[PATCH v3 0/5] ARM Error Source Table V2 Support

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/5] ARM Error Source Table V2 Support
@ 2025-01-15  8:42 Ruidong Tian
  2025-01-15  8:42 ` [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver Ruidong Tian
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Ruidong Tian @ 2025-01-15  8:42 UTC (permalink / raw)
  To: catalin.marinas, will, lpieralisi, guohanjun, sudeep.holla,
	xueshuai, baolin.wang, linux-kernel, linux-acpi, linux-arm-kernel,
	rafael, lenb, tony.luck, bp, yazen.ghannam
  Cc: tianruidong

AEST provides a mechanism for hardware to directly notify Kernel to
handle RAS errors through interrupts, which is also known as Kernel-first
mode.

AEST's Advantage
========================

1. AEST uses EL1 interrupts to report CE/DE, making it more lightweight
   than GHES (the Firmware First solution on Arm).
2. The lightweight AEST allows system to report each CE, enabling user
    applications to utilize this information for memory error prediction.

AEST Driver Architecture
========================

AEST Driver Device Mana
The AEST driver consists of three components:
  - AEST device: Handle interrupts and manage AEST nodes and records.
  - AEST node: corresponding to RAS node in hardware[1],
  - AEST record: RAS register sets.

They are organized together as follows.

 ┌──────────────────────────────────────────────────┐
 │             AEST Driver Device Management        │
 │┌─────────────┐    ┌──────────┐     ┌───────────┐ │
 ││ AEST Device ├─┬─►│AEST Node ├──┬─►│AEST Record│ │
 │└─────────────┘ │  └──────────┘  │  └───────────┘ │
 │                │       .        │  ┌───────────┐ │
 │                │       .        ├─►│AEST Record│ │
 │                │       .        │  └───────────┘ │
 │                │  ┌──────────┐  │        .       │
 │                ├─►│AEST Node │  │        .       │
 │                │  └──────────┘  │        .       │
 │                │                │  ┌───────────┐ │
 │                │  ┌──────────┐  └─►│AEST Record│ │
 │                └─►│AEST Node │     └───────────┘ │
 │                   └──────────┘                   │
 └──────────────────────────────────────────────────┘


AEST Interrupt Handle
=====================

Once AEST interrupt occur
1. The AEST device traverses all AEST nodes to locate errored record.
2. There are two types of records in each node：
      - report record: node can locate errored record through bitmap in
                       ERRGSR register.
      - poll record: node need poll all record to check if it errored.
3. process record:
      - if error is corrected, reset ce threshold and print it
      - if error is defered, dump register and call memory_fauilre
      - if error is uncorrected, panic, in fact UE usually raise a
        exception rather than interrupt.
4. decode record: AEST driver notify other driver，like EDAC，to decode
    RAS register.

AEST Error Injection
====================

AEST driver provides error(Software simulation instead of real hardware errors)
inject interface, details can be see in patch0003

Address Translation
===================

AS describe in 2.2[0], error address reported by AEST record may be
'''node-specific Logical Addresses''' rather than '''System Physical Address'''
used by Kernel, driver need tracelate LA to SPA, these is similar to AMD ATL[2].
So patch0004 introduce a common function both for AMD and ARM.

I have tested this series on THead Yitian710 SOC.

Future work:
1. Add CE storm mitigation.
2. Support AEST vendor node.

This series is based on Tyler Baicar's patches [1], which do not have v2
sended to mail list yet. Change from origin patch:
1. Add a genpool to collect all AEST error, and log them in a workqueue
other than in irq context.
2. Just use the same one aest_proc function for system register interface
and MMIO interface.
3. Reconstruct some structures and functions to make it more clear.
4. Accept all comments in Tyler Baicar's mail list.

Change from V2:
https://lore.kernel.org/all/20240321025317.114621-1-tianruidong@linux.alibaba.com/
1. Tomohiro Misono
    - dump register before panic
2. Baolin Wang & Shuai Xue: accept all comment.
3. Support AEST V2.

Change from V1:
https://lore.kernel.org/all/20240304111517.33001-1-tianruidong@linux.alibaba.com/
1. Marc Zyngier
  - Use readq/writeq_relaxed instead of readq/writeq for MMIO address.
  - Add sync for system register operation.
  - Use irq_is_percpu_devid() helper to identify a per-CPU interrupt.
  - Other fix.
2. Set RAS CE threshold in AEST driver.
3. Enable RAS interrupt explicitly in driver.
4. UER and UEO trigger memory_failure other than panic.

[0]: https://developer.arm.com/documentation/den0085/0101/
[1]: https://lore.kernel.org/all/20211124170708.3874-1-baicar@os.amperecomputing.com/
[2]: https://lore.kernel.org/all/20240123041401.79812-2-yazen.ghannam@amd.com/

Ruidong Tian (5):
  ACPI/RAS/AEST: Initial AEST driver
  RAS/AEST: Introduce AEST driver sysfs interface
  RAS/AEST: Introduce AEST inject interface to test AEST driver
  RAS/ATL: Unified ATL interface for ARM64 and AMD
  trace, ras: add ARM RAS extension trace event

 Documentation/ABI/testing/debugfs-aest |  115 +++
 MAINTAINERS                            |   11 +
 arch/arm64/include/asm/ras.h           |   95 +++
 drivers/acpi/arm64/Kconfig             |   11 +
 drivers/acpi/arm64/Makefile            |    1 +
 drivers/acpi/arm64/aest.c              |  340 ++++++++
 drivers/acpi/arm64/init.c              |    2 +
 drivers/acpi/arm64/init.h              |    1 +
 drivers/edac/amd64_edac.c              |    2 +-
 drivers/ras/Kconfig                    |    1 +
 drivers/ras/Makefile                   |    1 +
 drivers/ras/aest/Kconfig               |   17 +
 drivers/ras/aest/Makefile              |    7 +
 drivers/ras/aest/aest-core.c           | 1017 ++++++++++++++++++++++++
 drivers/ras/aest/aest-inject.c         |  151 ++++
 drivers/ras/aest/aest-sysfs.c          |  230 ++++++
 drivers/ras/aest/aest.h                |  338 ++++++++
 drivers/ras/amd/atl/core.c             |    4 +-
 drivers/ras/amd/atl/internal.h         |    2 +-
 drivers/ras/amd/atl/umc.c              |    3 +-
 drivers/ras/ras.c                      |   27 +-
 include/linux/acpi_aest.h              |   68 ++
 include/linux/cpuhotplug.h             |    1 +
 include/linux/ras.h                    |   17 +-
 include/ras/ras_event.h                |   71 ++
 25 files changed, 2510 insertions(+), 23 deletions(-)
 create mode 100644 Documentation/ABI/testing/debugfs-aest
 create mode 100644 arch/arm64/include/asm/ras.h
 create mode 100644 drivers/acpi/arm64/aest.c
 create mode 100644 drivers/ras/aest/Kconfig
 create mode 100644 drivers/ras/aest/Makefile
 create mode 100644 drivers/ras/aest/aest-core.c
 create mode 100644 drivers/ras/aest/aest-inject.c
 create mode 100644 drivers/ras/aest/aest-sysfs.c
 create mode 100644 drivers/ras/aest/aest.h
 create mode 100644 include/linux/acpi_aest.h

-- 
2.33.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver
  2025-01-15  8:42 [PATCH v3 0/5] ARM Error Source Table V2 Support Ruidong Tian
@ 2025-01-15  8:42 ` Ruidong Tian
  2025-01-17 10:50   ` Tomohiro Misono (Fujitsu)
  2025-02-19 20:49   ` Borislav Petkov
  2025-01-15  8:42 ` [PATCH v3 2/5] RAS/AEST: Introduce AEST driver sysfs interface Ruidong Tian
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 16+ messages in thread
From: Ruidong Tian @ 2025-01-15  8:42 UTC (permalink / raw)
  To: catalin.marinas, will, lpieralisi, guohanjun, sudeep.holla,
	xueshuai, baolin.wang, linux-kernel, linux-acpi, linux-arm-kernel,
	rafael, lenb, tony.luck, bp, yazen.ghannam
  Cc: tianruidong, Tyler Baicar

Add support for parsing the ARM Error Source Table and basic handling of
errors reported through both memory mapped and system register interfaces.

Assume system register interfaces are only registered with private
peripheral interrupts (PPIs); otherwise there is no guarantee the
core handling the error is the core which took the error and has the
syndrome info in its system registers.

In kernel-first mode, all configuration is controlled by kernel, include
CE ce_threshold and interrupt enable/disable.

All detected errors will be processed as follow:
  - CE, DE: use a workqueue to log this hare errors.
  - UER, UEO: log it and call memory_failun workquee.
  - UC, UEU: panic in irq context.

Signed-off-by: Tyler Baicar <baicar@os.amperecomputing.com>
Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
---
 MAINTAINERS                  |  10 +
 arch/arm64/include/asm/ras.h |  95 ++++
 drivers/acpi/arm64/Kconfig   |  11 +
 drivers/acpi/arm64/Makefile  |   1 +
 drivers/acpi/arm64/aest.c    | 335 ++++++++++++
 drivers/acpi/arm64/init.c    |   2 +
 drivers/acpi/arm64/init.h    |   1 +
 drivers/ras/Kconfig          |   1 +
 drivers/ras/Makefile         |   1 +
 drivers/ras/aest/Kconfig     |  17 +
 drivers/ras/aest/Makefile    |   5 +
 drivers/ras/aest/aest-core.c | 976 +++++++++++++++++++++++++++++++++++
 drivers/ras/aest/aest.h      | 323 ++++++++++++
 include/linux/acpi_aest.h    |  68 +++
 include/linux/cpuhotplug.h   |   1 +
 include/linux/ras.h          |   8 +
 16 files changed, 1855 insertions(+)
 create mode 100644 arch/arm64/include/asm/ras.h
 create mode 100644 drivers/acpi/arm64/aest.c
 create mode 100644 drivers/ras/aest/Kconfig
 create mode 100644 drivers/ras/aest/Makefile
 create mode 100644 drivers/ras/aest/aest-core.c
 create mode 100644 drivers/ras/aest/aest.h
 create mode 100644 include/linux/acpi_aest.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 637ddd44245f..d757f9339627 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -330,6 +330,16 @@ S:	Maintained
 F:	drivers/acpi/arm64
 F:	include/linux/acpi_iort.h
 
+ACPI AEST
+M:	Ruidong Tian <tianruidond@linux.alibaba.com>
+L:	linux-acpi@vger.kernel.org
+L:	linux-arm-kernel@lists.infradead.org
+S:	Supported
+F:	arch/arm64/include/asm/ras.h
+F:	drivers/acpi/arm64/aest.c
+F:	drivers/ras/aest/
+F:	include/linux/acpi_aest.h
+
 ACPI FOR RISC-V (ACPI/riscv)
 M:	Sunil V L <sunilvl@ventanamicro.com>
 L:	linux-acpi@vger.kernel.org
diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h
new file mode 100644
index 000000000000..7676add8a0ed
--- /dev/null
+++ b/arch/arm64/include/asm/ras.h
@@ -0,0 +1,95 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_RAS_H
+#define __ASM_RAS_H
+
+#include <linux/types.h>
+#include <linux/bits.h>
+
+/* ERR<n>FR */
+#define ERR_FR_CE                      GENMASK_ULL(54, 53)
+#define ERR_FR_RP                      BIT(15)
+#define ERR_FR_CEC                     GENMASK_ULL(14, 12)
+
+#define ERR_FR_RP_SINGLE_COUNTER       0
+#define ERR_FR_RP_DOUBLE_COUNTER       1
+
+#define ERR_FR_CEC_0B_COUNTER          0
+#define ERR_FR_CEC_8B_COUNTER          BIT(1)
+#define ERR_FR_CEC_16B_COUNTER         BIT(2)
+
+/* ERR<n>STATUS */
+#define ERR_STATUS_AV		BIT(31)
+#define ERR_STATUS_V		BIT(30)
+#define ERR_STATUS_UE		BIT(29)
+#define ERR_STATUS_ER		BIT(28)
+#define ERR_STATUS_OF		BIT(27)
+#define ERR_STATUS_MV		BIT(26)
+#define ERR_STATUS_CE		(BIT(25) | BIT(24))
+#define ERR_STATUS_DE		BIT(23)
+#define ERR_STATUS_PN		BIT(22)
+#define ERR_STATUS_UET		(BIT(21) | BIT(20))
+#define ERR_STATUS_CI		BIT(19)
+#define ERR_STATUS_IERR		GENMASK_ULL(15, 8)
+#define ERR_STATUS_SERR		GENMASK_ULL(7, 0)
+
+/* Theses bits are	 write-one-to-clear */
+#define ERR_STATUS_W1TC		(ERR_STATUS_AV | ERR_STATUS_V | ERR_STATUS_UE | \
+				ERR_STATUS_ER | ERR_STATUS_OF | ERR_STATUS_MV | \
+				ERR_STATUS_CE | ERR_STATUS_DE | ERR_STATUS_PN | \
+				ERR_STATUS_UET | ERR_STATUS_CI)
+
+#define ERR_STATUS_UET_UC	0
+#define ERR_STATUS_UET_UEU	1
+#define ERR_STATUS_UET_UEO	2
+#define ERR_STATUS_UET_UER	3
+
+/* ERR<n>CTLR */
+#define ERR_CTLR_CFI		BIT(8)
+#define ERR_CTLR_FI		BIT(3)
+#define ERR_CTLR_UI		BIT(2)
+
+/* ERR<n>ADDR */
+#define ERR_ADDR_AI		BIT(61)
+#define ERR_ADDR_PADDR		GENMASK_ULL(55, 0)
+
+/* ERR<n>MISC0 */
+
+/* ERR<n>FR.CEC == 0b010, ERR<n>FR.RP == 0  */
+#define ERR_MISC0_8B_OF		BIT(39)
+#define ERR_MISC0_8B_CEC	GENMASK_ULL(38, 32)
+
+/* ERR<n>FR.CEC == 0b100, ERR<n>FR.RP == 0  */
+#define ERR_MISC0_16B_OF	BIT(47)
+#define ERR_MISC0_16B_CEC	GENMASK_ULL(46, 32)
+
+#define ERR_MISC0_CEC_SHIFT	31
+
+#define ERR_8B_CEC_MAX		(ERR_MISC0_8B_CEC >> ERR_MISC0_CEC_SHIFT)
+#define ERR_16B_CEC_MAX		(ERR_MISC0_16B_CEC >> ERR_MISC0_CEC_SHIFT)
+
+/* ERR<n>FR.CEC == 0b100, ERR<n>FR.RP == 1  */
+#define ERR_MISC0_16B_OFO	BIT(63)
+#define ERR_MISC0_16B_CECO	GENMASK_ULL(62, 48)
+#define ERR_MISC0_16B_OFR	BIT(47)
+#define ERR_MISC0_16B_CECR	GENMASK_ULL(46, 32)
+
+/* ERRDEVARCH */
+#define ERRDEVARCH_REV		GENMASK(19, 16)
+
+enum ras_ce_threshold {
+	RAS_CE_THRESHOLD_0B,
+	RAS_CE_THRESHOLD_8B,
+	RAS_CE_THRESHOLD_16B,
+	RAS_CE_THRESHOLD_32B,
+	UNKNOWN,
+};
+
+struct ras_ext_regs {
+	u64 err_fr;
+	u64 err_ctlr;
+	u64 err_status;
+	u64 err_addr;
+	u64 err_misc[4];
+};
+
+#endif	/* __ASM_RAS_H */
diff --git a/drivers/acpi/arm64/Kconfig b/drivers/acpi/arm64/Kconfig
index b3ed6212244c..c8eb6de95733 100644
--- a/drivers/acpi/arm64/Kconfig
+++ b/drivers/acpi/arm64/Kconfig
@@ -21,3 +21,14 @@ config ACPI_AGDI
 
 config ACPI_APMT
 	bool
+
+config ACPI_AEST
+	bool "ARM Error Source Table Support"
+	depends on ARM64_RAS_EXTN
+
+	help
+	  The Arm Error Source Table (AEST) provides details on ACPI
+	  extensions that enable kernel-first handling of errors in a
+	  system that supports the Armv8 RAS extensions.
+
+	  If set, the kernel will report and log hardware errors.
diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
index 05ecde9eaabe..8e240b281fd1 100644
--- a/drivers/acpi/arm64/Makefile
+++ b/drivers/acpi/arm64/Makefile
@@ -6,5 +6,6 @@ obj-$(CONFIG_ACPI_GTDT) 	+= gtdt.o
 obj-$(CONFIG_ACPI_IORT) 	+= iort.o
 obj-$(CONFIG_ACPI_PROCESSOR_IDLE) += cpuidle.o
 obj-$(CONFIG_ARM_AMBA)		+= amba.o
+obj-$(CONFIG_ACPI_AEST) 	+= aest.o
 obj-y				+= dma.o init.o
 obj-y				+= thermal_cpufreq.o
diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c
new file mode 100644
index 000000000000..6dba9c23e04e
--- /dev/null
+++ b/drivers/acpi/arm64/aest.c
@@ -0,0 +1,335 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM Error Source Table Support
+ *
+ * Copyright (c) 2024, Alibaba Group.
+ */
+
+#include <linux/xarray.h>
+#include <linux/platform_device.h>
+#include <linux/acpi_aest.h>
+
+#include "init.h"
+
+#undef pr_fmt
+#define pr_fmt(fmt) "ACPI AEST: " fmt
+
+static struct xarray *aest_array;
+
+static void __init aest_init_interface(struct acpi_aest_hdr *hdr,
+				       struct acpi_aest_node *node)
+{
+	struct acpi_aest_node_interface_header *interface;
+
+	interface = ACPI_ADD_PTR(struct acpi_aest_node_interface_header, hdr,
+				 hdr->node_interface_offset);
+
+	node->type = hdr->type;
+	node->interface_hdr = interface;
+
+	switch (interface->group_format) {
+	case ACPI_AEST_NODE_GROUP_FORMAT_4K: {
+		struct acpi_aest_node_interface_4k *interface_4k =
+			(struct acpi_aest_node_interface_4k *)(interface + 1);
+
+		node->common = &interface_4k->common;
+		node->record_implemented =
+			(unsigned long *)&interface_4k->error_record_implemented;
+		node->status_reporting =
+			(unsigned long *)&interface_4k->error_status_reporting;
+		node->addressing_mode =
+			(unsigned long *)&interface_4k->addressing_mode;
+		break;
+	}
+	case ACPI_AEST_NODE_GROUP_FORMAT_16K: {
+		struct acpi_aest_node_interface_16k *interface_16k =
+			(struct acpi_aest_node_interface_16k *)(interface + 1);
+
+		node->common = &interface_16k->common;
+		node->record_implemented =
+			(unsigned long *)interface_16k->error_record_implemented;
+		node->status_reporting =
+			(unsigned long *)interface_16k->error_status_reporting;
+		node->addressing_mode =
+			(unsigned long *)interface_16k->addressing_mode;
+		break;
+	}
+	case ACPI_AEST_NODE_GROUP_FORMAT_64K: {
+		struct acpi_aest_node_interface_64k *interface_64k =
+			(struct acpi_aest_node_interface_64k *)(interface + 1);
+
+		node->common = &interface_64k->common;
+		node->record_implemented =
+			(unsigned long *)interface_64k->error_record_implemented;
+		node->status_reporting =
+			(unsigned long *)interface_64k->error_status_reporting;
+		node->addressing_mode =
+			(unsigned long *)interface_64k->addressing_mode;
+		break;
+	}
+	default:
+		pr_err("invalid group format: %d\n", interface->group_format);
+	}
+
+	node->interrupt = ACPI_ADD_PTR(struct acpi_aest_node_interrupt_v2,
+					hdr, hdr->node_interrupt_offset);
+
+	node->interrupt_count = hdr->node_interrupt_count;
+}
+
+static int __init acpi_aest_init_node_common(struct acpi_aest_hdr *aest_hdr,
+					struct acpi_aest_node *node)
+{
+	int ret;
+	struct aest_hnode *hnode;
+	u64 error_device_id;
+
+	aest_init_interface(aest_hdr, node);
+
+	error_device_id = node->common->error_node_device;
+
+	hnode = xa_load(aest_array, error_device_id);
+	if (!hnode) {
+		hnode = kmalloc(sizeof(*hnode), GFP_KERNEL);
+		if (!hnode) {
+			ret = -ENOMEM;
+			goto free;
+		}
+		INIT_LIST_HEAD(&hnode->list);
+		hnode->uid = error_device_id;
+		hnode->count = 0;
+		hnode->type = node->type;
+		xa_store(aest_array, error_device_id, hnode, GFP_KERNEL);
+	}
+
+	list_add_tail(&node->list, &hnode->list);
+	hnode->count++;
+
+	return 0;
+
+free:
+	kfree(node);
+	return ret;
+}
+
+static int __init
+acpi_aest_init_node_default(struct acpi_aest_hdr *aest_hdr)
+{
+	struct acpi_aest_node *node;
+
+	node = kzalloc(sizeof(*node), GFP_KERNEL);
+	if (!node)
+		return -ENOMEM;
+
+	node->spec_pointer = ACPI_ADD_PTR(void, aest_hdr,
+					aest_hdr->node_specific_offset);
+
+	return acpi_aest_init_node_common(aest_hdr, node);
+}
+
+static int __init
+acpi_aest_init_processor_node(struct acpi_aest_hdr *aest_hdr)
+{
+	struct acpi_aest_node *node;
+
+	node = kzalloc(sizeof(*node), GFP_KERNEL);
+	if (!node)
+		return -ENOMEM;
+
+	node->spec_pointer = ACPI_ADD_PTR(void, aest_hdr,
+					aest_hdr->node_specific_offset);
+
+	node->processor_spec_pointer = ACPI_ADD_PTR(void, node->spec_pointer,
+					sizeof(struct acpi_aest_processor));
+
+	return acpi_aest_init_node_common(aest_hdr, node);
+}
+
+static int __init acpi_aest_init_node(struct acpi_aest_hdr *header)
+{
+	switch (header->type) {
+	case ACPI_AEST_PROCESSOR_ERROR_NODE:
+		return acpi_aest_init_processor_node(header);
+	case ACPI_AEST_VENDOR_ERROR_NODE:
+	case ACPI_AEST_SMMU_ERROR_NODE:
+	case ACPI_AEST_GIC_ERROR_NODE:
+	case ACPI_AEST_PCIE_ERROR_NODE:
+	case ACPI_AEST_PROXY_ERROR_NODE:
+	case ACPI_AEST_MEMORY_ERROR_NODE:
+		return acpi_aest_init_node_default(header);
+	default:
+		pr_err("acpi table header type is invalid: %d\n", header->type);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int __init acpi_aest_init_nodes(struct acpi_table_header *aest_table)
+{
+	struct acpi_aest_hdr *aest_node, *aest_end;
+	struct acpi_table_aest *aest;
+	int rc;
+
+	aest = (struct acpi_table_aest *)aest_table;
+	aest_node = ACPI_ADD_PTR(struct acpi_aest_hdr, aest,
+				 sizeof(struct acpi_table_header));
+	aest_end = ACPI_ADD_PTR(struct acpi_aest_hdr, aest,
+				aest_table->length);
+
+	while (aest_node < aest_end) {
+		if (((u64)aest_node + aest_node->length) > (u64)aest_end) {
+			pr_warn(FW_WARN "AEST node pointer overflow, bad table.\n");
+			return -EINVAL;
+		}
+
+		rc = acpi_aest_init_node(aest_node);
+		if (rc)
+			return rc;
+
+		aest_node = ACPI_ADD_PTR(struct acpi_aest_hdr, aest_node,
+					 aest_node->length);
+	}
+
+	return 0;
+}
+
+static int
+acpi_aest_parse_irqs(struct platform_device *pdev, struct acpi_aest_node *anode,
+				struct resource *res, int *res_idx, int irqs[2])
+{
+	int i;
+	struct acpi_aest_node_interrupt_v2 *interrupt;
+	int trigger, irq;
+
+	for (i = 0; i < anode->interrupt_count; i++) {
+		interrupt = &anode->interrupt[i];
+		if (irqs[interrupt->type])
+			continue;
+
+		trigger = (interrupt->flags & AEST_INTERRUPT_MODE) ?
+			ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE;
+
+		irq = acpi_register_gsi(&pdev->dev, interrupt->gsiv, trigger,
+						ACPI_ACTIVE_HIGH);
+		if (irq <= 0) {
+			pr_err("failed to map AEST GSI %d\n", interrupt->gsiv);
+			return irq;
+		}
+
+		res[*res_idx].start = irq;
+		res[*res_idx].end = irq;
+		res[*res_idx].flags = IORESOURCE_IRQ;
+		res[*res_idx].name = interrupt->type ? "eri" : "fhi";
+
+		(*res_idx)++;
+
+		irqs[interrupt->type] = irq;
+	}
+
+	return 0;
+}
+
+static int __init acpi_aest_alloc_pdev(void)
+{
+	int ret, j, size;
+	struct aest_hnode *ahnode = NULL;
+	unsigned long i;
+	struct platform_device *pdev;
+	struct acpi_device *companion;
+	struct acpi_aest_node *anode;
+	char uid[16];
+	struct resource *res;
+
+	xa_for_each(aest_array, i, ahnode) {
+		int irq[2] = { 0 };
+
+		res = kcalloc(ahnode->count + 2, sizeof(*res), GFP_KERNEL);
+		if (!res) {
+			ret = -ENOMEM;
+			break;
+		}
+
+		pdev = platform_device_alloc("AEST", i);
+		if (IS_ERR(pdev)) {
+			ret = PTR_ERR(pdev);
+			break;
+		}
+
+		ret = snprintf(uid, sizeof(uid), "%u", (u32)i);
+		companion = acpi_dev_get_first_match_dev("ARMHE000", uid, -1);
+		if (companion)
+			ACPI_COMPANION_SET(&pdev->dev, companion);
+
+		j = 0;
+		list_for_each_entry(anode, &ahnode->list, list) {
+			if (anode->interface_hdr->type !=
+					ACPI_AEST_NODE_SYSTEM_REGISTER) {
+				res[j].name = "AEST:RECORD";
+				res[j].start = anode->interface_hdr->address;
+				size = anode->interface_hdr->error_record_count *
+						sizeof(struct ras_ext_regs);
+				res[j].end = res[j].start + size;
+				res[j].flags = IORESOURCE_MEM;
+			}
+
+			ret = acpi_aest_parse_irqs(pdev, anode, res, &j, irq);
+			if (ret) {
+				platform_device_put(pdev);
+				break;
+			}
+		}
+
+		ret = platform_device_add_resources(pdev, res, j);
+		if (ret)
+			break;
+
+		ret = platform_device_add_data(pdev, &ahnode, sizeof(ahnode));
+		if (ret)
+			break;
+
+		ret = platform_device_add(pdev);
+		if (ret)
+			break;
+	}
+
+	kfree(res);
+	if (ret)
+		platform_device_put(pdev);
+
+	return ret;
+}
+
+void __init acpi_aest_init(void)
+{
+	acpi_status status;
+	int ret;
+	struct acpi_table_header *aest_table;
+
+	status = acpi_get_table(ACPI_SIG_AEST, 0, &aest_table);
+	if (ACPI_FAILURE(status)) {
+		if (status != AE_NOT_FOUND) {
+			const char *msg = acpi_format_exception(status);
+
+			pr_err("Failed to get table, %s\n", msg);
+		}
+
+		return;
+	}
+
+	aest_array = kzalloc(sizeof(struct xarray), GFP_KERNEL);
+	xa_init(aest_array);
+
+	ret = acpi_aest_init_nodes(aest_table);
+	if (ret) {
+		pr_err("Failed init aest node %d\n", ret);
+		goto out;
+	}
+
+	ret = acpi_aest_alloc_pdev();
+	if (ret)
+		pr_err("Failed alloc pdev %d\n", ret);
+
+out:
+	acpi_put_table(aest_table);
+}
diff --git a/drivers/acpi/arm64/init.c b/drivers/acpi/arm64/init.c
index 7a47d8095a7d..b0c768923831 100644
--- a/drivers/acpi/arm64/init.c
+++ b/drivers/acpi/arm64/init.c
@@ -12,4 +12,6 @@ void __init acpi_arch_init(void)
 		acpi_iort_init();
 	if (IS_ENABLED(CONFIG_ARM_AMBA))
 		acpi_amba_init();
+	if (IS_ENABLED(CONFIG_ACPI_AEST))
+		acpi_aest_init();
 }
diff --git a/drivers/acpi/arm64/init.h b/drivers/acpi/arm64/init.h
index dcc277977194..3902d1676068 100644
--- a/drivers/acpi/arm64/init.h
+++ b/drivers/acpi/arm64/init.h
@@ -5,3 +5,4 @@ void __init acpi_agdi_init(void);
 void __init acpi_apmt_init(void);
 void __init acpi_iort_init(void);
 void __init acpi_amba_init(void);
+void __init acpi_aest_init(void);
diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
index fc4f4bb94a4c..61a2a05d9c94 100644
--- a/drivers/ras/Kconfig
+++ b/drivers/ras/Kconfig
@@ -33,6 +33,7 @@ if RAS
 
 source "arch/x86/ras/Kconfig"
 source "drivers/ras/amd/atl/Kconfig"
+source "drivers/ras/aest/Kconfig"
 
 config RAS_FMPM
 	tristate "FRU Memory Poison Manager"
diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile
index 11f95d59d397..72411ee9deaf 100644
--- a/drivers/ras/Makefile
+++ b/drivers/ras/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_RAS_CEC)	+= cec.o
 
 obj-$(CONFIG_RAS_FMPM)	+= amd/fmpm.o
 obj-y			+= amd/atl/
+obj-y 			+= aest/
diff --git a/drivers/ras/aest/Kconfig b/drivers/ras/aest/Kconfig
new file mode 100644
index 000000000000..6d436d911bea
--- /dev/null
+++ b/drivers/ras/aest/Kconfig
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# ARM Error Source Table Support
+#
+# Copyright (c) 2024, Alibaba Group.
+#
+
+config AEST
+	tristate "ARM AEST Driver"
+	depends on ACPI_AEST && RAS
+
+	help
+	  The Arm Error Source Table (AEST) provides details on ACPI
+	  extensions that enable kernel-first handling of errors in a
+	  system that supports the Armv8 RAS extensions.
+
+	  If set, the kernel will report and log hardware errors.
diff --git a/drivers/ras/aest/Makefile b/drivers/ras/aest/Makefile
new file mode 100644
index 000000000000..a6ba7e36fb43
--- /dev/null
+++ b/drivers/ras/aest/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-$(CONFIG_AEST) 	+= aest.o
+
+aest-y		:= aest-core.o
diff --git a/drivers/ras/aest/aest-core.c b/drivers/ras/aest/aest-core.c
new file mode 100644
index 000000000000..060a1eedee0a
--- /dev/null
+++ b/drivers/ras/aest/aest-core.c
@@ -0,0 +1,976 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM Error Source Table Support
+ *
+ * Copyright (c) 2021-2024, Alibaba Group.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/panic.h>
+#include <linux/platform_device.h>
+#include <linux/xarray.h>
+#include <linux/cpuhotplug.h>
+#include <linux/genalloc.h>
+#include <linux/ras.h>
+
+#include "aest.h"
+
+DEFINE_PER_CPU(struct aest_device, percpu_adev);
+
+#undef pr_fmt
+#define pr_fmt(fmt) "AEST: " fmt
+
+/*
+ * This memory pool is only to be used to save AEST node in AEST irq context.
+ * There can be 500 AEST node at most.
+ */
+#define AEST_NODE_ALLOCED_MAX	500
+
+#define AEST_LOG_PREFIX_BUFFER	64
+
+BLOCKING_NOTIFIER_HEAD(aest_decoder_chain);
+
+static void aest_print(struct aest_event *event)
+{
+	static atomic_t seqno = { 0 };
+	unsigned int curr_seqno;
+	char pfx_seq[AEST_LOG_PREFIX_BUFFER];
+	int index;
+	struct ras_ext_regs *regs;
+
+	curr_seqno = atomic_inc_return(&seqno);
+	snprintf(pfx_seq, sizeof(pfx_seq), "{%u}" HW_ERR, curr_seqno);
+	pr_info("%sHardware error from AEST %s\n", pfx_seq, event->node_name);
+
+	switch (event->type) {
+	case ACPI_AEST_PROCESSOR_ERROR_NODE:
+		pr_err("%s Error from CPU%d\n", pfx_seq, event->id0);
+		break;
+	case ACPI_AEST_MEMORY_ERROR_NODE:
+		pr_err("%s Error from memory at SRAT proximity domain %#x\n",
+			pfx_seq, event->id0);
+		break;
+	case ACPI_AEST_SMMU_ERROR_NODE:
+		pr_err("%s Error from SMMU IORT node %#x subcomponent %#x\n",
+			pfx_seq, event->id0, event->id1);
+		break;
+	case ACPI_AEST_VENDOR_ERROR_NODE:
+		pr_err("%s Error from vendor hid %8.8s uid %#x\n",
+			pfx_seq, event->hid, event->id1);
+		break;
+	case ACPI_AEST_GIC_ERROR_NODE:
+		pr_err("%s Error from GIC type %#x instance %#x\n",
+			pfx_seq, event->id0, event->id1);
+		break;
+	default:
+		pr_err("%s Unknown AEST node type\n", pfx_seq);
+		return;
+	}
+
+	index = event->index;
+	regs = &event->regs;
+
+	pr_err("%s  ERR%dFR: 0x%llx\n", pfx_seq, index, regs->err_fr);
+	pr_err("%s  ERR%dCTRL: 0x%llx\n", pfx_seq, index, regs->err_ctlr);
+	pr_err("%s  ERR%dSTATUS: 0x%llx\n", pfx_seq, index, regs->err_status);
+	if (regs->err_status & ERR_STATUS_AV)
+		pr_err("%s  ERR%dADDR: 0x%llx\n", pfx_seq, index,
+						regs->err_addr);
+
+	if (regs->err_status & ERR_STATUS_MV) {
+		pr_err("%s  ERR%dMISC0: 0x%llx\n", pfx_seq, index,
+						regs->err_misc[0]);
+		pr_err("%s  ERR%dMISC1: 0x%llx\n", pfx_seq, index,
+						regs->err_misc[1]);
+		pr_err("%s  ERR%dMISC2: 0x%llx\n", pfx_seq, index,
+						regs->err_misc[2]);
+		pr_err("%s  ERR%dMISC3: 0x%llx\n", pfx_seq, index,
+						regs->err_misc[3]);
+	}
+}
+
+static void aest_handle_memory_failure(u64 addr)
+{
+	unsigned long pfn;
+
+	pfn = PHYS_PFN(addr);
+
+	if (!pfn_valid(pfn)) {
+		pr_warn(HW_ERR "Invalid physical address: %#llx\n", addr);
+		return;
+	}
+
+#ifdef CONFIG_MEMORY_FAILURE
+	memory_failure(pfn, 0);
+#endif
+}
+
+static void init_aest_event(struct aest_event *event, struct aest_record *record,
+					struct ras_ext_regs *regs)
+{
+	struct aest_node *node = record->node;
+	struct acpi_aest_node *info = node->info;
+
+	event->type = node->type;
+	event->node_name = node->name;
+	switch (node->type) {
+	case ACPI_AEST_PROCESSOR_ERROR_NODE:
+		if (info->processor->flags & (ACPI_AEST_PROC_FLAG_SHARED |
+						ACPI_AEST_PROC_FLAG_GLOBAL))
+			event->id0 = smp_processor_id();
+		else
+			event->id0 = info->processor->processor_id;
+
+		event->id1 = info->processor->resource_type;
+		break;
+	case ACPI_AEST_MEMORY_ERROR_NODE:
+		event->id0 = info->memory->srat_proximity_domain;
+		break;
+	case ACPI_AEST_SMMU_ERROR_NODE:
+		event->id0 = info->smmu->iort_node_reference;
+		event->id1 = info->smmu->subcomponent_reference;
+		break;
+	case ACPI_AEST_VENDOR_ERROR_NODE:
+		event->id0 = 0;
+		event->id1 = info->vendor->acpi_uid;
+		event->hid = info->vendor->acpi_hid;
+		break;
+	case ACPI_AEST_GIC_ERROR_NODE:
+		event->id0 = info->gic->interface_type;
+		event->id1 = info->gic->instance_id;
+		break;
+	default:
+		event->id0 = 0;
+		event->id1 = 0;
+	}
+
+	memcpy(&event->regs, regs, sizeof(*regs));
+	event->index = record->index;
+	event->addressing_mode = record->addressing_mode;
+}
+
+static int
+aest_node_gen_pool_add(struct aest_device *adev, struct aest_record *record,
+					struct ras_ext_regs *regs)
+{
+	struct aest_event *event;
+
+	if (!adev->pool)
+		return -EINVAL;
+
+	event = (void *)gen_pool_alloc(adev->pool, sizeof(*event));
+	if (!event)
+		return -ENOMEM;
+
+	init_aest_event(event, record, regs);
+	llist_add(&event->llnode, &adev->event_list);
+
+	return 0;
+}
+
+static void aest_log(struct aest_record *record, struct ras_ext_regs *regs)
+{
+	struct aest_device *adev = record->node->adev;
+
+	if (!aest_node_gen_pool_add(adev, record, regs))
+		schedule_work(&adev->aest_work);
+}
+
+void aest_register_decode_chain(struct notifier_block *nb)
+{
+	blocking_notifier_chain_register(&aest_decoder_chain, nb);
+}
+EXPORT_SYMBOL_GPL(aest_register_decode_chain);
+
+void aest_unregister_decode_chain(struct notifier_block *nb)
+{
+	blocking_notifier_chain_unregister(&aest_decoder_chain, nb);
+}
+EXPORT_SYMBOL_GPL(aest_unregister_decode_chain);
+
+static void aest_node_pool_process(struct work_struct *work)
+{
+	struct llist_node *head;
+	struct aest_event *event;
+	struct aest_device *adev = container_of(work, struct aest_device,
+							aest_work);
+	u64 status, addr;
+
+	head = llist_del_all(&adev->event_list);
+	if (!head)
+		return;
+
+	head = llist_reverse_order(head);
+	llist_for_each_entry(event, head, llnode) {
+		aest_print(event);
+
+		/* TODO: translate Logical Addresses to System Physical Addresses */
+		if (event->addressing_mode == AEST_ADDREESS_LA ||
+			(event->regs.err_addr & ERR_ADDR_AI)) {
+			pr_notice("Can not translate LA to SPA\n");
+			addr = 0;
+		} else
+			addr = event->regs.err_addr & (1UL << CONFIG_ARM64_PA_BITS);
+
+		status = event->regs.err_status;
+		if (addr && ((status & ERR_STATUS_UE) || (status & ERR_STATUS_DE)))
+			aest_handle_memory_failure(addr);
+
+		blocking_notifier_call_chain(&aest_decoder_chain, 0, event);
+		gen_pool_free(adev->pool, (unsigned long)event,
+				sizeof(*event));
+	}
+}
+
+static int aest_node_pool_init(struct aest_device *adev)
+{
+	unsigned long addr, size;
+
+	size = ilog2(sizeof(struct aest_event));
+	adev->pool = devm_gen_pool_create(adev->dev, size, -1,
+						dev_name(adev->dev));
+	if (!adev->pool)
+		return -ENOMEM;
+
+	size = PAGE_ALIGN(size * AEST_NODE_ALLOCED_MAX);
+	addr = (unsigned long)devm_kzalloc(adev->dev, size, GFP_KERNEL);
+	if (!addr)
+		return -ENOMEM;
+
+	return gen_pool_add(adev->pool, addr, size, -1);
+
+	return 0;
+}
+
+static void aest_panic(struct aest_record *record, struct ras_ext_regs *regs, char *msg)
+{
+	struct aest_event event = { 0 };
+
+	init_aest_event(&event, record, regs);
+
+	aest_print(&event);
+
+	panic(msg);
+}
+
+static void aest_proc_record(struct aest_record *record, void *data)
+{
+	struct ras_ext_regs regs = {0};
+	int *count = data;
+
+	regs.err_status = record_read(record, ERXSTATUS);
+	if (!(regs.err_status & ERR_STATUS_V))
+		return;
+
+	(*count)++;
+
+	if (regs.err_status & ERR_STATUS_AV)
+		regs.err_addr = record_read(record, ERXADDR);
+
+	regs.err_fr = record->fr;
+	regs.err_ctlr = record_read(record, ERXCTLR);
+
+	if (regs.err_status & ERR_STATUS_MV) {
+		regs.err_misc[0] = record_read(record, ERXMISC0);
+		regs.err_misc[1] = record_read(record, ERXMISC1);
+		if (record->node->version >= ID_AA64PFR0_EL1_RAS_V1P1) {
+			regs.err_misc[2] = record_read(record, ERXMISC2);
+			regs.err_misc[3] = record_read(record, ERXMISC3);
+		}
+
+		if (record->node->info->interface_hdr->flags &
+			AEST_XFACE_FLAG_CLEAR_MISC) {
+			record_write(record, ERXMISC0, 0);
+			record_write(record, ERXMISC1, 0);
+			if (record->node->version >= ID_AA64PFR0_EL1_RAS_V1P1) {
+				record_write(record, ERXMISC2, 0);
+				record_write(record, ERXMISC3, 0);
+			}
+		/* ce count is 0 if record do not support ce */
+		} else if (record->ce.count > 0)
+			record_write(record, ERXMISC0, record->ce.reg_val);
+	}
+
+	/* panic if unrecoverable and uncontainable error encountered */
+	if ((regs.err_status & ERR_STATUS_UE) &&
+		(regs.err_status & ERR_STATUS_UET) > ERR_STATUS_UET_UEU)
+		aest_panic(record, &regs, "AEST: unrecoverable error encountered");
+
+	aest_log(record, &regs);
+
+	/* Write-one-to-clear the bits we've seen */
+	regs.err_status &= ERR_STATUS_W1TC;
+
+	/* Multi bit filed need to write all-ones to clear. */
+	if (regs.err_status & ERR_STATUS_CE)
+		regs.err_status |= ERR_STATUS_CE;
+
+	/* Multi bit filed need to write all-ones to clear. */
+	if (regs.err_status & ERR_STATUS_UET)
+		regs.err_status |= ERR_STATUS_UET;
+
+	record_write(record, ERXSTATUS, regs.err_status);
+}
+
+static void
+aest_node_foreach_record(void (*func)(struct aest_record *, void *),
+				struct aest_node *node, void *data,
+				unsigned long *bitmap)
+{
+	int i;
+
+	for_each_clear_bit(i, bitmap, node->record_count) {
+		aest_select_record(node, i);
+
+		func(&node->records[i], data);
+
+		aest_sync(node);
+	}
+}
+
+static int aest_proc(struct aest_node *node)
+{
+	int count = 0, i, j, size = node->record_count;
+	u64 err_group = 0;
+
+	aest_node_dbg(node, "Poll bit %*pb\n", size, node->record_implemented);
+	aest_node_foreach_record(aest_proc_record, node, &count,
+						node->record_implemented);
+
+	if (!node->errgsr)
+		return count;
+
+	aest_node_dbg(node, "Report bit %*pb\n", size, node->status_reporting);
+	for (i = 0; i < BITS_TO_U64(size); i++) {
+		err_group = readq_relaxed((void *)node->errgsr + i * 8);
+		aest_node_dbg(node, "errgsr[%d]: 0x%llx\n", i, err_group);
+
+		for_each_set_bit(j, (unsigned long *)&err_group,
+						BITS_PER_TYPE(u64)) {
+			/*
+			 * Error group base is only valid in Memory Map node,
+			 * so driver do not need to write select register and
+			 * sync.
+			 */
+			if (test_bit(i * BITS_PER_TYPE(u64) + j, node->status_reporting))
+				continue;
+			aest_proc_record(&node->records[j], &count);
+		}
+	}
+
+	return count;
+}
+
+static irqreturn_t aest_irq_func(int irq, void *input)
+{
+	struct aest_device *adev = input;
+	int i;
+
+	for (i = 0; i < adev->node_cnt; i++)
+		aest_proc(&adev->nodes[i]);
+
+	return IRQ_HANDLED;
+}
+
+static void aest_enable_irq(struct aest_record *record)
+{
+	u64 err_ctlr;
+	struct aest_device *adev = record->node->adev;
+
+	err_ctlr = record_read(record, ERXCTLR);
+
+	if (adev->irq[ACPI_AEST_NODE_FAULT_HANDLING])
+		err_ctlr |= (ERR_CTLR_FI | ERR_CTLR_CFI);
+	if (adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY])
+		err_ctlr |= ERR_CTLR_UI;
+
+	record_write(record, ERXCTLR, err_ctlr);
+}
+
+static void aest_config_irq(struct aest_node *node)
+{
+	int i;
+	struct acpi_aest_node_interrupt_v2 *interrupt;
+
+	if (!node->irq_config)
+		return;
+
+	for (i = 0; i < node->info->interrupt_count; i++) {
+		interrupt = &node->info->interrupt[i];
+
+		if (interrupt->type == ACPI_AEST_NODE_FAULT_HANDLING)
+			writeq_relaxed(interrupt->gsiv, node->irq_config);
+
+		if (interrupt->type == ACPI_AEST_NODE_ERROR_RECOVERY)
+			writeq_relaxed(interrupt->gsiv, node->irq_config + 8);
+
+		aest_node_dbg(node, "config irq type %d gsiv %d at %llx",
+				interrupt->type, interrupt->gsiv,
+				(u64)node->irq_config);
+	}
+}
+
+static enum ras_ce_threshold aest_get_ce_threshold(struct aest_record *record)
+{
+	u64 err_fr, err_fr_cec, err_fr_rp = -1;
+
+	err_fr = record->fr;
+	err_fr_cec = FIELD_GET(ERR_FR_CEC, err_fr);
+	err_fr_rp = FIELD_GET(ERR_FR_RP, err_fr);
+
+	if (err_fr_cec == ERR_FR_CEC_0B_COUNTER)
+		return RAS_CE_THRESHOLD_0B;
+	else if (err_fr_rp == ERR_FR_RP_DOUBLE_COUNTER)
+		return RAS_CE_THRESHOLD_32B;
+	else if (err_fr_cec == ERR_FR_CEC_8B_COUNTER)
+		return RAS_CE_THRESHOLD_8B;
+	else if (err_fr_cec == ERR_FR_CEC_16B_COUNTER)
+		return RAS_CE_THRESHOLD_16B;
+	else
+		return UNKNOWN;
+
+}
+
+static const struct ce_threshold_info ce_info[] = {
+	[RAS_CE_THRESHOLD_0B] = { 0 },
+	[RAS_CE_THRESHOLD_8B] = {
+		.max_count = ERR_8B_CEC_MAX,
+		.mask = ERR_MISC0_8B_CEC,
+		.shift = ERR_MISC0_CEC_SHIFT,
+	},
+	[RAS_CE_THRESHOLD_16B] = {
+		.max_count = ERR_16B_CEC_MAX,
+		.mask = ERR_MISC0_16B_CEC,
+		.shift = ERR_MISC0_CEC_SHIFT,
+	},
+	//TODO: Support 32B CEC threshold.
+	[RAS_CE_THRESHOLD_32B] = { 0 },
+};
+
+static void aest_set_ce_threshold(struct aest_record *record)
+{
+	u64 err_misc0, ce_count;
+	struct ce_threshold *ce = &record->ce;
+	const struct ce_threshold_info *info;
+
+	record->threshold_type  = aest_get_ce_threshold(record);
+
+	switch (record->threshold_type) {
+	case RAS_CE_THRESHOLD_0B:
+		aest_record_dbg(record, "do not support CE threshold!\n");
+		return;
+	case RAS_CE_THRESHOLD_8B:
+		aest_record_dbg(record, "support 8 bit CE threshold!\n");
+		break;
+	case RAS_CE_THRESHOLD_16B:
+		aest_record_dbg(record, "support 16 bit CE threshold!\n");
+		break;
+	case RAS_CE_THRESHOLD_32B:
+		aest_record_dbg(record, "not support 32 bit CE threshold!\n");
+		break;
+	default:
+		aest_record_dbg(record, "Unknown misc0 ce threshold!\n");
+	}
+
+	err_misc0 = record_read(record, ERXMISC0);
+	info = &ce_info[record->threshold_type];
+	ce->info = info;
+	ce_count = (err_misc0 & info->mask) >> info->shift;
+	if (ce_count) {
+		ce->count = ce_count;
+		ce->threshold = info->max_count - ce_count + 1;
+		ce->reg_val = err_misc0;
+		aest_record_dbg(record, "CE threshold is %llx, controlled by FW",
+							ce->threshold);
+		return;
+	}
+
+	// Default CE threshold is 1.
+	ce->count = info->max_count;
+	ce->threshold = DEFAULT_CE_THRESHOLD;
+	ce->reg_val = err_misc0 | info->mask;
+
+	record_write(record, ERXMISC0, ce->reg_val);
+	aest_record_dbg(record, "CE threshold is %llx, controlled by Kernel",
+							ce->threshold);
+}
+
+static int aest_register_irq(struct aest_device *adev)
+{
+	int i, irq, ret;
+	char *irq_desc;
+
+	irq_desc = devm_kasprintf(adev->dev, GFP_KERNEL, "%s.%s.",
+				  dev_driver_string(adev->dev),
+				  dev_name(adev->dev));
+	if (!irq_desc)
+		return -ENOMEM;
+
+	for (i = 0; i < MAX_GSI_PER_NODE; i++) {
+		irq = adev->irq[i];
+
+		if (!irq)
+			continue;
+
+		if (irq_is_percpu_devid(irq)) {
+			ret = request_percpu_irq(irq, aest_irq_func,
+							irq_desc,
+							adev->adev_oncore);
+			if (ret)
+				goto free;
+		} else {
+			ret = devm_request_irq(adev->dev, irq, aest_irq_func,
+					0, irq_desc, adev);
+			if (ret)
+				return ret;
+		}
+	}
+	return 0;
+
+free:
+	for (; i >= 0; i--) {
+		irq = adev->irq[i];
+
+		if (irq_is_percpu_devid(irq))
+			free_percpu_irq(irq, adev->adev_oncore);
+	}
+
+	return ret;
+}
+
+static int
+aest_init_record(struct aest_record *record, int i, struct aest_node *node)
+{
+	struct device *dev = node->adev->dev;
+
+	record->name = devm_kasprintf(dev, GFP_KERNEL, "record%d", i);
+	if (!record->name)
+		return -ENOMEM;
+
+	if (node->base)
+		record->regs_base = node->base + sizeof(struct ras_ext_regs) * i;
+
+	record->access = &aest_access[node->info->interface_hdr->type];
+	record->addressing_mode = test_bit(i, node->info->addressing_mode);
+	record->index = i;
+	record->node = node;
+	record->fr = record_read(record, ERXFR);
+
+	return 0;
+}
+
+static void aest_online_record(struct aest_record *record, void *data)
+{
+	if (record->fr & ERR_FR_CE)
+		aest_set_ce_threshold(record);
+
+	aest_enable_irq(record);
+}
+
+static void aest_online_oncore_node(struct aest_node *node)
+{
+	int count;
+
+	count = aest_proc(node);
+	aest_node_dbg(node, "Find %d error on CPU%d before AEST probe\n",
+						count, smp_processor_id());
+
+	aest_node_foreach_record(aest_online_record, node, NULL,
+						node->record_implemented);
+
+	aest_node_foreach_record(aest_online_record, node, NULL,
+						node->status_reporting);
+}
+
+static void aest_online_oncore_dev(void *data)
+{
+	int fhi_irq, eri_irq, i;
+	struct aest_device *adev = this_cpu_ptr(data);
+
+	for (i = 0; i < adev->node_cnt; i++)
+		aest_online_oncore_node(&adev->nodes[i]);
+
+	fhi_irq = adev->irq[ACPI_AEST_NODE_FAULT_HANDLING];
+	if (fhi_irq > 0)
+		enable_percpu_irq(fhi_irq, IRQ_TYPE_NONE);
+	eri_irq = adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY];
+	if (eri_irq > 0)
+		enable_percpu_irq(eri_irq, IRQ_TYPE_NONE);
+}
+
+static void aest_offline_oncore_dev(void *data)
+{
+	int fhi_irq, eri_irq;
+	struct aest_device *adev = this_cpu_ptr(data);
+
+	fhi_irq = adev->irq[ACPI_AEST_NODE_FAULT_HANDLING];
+	if (fhi_irq > 0)
+		disable_percpu_irq(fhi_irq);
+	eri_irq = adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY];
+	if (eri_irq > 0)
+		disable_percpu_irq(eri_irq);
+}
+
+static void aest_online_dev(struct aest_device *adev)
+{
+	int count, i;
+	struct aest_node *node;
+
+	for (i = 0; i < adev->node_cnt; i++) {
+		node = &adev->nodes[i];
+
+		if (!node->name)
+			continue;
+
+		count = aest_proc(node);
+		aest_node_dbg(node, "Find %d error before AEST probe\n", count);
+
+		aest_config_irq(node);
+
+		aest_node_foreach_record(aest_online_record, node, NULL,
+						node->record_implemented);
+		aest_node_foreach_record(aest_online_record, node, NULL,
+						node->status_reporting);
+	}
+}
+
+static int aest_starting_cpu(unsigned int cpu)
+{
+	pr_debug("CPU%d starting\n", cpu);
+	aest_online_oncore_dev(&percpu_adev);
+
+	return 0;
+}
+
+static int aest_dying_cpu(unsigned int cpu)
+{
+	pr_debug("CPU%d dying\n", cpu);
+	aest_offline_oncore_dev(&percpu_adev);
+
+	return 0;
+}
+
+static void aest_device_remove(struct platform_device *pdev)
+{
+	struct aest_device *adev = platform_get_drvdata(pdev);
+	int i;
+
+	platform_set_drvdata(pdev, NULL);
+
+	if (adev->type != ACPI_AEST_PROCESSOR_ERROR_NODE)
+		return;
+
+	on_each_cpu(aest_offline_oncore_dev, adev->adev_oncore, 1);
+
+	for (i = 0; i < MAX_GSI_PER_NODE; i++) {
+		if (adev->irq[i])
+			free_percpu_irq(adev->irq[i], adev->adev_oncore);
+	}
+}
+
+
+static int get_aest_node_ver(struct aest_node *node)
+{
+	u64 reg;
+	void *devarch_base;
+
+	if (node->type == ACPI_AEST_GIC_ERROR_NODE) {
+		devarch_base = ioremap(node->info->interface_hdr->address +
+						GIC_ERRDEVARCH, PAGE_SIZE);
+		if (!devarch_base)
+			return 0;
+
+		reg = readl_relaxed(devarch_base);
+		iounmap(devarch_base);
+
+		return FIELD_GET(ERRDEVARCH_REV, reg);
+	}
+
+	return FIELD_GET(ID_AA64PFR0_EL1_RAS_MASK, read_cpuid(ID_AA64PFR0_EL1));
+}
+
+static char *alloc_aest_node_name(struct aest_node *node)
+{
+	char *name;
+
+	switch (node->type) {
+	case ACPI_AEST_PROCESSOR_ERROR_NODE:
+		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "%s.%d",
+			aest_node_name[node->type],
+			node->info->processor->processor_id);
+		break;
+	case ACPI_AEST_MEMORY_ERROR_NODE:
+	case ACPI_AEST_SMMU_ERROR_NODE:
+	case ACPI_AEST_VENDOR_ERROR_NODE:
+	case ACPI_AEST_GIC_ERROR_NODE:
+	case ACPI_AEST_PCIE_ERROR_NODE:
+	case ACPI_AEST_PROXY_ERROR_NODE:
+		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "%s.%llx",
+			aest_node_name[node->type],
+			node->info->interface_hdr->address);
+		break;
+	default:
+		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "Unknown");
+	}
+
+	return name;
+}
+
+static int
+aest_node_set_errgsr(struct aest_device *adev, struct aest_node *node)
+{
+	struct acpi_aest_node *anode = node->info;
+	u64 errgsr_base = anode->common->error_group_register_base;
+
+	if (anode->interface_hdr->type != ACPI_AEST_NODE_MEMORY_MAPPED)
+		return 0;
+
+	if (!node->base)
+		return 0;
+
+	if (!(anode->interface_hdr->flags & AEST_XFACE_FLAG_ERROR_GROUP)) {
+		node->errgsr = node->base + ERXGROUP;
+		return 0;
+	}
+
+	if (!errgsr_base)
+		return -EINVAL;
+
+	node->errgsr = devm_ioremap(adev->dev, errgsr_base, PAGE_SIZE);
+	if (!node->errgsr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int aest_init_node(struct aest_device *adev, struct aest_node *node,
+					struct acpi_aest_node *anode)
+{
+	int i, ret;
+	u64 address, size, flags;
+
+	node->adev = adev;
+	node->info = anode;
+	node->type = anode->type;
+	node->version = get_aest_node_ver(node);
+	node->name = alloc_aest_node_name(node);
+	if (!node->name)
+		return -ENOMEM;
+	node->record_implemented = anode->record_implemented;
+	node->status_reporting = anode->status_reporting;
+
+	address = anode->interface_hdr->address;
+	size = anode->interface_hdr->error_record_count *
+						sizeof(struct ras_ext_regs);
+	if (address) {
+		node->base = devm_ioremap(adev->dev, address, size);
+		if (!node->base)
+			return -ENOMEM;
+	}
+
+	flags = anode->interface_hdr->flags;
+	address = node->info->common->fault_inject_register_base;
+	if ((flags & AEST_XFACE_FLAG_FAULT_INJECT) && address) {
+		node->inj = devm_ioremap(adev->dev, address, PAGE_SIZE);
+		if (!node->inj)
+			return -ENOMEM;
+	}
+
+	address = node->info->common->interrupt_config_register_base;
+	if ((flags & AEST_XFACE_FLAG_FAULT_INJECT) && address) {
+		node->irq_config = devm_ioremap(adev->dev, address, PAGE_SIZE);
+		if (!node->irq_config)
+			return -ENOMEM;
+	}
+
+	ret = aest_node_set_errgsr(adev, node);
+	if (ret)
+		return ret;
+
+	node->record_count = anode->interface_hdr->error_record_count;
+	node->records = devm_kcalloc(adev->dev, node->record_count,
+				sizeof(struct aest_record), GFP_KERNEL);
+	if (!node->records)
+		return -ENOMEM;
+
+	for (i = 0; i < node->record_count; i++) {
+		ret = aest_init_record(&node->records[i], i, node);
+		if (ret)
+			return ret;
+	}
+	aest_node_dbg(node, "%d records, base: %llx, errgsr: %llx\n",
+			node->record_count, (u64)node->base, (u64)node->errgsr);
+	return 0;
+}
+
+static int
+aest_init_nodes(struct aest_device *adev, struct aest_hnode *ahnode)
+{
+	struct acpi_aest_node *anode;
+	struct aest_node *node;
+	int ret, i = 0;
+
+	adev->node_cnt = ahnode->count;
+	adev->nodes = devm_kcalloc(adev->dev, adev->node_cnt,
+					sizeof(struct aest_node), GFP_KERNEL);
+	if (!adev->nodes)
+		return -ENOMEM;
+
+	list_for_each_entry(anode, &ahnode->list, list) {
+		adev->type = anode->type;
+
+		node = &adev->nodes[i++];
+		ret = aest_init_node(adev, node, anode);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int __setup_ppi(struct aest_device *adev)
+{
+	int cpu, i;
+	struct aest_device *oncore_adev;
+	struct aest_node *oncore_node;
+	size_t size;
+
+	adev->adev_oncore = &percpu_adev;
+	for_each_possible_cpu(cpu) {
+		oncore_adev = per_cpu_ptr(&percpu_adev, cpu);
+		memcpy(oncore_adev, adev, sizeof(struct aest_device));
+
+		oncore_adev->nodes = devm_kcalloc(adev->dev,
+						oncore_adev->node_cnt,
+						sizeof(struct aest_node),
+						GFP_KERNEL);
+		if (!oncore_adev->nodes)
+			return -ENOMEM;
+
+		size = adev->node_cnt * sizeof(struct aest_node);
+		memcpy(oncore_adev->nodes, adev->nodes, size);
+		for (i = 0; i < oncore_adev->node_cnt; i++) {
+			oncore_node = &oncore_adev->nodes[i];
+			oncore_node->records = devm_kcalloc(adev->dev,
+					oncore_node->record_count,
+					sizeof(struct aest_record), GFP_KERNEL);
+			if (!oncore_node->records)
+				return -ENOMEM;
+
+			size = oncore_node->record_count *
+						sizeof(struct aest_record);
+			memcpy(oncore_node->records, adev->nodes[i].records,
+									size);
+		}
+
+		aest_dev_dbg(adev, "Init device on CPU%d.\n", cpu);
+	}
+
+	return 0;
+}
+
+static int aest_setup_irq(struct platform_device *pdev, struct aest_device *adev)
+{
+	int fhi_irq, eri_irq;
+
+	fhi_irq = platform_get_irq_byname_optional(pdev, "fhi");
+	if (fhi_irq > 0)
+		adev->irq[0] = fhi_irq;
+
+	eri_irq = platform_get_irq_byname_optional(pdev, "eri");
+	if (eri_irq > 0)
+		adev->irq[1] = eri_irq;
+
+	/* Allocate and initialise the percpu device pointer for PPI */
+	if (irq_is_percpu(fhi_irq) || irq_is_percpu(eri_irq))
+		return __setup_ppi(adev);
+
+	return 0;
+}
+
+static int aest_device_probe(struct platform_device *pdev)
+{
+	int ret;
+	struct aest_device *adev;
+	struct aest_hnode *ahnode;
+
+	ahnode = *((struct aest_hnode **)pdev->dev.platform_data);
+	if (!ahnode)
+		return -ENODEV;
+
+	adev = devm_kzalloc(&pdev->dev, sizeof(*adev), GFP_KERNEL);
+	if (!adev)
+		return -ENOMEM;
+
+	adev->dev = &pdev->dev;
+	INIT_WORK(&adev->aest_work, aest_node_pool_process);
+	ret = aest_node_pool_init(adev);
+	if (ret) {
+		aest_dev_err(adev, "Failed init aest node pool.\n");
+		return ret;
+	}
+	init_llist_head(&adev->event_list);
+	adev->uid = ahnode->uid;
+	aest_set_name(adev, ahnode);
+
+	ret = aest_init_nodes(adev, ahnode);
+	if (ret)
+		return ret;
+
+	ret = aest_setup_irq(pdev, adev);
+	if (ret)
+		return ret;
+
+	ret = aest_register_irq(adev);
+	if (ret) {
+		aest_dev_err(adev, "register irq failed\n");
+		return ret;
+	}
+
+	platform_set_drvdata(pdev, adev);
+
+	if (aest_dev_is_oncore(adev))
+		ret = cpuhp_setup_state(CPUHP_AP_ARM_AEST_STARTING,
+				"drivers/acpi/arm64/aest:starting",
+				aest_starting_cpu, aest_dying_cpu);
+	else
+		aest_online_dev(adev);
+	if (ret)
+		return ret;
+
+	aest_dev_dbg(adev, "Node cnt: %x, uid: %x, irq: %d, %d\n",
+			adev->node_cnt, adev->uid, adev->irq[0], adev->irq[1]);
+
+	return 0;
+}
+
+static const struct acpi_device_id acpi_aest_ids[] = {
+	{"ARMHE000", 0},
+	{}
+};
+
+static struct platform_driver aest_driver = {
+	.driver	= {
+		.name	= "AEST",
+		.acpi_match_table = acpi_aest_ids,
+	},
+	.probe	= aest_device_probe,
+	.remove = aest_device_remove,
+};
+
+static int __init aest_init(void)
+{
+	return platform_driver_register(&aest_driver);
+}
+module_init(aest_init);
+
+static void __exit aest_exit(void)
+{
+	platform_driver_unregister(&aest_driver);
+}
+module_exit(aest_exit);
+
+MODULE_DESCRIPTION("ARM AEST Driver");
+MODULE_AUTHOR("Ruidong Tian <tianruidong@linux.alibaba.com>");
+MODULE_LICENSE("GPL");
+
diff --git a/drivers/ras/aest/aest.h b/drivers/ras/aest/aest.h
new file mode 100644
index 000000000000..04005aad3617
--- /dev/null
+++ b/drivers/ras/aest/aest.h
@@ -0,0 +1,323 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * ARM Error Source Table Support
+ *
+ * Copyright (c) 2021-2024, Alibaba Group.
+ */
+
+#include <linux/acpi_aest.h>
+#include <asm/ras.h>
+
+#define MAX_GSI_PER_NODE 2
+#define AEST_MAX_PPI 3
+#define DEFAULT_CE_THRESHOLD 1
+
+#define record_read(record, offset) \
+	record->access->read(record->regs_base, offset)
+#define record_write(record, offset, val) \
+	record->access->write(record->regs_base, offset, val)
+
+#define aest_dev_err(__adev, format, ...)	\
+	dev_err((__adev)->dev, format, ##__VA_ARGS__)
+#define aest_dev_info(__adev, format, ...)	\
+	dev_info((__adev)->dev, format, ##__VA_ARGS__)
+#define aest_dev_dbg(__adev, format, ...)	\
+	dev_dbg((__adev)->dev, format, ##__VA_ARGS__)
+
+#define aest_node_err(__node, format, ...)	\
+	dev_err((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
+#define aest_node_info(__node, format, ...)	\
+	dev_info((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
+#define aest_node_dbg(__node, format, ...)	\
+	dev_dbg((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
+
+#define aest_record_err(__record, format, ...)	\
+	dev_err((__record)->node->adev->dev, "%s: %s: " format, \
+		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
+#define aest_record_info(__record, format, ...)	\
+	dev_info((__record)->node->adev->dev, "%s: %s: " format, \
+		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
+#define aest_record_dbg(__record, format, ...)	\
+	dev_dbg((__record)->node->adev->dev, "%s: %s: " format, \
+		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
+
+#define ERXFR			0x0
+#define ERXCTLR			0x8
+#define ERXSTATUS		0x10
+#define ERXADDR			0x18
+#define ERXMISC0		0x20
+#define ERXMISC1		0x28
+#define ERXMISC2		0x30
+#define ERXMISC3		0x38
+
+#define ERXGROUP		0xE00
+#define GIC_ERRDEVARCH		0xFFBC
+
+extern struct xarray *aest_array;
+
+struct aest_event {
+	struct llist_node llnode;
+	char *node_name;
+	u32 type;
+	/*
+	 * Different nodes have different meanings:
+	 *   - Processor node	: processor number.
+	 *   - Memory node	: SRAT proximity domain.
+	 *   - SMMU node	: IORT proximity domain.
+	 *   - GIC node		: interface type.
+	 */
+	u32 id0;
+	/*
+	 * Different nodes have different meanings:
+	 *   - Processor node	: processor resource type.
+	 *   - Memory node	: Non.
+	 *   - SMMU node	: subcomponent reference.
+	 *   - Vendor node	: Unique ID.
+	 *   - GIC node		: instance identifier.
+	 */
+	u32 id1;
+	char *hid;		// Vendor node	: hardware ID.
+	u32 index;
+	u64 ce_threshold;
+	int addressing_mode;
+	struct ras_ext_regs regs;
+
+	void *vendor_data;
+	size_t vendor_data_size;
+};
+
+struct aest_access {
+	u64 (*read)(void *base, u32 offset);
+	void (*write)(void *base, u32 offset, u64 val);
+};
+
+struct ce_threshold_info {
+	const u64			max_count;
+	const u64			mask;
+	const u64			shift;
+};
+
+struct ce_threshold {
+	const struct ce_threshold_info	*info;
+	u64				count;
+	u64				threshold;
+	u64				reg_val;
+};
+
+struct aest_record {
+	char				*name;
+	int				index;
+	void __iomem			*regs_base;
+
+	/*
+	 * This bit specifies the addressing mode  to populate the ERR_ADDR
+	 * register:
+	 *   0b: Error record reports System Physical Addresses (SPA) in
+	 *       the ERR_ADDR register.
+	 *   1b: Error record reports error node-specific Logical Addresses(LA)
+	 *       in the ERR_ADD register. OS must use other means to translate
+	 *       the reported LA into SPA
+	 */
+	int				addressing_mode;
+	u64				fr;
+	struct aest_node		*node;
+
+	struct dentry			*debugfs;
+	struct ce_threshold		ce;
+	enum ras_ce_threshold		threshold_type;
+	const struct aest_access	*access;
+
+	void				*vendor_data;
+	size_t				vendor_data_size;
+};
+
+struct aest_node {
+	char				*name;
+	u8				type;
+	void				*errgsr;
+	void				*inj;
+	void				*irq_config;
+	void				*base;
+
+	/*
+	 * This bitmap indicates which of the error records within this error
+	 * node must be polled for error status.
+	 * Bit[n] of this field pertains to error record corresponding to
+	 * index n in this error group.
+	 * Bit[n] = 0b: Error record at index n needs to be polled.
+	 * Bit[n] = 1b: Error record at index n do not needs to be polled.
+	 */
+	unsigned long			*record_implemented;
+	/*
+	 * This bitmap indicates which of the error records within this error
+	 * node support error status reporting using ERRGSR register.
+	 * Bit[n] of this field pertains to error record corresponding to
+	 * index n in this error group.
+	 * Bit[n] = 0b: Error record at index n supports error status reporting
+	 *              through ERRGSR.S.
+	 * Bit[n] = 1b: Error record at index n does not support error reporting
+	 *              through the ERRGSR.S bit If this error record is
+	 *              implemented, then it must be polled explicitly for
+	 *              error events.
+	 */
+	unsigned long			*status_reporting;
+	int				version;
+
+	struct aest_device		*adev;
+	struct acpi_aest_node		*info;
+	struct dentry			*debugfs;
+
+	int				record_count;
+	struct aest_record		*records;
+
+	struct aest_node __percpu	*oncore_node;
+};
+
+struct aest_device {
+	struct device			*dev;
+	u32				type;
+	int				node_cnt;
+	struct aest_node		*nodes;
+
+	struct work_struct		aest_work;
+	struct gen_pool			*pool;
+	struct llist_head		event_list;
+
+	int				irq[MAX_GSI_PER_NODE];
+	u32				uid;
+	struct aest_device __percpu	*adev_oncore;
+
+	struct dentry			*debugfs;
+};
+
+struct aest_node_context {
+	struct aest_node		*node;
+	unsigned long			*bitmap;
+	void				(*func)(struct aest_record *record,
+							void *data);
+	void				*data;
+	int				ret;
+};
+
+#define CASE_READ(res, x)						\
+	case (x): {							\
+		res = read_sysreg_s(SYS_##x##_EL1);			\
+		break;							\
+	}
+
+#define CASE_WRITE(val, x)						\
+	case (x): {							\
+		write_sysreg_s((val), SYS_##x##_EL1);			\
+		break;							\
+	}
+
+static inline u64 aest_sysreg_read(void *__unused, u32 offset)
+{
+	u64 res;
+
+	switch (offset) {
+	CASE_READ(res, ERXFR)
+	CASE_READ(res, ERXCTLR)
+	CASE_READ(res, ERXSTATUS)
+	CASE_READ(res, ERXADDR)
+	CASE_READ(res, ERXMISC0)
+	CASE_READ(res, ERXMISC1)
+	CASE_READ(res, ERXMISC2)
+	CASE_READ(res, ERXMISC3)
+	default :
+		res = 0;
+	}
+	return res;
+}
+
+static inline void aest_sysreg_write(void *base, u32 offset, u64 val)
+{
+	switch (offset) {
+	CASE_WRITE(val, ERXFR)
+	CASE_WRITE(val, ERXCTLR)
+	CASE_WRITE(val, ERXSTATUS)
+	CASE_WRITE(val, ERXADDR)
+	CASE_WRITE(val, ERXMISC0)
+	CASE_WRITE(val, ERXMISC1)
+	CASE_WRITE(val, ERXMISC2)
+	CASE_WRITE(val, ERXMISC3)
+	default :
+		return;
+	}
+}
+
+static inline u64 aest_iomem_read(void *base, u32 offset)
+{
+	return readq_relaxed(base + offset);
+	return 0;
+}
+
+static inline void aest_iomem_write(void *base, u32 offset, u64 val)
+{
+	writeq_relaxed(val, base + offset);
+}
+
+/* access type is decided by AEST interface type. */
+static const struct aest_access aest_access[] = {
+	[ACPI_AEST_NODE_SYSTEM_REGISTER] = {
+		.read = aest_sysreg_read,
+		.write = aest_sysreg_write,
+	},
+
+	[ACPI_AEST_NODE_MEMORY_MAPPED] = {
+		.read = aest_iomem_read,
+		.write = aest_iomem_write,
+	},
+	[ACPI_AEST_NODE_SINGLE_RECORD_MEMORY_MAPPED] = {
+		.read = aest_iomem_read,
+		.write = aest_iomem_write,
+	},
+	{ }
+};
+
+static inline bool aest_dev_is_oncore(struct aest_device *adev)
+{
+	return adev->type == ACPI_AEST_PROCESSOR_ERROR_NODE;
+}
+
+/*
+ * Each PE may has multi error record, you must selects an error
+ * record to be accessed through the Error Record System
+ * registers.
+ */
+static inline void aest_select_record(struct aest_node *node, int index)
+{
+	if (node->type == ACPI_AEST_PROCESSOR_ERROR_NODE) {
+		write_sysreg_s(index, SYS_ERRSELR_EL1);
+		isb();
+	}
+}
+
+/* Ensure all writes has taken effect. */
+static inline void aest_sync(struct aest_node *node)
+{
+	if (node->type == ACPI_AEST_PROCESSOR_ERROR_NODE)
+		isb();
+}
+
+static const char * const aest_node_name[] = {
+	[ACPI_AEST_PROCESSOR_ERROR_NODE] = "processor",
+	[ACPI_AEST_MEMORY_ERROR_NODE] = "memory",
+	[ACPI_AEST_SMMU_ERROR_NODE] = "smmu",
+	[ACPI_AEST_VENDOR_ERROR_NODE] = "vendor",
+	[ACPI_AEST_GIC_ERROR_NODE] = "gic",
+	[ACPI_AEST_PCIE_ERROR_NODE] = "pcie",
+	[ACPI_AEST_PROXY_ERROR_NODE] = "proxy",
+};
+
+static inline int
+aest_set_name(struct aest_device *adev, struct aest_hnode *ahnode)
+{
+	adev->dev->init_name = devm_kasprintf(adev->dev, GFP_KERNEL,
+					"%s%d", aest_node_name[ahnode->type],
+						adev->uid);
+	if (!adev->dev->init_name)
+		return -ENOMEM;
+
+	return 0;
+}
diff --git a/include/linux/acpi_aest.h b/include/linux/acpi_aest.h
new file mode 100644
index 000000000000..1c2191791504
--- /dev/null
+++ b/include/linux/acpi_aest.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ACPI_AEST_H__
+#define __ACPI_AEST_H__
+
+#include <linux/acpi.h>
+#include <asm/ras.h>
+
+/* AEST component */
+#define ACPI_AEST_PROC_FLAG_GLOBAL	(1<<0)
+#define ACPI_AEST_PROC_FLAG_SHARED	(1<<1)
+
+#define AEST_ADDREESS_SPA	0
+#define AEST_ADDREESS_LA	1
+
+/* AEST interrupt */
+#define AEST_INTERRUPT_MODE		BIT(0)
+#define AEST_INTERRUPT_FHI_MODE		BIT(1)
+
+#define AEST_INTERRUPT_FHI_UE_SUPPORT		BIT(0)
+#define AEST_INTERRUPT_FHI_UE_NO_SUPPORT		BIT(1)
+
+#define AEST_MAX_INTERRUPT_PER_NODE 3
+
+/* AEST interface */
+
+#define AEST_XFACE_FLAG_SHARED		(1<<0)
+#define AEST_XFACE_FLAG_CLEAR_MISC	(1<<1)
+#define AEST_XFACE_FLAG_ERROR_DEVICE	(1<<2)
+#define AEST_XFACE_FLAG_AFFINITY	(1<<3)
+#define AEST_XFACE_FLAG_ERROR_GROUP	(1<<4)
+#define AEST_XFACE_FLAG_FAULT_INJECT	(1<<5)
+#define AEST_XFACE_FLAG_INT_CONFIG	(1<<6)
+
+struct aest_hnode {
+	struct list_head list;
+	int count;
+	u32 uid;
+	int type;
+};
+
+struct acpi_aest_node {
+	struct list_head list;
+	int type;
+	struct acpi_aest_node_interface_header *interface_hdr;
+	unsigned long *record_implemented;
+	unsigned long *status_reporting;
+	unsigned long *addressing_mode;
+	struct acpi_aest_node_interface_common *common;
+	union {
+		struct acpi_aest_processor *processor;
+		struct acpi_aest_memory *memory;
+		struct acpi_aest_smmu *smmu;
+		struct acpi_aest_vendor_v2 *vendor;
+		struct acpi_aest_gic *gic;
+		struct acpi_aest_pcie *pcie;
+		struct acpi_aest_proxy *proxy;
+		void *spec_pointer;
+	};
+	union {
+		struct acpi_aest_processor_cache *cache;
+		struct acpi_aest_processor_tlb *tlb;
+		struct acpi_aest_processor_generic *generic;
+		void *processor_spec_pointer;
+	};
+	struct acpi_aest_node_interrupt_v2 *interrupt;
+	int interrupt_count;
+};
+#endif /* __ACPI_IORT_H__ */
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index a04b73c40173..acf0e3957fdd 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -179,6 +179,7 @@ enum cpuhp_state {
 	CPUHP_AP_CSKY_TIMER_STARTING,
 	CPUHP_AP_TI_GP_TIMER_STARTING,
 	CPUHP_AP_HYPERV_TIMER_STARTING,
+	CPUHP_AP_ARM_AEST_STARTING,
 	/* Must be the last timer callback */
 	CPUHP_AP_DUMMY_TIMER_STARTING,
 	CPUHP_AP_ARM_XEN_STARTING,
diff --git a/include/linux/ras.h b/include/linux/ras.h
index a64182bc72ad..1c777af6a1af 100644
--- a/include/linux/ras.h
+++ b/include/linux/ras.h
@@ -53,4 +53,12 @@ static inline unsigned long
 amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err) { return -EINVAL; }
 #endif /* CONFIG_AMD_ATL */
 
+#if IS_ENABLED(CONFIG_AEST)
+void aest_register_decode_chain(struct notifier_block *nb);
+void aest_unregister_decode_chain(struct notifier_block *nb);
+#else
+static inline void aest_register_decode_chain(struct notifier_block *nb) {}
+static inline void aest_unregister_decode_chain(struct notifier_block *nb) {}
+#endif /* CONFIG_AEST */
+
 #endif /* __RAS_H__ */
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 2/5] RAS/AEST: Introduce AEST driver sysfs interface
  2025-01-15  8:42 [PATCH v3 0/5] ARM Error Source Table V2 Support Ruidong Tian
  2025-01-15  8:42 ` [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver Ruidong Tian
@ 2025-01-15  8:42 ` Ruidong Tian
  2025-01-18  2:37   ` kernel test robot
  2025-02-19 20:55   ` Borislav Petkov
  2025-01-15  8:42 ` [PATCH v3 3/5] RAS/AEST: Introduce AEST inject interface to test AEST driver Ruidong Tian
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 16+ messages in thread
From: Ruidong Tian @ 2025-01-15  8:42 UTC (permalink / raw)
  To: catalin.marinas, will, lpieralisi, guohanjun, sudeep.holla,
	xueshuai, baolin.wang, linux-kernel, linux-acpi, linux-arm-kernel,
	rafael, lenb, tony.luck, bp, yazen.ghannam
  Cc: tianruidong

Exposes certain AEST driver information to userspace.

Only ROOT can access these interface because it includes
hardware-sensitive information.

All AEST device will create one platform device, and for oncore device,
like CPU error node, will create a directory named "ras" in each cpu
device, and this directory include all records of this core:

  ls /sys/kernel/debug/aest/
  record0 record1 ...

Interface in

All details at:
        Documentation/ABI/testing/sysfs-driver-aest

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
---
 Documentation/ABI/testing/debugfs-aest |  98 +++++++++++
 MAINTAINERS                            |   1 +
 drivers/acpi/arm64/aest.c              |   3 +
 drivers/ras/aest/Makefile              |   1 +
 drivers/ras/aest/aest-core.c           |  35 ++++
 drivers/ras/aest/aest-sysfs.c          | 226 +++++++++++++++++++++++++
 drivers/ras/aest/aest.h                |  15 +-
 7 files changed, 378 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/ABI/testing/debugfs-aest
 create mode 100644 drivers/ras/aest/aest-sysfs.c

diff --git a/Documentation/ABI/testing/debugfs-aest b/Documentation/ABI/testing/debugfs-aest
new file mode 100644
index 000000000000..39d9c85843ef
--- /dev/null
+++ b/Documentation/ABI/testing/debugfs-aest
@@ -0,0 +1,98 @@
+What:		/sys/kernel/debug/aest/<name>.<uid>/
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		Directory represented a AEST device, <name> means device type,
+		like:
+
+			processor
+			memory
+			smmu
+			...
+		<uid> is the unique ID for this device.
+
+What:		/sys/kernel/debug/aest/<dev_name>/<node_name>/*
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		Attibute for aest node which belong this device, the format
+		of node name is: <Node Type>-<Node Address>
+
+		See more at:
+			https://developer.arm.com/documentation/den0085/latest/
+
+What:		/sys/kernel/debug/aest/<dev_name>/<node_name>/type
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		(RO) Return number indicates aest node type:
+
+		0 : Processor
+		1 : Memory Controller
+		2 : SMMU
+		3 : Vendor-defined
+		4 : GIC
+		5 : PCIe Root Complex
+		6 : Proxy error
+
+		See more at:
+			https://developer.arm.com/documentation/den0085/latest/
+
+What:		/sys/kernel/debug/aest/<dev_name>/<node_name>/error_node_device
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		(RO) ACPI _UID field of the Arm error node device in DSDT
+		that describes this error node
+
+		See more at:
+			https://developer.arm.com/documentation/den0085/latest/
+
+What:		/sys/kernel/debug/aest/<dev_name>/<node_name>/ce_threshold
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		(WO) Write the ce threshold to all records of this node. Failed
+		if input exceeded the maximum threshold
+
+What:		/sys/kernel/debug/aest/<dev_name>/<node_name>/err_count
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		(RO) Outputs error statistics for all error records of this node.
+
+What:		/sys/kernel/debug/aest/<dev_name>/<node_name>/record<index>/err_*
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		(RO) Read err_* register and return val.
+
+What:		/sys/kernel/debug/aest/<dev_name>/<node_name>/record<index>/err_*
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		(RO) Read err_* register and return val.
+
+
+What:		/sys/kernel/debug/aest/<dev_name>/<node_name>/record<index>/ce_threshold
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		(RW) Read and write the ce threshold to this record. Failed
+		if input exceeded the maximum threshold
+
+What:		/sys/kernel/debug/aest/<dev_name>/<node_name>/record<index>/err_count
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		(RO) Outputs error statistics for all this records.
diff --git a/MAINTAINERS b/MAINTAINERS
index d757f9339627..fe9ae27fdbec 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -335,6 +335,7 @@ M:	Ruidong Tian <tianruidond@linux.alibaba.com>
 L:	linux-acpi@vger.kernel.org
 L:	linux-arm-kernel@lists.infradead.org
 S:	Supported
+F:	Documentation/ABI/testing/debugfs-aest
 F:	arch/arm64/include/asm/ras.h
 F:	drivers/acpi/arm64/aest.c
 F:	drivers/ras/aest/
diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c
index 6dba9c23e04e..312ddd5c15f5 100644
--- a/drivers/acpi/arm64/aest.c
+++ b/drivers/acpi/arm64/aest.c
@@ -318,6 +318,9 @@ void __init acpi_aest_init(void)
 	}
 
 	aest_array = kzalloc(sizeof(struct xarray), GFP_KERNEL);
+	if (!aest_array)
+		return;
+
 	xa_init(aest_array);
 
 	ret = acpi_aest_init_nodes(aest_table);
diff --git a/drivers/ras/aest/Makefile b/drivers/ras/aest/Makefile
index a6ba7e36fb43..75495413d2b6 100644
--- a/drivers/ras/aest/Makefile
+++ b/drivers/ras/aest/Makefile
@@ -3,3 +3,4 @@
 obj-$(CONFIG_AEST) 	+= aest.o
 
 aest-y		:= aest-core.o
+aest-y		+= aest-sysfs.o
diff --git a/drivers/ras/aest/aest-core.c b/drivers/ras/aest/aest-core.c
index 060a1eedee0a..12d0a32ecda9 100644
--- a/drivers/ras/aest/aest-core.c
+++ b/drivers/ras/aest/aest-core.c
@@ -20,6 +20,9 @@ DEFINE_PER_CPU(struct aest_device, percpu_adev);
 #undef pr_fmt
 #define pr_fmt(fmt) "AEST: " fmt
 
+#ifdef CONFIG_DEBUG_FS
+struct dentry *aest_debugfs;
+#endif
 /*
  * This memory pool is only to be used to save AEST node in AEST irq context.
  * There can be 500 AEST node at most.
@@ -165,6 +168,27 @@ aest_node_gen_pool_add(struct aest_device *adev, struct aest_record *record,
 	init_aest_event(event, record, regs);
 	llist_add(&event->llnode, &adev->event_list);
 
+	if (regs->err_status & ERR_STATUS_CE)
+		record->count.ce++;
+	if (regs->err_status & ERR_STATUS_DE)
+		record->count.de++;
+	if (regs->err_status & ERR_STATUS_UE) {
+		switch (regs->err_status & ERR_STATUS_UET) {
+		case ERR_STATUS_UET_UC:
+			record->count.uc++;
+			break;
+		case ERR_STATUS_UET_UEU:
+			record->count.ueu++;
+			break;
+		case ERR_STATUS_UET_UER:
+			record->count.uer++;
+			break;
+		case ERR_STATUS_UET_UEO:
+			record->count.ueo++;
+			break;
+		}
+	}
+
 	return 0;
 }
 
@@ -938,10 +962,13 @@ static int aest_device_probe(struct platform_device *pdev)
 	if (ret)
 		return ret;
 
+	aest_dev_init_debugfs(adev);
+
 	aest_dev_dbg(adev, "Node cnt: %x, uid: %x, irq: %d, %d\n",
 			adev->node_cnt, adev->uid, adev->irq[0], adev->irq[1]);
 
 	return 0;
+
 }
 
 static const struct acpi_device_id acpi_aest_ids[] = {
@@ -960,12 +987,20 @@ static struct platform_driver aest_driver = {
 
 static int __init aest_init(void)
 {
+#ifdef CONFIG_DEBUG_FS
+	aest_debugfs = debugfs_create_dir("aest", NULL);
+#endif
+
 	return platform_driver_register(&aest_driver);
 }
 module_init(aest_init);
 
 static void __exit aest_exit(void)
 {
+#ifdef CONFIG_DEBUG_FS
+	debugfs_remove(aest_debugfs);
+#endif
+
 	platform_driver_unregister(&aest_driver);
 }
 module_exit(aest_exit);
diff --git a/drivers/ras/aest/aest-sysfs.c b/drivers/ras/aest/aest-sysfs.c
new file mode 100644
index 000000000000..f19cd2b5edb2
--- /dev/null
+++ b/drivers/ras/aest/aest-sysfs.c
@@ -0,0 +1,226 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM Error Source Table Support
+ *
+ * Copyright (c) 2024, Alibaba Group.
+ */
+
+#include "aest.h"
+
+static void
+aest_store_threshold(struct aest_record *record, void *data)
+{
+	u64 err_misc0, *threshold = data;
+	struct ce_threshold *ce = &record->ce;
+
+	if (*threshold > ce->info->max_count)
+		return;
+
+	ce->threshold = *threshold;
+	ce->count = ce->info->max_count - ce->threshold + 1;
+
+	err_misc0 = record_read(record, ERXMISC0);
+	ce->reg_val = (err_misc0 & ~ce->info->mask) |
+			(ce->count << ce->info->shift);
+
+	record_write(record, ERXMISC0, ce->reg_val);
+}
+
+static void
+aest_error_count(struct aest_record *record, void *data)
+{
+	struct record_count *count = data;
+
+	count->ce += record->count.ce;
+	count->de += record->count.de;
+	count->uc += record->count.uc;
+	count->ueu += record->count.ueu;
+	count->uer += record->count.uer;
+	count->ueo += record->count.ueo;
+}
+
+/*******************************************************************************
+ *
+ * Debugfs for AEST node
+ *
+ ******************************************************************************/
+
+static int aest_node_err_count_show(struct seq_file *m, void *data)
+{
+	struct aest_node *node = data;
+	struct record_count count = { 0 };
+	int i;
+
+	for (i = 0; i < node->record_count; i++)
+		aest_error_count(&node->records[i], &count);
+
+	seq_printf(m, "CE: %llu\n"
+				"DE: %llu\n"
+				"UC: %llu\n"
+				"UEU: %llu\n"
+				"UEO: %llu\n"
+				"UER: %llu\n",
+				count.ce, count.de, count.uc, count.ueu,
+				count.uer, count.ueo);
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(aest_node_err_count);
+
+/*******************************************************************************
+ *
+ * Attribute for AEST record
+ *
+ ******************************************************************************/
+
+#define DEFINE_AEST_DEBUGFS_ATTR(name, offset) \
+static int name##_get(void *data, u64 *val) \
+{ \
+	struct aest_record *record = data; \
+	*val = record_read(record, offset); \
+	return 0; \
+} \
+static int name##_set(void *data, u64 val) \
+{ \
+	struct aest_record *record = data; \
+	record_write(record, offset, val); \
+	return 0; \
+} \
+DEFINE_DEBUGFS_ATTRIBUTE(name##_ops, name##_get, name##_set, "%#llx\n")
+
+DEFINE_AEST_DEBUGFS_ATTR(err_fr, ERXFR);
+DEFINE_AEST_DEBUGFS_ATTR(err_ctrl, ERXCTLR);
+DEFINE_AEST_DEBUGFS_ATTR(err_status, ERXSTATUS);
+DEFINE_AEST_DEBUGFS_ATTR(err_addr, ERXADDR);
+DEFINE_AEST_DEBUGFS_ATTR(err_misc0, ERXMISC0);
+DEFINE_AEST_DEBUGFS_ATTR(err_misc1, ERXMISC1);
+DEFINE_AEST_DEBUGFS_ATTR(err_misc2, ERXMISC2);
+DEFINE_AEST_DEBUGFS_ATTR(err_misc3, ERXMISC3);
+
+static int record_ce_threshold_get(void *data, u64 *val)
+{
+	struct aest_record *record = data;
+
+	*val = record->ce.threshold;
+	return 0;
+}
+
+static int record_ce_threshold_set(void *data, u64 val)
+{
+	u64 threshold = val;
+	struct aest_record *record = data;
+
+	aest_store_threshold(record, &threshold);
+
+	return 0;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(record_ce_threshold_ops, record_ce_threshold_get,
+					record_ce_threshold_set, "%llu\n");
+
+static int aest_record_err_count_show(struct seq_file *m, void *data)
+{
+	struct aest_record *record = data;
+	struct record_count count = { 0 };
+
+	aest_error_count(record, &count);
+
+	seq_printf(m, "CE: %llu\n"
+				"DE: %llu\n"
+				"UC: %llu\n"
+				"UEU: %llu\n"
+				"UEO: %llu\n"
+				"UER: %llu\n",
+				count.ce, count.de, count.uc, count.ueu,
+				count.uer, count.ueo);
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(aest_record_err_count);
+
+static void aest_record_init_debugfs(struct aest_record *record)
+{
+	debugfs_create_file("err_fr", 0600, record->debugfs, record,
+								&err_fr_ops);
+	debugfs_create_file("err_ctrl", 0600, record->debugfs, record,
+								&err_ctrl_ops);
+	debugfs_create_file("err_status", 0600, record->debugfs, record,
+								&err_status_ops);
+	debugfs_create_file("err_addr", 0600, record->debugfs, record,
+								&err_addr_ops);
+	debugfs_create_file("err_misc0", 0600, record->debugfs, record,
+								&err_misc0_ops);
+	debugfs_create_file("err_misc1", 0600, record->debugfs, record,
+								&err_misc1_ops);
+	debugfs_create_file("err_misc2", 0600, record->debugfs, record,
+								&err_misc2_ops);
+	debugfs_create_file("err_misc3", 0600, record->debugfs, record,
+								&err_misc3_ops);
+	debugfs_create_file("err_count", 0400, record->debugfs, record,
+						&aest_record_err_count_fops);
+	debugfs_create_file("ce_threshold", 0400, record->debugfs, record,
+						&record_ce_threshold_ops);
+}
+
+static void
+aest_node_init_debugfs(struct aest_node *node)
+{
+	int i;
+	struct aest_record *record;
+
+	debugfs_create_u32("device_id", 0400, node->debugfs,
+				&node->info->common->error_node_device);
+	debugfs_create_file("err_count", 0400, node->debugfs, node,
+					&aest_node_err_count_fops);
+
+	for (i = 0; i < node->record_count; i++) {
+		record = &node->records[i];
+		if (!record->name)
+			continue;
+		record->debugfs = debugfs_create_dir(record->name,
+								node->debugfs);
+		aest_record_init_debugfs(record);
+	}
+}
+
+static void
+aest_oncore_dev_init_debugfs(struct aest_device *adev)
+{
+	int cpu, i;
+	struct aest_node *node;
+	struct aest_device *percpu_dev;
+	char name[16];
+
+	for_each_possible_cpu(cpu) {
+		percpu_dev = this_cpu_ptr(adev->adev_oncore);
+
+		snprintf(name, sizeof(name), "processor%u", cpu);
+		percpu_dev->debugfs = debugfs_create_dir(name, aest_debugfs);
+
+		for (i = 0; i < adev->node_cnt; i++) {
+			node = &adev->nodes[i];
+
+			node->debugfs = debugfs_create_dir(node->name,
+							percpu_dev->debugfs);
+			aest_node_init_debugfs(node);
+		}
+	}
+}
+
+void aest_dev_init_debugfs(struct aest_device *adev)
+{
+	int i;
+	struct aest_node *node;
+
+	adev->debugfs = debugfs_create_dir(dev_name(adev->dev), aest_debugfs);
+	if (aest_dev_is_oncore(adev)) {
+		aest_oncore_dev_init_debugfs(adev);
+		return;
+	}
+
+	for (i = 0; i < adev->node_cnt; i++) {
+		node = &adev->nodes[i];
+		if (!node->name)
+			continue;
+		node->debugfs = debugfs_create_dir(node->name, adev->debugfs);
+		aest_node_init_debugfs(node);
+	}
+}
diff --git a/drivers/ras/aest/aest.h b/drivers/ras/aest/aest.h
index 04005aad3617..d9a52e39b1b9 100644
--- a/drivers/ras/aest/aest.h
+++ b/drivers/ras/aest/aest.h
@@ -7,6 +7,7 @@
 
 #include <linux/acpi_aest.h>
 #include <asm/ras.h>
+#include <linux/debugfs.h>
 
 #define MAX_GSI_PER_NODE 2
 #define AEST_MAX_PPI 3
@@ -53,7 +54,7 @@
 #define ERXGROUP		0xE00
 #define GIC_ERRDEVARCH		0xFFBC
 
-extern struct xarray *aest_array;
+extern struct dentry *aest_debugfs;
 
 struct aest_event {
 	struct llist_node llnode;
@@ -104,6 +105,15 @@ struct ce_threshold {
 	u64				reg_val;
 };
 
+struct record_count {
+	u64				ce;
+	u64				de;
+	u64				uc;
+	u64				uer;
+	u64				ueo;
+	u64				ueu;
+};
+
 struct aest_record {
 	char				*name;
 	int				index;
@@ -125,6 +135,7 @@ struct aest_record {
 	struct dentry			*debugfs;
 	struct ce_threshold		ce;
 	enum ras_ce_threshold		threshold_type;
+	struct record_count		count;
 	const struct aest_access	*access;
 
 	void				*vendor_data;
@@ -321,3 +332,5 @@ aest_set_name(struct aest_device *adev, struct aest_hnode *ahnode)
 
 	return 0;
 }
+
+void aest_dev_init_debugfs(struct aest_device *adev);
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 3/5] RAS/AEST: Introduce AEST inject interface to test AEST driver
  2025-01-15  8:42 [PATCH v3 0/5] ARM Error Source Table V2 Support Ruidong Tian
  2025-01-15  8:42 ` [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver Ruidong Tian
  2025-01-15  8:42 ` [PATCH v3 2/5] RAS/AEST: Introduce AEST driver sysfs interface Ruidong Tian
@ 2025-01-15  8:42 ` Ruidong Tian
  2025-01-17  7:07   ` kernel test robot
  2025-01-15  8:42 ` [PATCH v3 4/5] RAS/ATL: Unified ATL interface for ARM64 and AMD Ruidong Tian
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Ruidong Tian @ 2025-01-15  8:42 UTC (permalink / raw)
  To: catalin.marinas, will, lpieralisi, guohanjun, sudeep.holla,
	xueshuai, baolin.wang, linux-kernel, linux-acpi, linux-arm-kernel,
	rafael, lenb, tony.luck, bp, yazen.ghannam
  Cc: tianruidong

AEST injection interface can help to test how AEST driver process error
record which raise error.

This interface just raise a SW simulate error rather than HW error.

Example1:

1. write RAS register value to err_* file:
echo 0x... > <debugfs>/aest/<dev>/<node>/inject/err_fr
echo 0x... > <debugfs>/aest/<dev>/<node>/inject/err_status
echo 0x... > <debugfs>/aest/<dev>/<node>/inject/err_addr
echo 0x... > <debugfs>/aest/<dev>/<node>/inject/err_*

2. trigger the error:
echo -1 > <debugfs>/aest/<dev>/<node>/inject/inject

AEST driver will process this error with error register value specified
by user.

Example2:

1. just trigger the error:
echo n(record_cpunt > n >=0 ) > <debugfs>/aest/<dev>/<node>/inject/inject

AEST driver will process this error with error register values read
from record<n> of this node.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
---
 Documentation/ABI/testing/debugfs-aest |  17 +++
 drivers/ras/aest/Makefile              |   1 +
 drivers/ras/aest/aest-inject.c         | 151 +++++++++++++++++++++++++
 drivers/ras/aest/aest-sysfs.c          |   8 +-
 drivers/ras/aest/aest.h                |   2 +
 5 files changed, 177 insertions(+), 2 deletions(-)
 create mode 100644 drivers/ras/aest/aest-inject.c

diff --git a/Documentation/ABI/testing/debugfs-aest b/Documentation/ABI/testing/debugfs-aest
index 39d9c85843ef..4d3f4464cf98 100644
--- a/Documentation/ABI/testing/debugfs-aest
+++ b/Documentation/ABI/testing/debugfs-aest
@@ -96,3 +96,20 @@ KernelVersion	6.10
 Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
 Description:
 		(RO) Outputs error statistics for all this records.
+
+What:		/sys/devices/platform/AEST.<UID>/<Nome_name>/inject/err_*
+Date:		June 2024
+KernelVersion	6.10
+Contact:	Ruidong Tian <tianruidong@linux.alibaba.com>
+Description:
+		(RW) Write any integer to this file to trigger the error
+		injection. Make sure you have specified all necessary error
+		parameters, i.e. this write should be the last step when
+		injecting errors.
+
+		Accepts values -  -1 or n ( 0 <= n < <record_count>).
+		-1 : If you write -1, make sure you specified all err_* file,
+		     driver will use these err_* value to proce AEST error.
+		n : Driver will read record<n> of this error node to collect
+		    error register value, and use these values to proce AEST
+		    error.
diff --git a/drivers/ras/aest/Makefile b/drivers/ras/aest/Makefile
index 75495413d2b6..5ee10fc8b2e9 100644
--- a/drivers/ras/aest/Makefile
+++ b/drivers/ras/aest/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_AEST) 	+= aest.o
 
 aest-y		:= aest-core.o
 aest-y		+= aest-sysfs.o
+aest-y		+= aest-inject.o
diff --git a/drivers/ras/aest/aest-inject.c b/drivers/ras/aest/aest-inject.c
new file mode 100644
index 000000000000..2ca074aa021c
--- /dev/null
+++ b/drivers/ras/aest/aest-inject.c
@@ -0,0 +1,151 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM Error Source Table Support
+ *
+ * Copyright (c) 2024, Alibaba Group.
+ */
+
+#include "aest.h"
+
+static struct ras_ext_regs regs_inj;
+static u64 hard_inject_val;
+
+struct inj_attr {
+	struct attribute attr;
+	ssize_t (*show)(struct aest_node *n, struct inj_attr *a, char *b);
+	ssize_t (*store)(struct aest_node *n, struct inj_attr *a, const char *b,
+				size_t c);
+};
+
+struct aest_inject {
+	struct aest_node *node;
+	struct kobject kobj;
+};
+
+#define to_inj(k)	container_of(k, struct aest_inject, kobj)
+#define to_inj_attr(a)	container_of(a, struct inj_attr, attr)
+
+static u64 aest_sysreg_read_inject(void *__unused, u32 offset)
+{
+	u64 *p = (u64 *)&regs_inj;
+
+	return p[offset/8];
+}
+
+static void aest_sysreg_write_inject(void *base, u32 offset, u64 val)
+{
+	u64 *p = (u64 *)&regs_inj;
+
+	p[offset/8] = val;
+}
+
+static u64 aest_iomem_read_inject(void *base, u32 offset)
+{
+	u64 *p = (u64 *)&regs_inj;
+
+	return p[offset/8];
+}
+
+static void aest_iomem_write_inject(void *base, u32 offset, u64 val)
+{
+	u64 *p = (u64 *)&regs_inj;
+
+	p[offset/8] = val;
+}
+
+static struct aest_access aest_access_inject[] = {
+	[ACPI_AEST_NODE_SYSTEM_REGISTER] = {
+		.read = aest_sysreg_read_inject,
+		.write = aest_sysreg_write_inject,
+	},
+
+	[ACPI_AEST_NODE_MEMORY_MAPPED] = {
+		.read = aest_iomem_read_inject,
+		.write = aest_iomem_write_inject,
+	},
+	[ACPI_AEST_NODE_SINGLE_RECORD_MEMORY_MAPPED] = {
+		.read = aest_iomem_read_inject,
+		.write = aest_iomem_write_inject,
+	},
+	{ }
+};
+
+static int inject_store(void *data, u64 val)
+{
+	int i = val, count = 0;
+	struct aest_record record_inj, *record;
+	struct aest_node node_inj, *node = data;
+
+	if (i > (int)node->info->interface_hdr->error_record_count)
+		return -EINVAL;
+
+	memcpy(&node_inj, node, sizeof(*node));
+	node_inj.name = "AEST-injection";
+
+	record_inj.access = &aest_access_inject[node->info->interface_hdr->type];
+	record_inj.node = &node_inj;
+	record_inj.index = i;
+	if (i >= 0) {
+		record = &node->records[i];
+		regs_inj.err_fr = record_read(record, ERXFR);
+		regs_inj.err_ctlr = record_read(record, ERXCTLR);
+		regs_inj.err_status = record_read(record, ERXSTATUS);
+		regs_inj.err_addr = record_read(record, ERXADDR);
+		regs_inj.err_misc[0] = record_read(record, ERXMISC0);
+		regs_inj.err_misc[1] = record_read(record, ERXMISC1);
+		regs_inj.err_misc[2] = record_read(record, ERXMISC2);
+		regs_inj.err_misc[3] = record_read(record, ERXMISC3);
+	}
+
+	regs_inj.err_status |= ERR_STATUS_V;
+
+	aest_proc_record(&record_inj, &count);
+
+	if (count != 1)
+		return -EIO;
+
+	return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(inject_ops, NULL, inject_store, "%llu\n");
+
+static int hard_inject_store(void *data, u64 val)
+{
+	struct aest_node *node = data;
+
+	if (!node->inj)
+		return -EPERM;
+
+	if (val > node->record_count)
+		return -ENODEV;
+
+	if (node->type == ACPI_AEST_PROCESSOR_ERROR_NODE) {
+		aest_select_record(node, val);
+		write_sysreg_s(hard_inject_val, SYS_ERXPFGCTL_EL1);
+		write_sysreg_s(0x100, SYS_ERXPFGCDN_EL1);
+		aest_sync(node);
+	} else
+		writeq_relaxed(hard_inject_val, node->inj + val * 8);
+
+	return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(hard_inject_ops, NULL, hard_inject_store, "%llu\n");
+
+void aest_inject_init_debugfs(struct aest_node *node)
+{
+	struct dentry *inj;
+
+	inj = debugfs_create_dir("inject", node->debugfs);
+
+	debugfs_create_u64("err_fr", 0400, inj, &regs_inj.err_fr);
+	debugfs_create_u64("err_ctrl", 0400, inj, &regs_inj.err_ctlr);
+	debugfs_create_u64("err_status", 0400, inj, &regs_inj.err_status);
+	debugfs_create_u64("err_addr", 0400, inj, &regs_inj.err_addr);
+	debugfs_create_u64("err_misc0", 0400, inj, &regs_inj.err_misc[0]);
+	debugfs_create_u64("err_misc1", 0400, inj, &regs_inj.err_misc[1]);
+	debugfs_create_u64("err_misc2", 0400, inj, &regs_inj.err_misc[2]);
+	debugfs_create_u64("err_misc3", 0400, inj, &regs_inj.err_misc[3]);
+	debugfs_create_file("inject", 0400, inj, node, &inject_ops);
+
+	debugfs_create_file("hard_inject", 0600, inj, node, &hard_inject_ops);
+	debugfs_create_u64("hard_inject_val", 0600, inj, &hard_inject_val);
+}
diff --git a/drivers/ras/aest/aest-sysfs.c b/drivers/ras/aest/aest-sysfs.c
index f19cd2b5edb2..ba913556fc03 100644
--- a/drivers/ras/aest/aest-sysfs.c
+++ b/drivers/ras/aest/aest-sysfs.c
@@ -192,8 +192,8 @@ aest_oncore_dev_init_debugfs(struct aest_device *adev)
 	for_each_possible_cpu(cpu) {
 		percpu_dev = this_cpu_ptr(adev->adev_oncore);
 
-		snprintf(name, sizeof(name), "processor%u", cpu);
-		percpu_dev->debugfs = debugfs_create_dir(name, aest_debugfs);
+		snprintf(name, sizeof(name), "CPU%u", cpu);
+		percpu_dev->debugfs = debugfs_create_dir(name, adev->debugfs);
 
 		for (i = 0; i < adev->node_cnt; i++) {
 			node = &adev->nodes[i];
@@ -210,6 +210,9 @@ void aest_dev_init_debugfs(struct aest_device *adev)
 	int i;
 	struct aest_node *node;
 
+	if (!aest_debugfs)
+		dev_err(adev->dev, "debugfs not enabled\n");
+
 	adev->debugfs = debugfs_create_dir(dev_name(adev->dev), aest_debugfs);
 	if (aest_dev_is_oncore(adev)) {
 		aest_oncore_dev_init_debugfs(adev);
@@ -222,5 +225,6 @@ void aest_dev_init_debugfs(struct aest_device *adev)
 			continue;
 		node->debugfs = debugfs_create_dir(node->name, adev->debugfs);
 		aest_node_init_debugfs(node);
+		aest_inject_init_debugfs(node);
 	}
 }
diff --git a/drivers/ras/aest/aest.h b/drivers/ras/aest/aest.h
index d9a52e39b1b9..90a96e2666d3 100644
--- a/drivers/ras/aest/aest.h
+++ b/drivers/ras/aest/aest.h
@@ -334,3 +334,5 @@ aest_set_name(struct aest_device *adev, struct aest_hnode *ahnode)
 }
 
 void aest_dev_init_debugfs(struct aest_device *adev);
+void aest_inject_init_debugfs(struct aest_node *node);
+void aest_proc_record(struct aest_record *record, void *data);
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 4/5] RAS/ATL: Unified ATL interface for ARM64 and AMD
  2025-01-15  8:42 [PATCH v3 0/5] ARM Error Source Table V2 Support Ruidong Tian
                   ` (2 preceding siblings ...)
  2025-01-15  8:42 ` [PATCH v3 3/5] RAS/AEST: Introduce AEST inject interface to test AEST driver Ruidong Tian
@ 2025-01-15  8:42 ` Ruidong Tian
  2025-01-17  6:14   ` kernel test robot
  2025-02-19 21:00   ` Borislav Petkov
  2025-01-15  8:42 ` [PATCH v3 5/5] trace, ras: add ARM RAS extension trace event Ruidong Tian
  2025-02-19 20:30 ` [PATCH v3 0/5] ARM Error Source Table V2 Support Borislav Petkov
  5 siblings, 2 replies; 16+ messages in thread
From: Ruidong Tian @ 2025-01-15  8:42 UTC (permalink / raw)
  To: catalin.marinas, will, lpieralisi, guohanjun, sudeep.holla,
	xueshuai, baolin.wang, linux-kernel, linux-acpi, linux-arm-kernel,
	rafael, lenb, tony.luck, bp, yazen.ghannam
  Cc: tianruidong

Translate device normalize address in AMD, also named logical address,
to system physical address is a common interface in RAS. Provides common
interface both for AMD and ARM.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
---
 drivers/edac/amd64_edac.c      |  2 +-
 drivers/ras/aest/aest-core.c   | 12 ++++++------
 drivers/ras/amd/atl/core.c     |  4 ++--
 drivers/ras/amd/atl/internal.h |  2 +-
 drivers/ras/amd/atl/umc.c      |  3 ++-
 drivers/ras/ras.c              | 24 +++++++++++-------------
 include/linux/ras.h            |  9 ++++-----
 7 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index ddfbdb66b794..1e9c96e4daa8 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2832,7 +2832,7 @@ static void decode_umc_error(int node_id, struct mce *m)
 	a_err.ipid = m->ipid;
 	a_err.cpu  = m->extcpu;
 
-	sys_addr = amd_convert_umc_mca_addr_to_sys_addr(&a_err);
+	sys_addr = convert_ras_la_to_spa(&a_err);
 	if (IS_ERR_VALUE(sys_addr)) {
 		err.err_code = ERR_NORM_ADDR;
 		goto log_error;
diff --git a/drivers/ras/aest/aest-core.c b/drivers/ras/aest/aest-core.c
index 12d0a32ecda9..0530880ded3e 100644
--- a/drivers/ras/aest/aest-core.c
+++ b/drivers/ras/aest/aest-core.c
@@ -228,16 +228,16 @@ static void aest_node_pool_process(struct work_struct *work)
 	llist_for_each_entry(event, head, llnode) {
 		aest_print(event);
 
-		/* TODO: translate Logical Addresses to System Physical Addresses */
+		addr = event->regs.err_addr & (1UL << CONFIG_ARM64_PA_BITS);
+
 		if (event->addressing_mode == AEST_ADDREESS_LA ||
-			(event->regs.err_addr & ERR_ADDR_AI)) {
-			pr_notice("Can not translate LA to SPA\n");
-			addr = 0;
-		} else
+			(event->regs.err_addr & ERR_ADDR_AI))
+			addr = convert_ras_la_to_spa(event);
+		else
 			addr = event->regs.err_addr & (1UL << CONFIG_ARM64_PA_BITS);
 
 		status = event->regs.err_status;
-		if (addr && ((status & ERR_STATUS_UE) || (status & ERR_STATUS_DE)))
+		if (addr > 0 && ((status & ERR_STATUS_UE) || (status & ERR_STATUS_DE)))
 			aest_handle_memory_failure(addr);
 
 		blocking_notifier_call_chain(&aest_decoder_chain, 0, event);
diff --git a/drivers/ras/amd/atl/core.c b/drivers/ras/amd/atl/core.c
index 4197e10993ac..6b5f0f65bf8e 100644
--- a/drivers/ras/amd/atl/core.c
+++ b/drivers/ras/amd/atl/core.c
@@ -207,7 +207,7 @@ static int __init amd_atl_init(void)
 
 	/* Increment this module's recount so that it can't be easily unloaded. */
 	__module_get(THIS_MODULE);
-	amd_atl_register_decoder(convert_umc_mca_addr_to_sys_addr);
+	atl_register_decoder(convert_umc_mca_addr_to_sys_addr);
 
 	pr_info("AMD Address Translation Library initialized\n");
 	return 0;
@@ -219,7 +219,7 @@ static int __init amd_atl_init(void)
  */
 static void __exit amd_atl_exit(void)
 {
-	amd_atl_unregister_decoder();
+	atl_unregister_decoder();
 }
 
 module_init(amd_atl_init);
diff --git a/drivers/ras/amd/atl/internal.h b/drivers/ras/amd/atl/internal.h
index 143d04c779a8..42686189decb 100644
--- a/drivers/ras/amd/atl/internal.h
+++ b/drivers/ras/amd/atl/internal.h
@@ -277,7 +277,7 @@ int denormalize_address(struct addr_ctx *ctx);
 int dehash_address(struct addr_ctx *ctx);
 
 unsigned long norm_to_sys_addr(u8 socket_id, u8 die_id, u8 coh_st_inst_id, unsigned long addr);
-unsigned long convert_umc_mca_addr_to_sys_addr(struct atl_err *err);
+unsigned long convert_umc_mca_addr_to_sys_addr(void *data);
 
 u64 add_base_and_hole(struct addr_ctx *ctx, u64 addr);
 u64 remove_base_and_hole(struct addr_ctx *ctx, u64 addr);
diff --git a/drivers/ras/amd/atl/umc.c b/drivers/ras/amd/atl/umc.c
index dc8aa12f63c8..aa13f7fd7ba4 100644
--- a/drivers/ras/amd/atl/umc.c
+++ b/drivers/ras/amd/atl/umc.c
@@ -395,8 +395,9 @@ static u8 get_coh_st_inst_id(struct atl_err *err)
 	return FIELD_GET(UMC_CHANNEL_NUM, err->ipid);
 }
 
-unsigned long convert_umc_mca_addr_to_sys_addr(struct atl_err *err)
+unsigned long convert_umc_mca_addr_to_sys_addr(void *data)
 {
+	struct atl_err *err = data;
 	u8 socket_id = topology_physical_package_id(err->cpu);
 	u8 coh_st_inst_id = get_coh_st_inst_id(err);
 	unsigned long addr = get_addr(err->addr);
diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
index a6e4792a1b2e..e5f23a8279c2 100644
--- a/drivers/ras/ras.c
+++ b/drivers/ras/ras.c
@@ -10,36 +10,34 @@
 #include <linux/ras.h>
 #include <linux/uuid.h>
 
-#if IS_ENABLED(CONFIG_AMD_ATL)
 /*
  * Once set, this function pointer should never be unset.
  *
  * The library module will set this pointer if it successfully loads. The module
  * should not be unloaded except for testing and debug purposes.
  */
-static unsigned long (*amd_atl_umc_na_to_spa)(struct atl_err *err);
+static unsigned long (*atl_ras_la_to_spa)(void *err);
 
-void amd_atl_register_decoder(unsigned long (*f)(struct atl_err *))
+void atl_register_decoder(unsigned long (*f)(void *))
 {
-	amd_atl_umc_na_to_spa = f;
+	atl_ras_la_to_spa = f;
 }
-EXPORT_SYMBOL_GPL(amd_atl_register_decoder);
+EXPORT_SYMBOL_GPL(atl_register_decoder);
 
-void amd_atl_unregister_decoder(void)
+void atl_unregister_decoder(void)
 {
-	amd_atl_umc_na_to_spa = NULL;
+	atl_ras_la_to_spa = NULL;
 }
-EXPORT_SYMBOL_GPL(amd_atl_unregister_decoder);
+EXPORT_SYMBOL_GPL(atl_unregister_decoder);
 
-unsigned long amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err)
+unsigned long convert_ras_la_to_spa(void *err)
 {
-	if (!amd_atl_umc_na_to_spa)
+	if (!atl_ras_la_to_spa)
 		return -EINVAL;
 
-	return amd_atl_umc_na_to_spa(err);
+	return atl_ras_la_to_spa(err);
 }
-EXPORT_SYMBOL_GPL(amd_convert_umc_mca_addr_to_sys_addr);
-#endif /* CONFIG_AMD_ATL */
+EXPORT_SYMBOL_GPL(convert_ras_la_to_spa);
 
 #define CREATE_TRACE_POINTS
 #define TRACE_INCLUDE_PATH ../../include/ras
diff --git a/include/linux/ras.h b/include/linux/ras.h
index 1c777af6a1af..2e90556779d2 100644
--- a/include/linux/ras.h
+++ b/include/linux/ras.h
@@ -43,16 +43,15 @@ struct atl_err {
 };
 
 #if IS_ENABLED(CONFIG_AMD_ATL)
-void amd_atl_register_decoder(unsigned long (*f)(struct atl_err *));
-void amd_atl_unregister_decoder(void);
 void amd_retire_dram_row(struct atl_err *err);
-unsigned long amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err);
 #else
 static inline void amd_retire_dram_row(struct atl_err *err) { }
-static inline unsigned long
-amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err) { return -EINVAL; }
 #endif /* CONFIG_AMD_ATL */
 
+void atl_register_decoder(unsigned long (*f)(void *));
+void atl_unregister_decoder(void);
+unsigned long convert_ras_la_to_spa(void *err);
+
 #if IS_ENABLED(CONFIG_AEST)
 void aest_register_decode_chain(struct notifier_block *nb);
 void aest_unregister_decode_chain(struct notifier_block *nb);
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 5/5] trace, ras: add ARM RAS extension trace event
  2025-01-15  8:42 [PATCH v3 0/5] ARM Error Source Table V2 Support Ruidong Tian
                   ` (3 preceding siblings ...)
  2025-01-15  8:42 ` [PATCH v3 4/5] RAS/ATL: Unified ATL interface for ARM64 and AMD Ruidong Tian
@ 2025-01-15  8:42 ` Ruidong Tian
  2025-02-19 20:30 ` [PATCH v3 0/5] ARM Error Source Table V2 Support Borislav Petkov
  5 siblings, 0 replies; 16+ messages in thread
From: Ruidong Tian @ 2025-01-15  8:42 UTC (permalink / raw)
  To: catalin.marinas, will, lpieralisi, guohanjun, sudeep.holla,
	xueshuai, baolin.wang, linux-kernel, linux-acpi, linux-arm-kernel,
	rafael, lenb, tony.luck, bp, yazen.ghannam
  Cc: tianruidong, Tyler Baicar

Add a trace event for hardware errors reported by the ARMv8
RAS extension registers.

Signed-off-by: Tyler Baicar <baicar@os.amperecomputing.com>
Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
---
 drivers/acpi/arm64/aest.c    |  2 +
 drivers/ras/aest/aest-core.c |  6 +++
 drivers/ras/ras.c            |  3 ++
 include/ras/ras_event.h      | 71 ++++++++++++++++++++++++++++++++++++
 4 files changed, 82 insertions(+)

diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c
index 312ddd5c15f5..adc12174f4e0 100644
--- a/drivers/acpi/arm64/aest.c
+++ b/drivers/acpi/arm64/aest.c
@@ -11,6 +11,8 @@
 
 #include "init.h"
 
+#include <ras/ras_event.h>
+
 #undef pr_fmt
 #define pr_fmt(fmt) "ACPI AEST: " fmt
 
diff --git a/drivers/ras/aest/aest-core.c b/drivers/ras/aest/aest-core.c
index 0530880ded3e..e72df9a79b96 100644
--- a/drivers/ras/aest/aest-core.c
+++ b/drivers/ras/aest/aest-core.c
@@ -13,6 +13,8 @@
 #include <linux/genalloc.h>
 #include <linux/ras.h>
 
+#include <ras/ras_event.h>
+
 #include "aest.h"
 
 DEFINE_PER_CPU(struct aest_device, percpu_adev);
@@ -90,6 +92,10 @@ static void aest_print(struct aest_event *event)
 		pr_err("%s  ERR%dMISC3: 0x%llx\n", pfx_seq, index,
 						regs->err_misc[3]);
 	}
+
+	trace_arm_ras_ext_event(event->type, event->id0, event->id1,
+				event->index, event->hid, &event->regs,
+				event->vendor_data, event->vendor_data_size);
 }
 
 static void aest_handle_memory_failure(u64 addr)
diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
index e5f23a8279c2..2a5a440e4c29 100644
--- a/drivers/ras/ras.c
+++ b/drivers/ras/ras.c
@@ -72,6 +72,9 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(extlog_mem_event);
 EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event);
 EXPORT_TRACEPOINT_SYMBOL_GPL(non_standard_event);
 EXPORT_TRACEPOINT_SYMBOL_GPL(arm_event);
+#ifdef CONFIG_ARM64_RAS_EXTN
+EXPORT_TRACEPOINT_SYMBOL_GPL(arm_ras_ext_event);
+#endif
 
 static int __init parse_ras_param(char *str)
 {
diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
index e5f7ee0864e7..119e8c4e6d20 100644
--- a/include/ras/ras_event.h
+++ b/include/ras/ras_event.h
@@ -338,6 +338,77 @@ TRACE_EVENT(aer_event,
 			"Not available")
 );
 
+/*
+ * ARM RAS Extension Events Report
+ *
+ * This event is generated when an error reported by the ARM RAS extension
+ * hardware is detected.
+ */
+
+#ifdef CONFIG_ARM64_RAS_EXTN
+#include <asm/ras.h>
+TRACE_EVENT(arm_ras_ext_event,
+
+	TP_PROTO(const u8 type,
+		 const u32 id0,
+		 const u32 id1,
+		 const u32 index,
+		 char *hid,
+		 struct ras_ext_regs *regs,
+		 const u8 *data,
+		 const u32 len),
+
+	TP_ARGS(type, id0, id1, index, hid, regs, data, len),
+
+	TP_STRUCT__entry(
+		__field(u8,  type)
+		__field(u32, id0)
+		__field(u32, id1)
+		__field(u32, index)
+		__field(char *, hid)
+		__field(u64, err_fr)
+		__field(u64, err_ctlr)
+		__field(u64, err_status)
+		__field(u64, err_addr)
+		__field(u64, err_misc0)
+		__field(u64, err_misc1)
+		__field(u64, err_misc2)
+		__field(u64, err_misc3)
+		__field(u32, len)
+		__dynamic_array(u8, buf, len)
+	),
+
+	TP_fast_assign(
+		__entry->type = type;
+		__entry->id0 = id0;
+		__entry->id1 = id1;
+		__entry->index = index;
+		__entry->hid = hid;
+		__entry->err_fr = regs->err_fr;
+		__entry->err_ctlr = regs->err_ctlr;
+		__entry->err_status = regs->err_status;
+		__entry->err_addr = regs->err_addr;
+		__entry->err_misc0 = regs->err_misc[0];
+		__entry->err_misc1 = regs->err_misc[1];
+		__entry->err_misc2 = regs->err_misc[2];
+		__entry->err_misc3 = regs->err_misc[3];
+		__entry->len = len;
+		memcpy(__get_dynamic_array(buf), data, len);
+	),
+
+	TP_printk("type: %d; id0: %d; id1: %d; index: %d; hid: %s; "
+		  "ERR_FR: %llx; ERR_CTLR: %llx; ERR_STATUS: %llx; "
+		  "ERR_ADDR: %llx; ERR_MISC0: %llx; ERR_MISC1: %llx; "
+		  "ERR_MISC2: %llx; ERR_MISC3: %llx; data len:%d; raw data:%s",
+		  __entry->type, __entry->id0, __entry->id1, __entry->index,
+		  __entry->hid, __entry->err_fr, __entry->err_ctlr,
+		  __entry->err_status, __entry->err_addr, __entry->err_misc0,
+		  __entry->err_misc1, __entry->err_misc2, __entry->err_misc3,
+		  __entry->len,
+		  __print_hex(__get_dynamic_array(buf), __entry->len))
+);
+#endif
+
 /*
  * memory-failure recovery action result event
  *
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 4/5] RAS/ATL: Unified ATL interface for ARM64 and AMD
  2025-01-15  8:42 ` [PATCH v3 4/5] RAS/ATL: Unified ATL interface for ARM64 and AMD Ruidong Tian
@ 2025-01-17  6:14   ` kernel test robot
  2025-02-19 21:00   ` Borislav Petkov
  1 sibling, 0 replies; 16+ messages in thread
From: kernel test robot @ 2025-01-17  6:14 UTC (permalink / raw)
  To: Ruidong Tian, catalin.marinas, will, lpieralisi, guohanjun,
	sudeep.holla, xueshuai, baolin.wang, linux-kernel, linux-acpi,
	linux-arm-kernel, rafael, lenb, tony.luck, bp, yazen.ghannam
  Cc: oe-kbuild-all, tianruidong

Hi Ruidong,

kernel test robot noticed the following build errors:

[auto build test ERROR on rafael-pm/linux-next]
[also build test ERROR on rafael-pm/bleeding-edge arm64/for-next/core ras/edac-for-next linus/master tip/smp/core v6.13-rc7 next-20250116]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ruidong-Tian/ACPI-RAS-AEST-Initial-AEST-driver/20250115-164601
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link:    https://lore.kernel.org/r/20250115084228.107573-5-tianruidong%40linux.alibaba.com
patch subject: [PATCH v3 4/5] RAS/ATL: Unified ATL interface for ARM64 and AMD
config: x86_64-randconfig-074-20250117 (https://download.01.org/0day-ci/archive/20250117/202501171437.tCtg3x1c-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250117/202501171437.tCtg3x1c-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501171437.tCtg3x1c-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/ras/amd/fmpm.c: In function 'save_spa':
>> drivers/ras/amd/fmpm.c:329:15: error: implicit declaration of function 'amd_convert_umc_mca_addr_to_sys_addr'; did you mean 'convert_umc_mca_addr_to_sys_addr'? [-Werror=implicit-function-declaration]
     329 |         spa = amd_convert_umc_mca_addr_to_sys_addr(&a_err);
         |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         |               convert_umc_mca_addr_to_sys_addr
   cc1: some warnings being treated as errors
--
   drivers/ras/amd/atl/umc.c: In function 'retire_row_mi300':
>> drivers/ras/amd/atl/umc.c:333:24: error: implicit declaration of function 'amd_convert_umc_mca_addr_to_sys_addr'; did you mean 'convert_umc_mca_addr_to_sys_addr'? [-Werror=implicit-function-declaration]
     333 |                 addr = amd_convert_umc_mca_addr_to_sys_addr(a_err);
         |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         |                        convert_umc_mca_addr_to_sys_addr
   cc1: some warnings being treated as errors


vim +329 drivers/ras/amd/fmpm.c

6f15e617cc9932 Yazen Ghannam 2024-02-13  291  
838850c50884cd Yazen Ghannam 2024-03-01  292  static void save_spa(struct fru_rec *rec, unsigned int entry,
838850c50884cd Yazen Ghannam 2024-03-01  293  		     u64 addr, u64 id, unsigned int cpu)
838850c50884cd Yazen Ghannam 2024-03-01  294  {
838850c50884cd Yazen Ghannam 2024-03-01  295  	unsigned int i, fru_idx, spa_entry;
838850c50884cd Yazen Ghannam 2024-03-01  296  	struct atl_err a_err;
838850c50884cd Yazen Ghannam 2024-03-01  297  	unsigned long spa;
838850c50884cd Yazen Ghannam 2024-03-01  298  
838850c50884cd Yazen Ghannam 2024-03-01  299  	if (entry >= max_nr_entries) {
838850c50884cd Yazen Ghannam 2024-03-01  300  		pr_warn_once("FRU descriptor entry %d out-of-bounds (max: %d)\n",
838850c50884cd Yazen Ghannam 2024-03-01  301  			     entry, max_nr_entries);
838850c50884cd Yazen Ghannam 2024-03-01  302  		return;
838850c50884cd Yazen Ghannam 2024-03-01  303  	}
838850c50884cd Yazen Ghannam 2024-03-01  304  
838850c50884cd Yazen Ghannam 2024-03-01  305  	/* spa_nr_entries is always multiple of max_nr_entries */
838850c50884cd Yazen Ghannam 2024-03-01  306  	for (i = 0; i < spa_nr_entries; i += max_nr_entries) {
838850c50884cd Yazen Ghannam 2024-03-01  307  		fru_idx = i / max_nr_entries;
838850c50884cd Yazen Ghannam 2024-03-01  308  		if (fru_records[fru_idx] == rec)
838850c50884cd Yazen Ghannam 2024-03-01  309  			break;
838850c50884cd Yazen Ghannam 2024-03-01  310  	}
838850c50884cd Yazen Ghannam 2024-03-01  311  
838850c50884cd Yazen Ghannam 2024-03-01  312  	if (i >= spa_nr_entries) {
838850c50884cd Yazen Ghannam 2024-03-01  313  		pr_warn_once("FRU record %d not found\n", i);
838850c50884cd Yazen Ghannam 2024-03-01  314  		return;
838850c50884cd Yazen Ghannam 2024-03-01  315  	}
838850c50884cd Yazen Ghannam 2024-03-01  316  
838850c50884cd Yazen Ghannam 2024-03-01  317  	spa_entry = i + entry;
838850c50884cd Yazen Ghannam 2024-03-01  318  	if (spa_entry >= spa_nr_entries) {
838850c50884cd Yazen Ghannam 2024-03-01  319  		pr_warn_once("spa_entries[] index out-of-bounds\n");
838850c50884cd Yazen Ghannam 2024-03-01  320  		return;
838850c50884cd Yazen Ghannam 2024-03-01  321  	}
838850c50884cd Yazen Ghannam 2024-03-01  322  
838850c50884cd Yazen Ghannam 2024-03-01  323  	memset(&a_err, 0, sizeof(struct atl_err));
838850c50884cd Yazen Ghannam 2024-03-01  324  
838850c50884cd Yazen Ghannam 2024-03-01  325  	a_err.addr = addr;
838850c50884cd Yazen Ghannam 2024-03-01  326  	a_err.ipid = id;
838850c50884cd Yazen Ghannam 2024-03-01  327  	a_err.cpu  = cpu;
838850c50884cd Yazen Ghannam 2024-03-01  328  
838850c50884cd Yazen Ghannam 2024-03-01 @329  	spa = amd_convert_umc_mca_addr_to_sys_addr(&a_err);
838850c50884cd Yazen Ghannam 2024-03-01  330  	if (IS_ERR_VALUE(spa)) {
838850c50884cd Yazen Ghannam 2024-03-01  331  		pr_debug("Failed to get system address\n");
838850c50884cd Yazen Ghannam 2024-03-01  332  		return;
838850c50884cd Yazen Ghannam 2024-03-01  333  	}
838850c50884cd Yazen Ghannam 2024-03-01  334  
838850c50884cd Yazen Ghannam 2024-03-01  335  	spa_entries[spa_entry] = spa;
838850c50884cd Yazen Ghannam 2024-03-01  336  	pr_debug("fru_idx: %u, entry: %u, spa_entry: %u, spa: 0x%016llx\n",
838850c50884cd Yazen Ghannam 2024-03-01  337  		 fru_idx, entry, spa_entry, spa_entries[spa_entry]);
838850c50884cd Yazen Ghannam 2024-03-01  338  }
838850c50884cd Yazen Ghannam 2024-03-01  339  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/5] RAS/AEST: Introduce AEST inject interface to test AEST driver
  2025-01-15  8:42 ` [PATCH v3 3/5] RAS/AEST: Introduce AEST inject interface to test AEST driver Ruidong Tian
@ 2025-01-17  7:07   ` kernel test robot
  0 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2025-01-17  7:07 UTC (permalink / raw)
  To: Ruidong Tian, catalin.marinas, will, lpieralisi, guohanjun,
	sudeep.holla, xueshuai, baolin.wang, linux-kernel, linux-acpi,
	linux-arm-kernel, rafael, lenb, tony.luck, bp, yazen.ghannam
  Cc: llvm, oe-kbuild-all, tianruidong

Hi Ruidong,

kernel test robot noticed the following build errors:

[auto build test ERROR on rafael-pm/linux-next]
[also build test ERROR on rafael-pm/bleeding-edge arm64/for-next/core ras/edac-for-next linus/master tip/smp/core v6.13-rc7 next-20250116]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ruidong-Tian/ACPI-RAS-AEST-Initial-AEST-driver/20250115-164601
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link:    https://lore.kernel.org/r/20250115084228.107573-4-tianruidong%40linux.alibaba.com
patch subject: [PATCH v3 3/5] RAS/AEST: Introduce AEST inject interface to test AEST driver
config: arm64-allmodconfig (https://download.01.org/0day-ci/archive/20250117/202501171406.o7oztilo-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250117/202501171406.o7oztilo-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501171406.o7oztilo-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/ras/aest/aest-core.c:280:13: error: static declaration of 'aest_proc_record' follows non-static declaration
     280 | static void aest_proc_record(struct aest_record *record, void *data)
         |             ^
   drivers/ras/aest/aest.h:338:6: note: previous declaration is here
     338 | void aest_proc_record(struct aest_record *record, void *data);
         |      ^
   1 error generated.


vim +/aest_proc_record +280 drivers/ras/aest/aest-core.c

b6c745ae1213b2 Ruidong Tian 2025-01-15  279  
b6c745ae1213b2 Ruidong Tian 2025-01-15 @280  static void aest_proc_record(struct aest_record *record, void *data)
b6c745ae1213b2 Ruidong Tian 2025-01-15  281  {
b6c745ae1213b2 Ruidong Tian 2025-01-15  282  	struct ras_ext_regs regs = {0};
b6c745ae1213b2 Ruidong Tian 2025-01-15  283  	int *count = data;
b6c745ae1213b2 Ruidong Tian 2025-01-15  284  
b6c745ae1213b2 Ruidong Tian 2025-01-15  285  	regs.err_status = record_read(record, ERXSTATUS);
b6c745ae1213b2 Ruidong Tian 2025-01-15  286  	if (!(regs.err_status & ERR_STATUS_V))
b6c745ae1213b2 Ruidong Tian 2025-01-15  287  		return;
b6c745ae1213b2 Ruidong Tian 2025-01-15  288  
b6c745ae1213b2 Ruidong Tian 2025-01-15  289  	(*count)++;
b6c745ae1213b2 Ruidong Tian 2025-01-15  290  
b6c745ae1213b2 Ruidong Tian 2025-01-15  291  	if (regs.err_status & ERR_STATUS_AV)
b6c745ae1213b2 Ruidong Tian 2025-01-15  292  		regs.err_addr = record_read(record, ERXADDR);
b6c745ae1213b2 Ruidong Tian 2025-01-15  293  
b6c745ae1213b2 Ruidong Tian 2025-01-15  294  	regs.err_fr = record->fr;
b6c745ae1213b2 Ruidong Tian 2025-01-15  295  	regs.err_ctlr = record_read(record, ERXCTLR);
b6c745ae1213b2 Ruidong Tian 2025-01-15  296  
b6c745ae1213b2 Ruidong Tian 2025-01-15  297  	if (regs.err_status & ERR_STATUS_MV) {
b6c745ae1213b2 Ruidong Tian 2025-01-15  298  		regs.err_misc[0] = record_read(record, ERXMISC0);
b6c745ae1213b2 Ruidong Tian 2025-01-15  299  		regs.err_misc[1] = record_read(record, ERXMISC1);
b6c745ae1213b2 Ruidong Tian 2025-01-15  300  		if (record->node->version >= ID_AA64PFR0_EL1_RAS_V1P1) {
b6c745ae1213b2 Ruidong Tian 2025-01-15  301  			regs.err_misc[2] = record_read(record, ERXMISC2);
b6c745ae1213b2 Ruidong Tian 2025-01-15  302  			regs.err_misc[3] = record_read(record, ERXMISC3);
b6c745ae1213b2 Ruidong Tian 2025-01-15  303  		}
b6c745ae1213b2 Ruidong Tian 2025-01-15  304  
b6c745ae1213b2 Ruidong Tian 2025-01-15  305  		if (record->node->info->interface_hdr->flags &
b6c745ae1213b2 Ruidong Tian 2025-01-15  306  			AEST_XFACE_FLAG_CLEAR_MISC) {
b6c745ae1213b2 Ruidong Tian 2025-01-15  307  			record_write(record, ERXMISC0, 0);
b6c745ae1213b2 Ruidong Tian 2025-01-15  308  			record_write(record, ERXMISC1, 0);
b6c745ae1213b2 Ruidong Tian 2025-01-15  309  			if (record->node->version >= ID_AA64PFR0_EL1_RAS_V1P1) {
b6c745ae1213b2 Ruidong Tian 2025-01-15  310  				record_write(record, ERXMISC2, 0);
b6c745ae1213b2 Ruidong Tian 2025-01-15  311  				record_write(record, ERXMISC3, 0);
b6c745ae1213b2 Ruidong Tian 2025-01-15  312  			}
b6c745ae1213b2 Ruidong Tian 2025-01-15  313  		/* ce count is 0 if record do not support ce */
b6c745ae1213b2 Ruidong Tian 2025-01-15  314  		} else if (record->ce.count > 0)
b6c745ae1213b2 Ruidong Tian 2025-01-15  315  			record_write(record, ERXMISC0, record->ce.reg_val);
b6c745ae1213b2 Ruidong Tian 2025-01-15  316  	}
b6c745ae1213b2 Ruidong Tian 2025-01-15  317  
b6c745ae1213b2 Ruidong Tian 2025-01-15  318  	/* panic if unrecoverable and uncontainable error encountered */
b6c745ae1213b2 Ruidong Tian 2025-01-15  319  	if ((regs.err_status & ERR_STATUS_UE) &&
b6c745ae1213b2 Ruidong Tian 2025-01-15  320  		(regs.err_status & ERR_STATUS_UET) > ERR_STATUS_UET_UEU)
b6c745ae1213b2 Ruidong Tian 2025-01-15  321  		aest_panic(record, &regs, "AEST: unrecoverable error encountered");
b6c745ae1213b2 Ruidong Tian 2025-01-15  322  
b6c745ae1213b2 Ruidong Tian 2025-01-15  323  	aest_log(record, &regs);
b6c745ae1213b2 Ruidong Tian 2025-01-15  324  
b6c745ae1213b2 Ruidong Tian 2025-01-15  325  	/* Write-one-to-clear the bits we've seen */
b6c745ae1213b2 Ruidong Tian 2025-01-15  326  	regs.err_status &= ERR_STATUS_W1TC;
b6c745ae1213b2 Ruidong Tian 2025-01-15  327  
b6c745ae1213b2 Ruidong Tian 2025-01-15  328  	/* Multi bit filed need to write all-ones to clear. */
b6c745ae1213b2 Ruidong Tian 2025-01-15  329  	if (regs.err_status & ERR_STATUS_CE)
b6c745ae1213b2 Ruidong Tian 2025-01-15  330  		regs.err_status |= ERR_STATUS_CE;
b6c745ae1213b2 Ruidong Tian 2025-01-15  331  
b6c745ae1213b2 Ruidong Tian 2025-01-15  332  	/* Multi bit filed need to write all-ones to clear. */
b6c745ae1213b2 Ruidong Tian 2025-01-15  333  	if (regs.err_status & ERR_STATUS_UET)
b6c745ae1213b2 Ruidong Tian 2025-01-15  334  		regs.err_status |= ERR_STATUS_UET;
b6c745ae1213b2 Ruidong Tian 2025-01-15  335  
b6c745ae1213b2 Ruidong Tian 2025-01-15  336  	record_write(record, ERXSTATUS, regs.err_status);
b6c745ae1213b2 Ruidong Tian 2025-01-15  337  }
b6c745ae1213b2 Ruidong Tian 2025-01-15  338  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver
  2025-01-15  8:42 ` [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver Ruidong Tian
@ 2025-01-17 10:50   ` Tomohiro Misono (Fujitsu)
  2025-02-06  8:32     ` Ruidong Tian
  2025-02-19 20:49   ` Borislav Petkov
  1 sibling, 1 reply; 16+ messages in thread
From: Tomohiro Misono (Fujitsu) @ 2025-01-17 10:50 UTC (permalink / raw)
  To: 'Ruidong Tian', catalin.marinas@arm.com, will@kernel.org,
	lpieralisi@kernel.org, guohanjun@huawei.com, sudeep.holla@arm.com,
	xueshuai@linux.alibaba.com, baolin.wang@linux.alibaba.com,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, rafael@kernel.org,
	lenb@kernel.org, tony.luck@intel.com, bp@alien8.de,
	yazen.ghannam@amd.com
  Cc: Tyler Baicar

Hello, some comments below.

> Subject: [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver
> 
> Add support for parsing the ARM Error Source Table and basic handling of
> errors reported through both memory mapped and system register interfaces.
> 
> Assume system register interfaces are only registered with private
> peripheral interrupts (PPIs); otherwise there is no guarantee the
> core handling the error is the core which took the error and has the
> syndrome info in its system registers.
> 
> In kernel-first mode, all configuration is controlled by kernel, include
> CE ce_threshold and interrupt enable/disable.
> 
> All detected errors will be processed as follow:
>   - CE, DE: use a workqueue to log this hare errors.
>   - UER, UEO: log it and call memory_failun workquee.
>   - UC, UEU: panic in irq context.
> 
> Signed-off-by: Tyler Baicar <baicar@os.amperecomputing.com>
> Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
> ---
>  MAINTAINERS                  |  10 +
>  arch/arm64/include/asm/ras.h |  95 ++++
>  drivers/acpi/arm64/Kconfig   |  11 +
>  drivers/acpi/arm64/Makefile  |   1 +
>  drivers/acpi/arm64/aest.c    | 335 ++++++++++++
>  drivers/acpi/arm64/init.c    |   2 +
>  drivers/acpi/arm64/init.h    |   1 +
>  drivers/ras/Kconfig          |   1 +
>  drivers/ras/Makefile         |   1 +
>  drivers/ras/aest/Kconfig     |  17 +
>  drivers/ras/aest/Makefile    |   5 +
>  drivers/ras/aest/aest-core.c | 976 +++++++++++++++++++++++++++++++++++
>  drivers/ras/aest/aest.h      | 323 ++++++++++++
>  include/linux/acpi_aest.h    |  68 +++
>  include/linux/cpuhotplug.h   |   1 +
>  include/linux/ras.h          |   8 +
>  16 files changed, 1855 insertions(+)
>  create mode 100644 arch/arm64/include/asm/ras.h
>  create mode 100644 drivers/acpi/arm64/aest.c
>  create mode 100644 drivers/ras/aest/Kconfig
>  create mode 100644 drivers/ras/aest/Makefile
>  create mode 100644 drivers/ras/aest/aest-core.c
>  create mode 100644 drivers/ras/aest/aest.h
>  create mode 100644 include/linux/acpi_aest.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 637ddd44245f..d757f9339627 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -330,6 +330,16 @@ S:	Maintained
>  F:	drivers/acpi/arm64
>  F:	include/linux/acpi_iort.h
> 
> +ACPI AEST
> +M:	Ruidong Tian <tianruidond@linux.alibaba.com>
> +L:	linux-acpi@vger.kernel.org
> +L:	linux-arm-kernel@lists.infradead.org
> +S:	Supported
> +F:	arch/arm64/include/asm/ras.h
> +F:	drivers/acpi/arm64/aest.c
> +F:	drivers/ras/aest/
> +F:	include/linux/acpi_aest.h
> +
>  ACPI FOR RISC-V (ACPI/riscv)
>  M:	Sunil V L <sunilvl@ventanamicro.com>
>  L:	linux-acpi@vger.kernel.org
> diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h
> new file mode 100644
> index 000000000000..7676add8a0ed
> --- /dev/null
> +++ b/arch/arm64/include/asm/ras.h
> @@ -0,0 +1,95 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_RAS_H
> +#define __ASM_RAS_H
> +
> +#include <linux/types.h>
> +#include <linux/bits.h>
> +
> +/* ERR<n>FR */
> +#define ERR_FR_CE                      GENMASK_ULL(54, 53)
> +#define ERR_FR_RP                      BIT(15)
> +#define ERR_FR_CEC                     GENMASK_ULL(14, 12)
> +
> +#define ERR_FR_RP_SINGLE_COUNTER       0
> +#define ERR_FR_RP_DOUBLE_COUNTER       1
> +
> +#define ERR_FR_CEC_0B_COUNTER          0
> +#define ERR_FR_CEC_8B_COUNTER          BIT(1)
> +#define ERR_FR_CEC_16B_COUNTER         BIT(2)
> +
> +/* ERR<n>STATUS */
> +#define ERR_STATUS_AV		BIT(31)
> +#define ERR_STATUS_V		BIT(30)
> +#define ERR_STATUS_UE		BIT(29)
> +#define ERR_STATUS_ER		BIT(28)
> +#define ERR_STATUS_OF		BIT(27)
> +#define ERR_STATUS_MV		BIT(26)
> +#define ERR_STATUS_CE		(BIT(25) | BIT(24))
> +#define ERR_STATUS_DE		BIT(23)
> +#define ERR_STATUS_PN		BIT(22)
> +#define ERR_STATUS_UET		(BIT(21) | BIT(20))
> +#define ERR_STATUS_CI		BIT(19)
> +#define ERR_STATUS_IERR		GENMASK_ULL(15, 8)
> +#define ERR_STATUS_SERR		GENMASK_ULL(7, 0)
> +
> +/* Theses bits are	 write-one-to-clear */
> +#define ERR_STATUS_W1TC		(ERR_STATUS_AV | ERR_STATUS_V | ERR_STATUS_UE | \
> +				ERR_STATUS_ER | ERR_STATUS_OF | ERR_STATUS_MV | \
> +				ERR_STATUS_CE | ERR_STATUS_DE | ERR_STATUS_PN | \
> +				ERR_STATUS_UET | ERR_STATUS_CI)
> +
> +#define ERR_STATUS_UET_UC	0
> +#define ERR_STATUS_UET_UEU	1
> +#define ERR_STATUS_UET_UEO	2
> +#define ERR_STATUS_UET_UER	3
> +
> +/* ERR<n>CTLR */
> +#define ERR_CTLR_CFI		BIT(8)
> +#define ERR_CTLR_FI		BIT(3)
> +#define ERR_CTLR_UI		BIT(2)
> +
> +/* ERR<n>ADDR */
> +#define ERR_ADDR_AI		BIT(61)
> +#define ERR_ADDR_PADDR		GENMASK_ULL(55, 0)
> +
> +/* ERR<n>MISC0 */
> +
> +/* ERR<n>FR.CEC == 0b010, ERR<n>FR.RP == 0  */
> +#define ERR_MISC0_8B_OF		BIT(39)
> +#define ERR_MISC0_8B_CEC	GENMASK_ULL(38, 32)
> +
> +/* ERR<n>FR.CEC == 0b100, ERR<n>FR.RP == 0  */
> +#define ERR_MISC0_16B_OF	BIT(47)
> +#define ERR_MISC0_16B_CEC	GENMASK_ULL(46, 32)
> +
> +#define ERR_MISC0_CEC_SHIFT	31
> +
> +#define ERR_8B_CEC_MAX		(ERR_MISC0_8B_CEC >> ERR_MISC0_CEC_SHIFT)
> +#define ERR_16B_CEC_MAX		(ERR_MISC0_16B_CEC >> ERR_MISC0_CEC_SHIFT)
> +
> +/* ERR<n>FR.CEC == 0b100, ERR<n>FR.RP == 1  */
> +#define ERR_MISC0_16B_OFO	BIT(63)
> +#define ERR_MISC0_16B_CECO	GENMASK_ULL(62, 48)
> +#define ERR_MISC0_16B_OFR	BIT(47)
> +#define ERR_MISC0_16B_CECR	GENMASK_ULL(46, 32)
> +
> +/* ERRDEVARCH */
> +#define ERRDEVARCH_REV		GENMASK(19, 16)
> +
> +enum ras_ce_threshold {
> +	RAS_CE_THRESHOLD_0B,
> +	RAS_CE_THRESHOLD_8B,
> +	RAS_CE_THRESHOLD_16B,
> +	RAS_CE_THRESHOLD_32B,
> +	UNKNOWN,
> +};
> +
> +struct ras_ext_regs {
> +	u64 err_fr;
> +	u64 err_ctlr;
> +	u64 err_status;
> +	u64 err_addr;
> +	u64 err_misc[4];
> +};
> +
> +#endif	/* __ASM_RAS_H */
> diff --git a/drivers/acpi/arm64/Kconfig b/drivers/acpi/arm64/Kconfig
> index b3ed6212244c..c8eb6de95733 100644
> --- a/drivers/acpi/arm64/Kconfig
> +++ b/drivers/acpi/arm64/Kconfig
> @@ -21,3 +21,14 @@ config ACPI_AGDI
> 
>  config ACPI_APMT
>  	bool
> +
> +config ACPI_AEST
> +	bool "ARM Error Source Table Support"
> +	depends on ARM64_RAS_EXTN
> +
> +	help
> +	  The Arm Error Source Table (AEST) provides details on ACPI
> +	  extensions that enable kernel-first handling of errors in a
> +	  system that supports the Armv8 RAS extensions.
> +
> +	  If set, the kernel will report and log hardware errors.
> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
> index 05ecde9eaabe..8e240b281fd1 100644
> --- a/drivers/acpi/arm64/Makefile
> +++ b/drivers/acpi/arm64/Makefile
> @@ -6,5 +6,6 @@ obj-$(CONFIG_ACPI_GTDT) 	+= gtdt.o
>  obj-$(CONFIG_ACPI_IORT) 	+= iort.o
>  obj-$(CONFIG_ACPI_PROCESSOR_IDLE) += cpuidle.o
>  obj-$(CONFIG_ARM_AMBA)		+= amba.o
> +obj-$(CONFIG_ACPI_AEST) 	+= aest.o
>  obj-y				+= dma.o init.o
>  obj-y				+= thermal_cpufreq.o
> diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c
> new file mode 100644
> index 000000000000..6dba9c23e04e
> --- /dev/null
> +++ b/drivers/acpi/arm64/aest.c
> @@ -0,0 +1,335 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * ARM Error Source Table Support
> + *
> + * Copyright (c) 2024, Alibaba Group.
> + */
> +
> +#include <linux/xarray.h>
> +#include <linux/platform_device.h>
> +#include <linux/acpi_aest.h>
> +
> +#include "init.h"
> +
> +#undef pr_fmt
> +#define pr_fmt(fmt) "ACPI AEST: " fmt
> +
> +static struct xarray *aest_array;
> +
> +static void __init aest_init_interface(struct acpi_aest_hdr *hdr,
> +				       struct acpi_aest_node *node)
> +{
> +	struct acpi_aest_node_interface_header *interface;
> +
> +	interface = ACPI_ADD_PTR(struct acpi_aest_node_interface_header, hdr,
> +				 hdr->node_interface_offset);
> +
> +	node->type = hdr->type;
> +	node->interface_hdr = interface;
> +
> +	switch (interface->group_format) {
> +	case ACPI_AEST_NODE_GROUP_FORMAT_4K: {
> +		struct acpi_aest_node_interface_4k *interface_4k =
> +			(struct acpi_aest_node_interface_4k *)(interface + 1);
> +
> +		node->common = &interface_4k->common;
> +		node->record_implemented =
> +			(unsigned long *)&interface_4k->error_record_implemented;
> +		node->status_reporting =
> +			(unsigned long *)&interface_4k->error_status_reporting;
> +		node->addressing_mode =
> +			(unsigned long *)&interface_4k->addressing_mode;
> +		break;
> +	}
> +	case ACPI_AEST_NODE_GROUP_FORMAT_16K: {
> +		struct acpi_aest_node_interface_16k *interface_16k =
> +			(struct acpi_aest_node_interface_16k *)(interface + 1);
> +
> +		node->common = &interface_16k->common;
> +		node->record_implemented =
> +			(unsigned long *)interface_16k->error_record_implemented;
> +		node->status_reporting =
> +			(unsigned long *)interface_16k->error_status_reporting;
> +		node->addressing_mode =
> +			(unsigned long *)interface_16k->addressing_mode;
> +		break;
> +	}
> +	case ACPI_AEST_NODE_GROUP_FORMAT_64K: {
> +		struct acpi_aest_node_interface_64k *interface_64k =
> +			(struct acpi_aest_node_interface_64k *)(interface + 1);
> +
> +		node->common = &interface_64k->common;
> +		node->record_implemented =
> +			(unsigned long *)interface_64k->error_record_implemented;
> +		node->status_reporting =
> +			(unsigned long *)interface_64k->error_status_reporting;
> +		node->addressing_mode =
> +			(unsigned long *)interface_64k->addressing_mode;
> +		break;
> +	}
> +	default:
> +		pr_err("invalid group format: %d\n", interface->group_format);
> +	}
> +
> +	node->interrupt = ACPI_ADD_PTR(struct acpi_aest_node_interrupt_v2,
> +					hdr, hdr->node_interrupt_offset);
> +
> +	node->interrupt_count = hdr->node_interrupt_count;
> +}
> +
> +static int __init acpi_aest_init_node_common(struct acpi_aest_hdr *aest_hdr,
> +					struct acpi_aest_node *node)
> +{
> +	int ret;
> +	struct aest_hnode *hnode;
> +	u64 error_device_id;
> +
> +	aest_init_interface(aest_hdr, node);
> +
> +	error_device_id = node->common->error_node_device;

I think I see a problem with this.
From the spec[1], I understand that error node device is optional and
error node device field is only valid when error node device valid flag is set.

[1] https://developer.arm.com/documentation/den0085/latest/

Previous versions work well for the system without error node device (i.e. system 
without ARMHE000 definition in DSDT) but this version doesn't.
Do we need to rely on information from error node device here when
a system has them? I thought AEST table has necessary information in all case and
want to know why this version use different approach from v2.

Also, I wonder if there will be a system that only some nodes have valid flag.

> +
> +	hnode = xa_load(aest_array, error_device_id);
> +	if (!hnode) {
> +		hnode = kmalloc(sizeof(*hnode), GFP_KERNEL);
> +		if (!hnode) {
> +			ret = -ENOMEM;
> +			goto free;
> +		}
> +		INIT_LIST_HEAD(&hnode->list);
> +		hnode->uid = error_device_id;
> +		hnode->count = 0;
> +		hnode->type = node->type;
> +		xa_store(aest_array, error_device_id, hnode, GFP_KERNEL);
> +	}
> +
> +	list_add_tail(&node->list, &hnode->list);
> +	hnode->count++;
> +
> +	return 0;
> +
> +free:
> +	kfree(node);
> +	return ret;
> +}
> +
> +static int __init
> +acpi_aest_init_node_default(struct acpi_aest_hdr *aest_hdr)
> +{
> +	struct acpi_aest_node *node;
> +
> +	node = kzalloc(sizeof(*node), GFP_KERNEL);
> +	if (!node)
> +		return -ENOMEM;
> +
> +	node->spec_pointer = ACPI_ADD_PTR(void, aest_hdr,
> +					aest_hdr->node_specific_offset);
> +
> +	return acpi_aest_init_node_common(aest_hdr, node);
> +}
> +
> +static int __init
> +acpi_aest_init_processor_node(struct acpi_aest_hdr *aest_hdr)
> +{
> +	struct acpi_aest_node *node;
> +
> +	node = kzalloc(sizeof(*node), GFP_KERNEL);
> +	if (!node)
> +		return -ENOMEM;
> +
> +	node->spec_pointer = ACPI_ADD_PTR(void, aest_hdr,
> +					aest_hdr->node_specific_offset);
> +
> +	node->processor_spec_pointer = ACPI_ADD_PTR(void, node->spec_pointer,
> +					sizeof(struct acpi_aest_processor));
> +
> +	return acpi_aest_init_node_common(aest_hdr, node);
> +}
> +
> +static int __init acpi_aest_init_node(struct acpi_aest_hdr *header)
> +{
> +	switch (header->type) {
> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
> +		return acpi_aest_init_processor_node(header);
> +	case ACPI_AEST_VENDOR_ERROR_NODE:
> +	case ACPI_AEST_SMMU_ERROR_NODE:
> +	case ACPI_AEST_GIC_ERROR_NODE:
> +	case ACPI_AEST_PCIE_ERROR_NODE:
> +	case ACPI_AEST_PROXY_ERROR_NODE:
> +	case ACPI_AEST_MEMORY_ERROR_NODE:
> +		return acpi_aest_init_node_default(header);
> +	default:
> +		pr_err("acpi table header type is invalid: %d\n", header->type);
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int __init acpi_aest_init_nodes(struct acpi_table_header *aest_table)
> +{
> +	struct acpi_aest_hdr *aest_node, *aest_end;
> +	struct acpi_table_aest *aest;
> +	int rc;
> +
> +	aest = (struct acpi_table_aest *)aest_table;
> +	aest_node = ACPI_ADD_PTR(struct acpi_aest_hdr, aest,
> +				 sizeof(struct acpi_table_header));
> +	aest_end = ACPI_ADD_PTR(struct acpi_aest_hdr, aest,
> +				aest_table->length);
> +
> +	while (aest_node < aest_end) {
> +		if (((u64)aest_node + aest_node->length) > (u64)aest_end) {
> +			pr_warn(FW_WARN "AEST node pointer overflow, bad table.\n");
> +			return -EINVAL;
> +		}
> +
> +		rc = acpi_aest_init_node(aest_node);
> +		if (rc)
> +			return rc;
> +
> +		aest_node = ACPI_ADD_PTR(struct acpi_aest_hdr, aest_node,
> +					 aest_node->length);
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +acpi_aest_parse_irqs(struct platform_device *pdev, struct acpi_aest_node *anode,
> +				struct resource *res, int *res_idx, int irqs[2])
> +{
> +	int i;
> +	struct acpi_aest_node_interrupt_v2 *interrupt;
> +	int trigger, irq;
> +
> +	for (i = 0; i < anode->interrupt_count; i++) {
> +		interrupt = &anode->interrupt[i];
> +		if (irqs[interrupt->type])
> +			continue;
> +
> +		trigger = (interrupt->flags & AEST_INTERRUPT_MODE) ?
> +			ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE;
> +
> +		irq = acpi_register_gsi(&pdev->dev, interrupt->gsiv, trigger,
> +						ACPI_ACTIVE_HIGH);
> +		if (irq <= 0) {
> +			pr_err("failed to map AEST GSI %d\n", interrupt->gsiv);
> +			return irq;
> +		}
> +
> +		res[*res_idx].start = irq;
> +		res[*res_idx].end = irq;
> +		res[*res_idx].flags = IORESOURCE_IRQ;
> +		res[*res_idx].name = interrupt->type ? "eri" : "fhi";
> +
> +		(*res_idx)++;
> +
> +		irqs[interrupt->type] = irq;
> +	}
> +
> +	return 0;
> +}
> +
> +static int __init acpi_aest_alloc_pdev(void)
> +{
> +	int ret, j, size;
> +	struct aest_hnode *ahnode = NULL;
> +	unsigned long i;
> +	struct platform_device *pdev;
> +	struct acpi_device *companion;
> +	struct acpi_aest_node *anode;
> +	char uid[16];
> +	struct resource *res;
> +
> +	xa_for_each(aest_array, i, ahnode) {
> +		int irq[2] = { 0 };
> +
> +		res = kcalloc(ahnode->count + 2, sizeof(*res), GFP_KERNEL);

Why is +2 needed?

> +		if (!res) {
> +			ret = -ENOMEM;
> +			break;
> +		}
> +
> +		pdev = platform_device_alloc("AEST", i);
> +		if (IS_ERR(pdev)) {
> +			ret = PTR_ERR(pdev);
> +			break;
> +		}
> +
> +		ret = snprintf(uid, sizeof(uid), "%u", (u32)i);
> +		companion = acpi_dev_get_first_match_dev("ARMHE000", uid, -1);
> +		if (companion)
> +			ACPI_COMPANION_SET(&pdev->dev, companion);
> +
> +		j = 0;
> +		list_for_each_entry(anode, &ahnode->list, list) {
> +			if (anode->interface_hdr->type !=
> +					ACPI_AEST_NODE_SYSTEM_REGISTER) {
> +				res[j].name = "AEST:RECORD";
> +				res[j].start = anode->interface_hdr->address;
> +				size = anode->interface_hdr->error_record_count *
> +						sizeof(struct ras_ext_regs);
> +				res[j].end = res[j].start + size;
> +				res[j].flags = IORESOURCE_MEM;

Will these fields be overwritten in below acpi_aest_parse_irqs()?

> +			}
> +
> +			ret = acpi_aest_parse_irqs(pdev, anode, res, &j, irq);
> +			if (ret) {
> +				platform_device_put(pdev);
> +				break;
> +			}
> +		}
> +
> +		ret = platform_device_add_resources(pdev, res, j);
> +		if (ret)
> +			break;
> +
> +		ret = platform_device_add_data(pdev, &ahnode, sizeof(ahnode));
> +		if (ret)
> +			break;
> +
> +		ret = platform_device_add(pdev);
> +		if (ret)
> +			break;
> +	}
> +
> +	kfree(res);
> +	if (ret)
> +		platform_device_put(pdev);
> +
> +	return ret;
> +}
> +
> +void __init acpi_aest_init(void)
> +{
> +	acpi_status status;
> +	int ret;
> +	struct acpi_table_header *aest_table;
> +
> +	status = acpi_get_table(ACPI_SIG_AEST, 0, &aest_table);
> +	if (ACPI_FAILURE(status)) {
> +		if (status != AE_NOT_FOUND) {
> +			const char *msg = acpi_format_exception(status);
> +
> +			pr_err("Failed to get table, %s\n", msg);
> +		}
> +
> +		return;
> +	}
> +
> +	aest_array = kzalloc(sizeof(struct xarray), GFP_KERNEL);
> +	xa_init(aest_array);
> +
> +	ret = acpi_aest_init_nodes(aest_table);
> +	if (ret) {
> +		pr_err("Failed init aest node %d\n", ret);
> +		goto out;
> +	}
> +
> +	ret = acpi_aest_alloc_pdev();
> +	if (ret)
> +		pr_err("Failed alloc pdev %d\n", ret);
> +
> +out:
> +	acpi_put_table(aest_table);
> +}
> diff --git a/drivers/acpi/arm64/init.c b/drivers/acpi/arm64/init.c
> index 7a47d8095a7d..b0c768923831 100644
> --- a/drivers/acpi/arm64/init.c
> +++ b/drivers/acpi/arm64/init.c
> @@ -12,4 +12,6 @@ void __init acpi_arch_init(void)
>  		acpi_iort_init();
>  	if (IS_ENABLED(CONFIG_ARM_AMBA))
>  		acpi_amba_init();
> +	if (IS_ENABLED(CONFIG_ACPI_AEST))
> +		acpi_aest_init();
>  }
> diff --git a/drivers/acpi/arm64/init.h b/drivers/acpi/arm64/init.h
> index dcc277977194..3902d1676068 100644
> --- a/drivers/acpi/arm64/init.h
> +++ b/drivers/acpi/arm64/init.h
> @@ -5,3 +5,4 @@ void __init acpi_agdi_init(void);
>  void __init acpi_apmt_init(void);
>  void __init acpi_iort_init(void);
>  void __init acpi_amba_init(void);
> +void __init acpi_aest_init(void);
> diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
> index fc4f4bb94a4c..61a2a05d9c94 100644
> --- a/drivers/ras/Kconfig
> +++ b/drivers/ras/Kconfig
> @@ -33,6 +33,7 @@ if RAS
> 
>  source "arch/x86/ras/Kconfig"
>  source "drivers/ras/amd/atl/Kconfig"
> +source "drivers/ras/aest/Kconfig"
> 
>  config RAS_FMPM
>  	tristate "FRU Memory Poison Manager"
> diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile
> index 11f95d59d397..72411ee9deaf 100644
> --- a/drivers/ras/Makefile
> +++ b/drivers/ras/Makefile
> @@ -5,3 +5,4 @@ obj-$(CONFIG_RAS_CEC)	+= cec.o
> 
>  obj-$(CONFIG_RAS_FMPM)	+= amd/fmpm.o
>  obj-y			+= amd/atl/
> +obj-y 			+= aest/
> diff --git a/drivers/ras/aest/Kconfig b/drivers/ras/aest/Kconfig
> new file mode 100644
> index 000000000000..6d436d911bea
> --- /dev/null
> +++ b/drivers/ras/aest/Kconfig
> @@ -0,0 +1,17 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# ARM Error Source Table Support
> +#
> +# Copyright (c) 2024, Alibaba Group.
> +#
> +
> +config AEST
> +	tristate "ARM AEST Driver"
> +	depends on ACPI_AEST && RAS
> +
> +	help
> +	  The Arm Error Source Table (AEST) provides details on ACPI
> +	  extensions that enable kernel-first handling of errors in a
> +	  system that supports the Armv8 RAS extensions.
> +
> +	  If set, the kernel will report and log hardware errors.
> diff --git a/drivers/ras/aest/Makefile b/drivers/ras/aest/Makefile
> new file mode 100644
> index 000000000000..a6ba7e36fb43
> --- /dev/null
> +++ b/drivers/ras/aest/Makefile
> @@ -0,0 +1,5 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +obj-$(CONFIG_AEST) 	+= aest.o
> +
> +aest-y		:= aest-core.o
> diff --git a/drivers/ras/aest/aest-core.c b/drivers/ras/aest/aest-core.c
> new file mode 100644
> index 000000000000..060a1eedee0a
> --- /dev/null
> +++ b/drivers/ras/aest/aest-core.c
> @@ -0,0 +1,976 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * ARM Error Source Table Support
> + *
> + * Copyright (c) 2021-2024, Alibaba Group.
> + */
> +
> +#include <linux/interrupt.h>
> +#include <linux/panic.h>
> +#include <linux/platform_device.h>
> +#include <linux/xarray.h>
> +#include <linux/cpuhotplug.h>
> +#include <linux/genalloc.h>
> +#include <linux/ras.h>
> +
> +#include "aest.h"
> +
> +DEFINE_PER_CPU(struct aest_device, percpu_adev);
> +
> +#undef pr_fmt
> +#define pr_fmt(fmt) "AEST: " fmt
> +
> +/*
> + * This memory pool is only to be used to save AEST node in AEST irq context.
> + * There can be 500 AEST node at most.
> + */
> +#define AEST_NODE_ALLOCED_MAX	500
> +
> +#define AEST_LOG_PREFIX_BUFFER	64
> +
> +BLOCKING_NOTIFIER_HEAD(aest_decoder_chain);
> +
> +static void aest_print(struct aest_event *event)
> +{
> +	static atomic_t seqno = { 0 };
> +	unsigned int curr_seqno;
> +	char pfx_seq[AEST_LOG_PREFIX_BUFFER];
> +	int index;
> +	struct ras_ext_regs *regs;
> +
> +	curr_seqno = atomic_inc_return(&seqno);
> +	snprintf(pfx_seq, sizeof(pfx_seq), "{%u}" HW_ERR, curr_seqno);
> +	pr_info("%sHardware error from AEST %s\n", pfx_seq, event->node_name);
> +
> +	switch (event->type) {
> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
> +		pr_err("%s Error from CPU%d\n", pfx_seq, event->id0);
> +		break;
> +	case ACPI_AEST_MEMORY_ERROR_NODE:
> +		pr_err("%s Error from memory at SRAT proximity domain %#x\n",
> +			pfx_seq, event->id0);
> +		break;
> +	case ACPI_AEST_SMMU_ERROR_NODE:
> +		pr_err("%s Error from SMMU IORT node %#x subcomponent %#x\n",
> +			pfx_seq, event->id0, event->id1);
> +		break;
> +	case ACPI_AEST_VENDOR_ERROR_NODE:
> +		pr_err("%s Error from vendor hid %8.8s uid %#x\n",
> +			pfx_seq, event->hid, event->id1);
> +		break;
> +	case ACPI_AEST_GIC_ERROR_NODE:
> +		pr_err("%s Error from GIC type %#x instance %#x\n",
> +			pfx_seq, event->id0, event->id1);
> +		break;
> +	default:
> +		pr_err("%s Unknown AEST node type\n", pfx_seq);
> +		return;
> +	}
> +
> +	index = event->index;
> +	regs = &event->regs;
> +
> +	pr_err("%s  ERR%dFR: 0x%llx\n", pfx_seq, index, regs->err_fr);
> +	pr_err("%s  ERR%dCTRL: 0x%llx\n", pfx_seq, index, regs->err_ctlr);
> +	pr_err("%s  ERR%dSTATUS: 0x%llx\n", pfx_seq, index, regs->err_status);
> +	if (regs->err_status & ERR_STATUS_AV)
> +		pr_err("%s  ERR%dADDR: 0x%llx\n", pfx_seq, index,
> +						regs->err_addr);
> +
> +	if (regs->err_status & ERR_STATUS_MV) {
> +		pr_err("%s  ERR%dMISC0: 0x%llx\n", pfx_seq, index,
> +						regs->err_misc[0]);
> +		pr_err("%s  ERR%dMISC1: 0x%llx\n", pfx_seq, index,
> +						regs->err_misc[1]);
> +		pr_err("%s  ERR%dMISC2: 0x%llx\n", pfx_seq, index,
> +						regs->err_misc[2]);
> +		pr_err("%s  ERR%dMISC3: 0x%llx\n", pfx_seq, index,
> +						regs->err_misc[3]);
> +	}
> +}
> +
> +static void aest_handle_memory_failure(u64 addr)
> +{
> +	unsigned long pfn;
> +
> +	pfn = PHYS_PFN(addr);
> +
> +	if (!pfn_valid(pfn)) {
> +		pr_warn(HW_ERR "Invalid physical address: %#llx\n", addr);
> +		return;
> +	}
> +
> +#ifdef CONFIG_MEMORY_FAILURE
> +	memory_failure(pfn, 0);
> +#endif
> +}
> +
> +static void init_aest_event(struct aest_event *event, struct aest_record *record,
> +					struct ras_ext_regs *regs)
> +{
> +	struct aest_node *node = record->node;
> +	struct acpi_aest_node *info = node->info;
> +
> +	event->type = node->type;
> +	event->node_name = node->name;
> +	switch (node->type) {
> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
> +		if (info->processor->flags & (ACPI_AEST_PROC_FLAG_SHARED |
> +						ACPI_AEST_PROC_FLAG_GLOBAL))
> +			event->id0 = smp_processor_id();

In "else" case, acpi processor id will be set for id0. So, how about use
get_acpi_id_for_cpu(smp_processor_id()) here for consistency?

> +		else
> +			event->id0 = info->processor->processor_id;
> +
> +		event->id1 = info->processor->resource_type;
> +		break;
> +	case ACPI_AEST_MEMORY_ERROR_NODE:
> +		event->id0 = info->memory->srat_proximity_domain;
> +		break;
> +	case ACPI_AEST_SMMU_ERROR_NODE:
> +		event->id0 = info->smmu->iort_node_reference;
> +		event->id1 = info->smmu->subcomponent_reference;
> +		break;
> +	case ACPI_AEST_VENDOR_ERROR_NODE:
> +		event->id0 = 0;
> +		event->id1 = info->vendor->acpi_uid;
> +		event->hid = info->vendor->acpi_hid;
> +		break;
> +	case ACPI_AEST_GIC_ERROR_NODE:
> +		event->id0 = info->gic->interface_type;
> +		event->id1 = info->gic->instance_id;
> +		break;
> +	default:
> +		event->id0 = 0;
> +		event->id1 = 0;
> +	}
> +
> +	memcpy(&event->regs, regs, sizeof(*regs));
> +	event->index = record->index;
> +	event->addressing_mode = record->addressing_mode;
> +}
> +
> +static int
> +aest_node_gen_pool_add(struct aest_device *adev, struct aest_record *record,
> +					struct ras_ext_regs *regs)
> +{
> +	struct aest_event *event;
> +
> +	if (!adev->pool)
> +		return -EINVAL;
> +
> +	event = (void *)gen_pool_alloc(adev->pool, sizeof(*event));
> +	if (!event)
> +		return -ENOMEM;
> +
> +	init_aest_event(event, record, regs);
> +	llist_add(&event->llnode, &adev->event_list);
> +
> +	return 0;
> +}
> +
> +static void aest_log(struct aest_record *record, struct ras_ext_regs *regs)
> +{
> +	struct aest_device *adev = record->node->adev;
> +
> +	if (!aest_node_gen_pool_add(adev, record, regs))
> +		schedule_work(&adev->aest_work);
> +}
> +
> +void aest_register_decode_chain(struct notifier_block *nb)
> +{
> +	blocking_notifier_chain_register(&aest_decoder_chain, nb);
> +}
> +EXPORT_SYMBOL_GPL(aest_register_decode_chain);
> +
> +void aest_unregister_decode_chain(struct notifier_block *nb)
> +{
> +	blocking_notifier_chain_unregister(&aest_decoder_chain, nb);
> +}
> +EXPORT_SYMBOL_GPL(aest_unregister_decode_chain);
> +
> +static void aest_node_pool_process(struct work_struct *work)
> +{
> +	struct llist_node *head;
> +	struct aest_event *event;
> +	struct aest_device *adev = container_of(work, struct aest_device,
> +							aest_work);
> +	u64 status, addr;
> +
> +	head = llist_del_all(&adev->event_list);
> +	if (!head)
> +		return;
> +
> +	head = llist_reverse_order(head);
> +	llist_for_each_entry(event, head, llnode) {
> +		aest_print(event);
> +
> +		/* TODO: translate Logical Addresses to System Physical Addresses */
> +		if (event->addressing_mode == AEST_ADDREESS_LA ||
> +			(event->regs.err_addr & ERR_ADDR_AI)) {
> +			pr_notice("Can not translate LA to SPA\n");
> +			addr = 0;
> +		} else
> +			addr = event->regs.err_addr & (1UL << CONFIG_ARM64_PA_BITS);
> +
> +		status = event->regs.err_status;
> +		if (addr && ((status & ERR_STATUS_UE) || (status & ERR_STATUS_DE)))
> +			aest_handle_memory_failure(addr);
> +
> +		blocking_notifier_call_chain(&aest_decoder_chain, 0, event);
> +		gen_pool_free(adev->pool, (unsigned long)event,
> +				sizeof(*event));
> +	}
> +}
> +
> +static int aest_node_pool_init(struct aest_device *adev)
> +{
> +	unsigned long addr, size;
> +
> +	size = ilog2(sizeof(struct aest_event));
> +	adev->pool = devm_gen_pool_create(adev->dev, size, -1,
> +						dev_name(adev->dev));
> +	if (!adev->pool)
> +		return -ENOMEM;
> +
> +	size = PAGE_ALIGN(size * AEST_NODE_ALLOCED_MAX);
> +	addr = (unsigned long)devm_kzalloc(adev->dev, size, GFP_KERNEL);
> +	if (!addr)
> +		return -ENOMEM;
> +
> +	return gen_pool_add(adev->pool, addr, size, -1);
> +
> +	return 0;
> +}
> +
> +static void aest_panic(struct aest_record *record, struct ras_ext_regs *regs, char *msg)
> +{
> +	struct aest_event event = { 0 };
> +
> +	init_aest_event(&event, record, regs);
> +
> +	aest_print(&event);
> +
> +	panic(msg);
> +}
> +
> +static void aest_proc_record(struct aest_record *record, void *data)
> +{
> +	struct ras_ext_regs regs = {0};
> +	int *count = data;
> +
> +	regs.err_status = record_read(record, ERXSTATUS);
> +	if (!(regs.err_status & ERR_STATUS_V))
> +		return;
> +
> +	(*count)++;
> +
> +	if (regs.err_status & ERR_STATUS_AV)
> +		regs.err_addr = record_read(record, ERXADDR);
> +
> +	regs.err_fr = record->fr;
> +	regs.err_ctlr = record_read(record, ERXCTLR);
> +
> +	if (regs.err_status & ERR_STATUS_MV) {
> +		regs.err_misc[0] = record_read(record, ERXMISC0);
> +		regs.err_misc[1] = record_read(record, ERXMISC1);
> +		if (record->node->version >= ID_AA64PFR0_EL1_RAS_V1P1) {
> +			regs.err_misc[2] = record_read(record, ERXMISC2);
> +			regs.err_misc[3] = record_read(record, ERXMISC3);
> +		}
> +
> +		if (record->node->info->interface_hdr->flags &
> +			AEST_XFACE_FLAG_CLEAR_MISC) {
> +			record_write(record, ERXMISC0, 0);
> +			record_write(record, ERXMISC1, 0);
> +			if (record->node->version >= ID_AA64PFR0_EL1_RAS_V1P1) {
> +				record_write(record, ERXMISC2, 0);
> +				record_write(record, ERXMISC3, 0);
> +			}
> +		/* ce count is 0 if record do not support ce */
> +		} else if (record->ce.count > 0)
> +			record_write(record, ERXMISC0, record->ce.reg_val);
> +	}
> +
> +	/* panic if unrecoverable and uncontainable error encountered */
> +	if ((regs.err_status & ERR_STATUS_UE) &&
> +		(regs.err_status & ERR_STATUS_UET) > ERR_STATUS_UET_UEU)
> +		aest_panic(record, &regs, "AEST: unrecoverable error encountered");

I think we need to use FIELD_GET to get correct value.
	u64 ue = FIELD_GET(ERR_STATUS_UET, regs.err_status);
	if ((regs.err_status & ERR_STATUS_UE) &&
 		(ue == ERR_STATUS_UET_UC || ue == ERR_STATUS_UET_UEU))

> +
> +	aest_log(record, &regs);
> +
> +	/* Write-one-to-clear the bits we've seen */
> +	regs.err_status &= ERR_STATUS_W1TC;
> +
> +	/* Multi bit filed need to write all-ones to clear. */
> +	if (regs.err_status & ERR_STATUS_CE)
> +		regs.err_status |= ERR_STATUS_CE;
> +
> +	/* Multi bit filed need to write all-ones to clear. */
> +	if (regs.err_status & ERR_STATUS_UET)
> +		regs.err_status |= ERR_STATUS_UET;
> +
> +	record_write(record, ERXSTATUS, regs.err_status);
> +}
> +
> +static void
> +aest_node_foreach_record(void (*func)(struct aest_record *, void *),
> +				struct aest_node *node, void *data,
> +				unsigned long *bitmap)
> +{
> +	int i;
> +
> +	for_each_clear_bit(i, bitmap, node->record_count) {
> +		aest_select_record(node, i);
> +
> +		func(&node->records[i], data);
> +
> +		aest_sync(node);
> +	}
> +}
> +
> +static int aest_proc(struct aest_node *node)
> +{
> +	int count = 0, i, j, size = node->record_count;
> +	u64 err_group = 0;
> +
> +	aest_node_dbg(node, "Poll bit %*pb\n", size, node->record_implemented);
> +	aest_node_foreach_record(aest_proc_record, node, &count,
> +						node->record_implemented);
> +
> +	if (!node->errgsr)
> +		return count;
> +
> +	aest_node_dbg(node, "Report bit %*pb\n", size, node->status_reporting);
> +	for (i = 0; i < BITS_TO_U64(size); i++) {
> +		err_group = readq_relaxed((void *)node->errgsr + i * 8);
> +		aest_node_dbg(node, "errgsr[%d]: 0x%llx\n", i, err_group);
> +
> +		for_each_set_bit(j, (unsigned long *)&err_group,
> +						BITS_PER_TYPE(u64)) {
> +			/*
> +			 * Error group base is only valid in Memory Map node,
> +			 * so driver do not need to write select register and
> +			 * sync.
> +			 */
> +			if (test_bit(i * BITS_PER_TYPE(u64) + j, node->status_reporting))
> +				continue;
> +			aest_proc_record(&node->records[j], &count);
> +		}
> +	}
> +
> +	return count;
> +}
> +
> +static irqreturn_t aest_irq_func(int irq, void *input)
> +{
> +	struct aest_device *adev = input;
> +	int i;
> +
> +	for (i = 0; i < adev->node_cnt; i++)
> +		aest_proc(&adev->nodes[i]);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static void aest_enable_irq(struct aest_record *record)
> +{
> +	u64 err_ctlr;
> +	struct aest_device *adev = record->node->adev;
> +
> +	err_ctlr = record_read(record, ERXCTLR);
> +
> +	if (adev->irq[ACPI_AEST_NODE_FAULT_HANDLING])
> +		err_ctlr |= (ERR_CTLR_FI | ERR_CTLR_CFI);
> +	if (adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY])
> +		err_ctlr |= ERR_CTLR_UI;
> +
> +	record_write(record, ERXCTLR, err_ctlr);
> +}
> +
> +static void aest_config_irq(struct aest_node *node)
> +{
> +	int i;
> +	struct acpi_aest_node_interrupt_v2 *interrupt;
> +
> +	if (!node->irq_config)
> +		return;
> +
> +	for (i = 0; i < node->info->interrupt_count; i++) {
> +		interrupt = &node->info->interrupt[i];
> +
> +		if (interrupt->type == ACPI_AEST_NODE_FAULT_HANDLING)
> +			writeq_relaxed(interrupt->gsiv, node->irq_config);
> +
> +		if (interrupt->type == ACPI_AEST_NODE_ERROR_RECOVERY)
> +			writeq_relaxed(interrupt->gsiv, node->irq_config + 8);
> +
> +		aest_node_dbg(node, "config irq type %d gsiv %d at %llx",
> +				interrupt->type, interrupt->gsiv,
> +				(u64)node->irq_config);
> +	}
> +}
> +
> +static enum ras_ce_threshold aest_get_ce_threshold(struct aest_record *record)
> +{
> +	u64 err_fr, err_fr_cec, err_fr_rp = -1;
> +
> +	err_fr = record->fr;
> +	err_fr_cec = FIELD_GET(ERR_FR_CEC, err_fr);
> +	err_fr_rp = FIELD_GET(ERR_FR_RP, err_fr);
> +
> +	if (err_fr_cec == ERR_FR_CEC_0B_COUNTER)
> +		return RAS_CE_THRESHOLD_0B;
> +	else if (err_fr_rp == ERR_FR_RP_DOUBLE_COUNTER)
> +		return RAS_CE_THRESHOLD_32B;
> +	else if (err_fr_cec == ERR_FR_CEC_8B_COUNTER)
> +		return RAS_CE_THRESHOLD_8B;
> +	else if (err_fr_cec == ERR_FR_CEC_16B_COUNTER)
> +		return RAS_CE_THRESHOLD_16B;
> +	else
> +		return UNKNOWN;
> +
> +}
> +
> +static const struct ce_threshold_info ce_info[] = {
> +	[RAS_CE_THRESHOLD_0B] = { 0 },
> +	[RAS_CE_THRESHOLD_8B] = {
> +		.max_count = ERR_8B_CEC_MAX,
> +		.mask = ERR_MISC0_8B_CEC,
> +		.shift = ERR_MISC0_CEC_SHIFT,
> +	},
> +	[RAS_CE_THRESHOLD_16B] = {
> +		.max_count = ERR_16B_CEC_MAX,
> +		.mask = ERR_MISC0_16B_CEC,
> +		.shift = ERR_MISC0_CEC_SHIFT,
> +	},
> +	//TODO: Support 32B CEC threshold.
> +	[RAS_CE_THRESHOLD_32B] = { 0 },
> +};
> +
> +static void aest_set_ce_threshold(struct aest_record *record)
> +{
> +	u64 err_misc0, ce_count;
> +	struct ce_threshold *ce = &record->ce;
> +	const struct ce_threshold_info *info;
> +
> +	record->threshold_type  = aest_get_ce_threshold(record);
> +
> +	switch (record->threshold_type) {
> +	case RAS_CE_THRESHOLD_0B:
> +		aest_record_dbg(record, "do not support CE threshold!\n");
> +		return;
> +	case RAS_CE_THRESHOLD_8B:
> +		aest_record_dbg(record, "support 8 bit CE threshold!\n");
> +		break;
> +	case RAS_CE_THRESHOLD_16B:
> +		aest_record_dbg(record, "support 16 bit CE threshold!\n");
> +		break;
> +	case RAS_CE_THRESHOLD_32B:
> +		aest_record_dbg(record, "not support 32 bit CE threshold!\n");
> +		break;
> +	default:
> +		aest_record_dbg(record, "Unknown misc0 ce threshold!\n");
> +	}
> +
> +	err_misc0 = record_read(record, ERXMISC0);
> +	info = &ce_info[record->threshold_type];
> +	ce->info = info;
> +	ce_count = (err_misc0 & info->mask) >> info->shift;
> +	if (ce_count) {
> +		ce->count = ce_count;
> +		ce->threshold = info->max_count - ce_count + 1;
> +		ce->reg_val = err_misc0;
> +		aest_record_dbg(record, "CE threshold is %llx, controlled by FW",
> +							ce->threshold);
> +		return;
> +	}
> +
> +	// Default CE threshold is 1.
> +	ce->count = info->max_count;
> +	ce->threshold = DEFAULT_CE_THRESHOLD;
> +	ce->reg_val = err_misc0 | info->mask;
> +
> +	record_write(record, ERXMISC0, ce->reg_val);
> +	aest_record_dbg(record, "CE threshold is %llx, controlled by Kernel",
> +							ce->threshold);
> +}
> +
> +static int aest_register_irq(struct aest_device *adev)
> +{
> +	int i, irq, ret;
> +	char *irq_desc;
> +
> +	irq_desc = devm_kasprintf(adev->dev, GFP_KERNEL, "%s.%s.",
> +				  dev_driver_string(adev->dev),
> +				  dev_name(adev->dev));
> +	if (!irq_desc)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < MAX_GSI_PER_NODE; i++) {
> +		irq = adev->irq[i];
> +
> +		if (!irq)
> +			continue;
> +
> +		if (irq_is_percpu_devid(irq)) {
> +			ret = request_percpu_irq(irq, aest_irq_func,
> +							irq_desc,
> +							adev->adev_oncore);
> +			if (ret)
> +				goto free;
> +		} else {
> +			ret = devm_request_irq(adev->dev, irq, aest_irq_func,
> +					0, irq_desc, adev);
> +			if (ret)
> +				return ret;
> +		}
> +	}
> +	return 0;
> +
> +free:
> +	for (; i >= 0; i--) {
> +		irq = adev->irq[i];
> +
> +		if (irq_is_percpu_devid(irq))
> +			free_percpu_irq(irq, adev->adev_oncore);
> +	}
> +
> +	return ret;
> +}
> +
> +static int
> +aest_init_record(struct aest_record *record, int i, struct aest_node *node)
> +{
> +	struct device *dev = node->adev->dev;
> +
> +	record->name = devm_kasprintf(dev, GFP_KERNEL, "record%d", i);
> +	if (!record->name)
> +		return -ENOMEM;
> +
> +	if (node->base)
> +		record->regs_base = node->base + sizeof(struct ras_ext_regs) * i;
> +
> +	record->access = &aest_access[node->info->interface_hdr->type];
> +	record->addressing_mode = test_bit(i, node->info->addressing_mode);
> +	record->index = i;
> +	record->node = node;
> +	record->fr = record_read(record, ERXFR);
> +
> +	return 0;
> +}
> +
> +static void aest_online_record(struct aest_record *record, void *data)
> +{
> +	if (record->fr & ERR_FR_CE)
> +		aest_set_ce_threshold(record);
> +
> +	aest_enable_irq(record);
> +}
> +
> +static void aest_online_oncore_node(struct aest_node *node)
> +{
> +	int count;
> +
> +	count = aest_proc(node);
> +	aest_node_dbg(node, "Find %d error on CPU%d before AEST probe\n",
> +						count, smp_processor_id());
> +
> +	aest_node_foreach_record(aest_online_record, node, NULL,
> +						node->record_implemented);
> +
> +	aest_node_foreach_record(aest_online_record, node, NULL,
> +						node->status_reporting);
> +}
> +
> +static void aest_online_oncore_dev(void *data)
> +{
> +	int fhi_irq, eri_irq, i;
> +	struct aest_device *adev = this_cpu_ptr(data);
> +
> +	for (i = 0; i < adev->node_cnt; i++)
> +		aest_online_oncore_node(&adev->nodes[i]);
> +
> +	fhi_irq = adev->irq[ACPI_AEST_NODE_FAULT_HANDLING];
> +	if (fhi_irq > 0)
> +		enable_percpu_irq(fhi_irq, IRQ_TYPE_NONE);
> +	eri_irq = adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY];
> +	if (eri_irq > 0)
> +		enable_percpu_irq(eri_irq, IRQ_TYPE_NONE);
> +}
> +
> +static void aest_offline_oncore_dev(void *data)
> +{
> +	int fhi_irq, eri_irq;
> +	struct aest_device *adev = this_cpu_ptr(data);
> +
> +	fhi_irq = adev->irq[ACPI_AEST_NODE_FAULT_HANDLING];
> +	if (fhi_irq > 0)
> +		disable_percpu_irq(fhi_irq);
> +	eri_irq = adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY];
> +	if (eri_irq > 0)
> +		disable_percpu_irq(eri_irq);
> +}
> +
> +static void aest_online_dev(struct aest_device *adev)
> +{
> +	int count, i;
> +	struct aest_node *node;
> +
> +	for (i = 0; i < adev->node_cnt; i++) {
> +		node = &adev->nodes[i];
> +
> +		if (!node->name)
> +			continue;
> +
> +		count = aest_proc(node);
> +		aest_node_dbg(node, "Find %d error before AEST probe\n", count);
> +
> +		aest_config_irq(node);
> +
> +		aest_node_foreach_record(aest_online_record, node, NULL,
> +						node->record_implemented);
> +		aest_node_foreach_record(aest_online_record, node, NULL,
> +						node->status_reporting);
> +	}
> +}
> +
> +static int aest_starting_cpu(unsigned int cpu)
> +{
> +	pr_debug("CPU%d starting\n", cpu);
> +	aest_online_oncore_dev(&percpu_adev);
> +
> +	return 0;
> +}
> +
> +static int aest_dying_cpu(unsigned int cpu)
> +{
> +	pr_debug("CPU%d dying\n", cpu);
> +	aest_offline_oncore_dev(&percpu_adev);
> +
> +	return 0;
> +}
> +
> +static void aest_device_remove(struct platform_device *pdev)
> +{
> +	struct aest_device *adev = platform_get_drvdata(pdev);
> +	int i;
> +
> +	platform_set_drvdata(pdev, NULL);
> +
> +	if (adev->type != ACPI_AEST_PROCESSOR_ERROR_NODE)
> +		return;
> +
> +	on_each_cpu(aest_offline_oncore_dev, adev->adev_oncore, 1);
> +
> +	for (i = 0; i < MAX_GSI_PER_NODE; i++) {
> +		if (adev->irq[i])
> +			free_percpu_irq(adev->irq[i], adev->adev_oncore);
> +	}
> +}
> +
> +
> +static int get_aest_node_ver(struct aest_node *node)
> +{
> +	u64 reg;
> +	void *devarch_base;
> +
> +	if (node->type == ACPI_AEST_GIC_ERROR_NODE) {
> +		devarch_base = ioremap(node->info->interface_hdr->address +
> +						GIC_ERRDEVARCH, PAGE_SIZE);
> +		if (!devarch_base)
> +			return 0;
> +
> +		reg = readl_relaxed(devarch_base);
> +		iounmap(devarch_base);
> +
> +		return FIELD_GET(ERRDEVARCH_REV, reg);
> +	}
> +
> +	return FIELD_GET(ID_AA64PFR0_EL1_RAS_MASK, read_cpuid(ID_AA64PFR0_EL1));
> +}
> +
> +static char *alloc_aest_node_name(struct aest_node *node)
> +{
> +	char *name;
> +
> +	switch (node->type) {
> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
> +		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "%s.%d",
> +			aest_node_name[node->type],
> +			node->info->processor->processor_id);
> +		break;
> +	case ACPI_AEST_MEMORY_ERROR_NODE:
> +	case ACPI_AEST_SMMU_ERROR_NODE:
> +	case ACPI_AEST_VENDOR_ERROR_NODE:
> +	case ACPI_AEST_GIC_ERROR_NODE:
> +	case ACPI_AEST_PCIE_ERROR_NODE:
> +	case ACPI_AEST_PROXY_ERROR_NODE:
> +		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "%s.%llx",
> +			aest_node_name[node->type],
> +			node->info->interface_hdr->address);
> +		break;
> +	default:
> +		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "Unknown");
> +	}
> +
> +	return name;
> +}
> +
> +static int
> +aest_node_set_errgsr(struct aest_device *adev, struct aest_node *node)
> +{
> +	struct acpi_aest_node *anode = node->info;
> +	u64 errgsr_base = anode->common->error_group_register_base;
> +
> +	if (anode->interface_hdr->type != ACPI_AEST_NODE_MEMORY_MAPPED)
> +		return 0;
> +
> +	if (!node->base)
> +		return 0;
> +
> +	if (!(anode->interface_hdr->flags & AEST_XFACE_FLAG_ERROR_GROUP)) {
> +		node->errgsr = node->base + ERXGROUP;
> +		return 0;
> +	}
> +
> +	if (!errgsr_base)
> +		return -EINVAL;
> +
> +	node->errgsr = devm_ioremap(adev->dev, errgsr_base, PAGE_SIZE);
> +	if (!node->errgsr)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static int aest_init_node(struct aest_device *adev, struct aest_node *node,
> +					struct acpi_aest_node *anode)
> +{
> +	int i, ret;
> +	u64 address, size, flags;
> +
> +	node->adev = adev;
> +	node->info = anode;
> +	node->type = anode->type;
> +	node->version = get_aest_node_ver(node);
> +	node->name = alloc_aest_node_name(node);
> +	if (!node->name)
> +		return -ENOMEM;
> +	node->record_implemented = anode->record_implemented;
> +	node->status_reporting = anode->status_reporting;
> +
> +	address = anode->interface_hdr->address;
> +	size = anode->interface_hdr->error_record_count *
> +						sizeof(struct ras_ext_regs);
> +	if (address) {
> +		node->base = devm_ioremap(adev->dev, address, size);
> +		if (!node->base)
> +			return -ENOMEM;
> +	}
> +
> +	flags = anode->interface_hdr->flags;
> +	address = node->info->common->fault_inject_register_base;
> +	if ((flags & AEST_XFACE_FLAG_FAULT_INJECT) && address) {
> +		node->inj = devm_ioremap(adev->dev, address, PAGE_SIZE);
> +		if (!node->inj)
> +			return -ENOMEM;
> +	}
> +
> +	address = node->info->common->interrupt_config_register_base;
> +	if ((flags & AEST_XFACE_FLAG_FAULT_INJECT) && address) {
> +		node->irq_config = devm_ioremap(adev->dev, address, PAGE_SIZE);
> +		if (!node->irq_config)
> +			return -ENOMEM;
> +	}
> +
> +	ret = aest_node_set_errgsr(adev, node);
> +	if (ret)
> +		return ret;
> +
> +	node->record_count = anode->interface_hdr->error_record_count;
> +	node->records = devm_kcalloc(adev->dev, node->record_count,
> +				sizeof(struct aest_record), GFP_KERNEL);
> +	if (!node->records)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < node->record_count; i++) {
> +		ret = aest_init_record(&node->records[i], i, node);
> +		if (ret)
> +			return ret;
> +	}
> +	aest_node_dbg(node, "%d records, base: %llx, errgsr: %llx\n",
> +			node->record_count, (u64)node->base, (u64)node->errgsr);
> +	return 0;
> +}
> +
> +static int
> +aest_init_nodes(struct aest_device *adev, struct aest_hnode *ahnode)
> +{
> +	struct acpi_aest_node *anode;
> +	struct aest_node *node;
> +	int ret, i = 0;
> +
> +	adev->node_cnt = ahnode->count;
> +	adev->nodes = devm_kcalloc(adev->dev, adev->node_cnt,
> +					sizeof(struct aest_node), GFP_KERNEL);
> +	if (!adev->nodes)
> +		return -ENOMEM;
> +
> +	list_for_each_entry(anode, &ahnode->list, list) {
> +		adev->type = anode->type;
> +
> +		node = &adev->nodes[i++];
> +		ret = aest_init_node(adev, node, anode);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int __setup_ppi(struct aest_device *adev)
> +{
> +	int cpu, i;
> +	struct aest_device *oncore_adev;
> +	struct aest_node *oncore_node;
> +	size_t size;
> +
> +	adev->adev_oncore = &percpu_adev;
> +	for_each_possible_cpu(cpu) {
> +		oncore_adev = per_cpu_ptr(&percpu_adev, cpu);
> +		memcpy(oncore_adev, adev, sizeof(struct aest_device));
> +
> +		oncore_adev->nodes = devm_kcalloc(adev->dev,
> +						oncore_adev->node_cnt,
> +						sizeof(struct aest_node),
> +						GFP_KERNEL);
> +		if (!oncore_adev->nodes)
> +			return -ENOMEM;
> +
> +		size = adev->node_cnt * sizeof(struct aest_node);
> +		memcpy(oncore_adev->nodes, adev->nodes, size);
> +		for (i = 0; i < oncore_adev->node_cnt; i++) {
> +			oncore_node = &oncore_adev->nodes[i];
> +			oncore_node->records = devm_kcalloc(adev->dev,
> +					oncore_node->record_count,
> +					sizeof(struct aest_record), GFP_KERNEL);
> +			if (!oncore_node->records)
> +				return -ENOMEM;
> +
> +			size = oncore_node->record_count *
> +						sizeof(struct aest_record);
> +			memcpy(oncore_node->records, adev->nodes[i].records,
> +									size);
> +		}
> +
> +		aest_dev_dbg(adev, "Init device on CPU%d.\n", cpu);
> +	}
> +
> +	return 0;
> +}
> +
> +static int aest_setup_irq(struct platform_device *pdev, struct aest_device *adev)
> +{
> +	int fhi_irq, eri_irq;
> +
> +	fhi_irq = platform_get_irq_byname_optional(pdev, "fhi");
> +	if (fhi_irq > 0)
> +		adev->irq[0] = fhi_irq;
> +
> +	eri_irq = platform_get_irq_byname_optional(pdev, "eri");
> +	if (eri_irq > 0)
> +		adev->irq[1] = eri_irq;
> +
> +	/* Allocate and initialise the percpu device pointer for PPI */
> +	if (irq_is_percpu(fhi_irq) || irq_is_percpu(eri_irq))
> +		return __setup_ppi(adev);
> +
> +	return 0;
> +}
> +
> +static int aest_device_probe(struct platform_device *pdev)
> +{
> +	int ret;
> +	struct aest_device *adev;
> +	struct aest_hnode *ahnode;
> +
> +	ahnode = *((struct aest_hnode **)pdev->dev.platform_data);
> +	if (!ahnode)
> +		return -ENODEV;
> +
> +	adev = devm_kzalloc(&pdev->dev, sizeof(*adev), GFP_KERNEL);
> +	if (!adev)
> +		return -ENOMEM;
> +
> +	adev->dev = &pdev->dev;
> +	INIT_WORK(&adev->aest_work, aest_node_pool_process);
> +	ret = aest_node_pool_init(adev);
> +	if (ret) {
> +		aest_dev_err(adev, "Failed init aest node pool.\n");
> +		return ret;
> +	}
> +	init_llist_head(&adev->event_list);
> +	adev->uid = ahnode->uid;
> +	aest_set_name(adev, ahnode);
> +
> +	ret = aest_init_nodes(adev, ahnode);
> +	if (ret)
> +		return ret;
> +
> +	ret = aest_setup_irq(pdev, adev);
> +	if (ret)
> +		return ret;
> +
> +	ret = aest_register_irq(adev);
> +	if (ret) {
> +		aest_dev_err(adev, "register irq failed\n");
> +		return ret;
> +	}
> +
> +	platform_set_drvdata(pdev, adev);
> +
> +	if (aest_dev_is_oncore(adev))
> +		ret = cpuhp_setup_state(CPUHP_AP_ARM_AEST_STARTING,
> +				"drivers/acpi/arm64/aest:starting",
> +				aest_starting_cpu, aest_dying_cpu);
> +	else
> +		aest_online_dev(adev);
> +	if (ret)
> +		return ret;
> +
> +	aest_dev_dbg(adev, "Node cnt: %x, uid: %x, irq: %d, %d\n",
> +			adev->node_cnt, adev->uid, adev->irq[0], adev->irq[1]);
> +
> +	return 0;
> +}
> +
> +static const struct acpi_device_id acpi_aest_ids[] = {
> +	{"ARMHE000", 0},
> +	{}
> +};

My understanding is that platform device with name "AEST" is
created in acpi_aest_alloc_pdev and then the name will be used
to bind this driver for the dev. So, do we need ACPI HID definition
here? Using name should work well for both systems with or without
ARMHE000. Or, am I missing something?

I have not yet finish to look all parts and will look them and
other patches too. 

Best Regards,
Tomohiro Misono

> +
> +static struct platform_driver aest_driver = {
> +	.driver	= {
> +		.name	= "AEST",
> +		.acpi_match_table = acpi_aest_ids,
> +	},
> +	.probe	= aest_device_probe,
> +	.remove = aest_device_remove,
> +};
> +
> +static int __init aest_init(void)
> +{
> +	return platform_driver_register(&aest_driver);
> +}
> +module_init(aest_init);
> +
> +static void __exit aest_exit(void)
> +{
> +	platform_driver_unregister(&aest_driver);
> +}
> +module_exit(aest_exit);
> +
> +MODULE_DESCRIPTION("ARM AEST Driver");
> +MODULE_AUTHOR("Ruidong Tian <tianruidong@linux.alibaba.com>");
> +MODULE_LICENSE("GPL");
> +
> diff --git a/drivers/ras/aest/aest.h b/drivers/ras/aest/aest.h
> new file mode 100644
> index 000000000000..04005aad3617
> --- /dev/null
> +++ b/drivers/ras/aest/aest.h
> @@ -0,0 +1,323 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * ARM Error Source Table Support
> + *
> + * Copyright (c) 2021-2024, Alibaba Group.
> + */
> +
> +#include <linux/acpi_aest.h>
> +#include <asm/ras.h>
> +
> +#define MAX_GSI_PER_NODE 2
> +#define AEST_MAX_PPI 3
> +#define DEFAULT_CE_THRESHOLD 1
> +
> +#define record_read(record, offset) \
> +	record->access->read(record->regs_base, offset)
> +#define record_write(record, offset, val) \
> +	record->access->write(record->regs_base, offset, val)
> +
> +#define aest_dev_err(__adev, format, ...)	\
> +	dev_err((__adev)->dev, format, ##__VA_ARGS__)
> +#define aest_dev_info(__adev, format, ...)	\
> +	dev_info((__adev)->dev, format, ##__VA_ARGS__)
> +#define aest_dev_dbg(__adev, format, ...)	\
> +	dev_dbg((__adev)->dev, format, ##__VA_ARGS__)
> +
> +#define aest_node_err(__node, format, ...)	\
> +	dev_err((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
> +#define aest_node_info(__node, format, ...)	\
> +	dev_info((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
> +#define aest_node_dbg(__node, format, ...)	\
> +	dev_dbg((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
> +
> +#define aest_record_err(__record, format, ...)	\
> +	dev_err((__record)->node->adev->dev, "%s: %s: " format, \
> +		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
> +#define aest_record_info(__record, format, ...)	\
> +	dev_info((__record)->node->adev->dev, "%s: %s: " format, \
> +		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
> +#define aest_record_dbg(__record, format, ...)	\
> +	dev_dbg((__record)->node->adev->dev, "%s: %s: " format, \
> +		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
> +
> +#define ERXFR			0x0
> +#define ERXCTLR			0x8
> +#define ERXSTATUS		0x10
> +#define ERXADDR			0x18
> +#define ERXMISC0		0x20
> +#define ERXMISC1		0x28
> +#define ERXMISC2		0x30
> +#define ERXMISC3		0x38
> +
> +#define ERXGROUP		0xE00
> +#define GIC_ERRDEVARCH		0xFFBC
> +
> +extern struct xarray *aest_array;
> +
> +struct aest_event {
> +	struct llist_node llnode;
> +	char *node_name;
> +	u32 type;
> +	/*
> +	 * Different nodes have different meanings:
> +	 *   - Processor node	: processor number.
> +	 *   - Memory node	: SRAT proximity domain.
> +	 *   - SMMU node	: IORT proximity domain.
> +	 *   - GIC node		: interface type.
> +	 */
> +	u32 id0;
> +	/*
> +	 * Different nodes have different meanings:
> +	 *   - Processor node	: processor resource type.
> +	 *   - Memory node	: Non.
> +	 *   - SMMU node	: subcomponent reference.
> +	 *   - Vendor node	: Unique ID.
> +	 *   - GIC node		: instance identifier.
> +	 */
> +	u32 id1;
> +	char *hid;		// Vendor node	: hardware ID.
> +	u32 index;
> +	u64 ce_threshold;
> +	int addressing_mode;
> +	struct ras_ext_regs regs;
> +
> +	void *vendor_data;
> +	size_t vendor_data_size;
> +};
> +
> +struct aest_access {
> +	u64 (*read)(void *base, u32 offset);
> +	void (*write)(void *base, u32 offset, u64 val);
> +};
> +
> +struct ce_threshold_info {
> +	const u64			max_count;
> +	const u64			mask;
> +	const u64			shift;
> +};
> +
> +struct ce_threshold {
> +	const struct ce_threshold_info	*info;
> +	u64				count;
> +	u64				threshold;
> +	u64				reg_val;
> +};
> +
> +struct aest_record {
> +	char				*name;
> +	int				index;
> +	void __iomem			*regs_base;
> +
> +	/*
> +	 * This bit specifies the addressing mode  to populate the ERR_ADDR
> +	 * register:
> +	 *   0b: Error record reports System Physical Addresses (SPA) in
> +	 *       the ERR_ADDR register.
> +	 *   1b: Error record reports error node-specific Logical Addresses(LA)
> +	 *       in the ERR_ADD register. OS must use other means to translate
> +	 *       the reported LA into SPA
> +	 */
> +	int				addressing_mode;
> +	u64				fr;
> +	struct aest_node		*node;
> +
> +	struct dentry			*debugfs;
> +	struct ce_threshold		ce;
> +	enum ras_ce_threshold		threshold_type;
> +	const struct aest_access	*access;
> +
> +	void				*vendor_data;
> +	size_t				vendor_data_size;
> +};
> +
> +struct aest_node {
> +	char				*name;
> +	u8				type;
> +	void				*errgsr;
> +	void				*inj;
> +	void				*irq_config;
> +	void				*base;
> +
> +	/*
> +	 * This bitmap indicates which of the error records within this error
> +	 * node must be polled for error status.
> +	 * Bit[n] of this field pertains to error record corresponding to
> +	 * index n in this error group.
> +	 * Bit[n] = 0b: Error record at index n needs to be polled.
> +	 * Bit[n] = 1b: Error record at index n do not needs to be polled.
> +	 */
> +	unsigned long			*record_implemented;
> +	/*
> +	 * This bitmap indicates which of the error records within this error
> +	 * node support error status reporting using ERRGSR register.
> +	 * Bit[n] of this field pertains to error record corresponding to
> +	 * index n in this error group.
> +	 * Bit[n] = 0b: Error record at index n supports error status reporting
> +	 *              through ERRGSR.S.
> +	 * Bit[n] = 1b: Error record at index n does not support error reporting
> +	 *              through the ERRGSR.S bit If this error record is
> +	 *              implemented, then it must be polled explicitly for
> +	 *              error events.
> +	 */
> +	unsigned long			*status_reporting;
> +	int				version;
> +
> +	struct aest_device		*adev;
> +	struct acpi_aest_node		*info;
> +	struct dentry			*debugfs;
> +
> +	int				record_count;
> +	struct aest_record		*records;
> +
> +	struct aest_node __percpu	*oncore_node;
> +};
> +
> +struct aest_device {
> +	struct device			*dev;
> +	u32				type;
> +	int				node_cnt;
> +	struct aest_node		*nodes;
> +
> +	struct work_struct		aest_work;
> +	struct gen_pool			*pool;
> +	struct llist_head		event_list;
> +
> +	int				irq[MAX_GSI_PER_NODE];
> +	u32				uid;
> +	struct aest_device __percpu	*adev_oncore;
> +
> +	struct dentry			*debugfs;
> +};
> +
> +struct aest_node_context {
> +	struct aest_node		*node;
> +	unsigned long			*bitmap;
> +	void				(*func)(struct aest_record *record,
> +							void *data);
> +	void				*data;
> +	int				ret;
> +};
> +
> +#define CASE_READ(res, x)						\
> +	case (x): {							\
> +		res = read_sysreg_s(SYS_##x##_EL1);			\
> +		break;							\
> +	}
> +
> +#define CASE_WRITE(val, x)						\
> +	case (x): {							\
> +		write_sysreg_s((val), SYS_##x##_EL1);			\
> +		break;							\
> +	}
> +
> +static inline u64 aest_sysreg_read(void *__unused, u32 offset)
> +{
> +	u64 res;
> +
> +	switch (offset) {
> +	CASE_READ(res, ERXFR)
> +	CASE_READ(res, ERXCTLR)
> +	CASE_READ(res, ERXSTATUS)
> +	CASE_READ(res, ERXADDR)
> +	CASE_READ(res, ERXMISC0)
> +	CASE_READ(res, ERXMISC1)
> +	CASE_READ(res, ERXMISC2)
> +	CASE_READ(res, ERXMISC3)
> +	default :
> +		res = 0;
> +	}
> +	return res;
> +}
> +
> +static inline void aest_sysreg_write(void *base, u32 offset, u64 val)
> +{
> +	switch (offset) {
> +	CASE_WRITE(val, ERXFR)
> +	CASE_WRITE(val, ERXCTLR)
> +	CASE_WRITE(val, ERXSTATUS)
> +	CASE_WRITE(val, ERXADDR)
> +	CASE_WRITE(val, ERXMISC0)
> +	CASE_WRITE(val, ERXMISC1)
> +	CASE_WRITE(val, ERXMISC2)
> +	CASE_WRITE(val, ERXMISC3)
> +	default :
> +		return;
> +	}
> +}
> +
> +static inline u64 aest_iomem_read(void *base, u32 offset)
> +{
> +	return readq_relaxed(base + offset);
> +	return 0;
> +}
> +
> +static inline void aest_iomem_write(void *base, u32 offset, u64 val)
> +{
> +	writeq_relaxed(val, base + offset);
> +}
> +
> +/* access type is decided by AEST interface type. */
> +static const struct aest_access aest_access[] = {
> +	[ACPI_AEST_NODE_SYSTEM_REGISTER] = {
> +		.read = aest_sysreg_read,
> +		.write = aest_sysreg_write,
> +	},
> +
> +	[ACPI_AEST_NODE_MEMORY_MAPPED] = {
> +		.read = aest_iomem_read,
> +		.write = aest_iomem_write,
> +	},
> +	[ACPI_AEST_NODE_SINGLE_RECORD_MEMORY_MAPPED] = {
> +		.read = aest_iomem_read,
> +		.write = aest_iomem_write,
> +	},
> +	{ }
> +};
> +
> +static inline bool aest_dev_is_oncore(struct aest_device *adev)
> +{
> +	return adev->type == ACPI_AEST_PROCESSOR_ERROR_NODE;
> +}
> +
> +/*
> + * Each PE may has multi error record, you must selects an error
> + * record to be accessed through the Error Record System
> + * registers.
> + */
> +static inline void aest_select_record(struct aest_node *node, int index)
> +{
> +	if (node->type == ACPI_AEST_PROCESSOR_ERROR_NODE) {
> +		write_sysreg_s(index, SYS_ERRSELR_EL1);
> +		isb();
> +	}
> +}
> +
> +/* Ensure all writes has taken effect. */
> +static inline void aest_sync(struct aest_node *node)
> +{
> +	if (node->type == ACPI_AEST_PROCESSOR_ERROR_NODE)
> +		isb();
> +}
> +
> +static const char * const aest_node_name[] = {
> +	[ACPI_AEST_PROCESSOR_ERROR_NODE] = "processor",
> +	[ACPI_AEST_MEMORY_ERROR_NODE] = "memory",
> +	[ACPI_AEST_SMMU_ERROR_NODE] = "smmu",
> +	[ACPI_AEST_VENDOR_ERROR_NODE] = "vendor",
> +	[ACPI_AEST_GIC_ERROR_NODE] = "gic",
> +	[ACPI_AEST_PCIE_ERROR_NODE] = "pcie",
> +	[ACPI_AEST_PROXY_ERROR_NODE] = "proxy",
> +};
> +
> +static inline int
> +aest_set_name(struct aest_device *adev, struct aest_hnode *ahnode)
> +{
> +	adev->dev->init_name = devm_kasprintf(adev->dev, GFP_KERNEL,
> +					"%s%d", aest_node_name[ahnode->type],
> +						adev->uid);
> +	if (!adev->dev->init_name)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> diff --git a/include/linux/acpi_aest.h b/include/linux/acpi_aest.h
> new file mode 100644
> index 000000000000..1c2191791504
> --- /dev/null
> +++ b/include/linux/acpi_aest.h
> @@ -0,0 +1,68 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ACPI_AEST_H__
> +#define __ACPI_AEST_H__
> +
> +#include <linux/acpi.h>
> +#include <asm/ras.h>
> +
> +/* AEST component */
> +#define ACPI_AEST_PROC_FLAG_GLOBAL	(1<<0)
> +#define ACPI_AEST_PROC_FLAG_SHARED	(1<<1)
> +
> +#define AEST_ADDREESS_SPA	0
> +#define AEST_ADDREESS_LA	1
> +
> +/* AEST interrupt */
> +#define AEST_INTERRUPT_MODE		BIT(0)
> +#define AEST_INTERRUPT_FHI_MODE		BIT(1)
> +
> +#define AEST_INTERRUPT_FHI_UE_SUPPORT		BIT(0)
> +#define AEST_INTERRUPT_FHI_UE_NO_SUPPORT		BIT(1)
> +
> +#define AEST_MAX_INTERRUPT_PER_NODE 3
> +
> +/* AEST interface */
> +
> +#define AEST_XFACE_FLAG_SHARED		(1<<0)
> +#define AEST_XFACE_FLAG_CLEAR_MISC	(1<<1)
> +#define AEST_XFACE_FLAG_ERROR_DEVICE	(1<<2)
> +#define AEST_XFACE_FLAG_AFFINITY	(1<<3)
> +#define AEST_XFACE_FLAG_ERROR_GROUP	(1<<4)
> +#define AEST_XFACE_FLAG_FAULT_INJECT	(1<<5)
> +#define AEST_XFACE_FLAG_INT_CONFIG	(1<<6)
> +
> +struct aest_hnode {
> +	struct list_head list;
> +	int count;
> +	u32 uid;
> +	int type;
> +};
> +
> +struct acpi_aest_node {
> +	struct list_head list;
> +	int type;
> +	struct acpi_aest_node_interface_header *interface_hdr;
> +	unsigned long *record_implemented;
> +	unsigned long *status_reporting;
> +	unsigned long *addressing_mode;
> +	struct acpi_aest_node_interface_common *common;
> +	union {
> +		struct acpi_aest_processor *processor;
> +		struct acpi_aest_memory *memory;
> +		struct acpi_aest_smmu *smmu;
> +		struct acpi_aest_vendor_v2 *vendor;
> +		struct acpi_aest_gic *gic;
> +		struct acpi_aest_pcie *pcie;
> +		struct acpi_aest_proxy *proxy;
> +		void *spec_pointer;
> +	};
> +	union {
> +		struct acpi_aest_processor_cache *cache;
> +		struct acpi_aest_processor_tlb *tlb;
> +		struct acpi_aest_processor_generic *generic;
> +		void *processor_spec_pointer;
> +	};
> +	struct acpi_aest_node_interrupt_v2 *interrupt;
> +	int interrupt_count;
> +};
> +#endif /* __ACPI_IORT_H__ */
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index a04b73c40173..acf0e3957fdd 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -179,6 +179,7 @@ enum cpuhp_state {
>  	CPUHP_AP_CSKY_TIMER_STARTING,
>  	CPUHP_AP_TI_GP_TIMER_STARTING,
>  	CPUHP_AP_HYPERV_TIMER_STARTING,
> +	CPUHP_AP_ARM_AEST_STARTING,
>  	/* Must be the last timer callback */
>  	CPUHP_AP_DUMMY_TIMER_STARTING,
>  	CPUHP_AP_ARM_XEN_STARTING,
> diff --git a/include/linux/ras.h b/include/linux/ras.h
> index a64182bc72ad..1c777af6a1af 100644
> --- a/include/linux/ras.h
> +++ b/include/linux/ras.h
> @@ -53,4 +53,12 @@ static inline unsigned long
>  amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err) { return -EINVAL; }
>  #endif /* CONFIG_AMD_ATL */
> 
> +#if IS_ENABLED(CONFIG_AEST)
> +void aest_register_decode_chain(struct notifier_block *nb);
> +void aest_unregister_decode_chain(struct notifier_block *nb);
> +#else
> +static inline void aest_register_decode_chain(struct notifier_block *nb) {}
> +static inline void aest_unregister_decode_chain(struct notifier_block *nb) {}
> +#endif /* CONFIG_AEST */
> +
>  #endif /* __RAS_H__ */
> --
> 2.33.1
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 2/5] RAS/AEST: Introduce AEST driver sysfs interface
  2025-01-15  8:42 ` [PATCH v3 2/5] RAS/AEST: Introduce AEST driver sysfs interface Ruidong Tian
@ 2025-01-18  2:37   ` kernel test robot
  2025-02-19 20:55   ` Borislav Petkov
  1 sibling, 0 replies; 16+ messages in thread
From: kernel test robot @ 2025-01-18  2:37 UTC (permalink / raw)
  To: Ruidong Tian, catalin.marinas, will, lpieralisi, guohanjun,
	sudeep.holla, xueshuai, baolin.wang, linux-kernel, linux-acpi,
	linux-arm-kernel, rafael, lenb, tony.luck, bp, yazen.ghannam
  Cc: oe-kbuild-all, tianruidong

Hi Ruidong,

kernel test robot noticed the following build warnings:

[auto build test WARNING on rafael-pm/linux-next]
[also build test WARNING on rafael-pm/bleeding-edge arm64/for-next/core ras/edac-for-next linus/master tip/smp/core v6.13-rc7 next-20250117]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ruidong-Tian/ACPI-RAS-AEST-Initial-AEST-driver/20250115-164601
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link:    https://lore.kernel.org/r/20250115084228.107573-3-tianruidong%40linux.alibaba.com
patch subject: [PATCH v3 2/5] RAS/AEST: Introduce AEST driver sysfs interface
reproduce: (https://download.01.org/0day-ci/archive/20250118/202501181043.Qi8ohhYk-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501181043.Qi8ohhYk-lkp@intel.com/

All warnings (new ones prefixed by >>):

   Warning: Documentation/hwmon/g762.rst references a file that doesn't exist: Documentation/devicetree/bindings/hwmon/g762.txt
   Warning: Documentation/hwmon/isl28022.rst references a file that doesn't exist: Documentation/devicetree/bindings/hwmon/isl,isl28022.yaml
   Warning: Documentation/translations/ja_JP/SubmittingPatches references a file that doesn't exist: linux-2.6.12-vanilla/Documentation/dontdiff
   Warning: MAINTAINERS references a file that doesn't exist: Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
   Warning: lib/Kconfig.debug references a file that doesn't exist: Documentation/dev-tools/fault-injection/fault-injection.rst
>> Warning: /sys/kernel/debug/aest/<dev_name>/<node_name>/record<index>/err_* is defined 2 times:  ./Documentation/ABI/testing/debugfs-aest:69  ./Documentation/ABI/testing/debugfs-aest:76
   Using alabaster theme

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver
  2025-01-17 10:50   ` Tomohiro Misono (Fujitsu)
@ 2025-02-06  8:32     ` Ruidong Tian
  2025-02-14  9:14       ` Tomohiro Misono (Fujitsu)
  0 siblings, 1 reply; 16+ messages in thread
From: Ruidong Tian @ 2025-02-06  8:32 UTC (permalink / raw)
  To: Tomohiro Misono (Fujitsu), catalin.marinas@arm.com,
	will@kernel.org, lpieralisi@kernel.org, guohanjun@huawei.com,
	sudeep.holla@arm.com, xueshuai@linux.alibaba.com,
	baolin.wang@linux.alibaba.com, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	rafael@kernel.org, lenb@kernel.org, tony.luck@intel.com,
	bp@alien8.de, yazen.ghannam@amd.com
  Cc: Tyler Baicar

> Hello, some comments below.

Thank you for your comments! I really appreciate it.

> 
>> Subject: [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver
>>
>> Add support for parsing the ARM Error Source Table and basic handling of
>> errors reported through both memory mapped and system register interfaces.
>>
>> Assume system register interfaces are only registered with private
>> peripheral interrupts (PPIs); otherwise there is no guarantee the
>> core handling the error is the core which took the error and has the
>> syndrome info in its system registers.
>>
>> In kernel-first mode, all configuration is controlled by kernel, include
>> CE ce_threshold and interrupt enable/disable.
>>
>> All detected errors will be processed as follow:
>>    - CE, DE: use a workqueue to log this hare errors.
>>    - UER, UEO: log it and call memory_failun workquee.
>>    - UC, UEU: panic in irq context.
>>
>> Signed-off-by: Tyler Baicar <baicar@os.amperecomputing.com>
>> Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
>> ---
>>   MAINTAINERS                  |  10 +
>>   arch/arm64/include/asm/ras.h |  95 ++++
>>   drivers/acpi/arm64/Kconfig   |  11 +
>>   drivers/acpi/arm64/Makefile  |   1 +
>>   drivers/acpi/arm64/aest.c    | 335 ++++++++++++
>>   drivers/acpi/arm64/init.c    |   2 +
>>   drivers/acpi/arm64/init.h    |   1 +
>>   drivers/ras/Kconfig          |   1 +
>>   drivers/ras/Makefile         |   1 +
>>   drivers/ras/aest/Kconfig     |  17 +
>>   drivers/ras/aest/Makefile    |   5 +
>>   drivers/ras/aest/aest-core.c | 976 +++++++++++++++++++++++++++++++++++
>>   drivers/ras/aest/aest.h      | 323 ++++++++++++
>>   include/linux/acpi_aest.h    |  68 +++
>>   include/linux/cpuhotplug.h   |   1 +
>>   include/linux/ras.h          |   8 +
>>   16 files changed, 1855 insertions(+)
>>   create mode 100644 arch/arm64/include/asm/ras.h
>>   create mode 100644 drivers/acpi/arm64/aest.c
>>   create mode 100644 drivers/ras/aest/Kconfig
>>   create mode 100644 drivers/ras/aest/Makefile
>>   create mode 100644 drivers/ras/aest/aest-core.c
>>   create mode 100644 drivers/ras/aest/aest.h
>>   create mode 100644 include/linux/acpi_aest.h
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 637ddd44245f..d757f9339627 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -330,6 +330,16 @@ S:	Maintained
>>   F:	drivers/acpi/arm64
>>   F:	include/linux/acpi_iort.h
>>
>> +ACPI AEST
>> +M:	Ruidong Tian <tianruidond@linux.alibaba.com>
>> +L:	linux-acpi@vger.kernel.org
>> +L:	linux-arm-kernel@lists.infradead.org
>> +S:	Supported
>> +F:	arch/arm64/include/asm/ras.h
>> +F:	drivers/acpi/arm64/aest.c
>> +F:	drivers/ras/aest/
>> +F:	include/linux/acpi_aest.h
>> +
>>   ACPI FOR RISC-V (ACPI/riscv)
>>   M:	Sunil V L <sunilvl@ventanamicro.com>
>>   L:	linux-acpi@vger.kernel.org
>> diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h
>> new file mode 100644
>> index 000000000000..7676add8a0ed
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/ras.h
>> @@ -0,0 +1,95 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef __ASM_RAS_H
>> +#define __ASM_RAS_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/bits.h>
>> +
>> +/* ERR<n>FR */
>> +#define ERR_FR_CE                      GENMASK_ULL(54, 53)
>> +#define ERR_FR_RP                      BIT(15)
>> +#define ERR_FR_CEC                     GENMASK_ULL(14, 12)
>> +
>> +#define ERR_FR_RP_SINGLE_COUNTER       0
>> +#define ERR_FR_RP_DOUBLE_COUNTER       1
>> +
>> +#define ERR_FR_CEC_0B_COUNTER          0
>> +#define ERR_FR_CEC_8B_COUNTER          BIT(1)
>> +#define ERR_FR_CEC_16B_COUNTER         BIT(2)
>> +
>> +/* ERR<n>STATUS */
>> +#define ERR_STATUS_AV		BIT(31)
>> +#define ERR_STATUS_V		BIT(30)
>> +#define ERR_STATUS_UE		BIT(29)
>> +#define ERR_STATUS_ER		BIT(28)
>> +#define ERR_STATUS_OF		BIT(27)
>> +#define ERR_STATUS_MV		BIT(26)
>> +#define ERR_STATUS_CE		(BIT(25) | BIT(24))
>> +#define ERR_STATUS_DE		BIT(23)
>> +#define ERR_STATUS_PN		BIT(22)
>> +#define ERR_STATUS_UET		(BIT(21) | BIT(20))
>> +#define ERR_STATUS_CI		BIT(19)
>> +#define ERR_STATUS_IERR		GENMASK_ULL(15, 8)
>> +#define ERR_STATUS_SERR		GENMASK_ULL(7, 0)
>> +
>> +/* Theses bits are	 write-one-to-clear */
>> +#define ERR_STATUS_W1TC		(ERR_STATUS_AV | ERR_STATUS_V | ERR_STATUS_UE | \
>> +				ERR_STATUS_ER | ERR_STATUS_OF | ERR_STATUS_MV | \
>> +				ERR_STATUS_CE | ERR_STATUS_DE | ERR_STATUS_PN | \
>> +				ERR_STATUS_UET | ERR_STATUS_CI)
>> +
>> +#define ERR_STATUS_UET_UC	0
>> +#define ERR_STATUS_UET_UEU	1
>> +#define ERR_STATUS_UET_UEO	2
>> +#define ERR_STATUS_UET_UER	3
>> +
>> +/* ERR<n>CTLR */
>> +#define ERR_CTLR_CFI		BIT(8)
>> +#define ERR_CTLR_FI		BIT(3)
>> +#define ERR_CTLR_UI		BIT(2)
>> +
>> +/* ERR<n>ADDR */
>> +#define ERR_ADDR_AI		BIT(61)
>> +#define ERR_ADDR_PADDR		GENMASK_ULL(55, 0)
>> +
>> +/* ERR<n>MISC0 */
>> +
>> +/* ERR<n>FR.CEC == 0b010, ERR<n>FR.RP == 0  */
>> +#define ERR_MISC0_8B_OF		BIT(39)
>> +#define ERR_MISC0_8B_CEC	GENMASK_ULL(38, 32)
>> +
>> +/* ERR<n>FR.CEC == 0b100, ERR<n>FR.RP == 0  */
>> +#define ERR_MISC0_16B_OF	BIT(47)
>> +#define ERR_MISC0_16B_CEC	GENMASK_ULL(46, 32)
>> +
>> +#define ERR_MISC0_CEC_SHIFT	31
>> +
>> +#define ERR_8B_CEC_MAX		(ERR_MISC0_8B_CEC >> ERR_MISC0_CEC_SHIFT)
>> +#define ERR_16B_CEC_MAX		(ERR_MISC0_16B_CEC >> ERR_MISC0_CEC_SHIFT)
>> +
>> +/* ERR<n>FR.CEC == 0b100, ERR<n>FR.RP == 1  */
>> +#define ERR_MISC0_16B_OFO	BIT(63)
>> +#define ERR_MISC0_16B_CECO	GENMASK_ULL(62, 48)
>> +#define ERR_MISC0_16B_OFR	BIT(47)
>> +#define ERR_MISC0_16B_CECR	GENMASK_ULL(46, 32)
>> +
>> +/* ERRDEVARCH */
>> +#define ERRDEVARCH_REV		GENMASK(19, 16)
>> +
>> +enum ras_ce_threshold {
>> +	RAS_CE_THRESHOLD_0B,
>> +	RAS_CE_THRESHOLD_8B,
>> +	RAS_CE_THRESHOLD_16B,
>> +	RAS_CE_THRESHOLD_32B,
>> +	UNKNOWN,
>> +};
>> +
>> +struct ras_ext_regs {
>> +	u64 err_fr;
>> +	u64 err_ctlr;
>> +	u64 err_status;
>> +	u64 err_addr;
>> +	u64 err_misc[4];
>> +};
>> +
>> +#endif	/* __ASM_RAS_H */
>> diff --git a/drivers/acpi/arm64/Kconfig b/drivers/acpi/arm64/Kconfig
>> index b3ed6212244c..c8eb6de95733 100644
>> --- a/drivers/acpi/arm64/Kconfig
>> +++ b/drivers/acpi/arm64/Kconfig
>> @@ -21,3 +21,14 @@ config ACPI_AGDI
>>
>>   config ACPI_APMT
>>   	bool
>> +
>> +config ACPI_AEST
>> +	bool "ARM Error Source Table Support"
>> +	depends on ARM64_RAS_EXTN
>> +
>> +	help
>> +	  The Arm Error Source Table (AEST) provides details on ACPI
>> +	  extensions that enable kernel-first handling of errors in a
>> +	  system that supports the Armv8 RAS extensions.
>> +
>> +	  If set, the kernel will report and log hardware errors.
>> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
>> index 05ecde9eaabe..8e240b281fd1 100644
>> --- a/drivers/acpi/arm64/Makefile
>> +++ b/drivers/acpi/arm64/Makefile
>> @@ -6,5 +6,6 @@ obj-$(CONFIG_ACPI_GTDT) 	+= gtdt.o
>>   obj-$(CONFIG_ACPI_IORT) 	+= iort.o
>>   obj-$(CONFIG_ACPI_PROCESSOR_IDLE) += cpuidle.o
>>   obj-$(CONFIG_ARM_AMBA)		+= amba.o
>> +obj-$(CONFIG_ACPI_AEST) 	+= aest.o
>>   obj-y				+= dma.o init.o
>>   obj-y				+= thermal_cpufreq.o
>> diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c
>> new file mode 100644
>> index 000000000000..6dba9c23e04e
>> --- /dev/null
>> +++ b/drivers/acpi/arm64/aest.c
>> @@ -0,0 +1,335 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * ARM Error Source Table Support
>> + *
>> + * Copyright (c) 2024, Alibaba Group.
>> + */
>> +
>> +#include <linux/xarray.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/acpi_aest.h>
>> +
>> +#include "init.h"
>> +
>> +#undef pr_fmt
>> +#define pr_fmt(fmt) "ACPI AEST: " fmt
>> +
>> +static struct xarray *aest_array;
>> +
>> +static void __init aest_init_interface(struct acpi_aest_hdr *hdr,
>> +				       struct acpi_aest_node *node)
>> +{
>> +	struct acpi_aest_node_interface_header *interface;
>> +
>> +	interface = ACPI_ADD_PTR(struct acpi_aest_node_interface_header, hdr,
>> +				 hdr->node_interface_offset);
>> +
>> +	node->type = hdr->type;
>> +	node->interface_hdr = interface;
>> +
>> +	switch (interface->group_format) {
>> +	case ACPI_AEST_NODE_GROUP_FORMAT_4K: {
>> +		struct acpi_aest_node_interface_4k *interface_4k =
>> +			(struct acpi_aest_node_interface_4k *)(interface + 1);
>> +
>> +		node->common = &interface_4k->common;
>> +		node->record_implemented =
>> +			(unsigned long *)&interface_4k->error_record_implemented;
>> +		node->status_reporting =
>> +			(unsigned long *)&interface_4k->error_status_reporting;
>> +		node->addressing_mode =
>> +			(unsigned long *)&interface_4k->addressing_mode;
>> +		break;
>> +	}
>> +	case ACPI_AEST_NODE_GROUP_FORMAT_16K: {
>> +		struct acpi_aest_node_interface_16k *interface_16k =
>> +			(struct acpi_aest_node_interface_16k *)(interface + 1);
>> +
>> +		node->common = &interface_16k->common;
>> +		node->record_implemented =
>> +			(unsigned long *)interface_16k->error_record_implemented;
>> +		node->status_reporting =
>> +			(unsigned long *)interface_16k->error_status_reporting;
>> +		node->addressing_mode =
>> +			(unsigned long *)interface_16k->addressing_mode;
>> +		break;
>> +	}
>> +	case ACPI_AEST_NODE_GROUP_FORMAT_64K: {
>> +		struct acpi_aest_node_interface_64k *interface_64k =
>> +			(struct acpi_aest_node_interface_64k *)(interface + 1);
>> +
>> +		node->common = &interface_64k->common;
>> +		node->record_implemented =
>> +			(unsigned long *)interface_64k->error_record_implemented;
>> +		node->status_reporting =
>> +			(unsigned long *)interface_64k->error_status_reporting;
>> +		node->addressing_mode =
>> +			(unsigned long *)interface_64k->addressing_mode;
>> +		break;
>> +	}
>> +	default:
>> +		pr_err("invalid group format: %d\n", interface->group_format);
>> +	}
>> +
>> +	node->interrupt = ACPI_ADD_PTR(struct acpi_aest_node_interrupt_v2,
>> +					hdr, hdr->node_interrupt_offset);
>> +
>> +	node->interrupt_count = hdr->node_interrupt_count;
>> +}
>> +
>> +static int __init acpi_aest_init_node_common(struct acpi_aest_hdr *aest_hdr,
>> +					struct acpi_aest_node *node)
>> +{
>> +	int ret;
>> +	struct aest_hnode *hnode;
>> +	u64 error_device_id;
>> +
>> +	aest_init_interface(aest_hdr, node);
>> +
>> +	error_device_id = node->common->error_node_device;
> 
> I think I see a problem with this.
>  From the spec[1], I understand that error node device is optional and
> error node device field is only valid when error node device valid flag is set.
> 
> [1] https://developer.arm.com/documentation/den0085/latest/
> 
> Previous versions work well for the system without error node device (i.e. system
> without ARMHE000 definition in DSDT) but this version doesn't.
> Do we need to rely on information from error node device here when
> a system has them? I thought AEST table has necessary information in all case and
> want to know why this version use different approach from v2.

Q: Do we need to rely on information from error node device here when
a system has them?
A: DSDT error device node may include certain ACPI methods, such as 
address translation for DDRC. Intel has implemented this approach by 
using an ACPI method to translate DIMM addresses into system physical 
addresses [0].

[0]: 
https://lore.kernel.org/all/20181015202620.23610-1-tony.luck@intel.com/T/#u

Reson for use different approach in v3
--------------------------------------------

In v3, an abstraction layer named AEST device was introduced on top of 
the AEST node. The main reasons are as follows:
1. Some AEST nodes share interrupts, and the AEST device is viewed as 
the owner of the interrupt to register interrupt functions.
2. Abstracting the contents of ACPI tables into platform devices is a 
common practice in ARM, like MPAM[1] an IORT, and I just follow it.

Which approach do you think is better, v2 or v3?

[1]: 
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.12-rc1&id=8c26b06d7b811d397e672fd3b0d7c10d4965d97a

> 
> Also, I wonder if there will be a system that only some nodes have valid flag.

My plan is to create an AEST platform device regardless of whether the 
node is valid. In the next version, I will set the error_device_id to a 
globally incrementing ID instead of directly assigning it the value of 
error_node_device.

> 
>> +
>> +	hnode = xa_load(aest_array, error_device_id);
>> +	if (!hnode) {
>> +		hnode = kmalloc(sizeof(*hnode), GFP_KERNEL);
>> +		if (!hnode) {
>> +			ret = -ENOMEM;
>> +			goto free;
>> +		}
>> +		INIT_LIST_HEAD(&hnode->list);
>> +		hnode->uid = error_device_id;
>> +		hnode->count = 0;
>> +		hnode->type = node->type;
>> +		xa_store(aest_array, error_device_id, hnode, GFP_KERNEL);
>> +	}
>> +
>> +	list_add_tail(&node->list, &hnode->list);
>> +	hnode->count++;
>> +
>> +	return 0;
>> +
>> +free:
>> +	kfree(node);
>> +	return ret;
>> +}
>> +
>> +static int __init
>> +acpi_aest_init_node_default(struct acpi_aest_hdr *aest_hdr)
>> +{
>> +	struct acpi_aest_node *node;
>> +
>> +	node = kzalloc(sizeof(*node), GFP_KERNEL);
>> +	if (!node)
>> +		return -ENOMEM;
>> +
>> +	node->spec_pointer = ACPI_ADD_PTR(void, aest_hdr,
>> +					aest_hdr->node_specific_offset);
>> +
>> +	return acpi_aest_init_node_common(aest_hdr, node);
>> +}
>> +
>> +static int __init
>> +acpi_aest_init_processor_node(struct acpi_aest_hdr *aest_hdr)
>> +{
>> +	struct acpi_aest_node *node;
>> +
>> +	node = kzalloc(sizeof(*node), GFP_KERNEL);
>> +	if (!node)
>> +		return -ENOMEM;
>> +
>> +	node->spec_pointer = ACPI_ADD_PTR(void, aest_hdr,
>> +					aest_hdr->node_specific_offset);
>> +
>> +	node->processor_spec_pointer = ACPI_ADD_PTR(void, node->spec_pointer,
>> +					sizeof(struct acpi_aest_processor));
>> +
>> +	return acpi_aest_init_node_common(aest_hdr, node);
>> +}
>> +
>> +static int __init acpi_aest_init_node(struct acpi_aest_hdr *header)
>> +{
>> +	switch (header->type) {
>> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
>> +		return acpi_aest_init_processor_node(header);
>> +	case ACPI_AEST_VENDOR_ERROR_NODE:
>> +	case ACPI_AEST_SMMU_ERROR_NODE:
>> +	case ACPI_AEST_GIC_ERROR_NODE:
>> +	case ACPI_AEST_PCIE_ERROR_NODE:
>> +	case ACPI_AEST_PROXY_ERROR_NODE:
>> +	case ACPI_AEST_MEMORY_ERROR_NODE:
>> +		return acpi_aest_init_node_default(header);
>> +	default:
>> +		pr_err("acpi table header type is invalid: %d\n", header->type);
>> +		return -EINVAL;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int __init acpi_aest_init_nodes(struct acpi_table_header *aest_table)
>> +{
>> +	struct acpi_aest_hdr *aest_node, *aest_end;
>> +	struct acpi_table_aest *aest;
>> +	int rc;
>> +
>> +	aest = (struct acpi_table_aest *)aest_table;
>> +	aest_node = ACPI_ADD_PTR(struct acpi_aest_hdr, aest,
>> +				 sizeof(struct acpi_table_header));
>> +	aest_end = ACPI_ADD_PTR(struct acpi_aest_hdr, aest,
>> +				aest_table->length);
>> +
>> +	while (aest_node < aest_end) {
>> +		if (((u64)aest_node + aest_node->length) > (u64)aest_end) {
>> +			pr_warn(FW_WARN "AEST node pointer overflow, bad table.\n");
>> +			return -EINVAL;
>> +		}
>> +
>> +		rc = acpi_aest_init_node(aest_node);
>> +		if (rc)
>> +			return rc;
>> +
>> +		aest_node = ACPI_ADD_PTR(struct acpi_aest_hdr, aest_node,
>> +					 aest_node->length);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int
>> +acpi_aest_parse_irqs(struct platform_device *pdev, struct acpi_aest_node *anode,
>> +				struct resource *res, int *res_idx, int irqs[2])
>> +{
>> +	int i;
>> +	struct acpi_aest_node_interrupt_v2 *interrupt;
>> +	int trigger, irq;
>> +
>> +	for (i = 0; i < anode->interrupt_count; i++) {
>> +		interrupt = &anode->interrupt[i];
>> +		if (irqs[interrupt->type])
>> +			continue;
>> +
>> +		trigger = (interrupt->flags & AEST_INTERRUPT_MODE) ?
>> +			ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE;
>> +
>> +		irq = acpi_register_gsi(&pdev->dev, interrupt->gsiv, trigger,
>> +						ACPI_ACTIVE_HIGH);
>> +		if (irq <= 0) {
>> +			pr_err("failed to map AEST GSI %d\n", interrupt->gsiv);
>> +			return irq;
>> +		}
>> +
>> +		res[*res_idx].start = irq;
>> +		res[*res_idx].end = irq;
>> +		res[*res_idx].flags = IORESOURCE_IRQ;
>> +		res[*res_idx].name = interrupt->type ? "eri" : "fhi";
>> +
>> +		(*res_idx)++;
>> +
>> +		irqs[interrupt->type] = irq;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int __init acpi_aest_alloc_pdev(void)
>> +{
>> +	int ret, j, size;
>> +	struct aest_hnode *ahnode = NULL;
>> +	unsigned long i;
>> +	struct platform_device *pdev;
>> +	struct acpi_device *companion;
>> +	struct acpi_aest_node *anode;
>> +	char uid[16];
>> +	struct resource *res;
>> +
>> +	xa_for_each(aest_array, i, ahnode) {
>> +		int irq[2] = { 0 };
>> +
>> +		res = kcalloc(ahnode->count + 2, sizeof(*res), GFP_KERNEL);
> 
> Why is +2 needed?

Each aest platform device have max 2 irq resources, one for Error 
Recovery Interrupt and one for  Fault Handling Interrupt. I will add a 
macro here next version.

> 
>> +		if (!res) {
>> +			ret = -ENOMEM;
>> +			break;
>> +		}
>> +
>> +		pdev = platform_device_alloc("AEST", i);
>> +		if (IS_ERR(pdev)) {
>> +			ret = PTR_ERR(pdev);
>> +			break;
>> +		}
>> +
>> +		ret = snprintf(uid, sizeof(uid), "%u", (u32)i);
>> +		companion = acpi_dev_get_first_match_dev("ARMHE000", uid, -1);
>> +		if (companion)
>> +			ACPI_COMPANION_SET(&pdev->dev, companion);
>> +
>> +		j = 0;
>> +		list_for_each_entry(anode, &ahnode->list, list) {
>> +			if (anode->interface_hdr->type !=
>> +					ACPI_AEST_NODE_SYSTEM_REGISTER) {
>> +				res[j].name = "AEST:RECORD";
>> +				res[j].start = anode->interface_hdr->address;
>> +				size = anode->interface_hdr->error_record_count *
>> +						sizeof(struct ras_ext_regs);
>> +				res[j].end = res[j].start + size;
>> +				res[j].flags = IORESOURCE_MEM;
> 
> Will these fields be overwritten in below acpi_aest_parse_irqs()?

Yes, it is a bug, i will fix it next version.

> 
>> +			}
>> +
>> +			ret = acpi_aest_parse_irqs(pdev, anode, res, &j, irq);
>> +			if (ret) {
>> +				platform_device_put(pdev);
>> +				break;
>> +			}
>> +		}
>> +
>> +		ret = platform_device_add_resources(pdev, res, j);
>> +		if (ret)
>> +			break;
>> +
>> +		ret = platform_device_add_data(pdev, &ahnode, sizeof(ahnode));
>> +		if (ret)
>> +			break;
>> +
>> +		ret = platform_device_add(pdev);
>> +		if (ret)
>> +			break;
>> +	}
>> +
>> +	kfree(res);
>> +	if (ret)
>> +		platform_device_put(pdev);
>> +
>> +	return ret;
>> +}
>> +
>> +void __init acpi_aest_init(void)
>> +{
>> +	acpi_status status;
>> +	int ret;
>> +	struct acpi_table_header *aest_table;
>> +
>> +	status = acpi_get_table(ACPI_SIG_AEST, 0, &aest_table);
>> +	if (ACPI_FAILURE(status)) {
>> +		if (status != AE_NOT_FOUND) {
>> +			const char *msg = acpi_format_exception(status);
>> +
>> +			pr_err("Failed to get table, %s\n", msg);
>> +		}
>> +
>> +		return;
>> +	}
>> +
>> +	aest_array = kzalloc(sizeof(struct xarray), GFP_KERNEL);
>> +	xa_init(aest_array);
>> +
>> +	ret = acpi_aest_init_nodes(aest_table);
>> +	if (ret) {
>> +		pr_err("Failed init aest node %d\n", ret);
>> +		goto out;
>> +	}
>> +
>> +	ret = acpi_aest_alloc_pdev();
>> +	if (ret)
>> +		pr_err("Failed alloc pdev %d\n", ret);
>> +
>> +out:
>> +	acpi_put_table(aest_table);
>> +}
>> diff --git a/drivers/acpi/arm64/init.c b/drivers/acpi/arm64/init.c
>> index 7a47d8095a7d..b0c768923831 100644
>> --- a/drivers/acpi/arm64/init.c
>> +++ b/drivers/acpi/arm64/init.c
>> @@ -12,4 +12,6 @@ void __init acpi_arch_init(void)
>>   		acpi_iort_init();
>>   	if (IS_ENABLED(CONFIG_ARM_AMBA))
>>   		acpi_amba_init();
>> +	if (IS_ENABLED(CONFIG_ACPI_AEST))
>> +		acpi_aest_init();
>>   }
>> diff --git a/drivers/acpi/arm64/init.h b/drivers/acpi/arm64/init.h
>> index dcc277977194..3902d1676068 100644
>> --- a/drivers/acpi/arm64/init.h
>> +++ b/drivers/acpi/arm64/init.h
>> @@ -5,3 +5,4 @@ void __init acpi_agdi_init(void);
>>   void __init acpi_apmt_init(void);
>>   void __init acpi_iort_init(void);
>>   void __init acpi_amba_init(void);
>> +void __init acpi_aest_init(void);
>> diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
>> index fc4f4bb94a4c..61a2a05d9c94 100644
>> --- a/drivers/ras/Kconfig
>> +++ b/drivers/ras/Kconfig
>> @@ -33,6 +33,7 @@ if RAS
>>
>>   source "arch/x86/ras/Kconfig"
>>   source "drivers/ras/amd/atl/Kconfig"
>> +source "drivers/ras/aest/Kconfig"
>>
>>   config RAS_FMPM
>>   	tristate "FRU Memory Poison Manager"
>> diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile
>> index 11f95d59d397..72411ee9deaf 100644
>> --- a/drivers/ras/Makefile
>> +++ b/drivers/ras/Makefile
>> @@ -5,3 +5,4 @@ obj-$(CONFIG_RAS_CEC)	+= cec.o
>>
>>   obj-$(CONFIG_RAS_FMPM)	+= amd/fmpm.o
>>   obj-y			+= amd/atl/
>> +obj-y 			+= aest/
>> diff --git a/drivers/ras/aest/Kconfig b/drivers/ras/aest/Kconfig
>> new file mode 100644
>> index 000000000000..6d436d911bea
>> --- /dev/null
>> +++ b/drivers/ras/aest/Kconfig
>> @@ -0,0 +1,17 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +#
>> +# ARM Error Source Table Support
>> +#
>> +# Copyright (c) 2024, Alibaba Group.
>> +#
>> +
>> +config AEST
>> +	tristate "ARM AEST Driver"
>> +	depends on ACPI_AEST && RAS
>> +
>> +	help
>> +	  The Arm Error Source Table (AEST) provides details on ACPI
>> +	  extensions that enable kernel-first handling of errors in a
>> +	  system that supports the Armv8 RAS extensions.
>> +
>> +	  If set, the kernel will report and log hardware errors.
>> diff --git a/drivers/ras/aest/Makefile b/drivers/ras/aest/Makefile
>> new file mode 100644
>> index 000000000000..a6ba7e36fb43
>> --- /dev/null
>> +++ b/drivers/ras/aest/Makefile
>> @@ -0,0 +1,5 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +
>> +obj-$(CONFIG_AEST) 	+= aest.o
>> +
>> +aest-y		:= aest-core.o
>> diff --git a/drivers/ras/aest/aest-core.c b/drivers/ras/aest/aest-core.c
>> new file mode 100644
>> index 000000000000..060a1eedee0a
>> --- /dev/null
>> +++ b/drivers/ras/aest/aest-core.c
>> @@ -0,0 +1,976 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * ARM Error Source Table Support
>> + *
>> + * Copyright (c) 2021-2024, Alibaba Group.
>> + */
>> +
>> +#include <linux/interrupt.h>
>> +#include <linux/panic.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/xarray.h>
>> +#include <linux/cpuhotplug.h>
>> +#include <linux/genalloc.h>
>> +#include <linux/ras.h>
>> +
>> +#include "aest.h"
>> +
>> +DEFINE_PER_CPU(struct aest_device, percpu_adev);
>> +
>> +#undef pr_fmt
>> +#define pr_fmt(fmt) "AEST: " fmt
>> +
>> +/*
>> + * This memory pool is only to be used to save AEST node in AEST irq context.
>> + * There can be 500 AEST node at most.
>> + */
>> +#define AEST_NODE_ALLOCED_MAX	500
>> +
>> +#define AEST_LOG_PREFIX_BUFFER	64
>> +
>> +BLOCKING_NOTIFIER_HEAD(aest_decoder_chain);
>> +
>> +static void aest_print(struct aest_event *event)
>> +{
>> +	static atomic_t seqno = { 0 };
>> +	unsigned int curr_seqno;
>> +	char pfx_seq[AEST_LOG_PREFIX_BUFFER];
>> +	int index;
>> +	struct ras_ext_regs *regs;
>> +
>> +	curr_seqno = atomic_inc_return(&seqno);
>> +	snprintf(pfx_seq, sizeof(pfx_seq), "{%u}" HW_ERR, curr_seqno);
>> +	pr_info("%sHardware error from AEST %s\n", pfx_seq, event->node_name);
>> +
>> +	switch (event->type) {
>> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
>> +		pr_err("%s Error from CPU%d\n", pfx_seq, event->id0);
>> +		break;
>> +	case ACPI_AEST_MEMORY_ERROR_NODE:
>> +		pr_err("%s Error from memory at SRAT proximity domain %#x\n",
>> +			pfx_seq, event->id0);
>> +		break;
>> +	case ACPI_AEST_SMMU_ERROR_NODE:
>> +		pr_err("%s Error from SMMU IORT node %#x subcomponent %#x\n",
>> +			pfx_seq, event->id0, event->id1);
>> +		break;
>> +	case ACPI_AEST_VENDOR_ERROR_NODE:
>> +		pr_err("%s Error from vendor hid %8.8s uid %#x\n",
>> +			pfx_seq, event->hid, event->id1);
>> +		break;
>> +	case ACPI_AEST_GIC_ERROR_NODE:
>> +		pr_err("%s Error from GIC type %#x instance %#x\n",
>> +			pfx_seq, event->id0, event->id1);
>> +		break;
>> +	default:
>> +		pr_err("%s Unknown AEST node type\n", pfx_seq);
>> +		return;
>> +	}
>> +
>> +	index = event->index;
>> +	regs = &event->regs;
>> +
>> +	pr_err("%s  ERR%dFR: 0x%llx\n", pfx_seq, index, regs->err_fr);
>> +	pr_err("%s  ERR%dCTRL: 0x%llx\n", pfx_seq, index, regs->err_ctlr);
>> +	pr_err("%s  ERR%dSTATUS: 0x%llx\n", pfx_seq, index, regs->err_status);
>> +	if (regs->err_status & ERR_STATUS_AV)
>> +		pr_err("%s  ERR%dADDR: 0x%llx\n", pfx_seq, index,
>> +						regs->err_addr);
>> +
>> +	if (regs->err_status & ERR_STATUS_MV) {
>> +		pr_err("%s  ERR%dMISC0: 0x%llx\n", pfx_seq, index,
>> +						regs->err_misc[0]);
>> +		pr_err("%s  ERR%dMISC1: 0x%llx\n", pfx_seq, index,
>> +						regs->err_misc[1]);
>> +		pr_err("%s  ERR%dMISC2: 0x%llx\n", pfx_seq, index,
>> +						regs->err_misc[2]);
>> +		pr_err("%s  ERR%dMISC3: 0x%llx\n", pfx_seq, index,
>> +						regs->err_misc[3]);
>> +	}
>> +}
>> +
>> +static void aest_handle_memory_failure(u64 addr)
>> +{
>> +	unsigned long pfn;
>> +
>> +	pfn = PHYS_PFN(addr);
>> +
>> +	if (!pfn_valid(pfn)) {
>> +		pr_warn(HW_ERR "Invalid physical address: %#llx\n", addr);
>> +		return;
>> +	}
>> +
>> +#ifdef CONFIG_MEMORY_FAILURE
>> +	memory_failure(pfn, 0);
>> +#endif
>> +}
>> +
>> +static void init_aest_event(struct aest_event *event, struct aest_record *record,
>> +					struct ras_ext_regs *regs)
>> +{
>> +	struct aest_node *node = record->node;
>> +	struct acpi_aest_node *info = node->info;
>> +
>> +	event->type = node->type;
>> +	event->node_name = node->name;
>> +	switch (node->type) {
>> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
>> +		if (info->processor->flags & (ACPI_AEST_PROC_FLAG_SHARED |
>> +						ACPI_AEST_PROC_FLAG_GLOBAL))
>> +			event->id0 = smp_processor_id();
> 
> In "else" case, acpi processor id will be set for id0. So, how about use
> get_acpi_id_for_cpu(smp_processor_id()) here for consistency?

Acpi processor id may be confused to user, i will use 
get_cpu_for_acpi_id(info->processor->processor_id) in "else" case.

> 
>> +		else
>> +			event->id0 = info->processor->processor_id;
>> +
>> +		event->id1 = info->processor->resource_type;
>> +		break;
>> +	case ACPI_AEST_MEMORY_ERROR_NODE:
>> +		event->id0 = info->memory->srat_proximity_domain;
>> +		break;
>> +	case ACPI_AEST_SMMU_ERROR_NODE:
>> +		event->id0 = info->smmu->iort_node_reference;
>> +		event->id1 = info->smmu->subcomponent_reference;
>> +		break;
>> +	case ACPI_AEST_VENDOR_ERROR_NODE:
>> +		event->id0 = 0;
>> +		event->id1 = info->vendor->acpi_uid;
>> +		event->hid = info->vendor->acpi_hid;
>> +		break;
>> +	case ACPI_AEST_GIC_ERROR_NODE:
>> +		event->id0 = info->gic->interface_type;
>> +		event->id1 = info->gic->instance_id;
>> +		break;
>> +	default:
>> +		event->id0 = 0;
>> +		event->id1 = 0;
>> +	}
>> +
>> +	memcpy(&event->regs, regs, sizeof(*regs));
>> +	event->index = record->index;
>> +	event->addressing_mode = record->addressing_mode;
>> +}
>> +
>> +static int
>> +aest_node_gen_pool_add(struct aest_device *adev, struct aest_record *record,
>> +					struct ras_ext_regs *regs)
>> +{
>> +	struct aest_event *event;
>> +
>> +	if (!adev->pool)
>> +		return -EINVAL;
>> +
>> +	event = (void *)gen_pool_alloc(adev->pool, sizeof(*event));
>> +	if (!event)
>> +		return -ENOMEM;
>> +
>> +	init_aest_event(event, record, regs);
>> +	llist_add(&event->llnode, &adev->event_list);
>> +
>> +	return 0;
>> +}
>> +
>> +static void aest_log(struct aest_record *record, struct ras_ext_regs *regs)
>> +{
>> +	struct aest_device *adev = record->node->adev;
>> +
>> +	if (!aest_node_gen_pool_add(adev, record, regs))
>> +		schedule_work(&adev->aest_work);
>> +}
>> +
>> +void aest_register_decode_chain(struct notifier_block *nb)
>> +{
>> +	blocking_notifier_chain_register(&aest_decoder_chain, nb);
>> +}
>> +EXPORT_SYMBOL_GPL(aest_register_decode_chain);
>> +
>> +void aest_unregister_decode_chain(struct notifier_block *nb)
>> +{
>> +	blocking_notifier_chain_unregister(&aest_decoder_chain, nb);
>> +}
>> +EXPORT_SYMBOL_GPL(aest_unregister_decode_chain);
>> +
>> +static void aest_node_pool_process(struct work_struct *work)
>> +{
>> +	struct llist_node *head;
>> +	struct aest_event *event;
>> +	struct aest_device *adev = container_of(work, struct aest_device,
>> +							aest_work);
>> +	u64 status, addr;
>> +
>> +	head = llist_del_all(&adev->event_list);
>> +	if (!head)
>> +		return;
>> +
>> +	head = llist_reverse_order(head);
>> +	llist_for_each_entry(event, head, llnode) {
>> +		aest_print(event);
>> +
>> +		/* TODO: translate Logical Addresses to System Physical Addresses */
>> +		if (event->addressing_mode == AEST_ADDREESS_LA ||
>> +			(event->regs.err_addr & ERR_ADDR_AI)) {
>> +			pr_notice("Can not translate LA to SPA\n");
>> +			addr = 0;
>> +		} else
>> +			addr = event->regs.err_addr & (1UL << CONFIG_ARM64_PA_BITS);
>> +
>> +		status = event->regs.err_status;
>> +		if (addr && ((status & ERR_STATUS_UE) || (status & ERR_STATUS_DE)))
>> +			aest_handle_memory_failure(addr);
>> +
>> +		blocking_notifier_call_chain(&aest_decoder_chain, 0, event);
>> +		gen_pool_free(adev->pool, (unsigned long)event,
>> +				sizeof(*event));
>> +	}
>> +}
>> +
>> +static int aest_node_pool_init(struct aest_device *adev)
>> +{
>> +	unsigned long addr, size;
>> +
>> +	size = ilog2(sizeof(struct aest_event));
>> +	adev->pool = devm_gen_pool_create(adev->dev, size, -1,
>> +						dev_name(adev->dev));
>> +	if (!adev->pool)
>> +		return -ENOMEM;
>> +
>> +	size = PAGE_ALIGN(size * AEST_NODE_ALLOCED_MAX);
>> +	addr = (unsigned long)devm_kzalloc(adev->dev, size, GFP_KERNEL);
>> +	if (!addr)
>> +		return -ENOMEM;
>> +
>> +	return gen_pool_add(adev->pool, addr, size, -1);
>> +
>> +	return 0;
>> +}
>> +
>> +static void aest_panic(struct aest_record *record, struct ras_ext_regs *regs, char *msg)
>> +{
>> +	struct aest_event event = { 0 };
>> +
>> +	init_aest_event(&event, record, regs);
>> +
>> +	aest_print(&event);
>> +
>> +	panic(msg);
>> +}
>> +
>> +static void aest_proc_record(struct aest_record *record, void *data)
>> +{
>> +	struct ras_ext_regs regs = {0};
>> +	int *count = data;
>> +
>> +	regs.err_status = record_read(record, ERXSTATUS);
>> +	if (!(regs.err_status & ERR_STATUS_V))
>> +		return;
>> +
>> +	(*count)++;
>> +
>> +	if (regs.err_status & ERR_STATUS_AV)
>> +		regs.err_addr = record_read(record, ERXADDR);
>> +
>> +	regs.err_fr = record->fr;
>> +	regs.err_ctlr = record_read(record, ERXCTLR);
>> +
>> +	if (regs.err_status & ERR_STATUS_MV) {
>> +		regs.err_misc[0] = record_read(record, ERXMISC0);
>> +		regs.err_misc[1] = record_read(record, ERXMISC1);
>> +		if (record->node->version >= ID_AA64PFR0_EL1_RAS_V1P1) {
>> +			regs.err_misc[2] = record_read(record, ERXMISC2);
>> +			regs.err_misc[3] = record_read(record, ERXMISC3);
>> +		}
>> +
>> +		if (record->node->info->interface_hdr->flags &
>> +			AEST_XFACE_FLAG_CLEAR_MISC) {
>> +			record_write(record, ERXMISC0, 0);
>> +			record_write(record, ERXMISC1, 0);
>> +			if (record->node->version >= ID_AA64PFR0_EL1_RAS_V1P1) {
>> +				record_write(record, ERXMISC2, 0);
>> +				record_write(record, ERXMISC3, 0);
>> +			}
>> +		/* ce count is 0 if record do not support ce */
>> +		} else if (record->ce.count > 0)
>> +			record_write(record, ERXMISC0, record->ce.reg_val);
>> +	}
>> +
>> +	/* panic if unrecoverable and uncontainable error encountered */
>> +	if ((regs.err_status & ERR_STATUS_UE) &&
>> +		(regs.err_status & ERR_STATUS_UET) > ERR_STATUS_UET_UEU)
>> +		aest_panic(record, &regs, "AEST: unrecoverable error encountered");
> 
> I think we need to use FIELD_GET to get correct value.
> 	u64 ue = FIELD_GET(ERR_STATUS_UET, regs.err_status);
> 	if ((regs.err_status & ERR_STATUS_UE) &&
>   		(ue == ERR_STATUS_UET_UC || ue == ERR_STATUS_UET_UEU))
> 

OK, i will update next version.

>> +
>> +	aest_log(record, &regs);
>> +
>> +	/* Write-one-to-clear the bits we've seen */
>> +	regs.err_status &= ERR_STATUS_W1TC;
>> +
>> +	/* Multi bit filed need to write all-ones to clear. */
>> +	if (regs.err_status & ERR_STATUS_CE)
>> +		regs.err_status |= ERR_STATUS_CE;
>> +
>> +	/* Multi bit filed need to write all-ones to clear. */
>> +	if (regs.err_status & ERR_STATUS_UET)
>> +		regs.err_status |= ERR_STATUS_UET;
>> +
>> +	record_write(record, ERXSTATUS, regs.err_status);
>> +}
>> +
>> +static void
>> +aest_node_foreach_record(void (*func)(struct aest_record *, void *),
>> +				struct aest_node *node, void *data,
>> +				unsigned long *bitmap)
>> +{
>> +	int i;
>> +
>> +	for_each_clear_bit(i, bitmap, node->record_count) {
>> +		aest_select_record(node, i);
>> +
>> +		func(&node->records[i], data);
>> +
>> +		aest_sync(node);
>> +	}
>> +}
>> +
>> +static int aest_proc(struct aest_node *node)
>> +{
>> +	int count = 0, i, j, size = node->record_count;
>> +	u64 err_group = 0;
>> +
>> +	aest_node_dbg(node, "Poll bit %*pb\n", size, node->record_implemented);
>> +	aest_node_foreach_record(aest_proc_record, node, &count,
>> +						node->record_implemented);
>> +
>> +	if (!node->errgsr)
>> +		return count;
>> +
>> +	aest_node_dbg(node, "Report bit %*pb\n", size, node->status_reporting);
>> +	for (i = 0; i < BITS_TO_U64(size); i++) {
>> +		err_group = readq_relaxed((void *)node->errgsr + i * 8);
>> +		aest_node_dbg(node, "errgsr[%d]: 0x%llx\n", i, err_group);
>> +
>> +		for_each_set_bit(j, (unsigned long *)&err_group,
>> +						BITS_PER_TYPE(u64)) {
>> +			/*
>> +			 * Error group base is only valid in Memory Map node,
>> +			 * so driver do not need to write select register and
>> +			 * sync.
>> +			 */
>> +			if (test_bit(i * BITS_PER_TYPE(u64) + j, node->status_reporting))
>> +				continue;
>> +			aest_proc_record(&node->records[j], &count);
>> +		}
>> +	}
>> +
>> +	return count;
>> +}
>> +
>> +static irqreturn_t aest_irq_func(int irq, void *input)
>> +{
>> +	struct aest_device *adev = input;
>> +	int i;
>> +
>> +	for (i = 0; i < adev->node_cnt; i++)
>> +		aest_proc(&adev->nodes[i]);
>> +
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static void aest_enable_irq(struct aest_record *record)
>> +{
>> +	u64 err_ctlr;
>> +	struct aest_device *adev = record->node->adev;
>> +
>> +	err_ctlr = record_read(record, ERXCTLR);
>> +
>> +	if (adev->irq[ACPI_AEST_NODE_FAULT_HANDLING])
>> +		err_ctlr |= (ERR_CTLR_FI | ERR_CTLR_CFI);
>> +	if (adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY])
>> +		err_ctlr |= ERR_CTLR_UI;
>> +
>> +	record_write(record, ERXCTLR, err_ctlr);
>> +}
>> +
>> +static void aest_config_irq(struct aest_node *node)
>> +{
>> +	int i;
>> +	struct acpi_aest_node_interrupt_v2 *interrupt;
>> +
>> +	if (!node->irq_config)
>> +		return;
>> +
>> +	for (i = 0; i < node->info->interrupt_count; i++) {
>> +		interrupt = &node->info->interrupt[i];
>> +
>> +		if (interrupt->type == ACPI_AEST_NODE_FAULT_HANDLING)
>> +			writeq_relaxed(interrupt->gsiv, node->irq_config);
>> +
>> +		if (interrupt->type == ACPI_AEST_NODE_ERROR_RECOVERY)
>> +			writeq_relaxed(interrupt->gsiv, node->irq_config + 8);
>> +
>> +		aest_node_dbg(node, "config irq type %d gsiv %d at %llx",
>> +				interrupt->type, interrupt->gsiv,
>> +				(u64)node->irq_config);
>> +	}
>> +}
>> +
>> +static enum ras_ce_threshold aest_get_ce_threshold(struct aest_record *record)
>> +{
>> +	u64 err_fr, err_fr_cec, err_fr_rp = -1;
>> +
>> +	err_fr = record->fr;
>> +	err_fr_cec = FIELD_GET(ERR_FR_CEC, err_fr);
>> +	err_fr_rp = FIELD_GET(ERR_FR_RP, err_fr);
>> +
>> +	if (err_fr_cec == ERR_FR_CEC_0B_COUNTER)
>> +		return RAS_CE_THRESHOLD_0B;
>> +	else if (err_fr_rp == ERR_FR_RP_DOUBLE_COUNTER)
>> +		return RAS_CE_THRESHOLD_32B;
>> +	else if (err_fr_cec == ERR_FR_CEC_8B_COUNTER)
>> +		return RAS_CE_THRESHOLD_8B;
>> +	else if (err_fr_cec == ERR_FR_CEC_16B_COUNTER)
>> +		return RAS_CE_THRESHOLD_16B;
>> +	else
>> +		return UNKNOWN;
>> +
>> +}
>> +
>> +static const struct ce_threshold_info ce_info[] = {
>> +	[RAS_CE_THRESHOLD_0B] = { 0 },
>> +	[RAS_CE_THRESHOLD_8B] = {
>> +		.max_count = ERR_8B_CEC_MAX,
>> +		.mask = ERR_MISC0_8B_CEC,
>> +		.shift = ERR_MISC0_CEC_SHIFT,
>> +	},
>> +	[RAS_CE_THRESHOLD_16B] = {
>> +		.max_count = ERR_16B_CEC_MAX,
>> +		.mask = ERR_MISC0_16B_CEC,
>> +		.shift = ERR_MISC0_CEC_SHIFT,
>> +	},
>> +	//TODO: Support 32B CEC threshold.
>> +	[RAS_CE_THRESHOLD_32B] = { 0 },
>> +};
>> +
>> +static void aest_set_ce_threshold(struct aest_record *record)
>> +{
>> +	u64 err_misc0, ce_count;
>> +	struct ce_threshold *ce = &record->ce;
>> +	const struct ce_threshold_info *info;
>> +
>> +	record->threshold_type  = aest_get_ce_threshold(record);
>> +
>> +	switch (record->threshold_type) {
>> +	case RAS_CE_THRESHOLD_0B:
>> +		aest_record_dbg(record, "do not support CE threshold!\n");
>> +		return;
>> +	case RAS_CE_THRESHOLD_8B:
>> +		aest_record_dbg(record, "support 8 bit CE threshold!\n");
>> +		break;
>> +	case RAS_CE_THRESHOLD_16B:
>> +		aest_record_dbg(record, "support 16 bit CE threshold!\n");
>> +		break;
>> +	case RAS_CE_THRESHOLD_32B:
>> +		aest_record_dbg(record, "not support 32 bit CE threshold!\n");
>> +		break;
>> +	default:
>> +		aest_record_dbg(record, "Unknown misc0 ce threshold!\n");
>> +	}
>> +
>> +	err_misc0 = record_read(record, ERXMISC0);
>> +	info = &ce_info[record->threshold_type];
>> +	ce->info = info;
>> +	ce_count = (err_misc0 & info->mask) >> info->shift;
>> +	if (ce_count) {
>> +		ce->count = ce_count;
>> +		ce->threshold = info->max_count - ce_count + 1;
>> +		ce->reg_val = err_misc0;
>> +		aest_record_dbg(record, "CE threshold is %llx, controlled by FW",
>> +							ce->threshold);
>> +		return;
>> +	}
>> +
>> +	// Default CE threshold is 1.
>> +	ce->count = info->max_count;
>> +	ce->threshold = DEFAULT_CE_THRESHOLD;
>> +	ce->reg_val = err_misc0 | info->mask;
>> +
>> +	record_write(record, ERXMISC0, ce->reg_val);
>> +	aest_record_dbg(record, "CE threshold is %llx, controlled by Kernel",
>> +							ce->threshold);
>> +}
>> +
>> +static int aest_register_irq(struct aest_device *adev)
>> +{
>> +	int i, irq, ret;
>> +	char *irq_desc;
>> +
>> +	irq_desc = devm_kasprintf(adev->dev, GFP_KERNEL, "%s.%s.",
>> +				  dev_driver_string(adev->dev),
>> +				  dev_name(adev->dev));
>> +	if (!irq_desc)
>> +		return -ENOMEM;
>> +
>> +	for (i = 0; i < MAX_GSI_PER_NODE; i++) {
>> +		irq = adev->irq[i];
>> +
>> +		if (!irq)
>> +			continue;
>> +
>> +		if (irq_is_percpu_devid(irq)) {
>> +			ret = request_percpu_irq(irq, aest_irq_func,
>> +							irq_desc,
>> +							adev->adev_oncore);
>> +			if (ret)
>> +				goto free;
>> +		} else {
>> +			ret = devm_request_irq(adev->dev, irq, aest_irq_func,
>> +					0, irq_desc, adev);
>> +			if (ret)
>> +				return ret;
>> +		}
>> +	}
>> +	return 0;
>> +
>> +free:
>> +	for (; i >= 0; i--) {
>> +		irq = adev->irq[i];
>> +
>> +		if (irq_is_percpu_devid(irq))
>> +			free_percpu_irq(irq, adev->adev_oncore);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static int
>> +aest_init_record(struct aest_record *record, int i, struct aest_node *node)
>> +{
>> +	struct device *dev = node->adev->dev;
>> +
>> +	record->name = devm_kasprintf(dev, GFP_KERNEL, "record%d", i);
>> +	if (!record->name)
>> +		return -ENOMEM;
>> +
>> +	if (node->base)
>> +		record->regs_base = node->base + sizeof(struct ras_ext_regs) * i;
>> +
>> +	record->access = &aest_access[node->info->interface_hdr->type];
>> +	record->addressing_mode = test_bit(i, node->info->addressing_mode);
>> +	record->index = i;
>> +	record->node = node;
>> +	record->fr = record_read(record, ERXFR);
>> +
>> +	return 0;
>> +}
>> +
>> +static void aest_online_record(struct aest_record *record, void *data)
>> +{
>> +	if (record->fr & ERR_FR_CE)
>> +		aest_set_ce_threshold(record);
>> +
>> +	aest_enable_irq(record);
>> +}
>> +
>> +static void aest_online_oncore_node(struct aest_node *node)
>> +{
>> +	int count;
>> +
>> +	count = aest_proc(node);
>> +	aest_node_dbg(node, "Find %d error on CPU%d before AEST probe\n",
>> +						count, smp_processor_id());
>> +
>> +	aest_node_foreach_record(aest_online_record, node, NULL,
>> +						node->record_implemented);
>> +
>> +	aest_node_foreach_record(aest_online_record, node, NULL,
>> +						node->status_reporting);
>> +}
>> +
>> +static void aest_online_oncore_dev(void *data)
>> +{
>> +	int fhi_irq, eri_irq, i;
>> +	struct aest_device *adev = this_cpu_ptr(data);
>> +
>> +	for (i = 0; i < adev->node_cnt; i++)
>> +		aest_online_oncore_node(&adev->nodes[i]);
>> +
>> +	fhi_irq = adev->irq[ACPI_AEST_NODE_FAULT_HANDLING];
>> +	if (fhi_irq > 0)
>> +		enable_percpu_irq(fhi_irq, IRQ_TYPE_NONE);
>> +	eri_irq = adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY];
>> +	if (eri_irq > 0)
>> +		enable_percpu_irq(eri_irq, IRQ_TYPE_NONE);
>> +}
>> +
>> +static void aest_offline_oncore_dev(void *data)
>> +{
>> +	int fhi_irq, eri_irq;
>> +	struct aest_device *adev = this_cpu_ptr(data);
>> +
>> +	fhi_irq = adev->irq[ACPI_AEST_NODE_FAULT_HANDLING];
>> +	if (fhi_irq > 0)
>> +		disable_percpu_irq(fhi_irq);
>> +	eri_irq = adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY];
>> +	if (eri_irq > 0)
>> +		disable_percpu_irq(eri_irq);
>> +}
>> +
>> +static void aest_online_dev(struct aest_device *adev)
>> +{
>> +	int count, i;
>> +	struct aest_node *node;
>> +
>> +	for (i = 0; i < adev->node_cnt; i++) {
>> +		node = &adev->nodes[i];
>> +
>> +		if (!node->name)
>> +			continue;
>> +
>> +		count = aest_proc(node);
>> +		aest_node_dbg(node, "Find %d error before AEST probe\n", count);
>> +
>> +		aest_config_irq(node);
>> +
>> +		aest_node_foreach_record(aest_online_record, node, NULL,
>> +						node->record_implemented);
>> +		aest_node_foreach_record(aest_online_record, node, NULL,
>> +						node->status_reporting);
>> +	}
>> +}
>> +
>> +static int aest_starting_cpu(unsigned int cpu)
>> +{
>> +	pr_debug("CPU%d starting\n", cpu);
>> +	aest_online_oncore_dev(&percpu_adev);
>> +
>> +	return 0;
>> +}
>> +
>> +static int aest_dying_cpu(unsigned int cpu)
>> +{
>> +	pr_debug("CPU%d dying\n", cpu);
>> +	aest_offline_oncore_dev(&percpu_adev);
>> +
>> +	return 0;
>> +}
>> +
>> +static void aest_device_remove(struct platform_device *pdev)
>> +{
>> +	struct aest_device *adev = platform_get_drvdata(pdev);
>> +	int i;
>> +
>> +	platform_set_drvdata(pdev, NULL);
>> +
>> +	if (adev->type != ACPI_AEST_PROCESSOR_ERROR_NODE)
>> +		return;
>> +
>> +	on_each_cpu(aest_offline_oncore_dev, adev->adev_oncore, 1);
>> +
>> +	for (i = 0; i < MAX_GSI_PER_NODE; i++) {
>> +		if (adev->irq[i])
>> +			free_percpu_irq(adev->irq[i], adev->adev_oncore);
>> +	}
>> +}
>> +
>> +
>> +static int get_aest_node_ver(struct aest_node *node)
>> +{
>> +	u64 reg;
>> +	void *devarch_base;
>> +
>> +	if (node->type == ACPI_AEST_GIC_ERROR_NODE) {
>> +		devarch_base = ioremap(node->info->interface_hdr->address +
>> +						GIC_ERRDEVARCH, PAGE_SIZE);
>> +		if (!devarch_base)
>> +			return 0;
>> +
>> +		reg = readl_relaxed(devarch_base);
>> +		iounmap(devarch_base);
>> +
>> +		return FIELD_GET(ERRDEVARCH_REV, reg);
>> +	}
>> +
>> +	return FIELD_GET(ID_AA64PFR0_EL1_RAS_MASK, read_cpuid(ID_AA64PFR0_EL1));
>> +}
>> +
>> +static char *alloc_aest_node_name(struct aest_node *node)
>> +{
>> +	char *name;
>> +
>> +	switch (node->type) {
>> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
>> +		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "%s.%d",
>> +			aest_node_name[node->type],
>> +			node->info->processor->processor_id);
>> +		break;
>> +	case ACPI_AEST_MEMORY_ERROR_NODE:
>> +	case ACPI_AEST_SMMU_ERROR_NODE:
>> +	case ACPI_AEST_VENDOR_ERROR_NODE:
>> +	case ACPI_AEST_GIC_ERROR_NODE:
>> +	case ACPI_AEST_PCIE_ERROR_NODE:
>> +	case ACPI_AEST_PROXY_ERROR_NODE:
>> +		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "%s.%llx",
>> +			aest_node_name[node->type],
>> +			node->info->interface_hdr->address);
>> +		break;
>> +	default:
>> +		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "Unknown");
>> +	}
>> +
>> +	return name;
>> +}
>> +
>> +static int
>> +aest_node_set_errgsr(struct aest_device *adev, struct aest_node *node)
>> +{
>> +	struct acpi_aest_node *anode = node->info;
>> +	u64 errgsr_base = anode->common->error_group_register_base;
>> +
>> +	if (anode->interface_hdr->type != ACPI_AEST_NODE_MEMORY_MAPPED)
>> +		return 0;
>> +
>> +	if (!node->base)
>> +		return 0;
>> +
>> +	if (!(anode->interface_hdr->flags & AEST_XFACE_FLAG_ERROR_GROUP)) {
>> +		node->errgsr = node->base + ERXGROUP;
>> +		return 0;
>> +	}
>> +
>> +	if (!errgsr_base)
>> +		return -EINVAL;
>> +
>> +	node->errgsr = devm_ioremap(adev->dev, errgsr_base, PAGE_SIZE);
>> +	if (!node->errgsr)
>> +		return -ENOMEM;
>> +
>> +	return 0;
>> +}
>> +
>> +static int aest_init_node(struct aest_device *adev, struct aest_node *node,
>> +					struct acpi_aest_node *anode)
>> +{
>> +	int i, ret;
>> +	u64 address, size, flags;
>> +
>> +	node->adev = adev;
>> +	node->info = anode;
>> +	node->type = anode->type;
>> +	node->version = get_aest_node_ver(node);
>> +	node->name = alloc_aest_node_name(node);
>> +	if (!node->name)
>> +		return -ENOMEM;
>> +	node->record_implemented = anode->record_implemented;
>> +	node->status_reporting = anode->status_reporting;
>> +
>> +	address = anode->interface_hdr->address;
>> +	size = anode->interface_hdr->error_record_count *
>> +						sizeof(struct ras_ext_regs);
>> +	if (address) {
>> +		node->base = devm_ioremap(adev->dev, address, size);
>> +		if (!node->base)
>> +			return -ENOMEM;
>> +	}
>> +
>> +	flags = anode->interface_hdr->flags;
>> +	address = node->info->common->fault_inject_register_base;
>> +	if ((flags & AEST_XFACE_FLAG_FAULT_INJECT) && address) {
>> +		node->inj = devm_ioremap(adev->dev, address, PAGE_SIZE);
>> +		if (!node->inj)
>> +			return -ENOMEM;
>> +	}
>> +
>> +	address = node->info->common->interrupt_config_register_base;
>> +	if ((flags & AEST_XFACE_FLAG_FAULT_INJECT) && address) {
>> +		node->irq_config = devm_ioremap(adev->dev, address, PAGE_SIZE);
>> +		if (!node->irq_config)
>> +			return -ENOMEM;
>> +	}
>> +
>> +	ret = aest_node_set_errgsr(adev, node);
>> +	if (ret)
>> +		return ret;
>> +
>> +	node->record_count = anode->interface_hdr->error_record_count;
>> +	node->records = devm_kcalloc(adev->dev, node->record_count,
>> +				sizeof(struct aest_record), GFP_KERNEL);
>> +	if (!node->records)
>> +		return -ENOMEM;
>> +
>> +	for (i = 0; i < node->record_count; i++) {
>> +		ret = aest_init_record(&node->records[i], i, node);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +	aest_node_dbg(node, "%d records, base: %llx, errgsr: %llx\n",
>> +			node->record_count, (u64)node->base, (u64)node->errgsr);
>> +	return 0;
>> +}
>> +
>> +static int
>> +aest_init_nodes(struct aest_device *adev, struct aest_hnode *ahnode)
>> +{
>> +	struct acpi_aest_node *anode;
>> +	struct aest_node *node;
>> +	int ret, i = 0;
>> +
>> +	adev->node_cnt = ahnode->count;
>> +	adev->nodes = devm_kcalloc(adev->dev, adev->node_cnt,
>> +					sizeof(struct aest_node), GFP_KERNEL);
>> +	if (!adev->nodes)
>> +		return -ENOMEM;
>> +
>> +	list_for_each_entry(anode, &ahnode->list, list) {
>> +		adev->type = anode->type;
>> +
>> +		node = &adev->nodes[i++];
>> +		ret = aest_init_node(adev, node, anode);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int __setup_ppi(struct aest_device *adev)
>> +{
>> +	int cpu, i;
>> +	struct aest_device *oncore_adev;
>> +	struct aest_node *oncore_node;
>> +	size_t size;
>> +
>> +	adev->adev_oncore = &percpu_adev;
>> +	for_each_possible_cpu(cpu) {
>> +		oncore_adev = per_cpu_ptr(&percpu_adev, cpu);
>> +		memcpy(oncore_adev, adev, sizeof(struct aest_device));
>> +
>> +		oncore_adev->nodes = devm_kcalloc(adev->dev,
>> +						oncore_adev->node_cnt,
>> +						sizeof(struct aest_node),
>> +						GFP_KERNEL);
>> +		if (!oncore_adev->nodes)
>> +			return -ENOMEM;
>> +
>> +		size = adev->node_cnt * sizeof(struct aest_node);
>> +		memcpy(oncore_adev->nodes, adev->nodes, size);
>> +		for (i = 0; i < oncore_adev->node_cnt; i++) {
>> +			oncore_node = &oncore_adev->nodes[i];
>> +			oncore_node->records = devm_kcalloc(adev->dev,
>> +					oncore_node->record_count,
>> +					sizeof(struct aest_record), GFP_KERNEL);
>> +			if (!oncore_node->records)
>> +				return -ENOMEM;
>> +
>> +			size = oncore_node->record_count *
>> +						sizeof(struct aest_record);
>> +			memcpy(oncore_node->records, adev->nodes[i].records,
>> +									size);
>> +		}
>> +
>> +		aest_dev_dbg(adev, "Init device on CPU%d.\n", cpu);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int aest_setup_irq(struct platform_device *pdev, struct aest_device *adev)
>> +{
>> +	int fhi_irq, eri_irq;
>> +
>> +	fhi_irq = platform_get_irq_byname_optional(pdev, "fhi");
>> +	if (fhi_irq > 0)
>> +		adev->irq[0] = fhi_irq;
>> +
>> +	eri_irq = platform_get_irq_byname_optional(pdev, "eri");
>> +	if (eri_irq > 0)
>> +		adev->irq[1] = eri_irq;
>> +
>> +	/* Allocate and initialise the percpu device pointer for PPI */
>> +	if (irq_is_percpu(fhi_irq) || irq_is_percpu(eri_irq))
>> +		return __setup_ppi(adev);
>> +
>> +	return 0;
>> +}
>> +
>> +static int aest_device_probe(struct platform_device *pdev)
>> +{
>> +	int ret;
>> +	struct aest_device *adev;
>> +	struct aest_hnode *ahnode;
>> +
>> +	ahnode = *((struct aest_hnode **)pdev->dev.platform_data);
>> +	if (!ahnode)
>> +		return -ENODEV;
>> +
>> +	adev = devm_kzalloc(&pdev->dev, sizeof(*adev), GFP_KERNEL);
>> +	if (!adev)
>> +		return -ENOMEM;
>> +
>> +	adev->dev = &pdev->dev;
>> +	INIT_WORK(&adev->aest_work, aest_node_pool_process);
>> +	ret = aest_node_pool_init(adev);
>> +	if (ret) {
>> +		aest_dev_err(adev, "Failed init aest node pool.\n");
>> +		return ret;
>> +	}
>> +	init_llist_head(&adev->event_list);
>> +	adev->uid = ahnode->uid;
>> +	aest_set_name(adev, ahnode);
>> +
>> +	ret = aest_init_nodes(adev, ahnode);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = aest_setup_irq(pdev, adev);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = aest_register_irq(adev);
>> +	if (ret) {
>> +		aest_dev_err(adev, "register irq failed\n");
>> +		return ret;
>> +	}
>> +
>> +	platform_set_drvdata(pdev, adev);
>> +
>> +	if (aest_dev_is_oncore(adev))
>> +		ret = cpuhp_setup_state(CPUHP_AP_ARM_AEST_STARTING,
>> +				"drivers/acpi/arm64/aest:starting",
>> +				aest_starting_cpu, aest_dying_cpu);
>> +	else
>> +		aest_online_dev(adev);
>> +	if (ret)
>> +		return ret;
>> +
>> +	aest_dev_dbg(adev, "Node cnt: %x, uid: %x, irq: %d, %d\n",
>> +			adev->node_cnt, adev->uid, adev->irq[0], adev->irq[1]);
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct acpi_device_id acpi_aest_ids[] = {
>> +	{"ARMHE000", 0},
>> +	{}
>> +};
> 
> My understanding is that platform device with name "AEST" is
> created in acpi_aest_alloc_pdev and then the name will be used
> to bind this driver for the dev. So, do we need ACPI HID definition
> here? Using name should work well for both systems with or without
> ARMHE000. Or, am I missing something?
> 
> I have not yet finish to look all parts and will look them and
> other patches too.
> 
> Best Regards,
> Tomohiro Misono

You are right, i will delete these code next version.

Best Regards,
Ruidong

> 
>> +
>> +static struct platform_driver aest_driver = {
>> +	.driver	= {
>> +		.name	= "AEST",
>> +		.acpi_match_table = acpi_aest_ids,
>> +	},
>> +	.probe	= aest_device_probe,
>> +	.remove = aest_device_remove,
>> +};
>> +
>> +static int __init aest_init(void)
>> +{
>> +	return platform_driver_register(&aest_driver);
>> +}
>> +module_init(aest_init);
>> +
>> +static void __exit aest_exit(void)
>> +{
>> +	platform_driver_unregister(&aest_driver);
>> +}
>> +module_exit(aest_exit);
>> +
>> +MODULE_DESCRIPTION("ARM AEST Driver");
>> +MODULE_AUTHOR("Ruidong Tian <tianruidong@linux.alibaba.com>");
>> +MODULE_LICENSE("GPL");
>> +
>> diff --git a/drivers/ras/aest/aest.h b/drivers/ras/aest/aest.h
>> new file mode 100644
>> index 000000000000..04005aad3617
>> --- /dev/null
>> +++ b/drivers/ras/aest/aest.h
>> @@ -0,0 +1,323 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * ARM Error Source Table Support
>> + *
>> + * Copyright (c) 2021-2024, Alibaba Group.
>> + */
>> +
>> +#include <linux/acpi_aest.h>
>> +#include <asm/ras.h>
>> +
>> +#define MAX_GSI_PER_NODE 2
>> +#define AEST_MAX_PPI 3
>> +#define DEFAULT_CE_THRESHOLD 1
>> +
>> +#define record_read(record, offset) \
>> +	record->access->read(record->regs_base, offset)
>> +#define record_write(record, offset, val) \
>> +	record->access->write(record->regs_base, offset, val)
>> +
>> +#define aest_dev_err(__adev, format, ...)	\
>> +	dev_err((__adev)->dev, format, ##__VA_ARGS__)
>> +#define aest_dev_info(__adev, format, ...)	\
>> +	dev_info((__adev)->dev, format, ##__VA_ARGS__)
>> +#define aest_dev_dbg(__adev, format, ...)	\
>> +	dev_dbg((__adev)->dev, format, ##__VA_ARGS__)
>> +
>> +#define aest_node_err(__node, format, ...)	\
>> +	dev_err((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
>> +#define aest_node_info(__node, format, ...)	\
>> +	dev_info((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
>> +#define aest_node_dbg(__node, format, ...)	\
>> +	dev_dbg((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
>> +
>> +#define aest_record_err(__record, format, ...)	\
>> +	dev_err((__record)->node->adev->dev, "%s: %s: " format, \
>> +		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
>> +#define aest_record_info(__record, format, ...)	\
>> +	dev_info((__record)->node->adev->dev, "%s: %s: " format, \
>> +		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
>> +#define aest_record_dbg(__record, format, ...)	\
>> +	dev_dbg((__record)->node->adev->dev, "%s: %s: " format, \
>> +		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
>> +
>> +#define ERXFR			0x0
>> +#define ERXCTLR			0x8
>> +#define ERXSTATUS		0x10
>> +#define ERXADDR			0x18
>> +#define ERXMISC0		0x20
>> +#define ERXMISC1		0x28
>> +#define ERXMISC2		0x30
>> +#define ERXMISC3		0x38
>> +
>> +#define ERXGROUP		0xE00
>> +#define GIC_ERRDEVARCH		0xFFBC
>> +
>> +extern struct xarray *aest_array;
>> +
>> +struct aest_event {
>> +	struct llist_node llnode;
>> +	char *node_name;
>> +	u32 type;
>> +	/*
>> +	 * Different nodes have different meanings:
>> +	 *   - Processor node	: processor number.
>> +	 *   - Memory node	: SRAT proximity domain.
>> +	 *   - SMMU node	: IORT proximity domain.
>> +	 *   - GIC node		: interface type.
>> +	 */
>> +	u32 id0;
>> +	/*
>> +	 * Different nodes have different meanings:
>> +	 *   - Processor node	: processor resource type.
>> +	 *   - Memory node	: Non.
>> +	 *   - SMMU node	: subcomponent reference.
>> +	 *   - Vendor node	: Unique ID.
>> +	 *   - GIC node		: instance identifier.
>> +	 */
>> +	u32 id1;
>> +	char *hid;		// Vendor node	: hardware ID.
>> +	u32 index;
>> +	u64 ce_threshold;
>> +	int addressing_mode;
>> +	struct ras_ext_regs regs;
>> +
>> +	void *vendor_data;
>> +	size_t vendor_data_size;
>> +};
>> +
>> +struct aest_access {
>> +	u64 (*read)(void *base, u32 offset);
>> +	void (*write)(void *base, u32 offset, u64 val);
>> +};
>> +
>> +struct ce_threshold_info {
>> +	const u64			max_count;
>> +	const u64			mask;
>> +	const u64			shift;
>> +};
>> +
>> +struct ce_threshold {
>> +	const struct ce_threshold_info	*info;
>> +	u64				count;
>> +	u64				threshold;
>> +	u64				reg_val;
>> +};
>> +
>> +struct aest_record {
>> +	char				*name;
>> +	int				index;
>> +	void __iomem			*regs_base;
>> +
>> +	/*
>> +	 * This bit specifies the addressing mode  to populate the ERR_ADDR
>> +	 * register:
>> +	 *   0b: Error record reports System Physical Addresses (SPA) in
>> +	 *       the ERR_ADDR register.
>> +	 *   1b: Error record reports error node-specific Logical Addresses(LA)
>> +	 *       in the ERR_ADD register. OS must use other means to translate
>> +	 *       the reported LA into SPA
>> +	 */
>> +	int				addressing_mode;
>> +	u64				fr;
>> +	struct aest_node		*node;
>> +
>> +	struct dentry			*debugfs;
>> +	struct ce_threshold		ce;
>> +	enum ras_ce_threshold		threshold_type;
>> +	const struct aest_access	*access;
>> +
>> +	void				*vendor_data;
>> +	size_t				vendor_data_size;
>> +};
>> +
>> +struct aest_node {
>> +	char				*name;
>> +	u8				type;
>> +	void				*errgsr;
>> +	void				*inj;
>> +	void				*irq_config;
>> +	void				*base;
>> +
>> +	/*
>> +	 * This bitmap indicates which of the error records within this error
>> +	 * node must be polled for error status.
>> +	 * Bit[n] of this field pertains to error record corresponding to
>> +	 * index n in this error group.
>> +	 * Bit[n] = 0b: Error record at index n needs to be polled.
>> +	 * Bit[n] = 1b: Error record at index n do not needs to be polled.
>> +	 */
>> +	unsigned long			*record_implemented;
>> +	/*
>> +	 * This bitmap indicates which of the error records within this error
>> +	 * node support error status reporting using ERRGSR register.
>> +	 * Bit[n] of this field pertains to error record corresponding to
>> +	 * index n in this error group.
>> +	 * Bit[n] = 0b: Error record at index n supports error status reporting
>> +	 *              through ERRGSR.S.
>> +	 * Bit[n] = 1b: Error record at index n does not support error reporting
>> +	 *              through the ERRGSR.S bit If this error record is
>> +	 *              implemented, then it must be polled explicitly for
>> +	 *              error events.
>> +	 */
>> +	unsigned long			*status_reporting;
>> +	int				version;
>> +
>> +	struct aest_device		*adev;
>> +	struct acpi_aest_node		*info;
>> +	struct dentry			*debugfs;
>> +
>> +	int				record_count;
>> +	struct aest_record		*records;
>> +
>> +	struct aest_node __percpu	*oncore_node;
>> +};
>> +
>> +struct aest_device {
>> +	struct device			*dev;
>> +	u32				type;
>> +	int				node_cnt;
>> +	struct aest_node		*nodes;
>> +
>> +	struct work_struct		aest_work;
>> +	struct gen_pool			*pool;
>> +	struct llist_head		event_list;
>> +
>> +	int				irq[MAX_GSI_PER_NODE];
>> +	u32				uid;
>> +	struct aest_device __percpu	*adev_oncore;
>> +
>> +	struct dentry			*debugfs;
>> +};
>> +
>> +struct aest_node_context {
>> +	struct aest_node		*node;
>> +	unsigned long			*bitmap;
>> +	void				(*func)(struct aest_record *record,
>> +							void *data);
>> +	void				*data;
>> +	int				ret;
>> +};
>> +
>> +#define CASE_READ(res, x)						\
>> +	case (x): {							\
>> +		res = read_sysreg_s(SYS_##x##_EL1);			\
>> +		break;							\
>> +	}
>> +
>> +#define CASE_WRITE(val, x)						\
>> +	case (x): {							\
>> +		write_sysreg_s((val), SYS_##x##_EL1);			\
>> +		break;							\
>> +	}
>> +
>> +static inline u64 aest_sysreg_read(void *__unused, u32 offset)
>> +{
>> +	u64 res;
>> +
>> +	switch (offset) {
>> +	CASE_READ(res, ERXFR)
>> +	CASE_READ(res, ERXCTLR)
>> +	CASE_READ(res, ERXSTATUS)
>> +	CASE_READ(res, ERXADDR)
>> +	CASE_READ(res, ERXMISC0)
>> +	CASE_READ(res, ERXMISC1)
>> +	CASE_READ(res, ERXMISC2)
>> +	CASE_READ(res, ERXMISC3)
>> +	default :
>> +		res = 0;
>> +	}
>> +	return res;
>> +}
>> +
>> +static inline void aest_sysreg_write(void *base, u32 offset, u64 val)
>> +{
>> +	switch (offset) {
>> +	CASE_WRITE(val, ERXFR)
>> +	CASE_WRITE(val, ERXCTLR)
>> +	CASE_WRITE(val, ERXSTATUS)
>> +	CASE_WRITE(val, ERXADDR)
>> +	CASE_WRITE(val, ERXMISC0)
>> +	CASE_WRITE(val, ERXMISC1)
>> +	CASE_WRITE(val, ERXMISC2)
>> +	CASE_WRITE(val, ERXMISC3)
>> +	default :
>> +		return;
>> +	}
>> +}
>> +
>> +static inline u64 aest_iomem_read(void *base, u32 offset)
>> +{
>> +	return readq_relaxed(base + offset);
>> +	return 0;
>> +}
>> +
>> +static inline void aest_iomem_write(void *base, u32 offset, u64 val)
>> +{
>> +	writeq_relaxed(val, base + offset);
>> +}
>> +
>> +/* access type is decided by AEST interface type. */
>> +static const struct aest_access aest_access[] = {
>> +	[ACPI_AEST_NODE_SYSTEM_REGISTER] = {
>> +		.read = aest_sysreg_read,
>> +		.write = aest_sysreg_write,
>> +	},
>> +
>> +	[ACPI_AEST_NODE_MEMORY_MAPPED] = {
>> +		.read = aest_iomem_read,
>> +		.write = aest_iomem_write,
>> +	},
>> +	[ACPI_AEST_NODE_SINGLE_RECORD_MEMORY_MAPPED] = {
>> +		.read = aest_iomem_read,
>> +		.write = aest_iomem_write,
>> +	},
>> +	{ }
>> +};
>> +
>> +static inline bool aest_dev_is_oncore(struct aest_device *adev)
>> +{
>> +	return adev->type == ACPI_AEST_PROCESSOR_ERROR_NODE;
>> +}
>> +
>> +/*
>> + * Each PE may has multi error record, you must selects an error
>> + * record to be accessed through the Error Record System
>> + * registers.
>> + */
>> +static inline void aest_select_record(struct aest_node *node, int index)
>> +{
>> +	if (node->type == ACPI_AEST_PROCESSOR_ERROR_NODE) {
>> +		write_sysreg_s(index, SYS_ERRSELR_EL1);
>> +		isb();
>> +	}
>> +}
>> +
>> +/* Ensure all writes has taken effect. */
>> +static inline void aest_sync(struct aest_node *node)
>> +{
>> +	if (node->type == ACPI_AEST_PROCESSOR_ERROR_NODE)
>> +		isb();
>> +}
>> +
>> +static const char * const aest_node_name[] = {
>> +	[ACPI_AEST_PROCESSOR_ERROR_NODE] = "processor",
>> +	[ACPI_AEST_MEMORY_ERROR_NODE] = "memory",
>> +	[ACPI_AEST_SMMU_ERROR_NODE] = "smmu",
>> +	[ACPI_AEST_VENDOR_ERROR_NODE] = "vendor",
>> +	[ACPI_AEST_GIC_ERROR_NODE] = "gic",
>> +	[ACPI_AEST_PCIE_ERROR_NODE] = "pcie",
>> +	[ACPI_AEST_PROXY_ERROR_NODE] = "proxy",
>> +};
>> +
>> +static inline int
>> +aest_set_name(struct aest_device *adev, struct aest_hnode *ahnode)
>> +{
>> +	adev->dev->init_name = devm_kasprintf(adev->dev, GFP_KERNEL,
>> +					"%s%d", aest_node_name[ahnode->type],
>> +						adev->uid);
>> +	if (!adev->dev->init_name)
>> +		return -ENOMEM;
>> +
>> +	return 0;
>> +}
>> diff --git a/include/linux/acpi_aest.h b/include/linux/acpi_aest.h
>> new file mode 100644
>> index 000000000000..1c2191791504
>> --- /dev/null
>> +++ b/include/linux/acpi_aest.h
>> @@ -0,0 +1,68 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef __ACPI_AEST_H__
>> +#define __ACPI_AEST_H__
>> +
>> +#include <linux/acpi.h>
>> +#include <asm/ras.h>
>> +
>> +/* AEST component */
>> +#define ACPI_AEST_PROC_FLAG_GLOBAL	(1<<0)
>> +#define ACPI_AEST_PROC_FLAG_SHARED	(1<<1)
>> +
>> +#define AEST_ADDREESS_SPA	0
>> +#define AEST_ADDREESS_LA	1
>> +
>> +/* AEST interrupt */
>> +#define AEST_INTERRUPT_MODE		BIT(0)
>> +#define AEST_INTERRUPT_FHI_MODE		BIT(1)
>> +
>> +#define AEST_INTERRUPT_FHI_UE_SUPPORT		BIT(0)
>> +#define AEST_INTERRUPT_FHI_UE_NO_SUPPORT		BIT(1)
>> +
>> +#define AEST_MAX_INTERRUPT_PER_NODE 3
>> +
>> +/* AEST interface */
>> +
>> +#define AEST_XFACE_FLAG_SHARED		(1<<0)
>> +#define AEST_XFACE_FLAG_CLEAR_MISC	(1<<1)
>> +#define AEST_XFACE_FLAG_ERROR_DEVICE	(1<<2)
>> +#define AEST_XFACE_FLAG_AFFINITY	(1<<3)
>> +#define AEST_XFACE_FLAG_ERROR_GROUP	(1<<4)
>> +#define AEST_XFACE_FLAG_FAULT_INJECT	(1<<5)
>> +#define AEST_XFACE_FLAG_INT_CONFIG	(1<<6)
>> +
>> +struct aest_hnode {
>> +	struct list_head list;
>> +	int count;
>> +	u32 uid;
>> +	int type;
>> +};
>> +
>> +struct acpi_aest_node {
>> +	struct list_head list;
>> +	int type;
>> +	struct acpi_aest_node_interface_header *interface_hdr;
>> +	unsigned long *record_implemented;
>> +	unsigned long *status_reporting;
>> +	unsigned long *addressing_mode;
>> +	struct acpi_aest_node_interface_common *common;
>> +	union {
>> +		struct acpi_aest_processor *processor;
>> +		struct acpi_aest_memory *memory;
>> +		struct acpi_aest_smmu *smmu;
>> +		struct acpi_aest_vendor_v2 *vendor;
>> +		struct acpi_aest_gic *gic;
>> +		struct acpi_aest_pcie *pcie;
>> +		struct acpi_aest_proxy *proxy;
>> +		void *spec_pointer;
>> +	};
>> +	union {
>> +		struct acpi_aest_processor_cache *cache;
>> +		struct acpi_aest_processor_tlb *tlb;
>> +		struct acpi_aest_processor_generic *generic;
>> +		void *processor_spec_pointer;
>> +	};
>> +	struct acpi_aest_node_interrupt_v2 *interrupt;
>> +	int interrupt_count;
>> +};
>> +#endif /* __ACPI_IORT_H__ */
>> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> index a04b73c40173..acf0e3957fdd 100644
>> --- a/include/linux/cpuhotplug.h
>> +++ b/include/linux/cpuhotplug.h
>> @@ -179,6 +179,7 @@ enum cpuhp_state {
>>   	CPUHP_AP_CSKY_TIMER_STARTING,
>>   	CPUHP_AP_TI_GP_TIMER_STARTING,
>>   	CPUHP_AP_HYPERV_TIMER_STARTING,
>> +	CPUHP_AP_ARM_AEST_STARTING,
>>   	/* Must be the last timer callback */
>>   	CPUHP_AP_DUMMY_TIMER_STARTING,
>>   	CPUHP_AP_ARM_XEN_STARTING,
>> diff --git a/include/linux/ras.h b/include/linux/ras.h
>> index a64182bc72ad..1c777af6a1af 100644
>> --- a/include/linux/ras.h
>> +++ b/include/linux/ras.h
>> @@ -53,4 +53,12 @@ static inline unsigned long
>>   amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err) { return -EINVAL; }
>>   #endif /* CONFIG_AMD_ATL */
>>
>> +#if IS_ENABLED(CONFIG_AEST)
>> +void aest_register_decode_chain(struct notifier_block *nb);
>> +void aest_unregister_decode_chain(struct notifier_block *nb);
>> +#else
>> +static inline void aest_register_decode_chain(struct notifier_block *nb) {}
>> +static inline void aest_unregister_decode_chain(struct notifier_block *nb) {}
>> +#endif /* CONFIG_AEST */
>> +
>>   #endif /* __RAS_H__ */
>> --
>> 2.33.1
>>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver
  2025-02-06  8:32     ` Ruidong Tian
@ 2025-02-14  9:14       ` Tomohiro Misono (Fujitsu)
  0 siblings, 0 replies; 16+ messages in thread
From: Tomohiro Misono (Fujitsu) @ 2025-02-14  9:14 UTC (permalink / raw)
  To: 'Ruidong Tian', catalin.marinas@arm.com, will@kernel.org,
	lpieralisi@kernel.org, guohanjun@huawei.com, sudeep.holla@arm.com,
	xueshuai@linux.alibaba.com, baolin.wang@linux.alibaba.com,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, rafael@kernel.org,
	lenb@kernel.org, tony.luck@intel.com, bp@alien8.de,
	yazen.ghannam@amd.com
  Cc: Tyler Baicar

> > Hello, some comments below.
> 
> Thank you for your comments! I really appreciate it.
> 
> >
> >> Subject: [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver
> >>
> >> Add support for parsing the ARM Error Source Table and basic handling of
> >> errors reported through both memory mapped and system register interfaces.
> >>
> >> Assume system register interfaces are only registered with private
> >> peripheral interrupts (PPIs); otherwise there is no guarantee the
> >> core handling the error is the core which took the error and has the
> >> syndrome info in its system registers.
> >>
> >> In kernel-first mode, all configuration is controlled by kernel, include
> >> CE ce_threshold and interrupt enable/disable.
> >>
> >> All detected errors will be processed as follow:
> >>    - CE, DE: use a workqueue to log this hare errors.
> >>    - UER, UEO: log it and call memory_failun workquee.
> >>    - UC, UEU: panic in irq context.
> >>
> >> Signed-off-by: Tyler Baicar <baicar@os.amperecomputing.com>
> >> Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
> >> ---
> >>   MAINTAINERS                  |  10 +
> >>   arch/arm64/include/asm/ras.h |  95 ++++
> >>   drivers/acpi/arm64/Kconfig   |  11 +
> >>   drivers/acpi/arm64/Makefile  |   1 +
> >>   drivers/acpi/arm64/aest.c    | 335 ++++++++++++
> >>   drivers/acpi/arm64/init.c    |   2 +
> >>   drivers/acpi/arm64/init.h    |   1 +
> >>   drivers/ras/Kconfig          |   1 +
> >>   drivers/ras/Makefile         |   1 +
> >>   drivers/ras/aest/Kconfig     |  17 +
> >>   drivers/ras/aest/Makefile    |   5 +
> >>   drivers/ras/aest/aest-core.c | 976 +++++++++++++++++++++++++++++++++++
> >>   drivers/ras/aest/aest.h      | 323 ++++++++++++
> >>   include/linux/acpi_aest.h    |  68 +++
> >>   include/linux/cpuhotplug.h   |   1 +
> >>   include/linux/ras.h          |   8 +
> >>   16 files changed, 1855 insertions(+)
> >>   create mode 100644 arch/arm64/include/asm/ras.h
> >>   create mode 100644 drivers/acpi/arm64/aest.c
> >>   create mode 100644 drivers/ras/aest/Kconfig
> >>   create mode 100644 drivers/ras/aest/Makefile
> >>   create mode 100644 drivers/ras/aest/aest-core.c
> >>   create mode 100644 drivers/ras/aest/aest.h
> >>   create mode 100644 include/linux/acpi_aest.h
> >>
> >> diff --git a/MAINTAINERS b/MAINTAINERS
> >> index 637ddd44245f..d757f9339627 100644
> >> --- a/MAINTAINERS
> >> +++ b/MAINTAINERS
> >> @@ -330,6 +330,16 @@ S:	Maintained
> >>   F:	drivers/acpi/arm64
> >>   F:	include/linux/acpi_iort.h
> >>
> >> +ACPI AEST
> >> +M:	Ruidong Tian <tianruidond@linux.alibaba.com>
> >> +L:	linux-acpi@vger.kernel.org
> >> +L:	linux-arm-kernel@lists.infradead.org
> >> +S:	Supported
> >> +F:	arch/arm64/include/asm/ras.h
> >> +F:	drivers/acpi/arm64/aest.c
> >> +F:	drivers/ras/aest/
> >> +F:	include/linux/acpi_aest.h
> >> +
> >>   ACPI FOR RISC-V (ACPI/riscv)
> >>   M:	Sunil V L <sunilvl@ventanamicro.com>
> >>   L:	linux-acpi@vger.kernel.org
> >> diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h
> >> new file mode 100644
> >> index 000000000000..7676add8a0ed
> >> --- /dev/null
> >> +++ b/arch/arm64/include/asm/ras.h
> >> @@ -0,0 +1,95 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> >> +#ifndef __ASM_RAS_H
> >> +#define __ASM_RAS_H
> >> +
> >> +#include <linux/types.h>
> >> +#include <linux/bits.h>
> >> +
> >> +/* ERR<n>FR */
> >> +#define ERR_FR_CE                      GENMASK_ULL(54, 53)
> >> +#define ERR_FR_RP                      BIT(15)
> >> +#define ERR_FR_CEC                     GENMASK_ULL(14, 12)
> >> +
> >> +#define ERR_FR_RP_SINGLE_COUNTER       0
> >> +#define ERR_FR_RP_DOUBLE_COUNTER       1
> >> +
> >> +#define ERR_FR_CEC_0B_COUNTER          0
> >> +#define ERR_FR_CEC_8B_COUNTER          BIT(1)
> >> +#define ERR_FR_CEC_16B_COUNTER         BIT(2)
> >> +
> >> +/* ERR<n>STATUS */
> >> +#define ERR_STATUS_AV		BIT(31)
> >> +#define ERR_STATUS_V		BIT(30)
> >> +#define ERR_STATUS_UE		BIT(29)
> >> +#define ERR_STATUS_ER		BIT(28)
> >> +#define ERR_STATUS_OF		BIT(27)
> >> +#define ERR_STATUS_MV		BIT(26)
> >> +#define ERR_STATUS_CE		(BIT(25) | BIT(24))
> >> +#define ERR_STATUS_DE		BIT(23)
> >> +#define ERR_STATUS_PN		BIT(22)
> >> +#define ERR_STATUS_UET		(BIT(21) | BIT(20))
> >> +#define ERR_STATUS_CI		BIT(19)
> >> +#define ERR_STATUS_IERR		GENMASK_ULL(15, 8)
> >> +#define ERR_STATUS_SERR		GENMASK_ULL(7, 0)
> >> +
> >> +/* Theses bits are	 write-one-to-clear */
> >> +#define ERR_STATUS_W1TC		(ERR_STATUS_AV | ERR_STATUS_V |
> ERR_STATUS_UE | \
> >> +				ERR_STATUS_ER | ERR_STATUS_OF | ERR_STATUS_MV | \
> >> +				ERR_STATUS_CE | ERR_STATUS_DE | ERR_STATUS_PN | \
> >> +				ERR_STATUS_UET | ERR_STATUS_CI)
> >> +
> >> +#define ERR_STATUS_UET_UC	0
> >> +#define ERR_STATUS_UET_UEU	1
> >> +#define ERR_STATUS_UET_UEO	2
> >> +#define ERR_STATUS_UET_UER	3
> >> +
> >> +/* ERR<n>CTLR */
> >> +#define ERR_CTLR_CFI		BIT(8)
> >> +#define ERR_CTLR_FI		BIT(3)
> >> +#define ERR_CTLR_UI		BIT(2)
> >> +
> >> +/* ERR<n>ADDR */
> >> +#define ERR_ADDR_AI		BIT(61)
> >> +#define ERR_ADDR_PADDR		GENMASK_ULL(55, 0)
> >> +
> >> +/* ERR<n>MISC0 */
> >> +
> >> +/* ERR<n>FR.CEC == 0b010, ERR<n>FR.RP == 0  */
> >> +#define ERR_MISC0_8B_OF		BIT(39)
> >> +#define ERR_MISC0_8B_CEC	GENMASK_ULL(38, 32)
> >> +
> >> +/* ERR<n>FR.CEC == 0b100, ERR<n>FR.RP == 0  */
> >> +#define ERR_MISC0_16B_OF	BIT(47)
> >> +#define ERR_MISC0_16B_CEC	GENMASK_ULL(46, 32)
> >> +
> >> +#define ERR_MISC0_CEC_SHIFT	31
> >> +
> >> +#define ERR_8B_CEC_MAX		(ERR_MISC0_8B_CEC >> ERR_MISC0_CEC_SHIFT)
> >> +#define ERR_16B_CEC_MAX		(ERR_MISC0_16B_CEC >> ERR_MISC0_CEC_SHIFT)
> >> +
> >> +/* ERR<n>FR.CEC == 0b100, ERR<n>FR.RP == 1  */
> >> +#define ERR_MISC0_16B_OFO	BIT(63)
> >> +#define ERR_MISC0_16B_CECO	GENMASK_ULL(62, 48)
> >> +#define ERR_MISC0_16B_OFR	BIT(47)
> >> +#define ERR_MISC0_16B_CECR	GENMASK_ULL(46, 32)
> >> +
> >> +/* ERRDEVARCH */
> >> +#define ERRDEVARCH_REV		GENMASK(19, 16)
> >> +
> >> +enum ras_ce_threshold {
> >> +	RAS_CE_THRESHOLD_0B,
> >> +	RAS_CE_THRESHOLD_8B,
> >> +	RAS_CE_THRESHOLD_16B,
> >> +	RAS_CE_THRESHOLD_32B,
> >> +	UNKNOWN,
> >> +};
> >> +
> >> +struct ras_ext_regs {
> >> +	u64 err_fr;
> >> +	u64 err_ctlr;
> >> +	u64 err_status;
> >> +	u64 err_addr;
> >> +	u64 err_misc[4];
> >> +};
> >> +
> >> +#endif	/* __ASM_RAS_H */
> >> diff --git a/drivers/acpi/arm64/Kconfig b/drivers/acpi/arm64/Kconfig
> >> index b3ed6212244c..c8eb6de95733 100644
> >> --- a/drivers/acpi/arm64/Kconfig
> >> +++ b/drivers/acpi/arm64/Kconfig
> >> @@ -21,3 +21,14 @@ config ACPI_AGDI
> >>
> >>   config ACPI_APMT
> >>   	bool
> >> +
> >> +config ACPI_AEST
> >> +	bool "ARM Error Source Table Support"
> >> +	depends on ARM64_RAS_EXTN
> >> +
> >> +	help
> >> +	  The Arm Error Source Table (AEST) provides details on ACPI
> >> +	  extensions that enable kernel-first handling of errors in a
> >> +	  system that supports the Armv8 RAS extensions.
> >> +
> >> +	  If set, the kernel will report and log hardware errors.
> >> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
> >> index 05ecde9eaabe..8e240b281fd1 100644
> >> --- a/drivers/acpi/arm64/Makefile
> >> +++ b/drivers/acpi/arm64/Makefile
> >> @@ -6,5 +6,6 @@ obj-$(CONFIG_ACPI_GTDT) 	+= gtdt.o
> >>   obj-$(CONFIG_ACPI_IORT) 	+= iort.o
> >>   obj-$(CONFIG_ACPI_PROCESSOR_IDLE) += cpuidle.o
> >>   obj-$(CONFIG_ARM_AMBA)		+= amba.o
> >> +obj-$(CONFIG_ACPI_AEST) 	+= aest.o
> >>   obj-y				+= dma.o init.o
> >>   obj-y				+= thermal_cpufreq.o
> >> diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c
> >> new file mode 100644
> >> index 000000000000..6dba9c23e04e
> >> --- /dev/null
> >> +++ b/drivers/acpi/arm64/aest.c
> >> @@ -0,0 +1,335 @@
> >> +// SPDX-License-Identifier: GPL-2.0
> >> +/*
> >> + * ARM Error Source Table Support
> >> + *
> >> + * Copyright (c) 2024, Alibaba Group.
> >> + */
> >> +
> >> +#include <linux/xarray.h>
> >> +#include <linux/platform_device.h>
> >> +#include <linux/acpi_aest.h>
> >> +
> >> +#include "init.h"
> >> +
> >> +#undef pr_fmt
> >> +#define pr_fmt(fmt) "ACPI AEST: " fmt
> >> +
> >> +static struct xarray *aest_array;
> >> +
> >> +static void __init aest_init_interface(struct acpi_aest_hdr *hdr,
> >> +				       struct acpi_aest_node *node)
> >> +{
> >> +	struct acpi_aest_node_interface_header *interface;
> >> +
> >> +	interface = ACPI_ADD_PTR(struct acpi_aest_node_interface_header, hdr,
> >> +				 hdr->node_interface_offset);
> >> +
> >> +	node->type = hdr->type;
> >> +	node->interface_hdr = interface;
> >> +
> >> +	switch (interface->group_format) {
> >> +	case ACPI_AEST_NODE_GROUP_FORMAT_4K: {
> >> +		struct acpi_aest_node_interface_4k *interface_4k =
> >> +			(struct acpi_aest_node_interface_4k *)(interface + 1);
> >> +
> >> +		node->common = &interface_4k->common;
> >> +		node->record_implemented =
> >> +			(unsigned long *)&interface_4k->error_record_implemented;
> >> +		node->status_reporting =
> >> +			(unsigned long *)&interface_4k->error_status_reporting;
> >> +		node->addressing_mode =
> >> +			(unsigned long *)&interface_4k->addressing_mode;
> >> +		break;
> >> +	}
> >> +	case ACPI_AEST_NODE_GROUP_FORMAT_16K: {
> >> +		struct acpi_aest_node_interface_16k *interface_16k =
> >> +			(struct acpi_aest_node_interface_16k *)(interface + 1);
> >> +
> >> +		node->common = &interface_16k->common;
> >> +		node->record_implemented =
> >> +			(unsigned long *)interface_16k->error_record_implemented;
> >> +		node->status_reporting =
> >> +			(unsigned long *)interface_16k->error_status_reporting;
> >> +		node->addressing_mode =
> >> +			(unsigned long *)interface_16k->addressing_mode;
> >> +		break;
> >> +	}
> >> +	case ACPI_AEST_NODE_GROUP_FORMAT_64K: {
> >> +		struct acpi_aest_node_interface_64k *interface_64k =
> >> +			(struct acpi_aest_node_interface_64k *)(interface + 1);
> >> +
> >> +		node->common = &interface_64k->common;
> >> +		node->record_implemented =
> >> +			(unsigned long *)interface_64k->error_record_implemented;
> >> +		node->status_reporting =
> >> +			(unsigned long *)interface_64k->error_status_reporting;
> >> +		node->addressing_mode =
> >> +			(unsigned long *)interface_64k->addressing_mode;
> >> +		break;
> >> +	}
> >> +	default:
> >> +		pr_err("invalid group format: %d\n", interface->group_format);
> >> +	}
> >> +
> >> +	node->interrupt = ACPI_ADD_PTR(struct acpi_aest_node_interrupt_v2,
> >> +					hdr, hdr->node_interrupt_offset);
> >> +
> >> +	node->interrupt_count = hdr->node_interrupt_count;
> >> +}
> >> +
> >> +static int __init acpi_aest_init_node_common(struct acpi_aest_hdr *aest_hdr,
> >> +					struct acpi_aest_node *node)
> >> +{
> >> +	int ret;
> >> +	struct aest_hnode *hnode;
> >> +	u64 error_device_id;
> >> +
> >> +	aest_init_interface(aest_hdr, node);
> >> +
> >> +	error_device_id = node->common->error_node_device;
> >
> > I think I see a problem with this.
> >  From the spec[1], I understand that error node device is optional and
> > error node device field is only valid when error node device valid flag is set.
> >
> > [1] https://developer.arm.com/documentation/den0085/latest/
> >
> > Previous versions work well for the system without error node device (i.e. system
> > without ARMHE000 definition in DSDT) but this version doesn't.
> > Do we need to rely on information from error node device here when
> > a system has them? I thought AEST table has necessary information in all case and
> > want to know why this version use different approach from v2.
> 
> Q: Do we need to rely on information from error node device here when
> a system has them?
> A: DSDT error device node may include certain ACPI methods, such as
> address translation for DDRC. Intel has implemented this approach by
> using an ACPI method to translate DIMM addresses into system physical
> addresses [0].
> 
> [0]:
> https://lore.kernel.org/all/20181015202620.23610-1-tony.luck@intel.com/T/#u

Hi,
Thanks for the explanation. I see your point now but have one question regarding to this.
I think _DSM method requires UUID. So, which UUID do you plan to use for AEST?
I couldn't find any definition in ARM RAS specifications. 

> 
> Reson for use different approach in v3
> --------------------------------------------
> 
> In v3, an abstraction layer named AEST device was introduced on top of
> the AEST node. The main reasons are as follows:
> 1. Some AEST nodes share interrupts, and the AEST device is viewed as
> the owner of the interrupt to register interrupt functions.
> 2. Abstracting the contents of ACPI tables into platform devices is a
> common practice in ARM, like MPAM[1] an IORT, and I just follow it.
> 
> Which approach do you think is better, v2 or v3?
> [1]:
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.12-rc
> 1&id=8c26b06d7b811d397e672fd3b0d7c10d4965d97a
> 
> >
> > Also, I wonder if there will be a system that only some nodes have valid flag.
> 
> My plan is to create an AEST platform device regardless of whether the
> node is valid. In the next version, I will set the error_device_id to a
> globally incrementing ID instead of directly assigning it the value of
> error_node_device.

Since error node device definitions in DSDT is optional, I believe we need to make
sure AEST driver works for both case (i.e., with or without ARMHE000.
I think current FVP does not use error node devices). 

I'm not so sure if creating a platform device even when there is no error node device has benefit,
but if it simplifies the code on the whole, then setting a unique pseudo ID to each platform
device in that case seems a reasonable solution.

Regards,
Tomohiro Misono

> 
> >
> >> +
> >> +	hnode = xa_load(aest_array, error_device_id);
> >> +	if (!hnode) {
> >> +		hnode = kmalloc(sizeof(*hnode), GFP_KERNEL);
> >> +		if (!hnode) {
> >> +			ret = -ENOMEM;
> >> +			goto free;
> >> +		}
> >> +		INIT_LIST_HEAD(&hnode->list);
> >> +		hnode->uid = error_device_id;
> >> +		hnode->count = 0;
> >> +		hnode->type = node->type;
> >> +		xa_store(aest_array, error_device_id, hnode, GFP_KERNEL);
> >> +	}
> >> +
> >> +	list_add_tail(&node->list, &hnode->list);
> >> +	hnode->count++;
> >> +
> >> +	return 0;
> >> +
> >> +free:
> >> +	kfree(node);
> >> +	return ret;
> >> +}
> >> +
> >> +static int __init
> >> +acpi_aest_init_node_default(struct acpi_aest_hdr *aest_hdr)
> >> +{
> >> +	struct acpi_aest_node *node;
> >> +
> >> +	node = kzalloc(sizeof(*node), GFP_KERNEL);
> >> +	if (!node)
> >> +		return -ENOMEM;
> >> +
> >> +	node->spec_pointer = ACPI_ADD_PTR(void, aest_hdr,
> >> +					aest_hdr->node_specific_offset);
> >> +
> >> +	return acpi_aest_init_node_common(aest_hdr, node);
> >> +}
> >> +
> >> +static int __init
> >> +acpi_aest_init_processor_node(struct acpi_aest_hdr *aest_hdr)
> >> +{
> >> +	struct acpi_aest_node *node;
> >> +
> >> +	node = kzalloc(sizeof(*node), GFP_KERNEL);
> >> +	if (!node)
> >> +		return -ENOMEM;
> >> +
> >> +	node->spec_pointer = ACPI_ADD_PTR(void, aest_hdr,
> >> +					aest_hdr->node_specific_offset);
> >> +
> >> +	node->processor_spec_pointer = ACPI_ADD_PTR(void, node->spec_pointer,
> >> +					sizeof(struct acpi_aest_processor));
> >> +
> >> +	return acpi_aest_init_node_common(aest_hdr, node);
> >> +}
> >> +
> >> +static int __init acpi_aest_init_node(struct acpi_aest_hdr *header)
> >> +{
> >> +	switch (header->type) {
> >> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
> >> +		return acpi_aest_init_processor_node(header);
> >> +	case ACPI_AEST_VENDOR_ERROR_NODE:
> >> +	case ACPI_AEST_SMMU_ERROR_NODE:
> >> +	case ACPI_AEST_GIC_ERROR_NODE:
> >> +	case ACPI_AEST_PCIE_ERROR_NODE:
> >> +	case ACPI_AEST_PROXY_ERROR_NODE:
> >> +	case ACPI_AEST_MEMORY_ERROR_NODE:
> >> +		return acpi_aest_init_node_default(header);
> >> +	default:
> >> +		pr_err("acpi table header type is invalid: %d\n", header->type);
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int __init acpi_aest_init_nodes(struct acpi_table_header *aest_table)
> >> +{
> >> +	struct acpi_aest_hdr *aest_node, *aest_end;
> >> +	struct acpi_table_aest *aest;
> >> +	int rc;
> >> +
> >> +	aest = (struct acpi_table_aest *)aest_table;
> >> +	aest_node = ACPI_ADD_PTR(struct acpi_aest_hdr, aest,
> >> +				 sizeof(struct acpi_table_header));
> >> +	aest_end = ACPI_ADD_PTR(struct acpi_aest_hdr, aest,
> >> +				aest_table->length);
> >> +
> >> +	while (aest_node < aest_end) {
> >> +		if (((u64)aest_node + aest_node->length) > (u64)aest_end) {
> >> +			pr_warn(FW_WARN "AEST node pointer overflow, bad table.\n");
> >> +			return -EINVAL;
> >> +		}
> >> +
> >> +		rc = acpi_aest_init_node(aest_node);
> >> +		if (rc)
> >> +			return rc;
> >> +
> >> +		aest_node = ACPI_ADD_PTR(struct acpi_aest_hdr, aest_node,
> >> +					 aest_node->length);
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int
> >> +acpi_aest_parse_irqs(struct platform_device *pdev, struct acpi_aest_node *anode,
> >> +				struct resource *res, int *res_idx, int irqs[2])
> >> +{
> >> +	int i;
> >> +	struct acpi_aest_node_interrupt_v2 *interrupt;
> >> +	int trigger, irq;
> >> +
> >> +	for (i = 0; i < anode->interrupt_count; i++) {
> >> +		interrupt = &anode->interrupt[i];
> >> +		if (irqs[interrupt->type])
> >> +			continue;
> >> +
> >> +		trigger = (interrupt->flags & AEST_INTERRUPT_MODE) ?
> >> +			ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE;
> >> +
> >> +		irq = acpi_register_gsi(&pdev->dev, interrupt->gsiv, trigger,
> >> +						ACPI_ACTIVE_HIGH);
> >> +		if (irq <= 0) {
> >> +			pr_err("failed to map AEST GSI %d\n", interrupt->gsiv);
> >> +			return irq;
> >> +		}
> >> +
> >> +		res[*res_idx].start = irq;
> >> +		res[*res_idx].end = irq;
> >> +		res[*res_idx].flags = IORESOURCE_IRQ;
> >> +		res[*res_idx].name = interrupt->type ? "eri" : "fhi";
> >> +
> >> +		(*res_idx)++;
> >> +
> >> +		irqs[interrupt->type] = irq;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int __init acpi_aest_alloc_pdev(void)
> >> +{
> >> +	int ret, j, size;
> >> +	struct aest_hnode *ahnode = NULL;
> >> +	unsigned long i;
> >> +	struct platform_device *pdev;
> >> +	struct acpi_device *companion;
> >> +	struct acpi_aest_node *anode;
> >> +	char uid[16];
> >> +	struct resource *res;
> >> +
> >> +	xa_for_each(aest_array, i, ahnode) {
> >> +		int irq[2] = { 0 };
> >> +
> >> +		res = kcalloc(ahnode->count + 2, sizeof(*res), GFP_KERNEL);
> >
> > Why is +2 needed?
> 
> Each aest platform device have max 2 irq resources, one for Error
> Recovery Interrupt and one for  Fault Handling Interrupt. I will add a
> macro here next version.
> 
> >
> >> +		if (!res) {
> >> +			ret = -ENOMEM;
> >> +			break;
> >> +		}
> >> +
> >> +		pdev = platform_device_alloc("AEST", i);
> >> +		if (IS_ERR(pdev)) {
> >> +			ret = PTR_ERR(pdev);
> >> +			break;
> >> +		}
> >> +
> >> +		ret = snprintf(uid, sizeof(uid), "%u", (u32)i);
> >> +		companion = acpi_dev_get_first_match_dev("ARMHE000", uid, -1);
> >> +		if (companion)
> >> +			ACPI_COMPANION_SET(&pdev->dev, companion);
> >> +
> >> +		j = 0;
> >> +		list_for_each_entry(anode, &ahnode->list, list) {
> >> +			if (anode->interface_hdr->type !=
> >> +					ACPI_AEST_NODE_SYSTEM_REGISTER) {
> >> +				res[j].name = "AEST:RECORD";
> >> +				res[j].start = anode->interface_hdr->address;
> >> +				size = anode->interface_hdr->error_record_count *
> >> +						sizeof(struct ras_ext_regs);
> >> +				res[j].end = res[j].start + size;
> >> +				res[j].flags = IORESOURCE_MEM;
> >
> > Will these fields be overwritten in below acpi_aest_parse_irqs()?
> 
> Yes, it is a bug, i will fix it next version.
> 
> >
> >> +			}
> >> +
> >> +			ret = acpi_aest_parse_irqs(pdev, anode, res, &j, irq);
> >> +			if (ret) {
> >> +				platform_device_put(pdev);
> >> +				break;
> >> +			}
> >> +		}
> >> +
> >> +		ret = platform_device_add_resources(pdev, res, j);
> >> +		if (ret)
> >> +			break;
> >> +
> >> +		ret = platform_device_add_data(pdev, &ahnode, sizeof(ahnode));
> >> +		if (ret)
> >> +			break;
> >> +
> >> +		ret = platform_device_add(pdev);
> >> +		if (ret)
> >> +			break;
> >> +	}
> >> +
> >> +	kfree(res);
> >> +	if (ret)
> >> +		platform_device_put(pdev);
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +void __init acpi_aest_init(void)
> >> +{
> >> +	acpi_status status;
> >> +	int ret;
> >> +	struct acpi_table_header *aest_table;
> >> +
> >> +	status = acpi_get_table(ACPI_SIG_AEST, 0, &aest_table);
> >> +	if (ACPI_FAILURE(status)) {
> >> +		if (status != AE_NOT_FOUND) {
> >> +			const char *msg = acpi_format_exception(status);
> >> +
> >> +			pr_err("Failed to get table, %s\n", msg);
> >> +		}
> >> +
> >> +		return;
> >> +	}
> >> +
> >> +	aest_array = kzalloc(sizeof(struct xarray), GFP_KERNEL);
> >> +	xa_init(aest_array);
> >> +
> >> +	ret = acpi_aest_init_nodes(aest_table);
> >> +	if (ret) {
> >> +		pr_err("Failed init aest node %d\n", ret);
> >> +		goto out;
> >> +	}
> >> +
> >> +	ret = acpi_aest_alloc_pdev();
> >> +	if (ret)
> >> +		pr_err("Failed alloc pdev %d\n", ret);
> >> +
> >> +out:
> >> +	acpi_put_table(aest_table);
> >> +}
> >> diff --git a/drivers/acpi/arm64/init.c b/drivers/acpi/arm64/init.c
> >> index 7a47d8095a7d..b0c768923831 100644
> >> --- a/drivers/acpi/arm64/init.c
> >> +++ b/drivers/acpi/arm64/init.c
> >> @@ -12,4 +12,6 @@ void __init acpi_arch_init(void)
> >>   		acpi_iort_init();
> >>   	if (IS_ENABLED(CONFIG_ARM_AMBA))
> >>   		acpi_amba_init();
> >> +	if (IS_ENABLED(CONFIG_ACPI_AEST))
> >> +		acpi_aest_init();
> >>   }
> >> diff --git a/drivers/acpi/arm64/init.h b/drivers/acpi/arm64/init.h
> >> index dcc277977194..3902d1676068 100644
> >> --- a/drivers/acpi/arm64/init.h
> >> +++ b/drivers/acpi/arm64/init.h
> >> @@ -5,3 +5,4 @@ void __init acpi_agdi_init(void);
> >>   void __init acpi_apmt_init(void);
> >>   void __init acpi_iort_init(void);
> >>   void __init acpi_amba_init(void);
> >> +void __init acpi_aest_init(void);
> >> diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
> >> index fc4f4bb94a4c..61a2a05d9c94 100644
> >> --- a/drivers/ras/Kconfig
> >> +++ b/drivers/ras/Kconfig
> >> @@ -33,6 +33,7 @@ if RAS
> >>
> >>   source "arch/x86/ras/Kconfig"
> >>   source "drivers/ras/amd/atl/Kconfig"
> >> +source "drivers/ras/aest/Kconfig"
> >>
> >>   config RAS_FMPM
> >>   	tristate "FRU Memory Poison Manager"
> >> diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile
> >> index 11f95d59d397..72411ee9deaf 100644
> >> --- a/drivers/ras/Makefile
> >> +++ b/drivers/ras/Makefile
> >> @@ -5,3 +5,4 @@ obj-$(CONFIG_RAS_CEC)	+= cec.o
> >>
> >>   obj-$(CONFIG_RAS_FMPM)	+= amd/fmpm.o
> >>   obj-y			+= amd/atl/
> >> +obj-y 			+= aest/
> >> diff --git a/drivers/ras/aest/Kconfig b/drivers/ras/aest/Kconfig
> >> new file mode 100644
> >> index 000000000000..6d436d911bea
> >> --- /dev/null
> >> +++ b/drivers/ras/aest/Kconfig
> >> @@ -0,0 +1,17 @@
> >> +# SPDX-License-Identifier: GPL-2.0
> >> +#
> >> +# ARM Error Source Table Support
> >> +#
> >> +# Copyright (c) 2024, Alibaba Group.
> >> +#
> >> +
> >> +config AEST
> >> +	tristate "ARM AEST Driver"
> >> +	depends on ACPI_AEST && RAS
> >> +
> >> +	help
> >> +	  The Arm Error Source Table (AEST) provides details on ACPI
> >> +	  extensions that enable kernel-first handling of errors in a
> >> +	  system that supports the Armv8 RAS extensions.
> >> +
> >> +	  If set, the kernel will report and log hardware errors.
> >> diff --git a/drivers/ras/aest/Makefile b/drivers/ras/aest/Makefile
> >> new file mode 100644
> >> index 000000000000..a6ba7e36fb43
> >> --- /dev/null
> >> +++ b/drivers/ras/aest/Makefile
> >> @@ -0,0 +1,5 @@
> >> +# SPDX-License-Identifier: GPL-2.0-only
> >> +
> >> +obj-$(CONFIG_AEST) 	+= aest.o
> >> +
> >> +aest-y		:= aest-core.o
> >> diff --git a/drivers/ras/aest/aest-core.c b/drivers/ras/aest/aest-core.c
> >> new file mode 100644
> >> index 000000000000..060a1eedee0a
> >> --- /dev/null
> >> +++ b/drivers/ras/aest/aest-core.c
> >> @@ -0,0 +1,976 @@
> >> +// SPDX-License-Identifier: GPL-2.0
> >> +/*
> >> + * ARM Error Source Table Support
> >> + *
> >> + * Copyright (c) 2021-2024, Alibaba Group.
> >> + */
> >> +
> >> +#include <linux/interrupt.h>
> >> +#include <linux/panic.h>
> >> +#include <linux/platform_device.h>
> >> +#include <linux/xarray.h>
> >> +#include <linux/cpuhotplug.h>
> >> +#include <linux/genalloc.h>
> >> +#include <linux/ras.h>
> >> +
> >> +#include "aest.h"
> >> +
> >> +DEFINE_PER_CPU(struct aest_device, percpu_adev);
> >> +
> >> +#undef pr_fmt
> >> +#define pr_fmt(fmt) "AEST: " fmt
> >> +
> >> +/*
> >> + * This memory pool is only to be used to save AEST node in AEST irq context.
> >> + * There can be 500 AEST node at most.
> >> + */
> >> +#define AEST_NODE_ALLOCED_MAX	500
> >> +
> >> +#define AEST_LOG_PREFIX_BUFFER	64
> >> +
> >> +BLOCKING_NOTIFIER_HEAD(aest_decoder_chain);
> >> +
> >> +static void aest_print(struct aest_event *event)
> >> +{
> >> +	static atomic_t seqno = { 0 };
> >> +	unsigned int curr_seqno;
> >> +	char pfx_seq[AEST_LOG_PREFIX_BUFFER];
> >> +	int index;
> >> +	struct ras_ext_regs *regs;
> >> +
> >> +	curr_seqno = atomic_inc_return(&seqno);
> >> +	snprintf(pfx_seq, sizeof(pfx_seq), "{%u}" HW_ERR, curr_seqno);
> >> +	pr_info("%sHardware error from AEST %s\n", pfx_seq, event->node_name);
> >> +
> >> +	switch (event->type) {
> >> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
> >> +		pr_err("%s Error from CPU%d\n", pfx_seq, event->id0);
> >> +		break;
> >> +	case ACPI_AEST_MEMORY_ERROR_NODE:
> >> +		pr_err("%s Error from memory at SRAT proximity domain %#x\n",
> >> +			pfx_seq, event->id0);
> >> +		break;
> >> +	case ACPI_AEST_SMMU_ERROR_NODE:
> >> +		pr_err("%s Error from SMMU IORT node %#x subcomponent %#x\n",
> >> +			pfx_seq, event->id0, event->id1);
> >> +		break;
> >> +	case ACPI_AEST_VENDOR_ERROR_NODE:
> >> +		pr_err("%s Error from vendor hid %8.8s uid %#x\n",
> >> +			pfx_seq, event->hid, event->id1);
> >> +		break;
> >> +	case ACPI_AEST_GIC_ERROR_NODE:
> >> +		pr_err("%s Error from GIC type %#x instance %#x\n",
> >> +			pfx_seq, event->id0, event->id1);
> >> +		break;
> >> +	default:
> >> +		pr_err("%s Unknown AEST node type\n", pfx_seq);
> >> +		return;
> >> +	}
> >> +
> >> +	index = event->index;
> >> +	regs = &event->regs;
> >> +
> >> +	pr_err("%s  ERR%dFR: 0x%llx\n", pfx_seq, index, regs->err_fr);
> >> +	pr_err("%s  ERR%dCTRL: 0x%llx\n", pfx_seq, index, regs->err_ctlr);
> >> +	pr_err("%s  ERR%dSTATUS: 0x%llx\n", pfx_seq, index, regs->err_status);
> >> +	if (regs->err_status & ERR_STATUS_AV)
> >> +		pr_err("%s  ERR%dADDR: 0x%llx\n", pfx_seq, index,
> >> +						regs->err_addr);
> >> +
> >> +	if (regs->err_status & ERR_STATUS_MV) {
> >> +		pr_err("%s  ERR%dMISC0: 0x%llx\n", pfx_seq, index,
> >> +						regs->err_misc[0]);
> >> +		pr_err("%s  ERR%dMISC1: 0x%llx\n", pfx_seq, index,
> >> +						regs->err_misc[1]);
> >> +		pr_err("%s  ERR%dMISC2: 0x%llx\n", pfx_seq, index,
> >> +						regs->err_misc[2]);
> >> +		pr_err("%s  ERR%dMISC3: 0x%llx\n", pfx_seq, index,
> >> +						regs->err_misc[3]);
> >> +	}
> >> +}
> >> +
> >> +static void aest_handle_memory_failure(u64 addr)
> >> +{
> >> +	unsigned long pfn;
> >> +
> >> +	pfn = PHYS_PFN(addr);
> >> +
> >> +	if (!pfn_valid(pfn)) {
> >> +		pr_warn(HW_ERR "Invalid physical address: %#llx\n", addr);
> >> +		return;
> >> +	}
> >> +
> >> +#ifdef CONFIG_MEMORY_FAILURE
> >> +	memory_failure(pfn, 0);
> >> +#endif
> >> +}
> >> +
> >> +static void init_aest_event(struct aest_event *event, struct aest_record *record,
> >> +					struct ras_ext_regs *regs)
> >> +{
> >> +	struct aest_node *node = record->node;
> >> +	struct acpi_aest_node *info = node->info;
> >> +
> >> +	event->type = node->type;
> >> +	event->node_name = node->name;
> >> +	switch (node->type) {
> >> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
> >> +		if (info->processor->flags & (ACPI_AEST_PROC_FLAG_SHARED |
> >> +						ACPI_AEST_PROC_FLAG_GLOBAL))
> >> +			event->id0 = smp_processor_id();
> >
> > In "else" case, acpi processor id will be set for id0. So, how about use
> > get_acpi_id_for_cpu(smp_processor_id()) here for consistency?
> 
> Acpi processor id may be confused to user, i will use
> get_cpu_for_acpi_id(info->processor->processor_id) in "else" case.
> 
> >
> >> +		else
> >> +			event->id0 = info->processor->processor_id;
> >> +
> >> +		event->id1 = info->processor->resource_type;
> >> +		break;
> >> +	case ACPI_AEST_MEMORY_ERROR_NODE:
> >> +		event->id0 = info->memory->srat_proximity_domain;
> >> +		break;
> >> +	case ACPI_AEST_SMMU_ERROR_NODE:
> >> +		event->id0 = info->smmu->iort_node_reference;
> >> +		event->id1 = info->smmu->subcomponent_reference;
> >> +		break;
> >> +	case ACPI_AEST_VENDOR_ERROR_NODE:
> >> +		event->id0 = 0;
> >> +		event->id1 = info->vendor->acpi_uid;
> >> +		event->hid = info->vendor->acpi_hid;
> >> +		break;
> >> +	case ACPI_AEST_GIC_ERROR_NODE:
> >> +		event->id0 = info->gic->interface_type;
> >> +		event->id1 = info->gic->instance_id;
> >> +		break;
> >> +	default:
> >> +		event->id0 = 0;
> >> +		event->id1 = 0;
> >> +	}
> >> +
> >> +	memcpy(&event->regs, regs, sizeof(*regs));
> >> +	event->index = record->index;
> >> +	event->addressing_mode = record->addressing_mode;
> >> +}
> >> +
> >> +static int
> >> +aest_node_gen_pool_add(struct aest_device *adev, struct aest_record *record,
> >> +					struct ras_ext_regs *regs)
> >> +{
> >> +	struct aest_event *event;
> >> +
> >> +	if (!adev->pool)
> >> +		return -EINVAL;
> >> +
> >> +	event = (void *)gen_pool_alloc(adev->pool, sizeof(*event));
> >> +	if (!event)
> >> +		return -ENOMEM;
> >> +
> >> +	init_aest_event(event, record, regs);
> >> +	llist_add(&event->llnode, &adev->event_list);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static void aest_log(struct aest_record *record, struct ras_ext_regs *regs)
> >> +{
> >> +	struct aest_device *adev = record->node->adev;
> >> +
> >> +	if (!aest_node_gen_pool_add(adev, record, regs))
> >> +		schedule_work(&adev->aest_work);
> >> +}
> >> +
> >> +void aest_register_decode_chain(struct notifier_block *nb)
> >> +{
> >> +	blocking_notifier_chain_register(&aest_decoder_chain, nb);
> >> +}
> >> +EXPORT_SYMBOL_GPL(aest_register_decode_chain);
> >> +
> >> +void aest_unregister_decode_chain(struct notifier_block *nb)
> >> +{
> >> +	blocking_notifier_chain_unregister(&aest_decoder_chain, nb);
> >> +}
> >> +EXPORT_SYMBOL_GPL(aest_unregister_decode_chain);
> >> +
> >> +static void aest_node_pool_process(struct work_struct *work)
> >> +{
> >> +	struct llist_node *head;
> >> +	struct aest_event *event;
> >> +	struct aest_device *adev = container_of(work, struct aest_device,
> >> +							aest_work);
> >> +	u64 status, addr;
> >> +
> >> +	head = llist_del_all(&adev->event_list);
> >> +	if (!head)
> >> +		return;
> >> +
> >> +	head = llist_reverse_order(head);
> >> +	llist_for_each_entry(event, head, llnode) {
> >> +		aest_print(event);
> >> +
> >> +		/* TODO: translate Logical Addresses to System Physical Addresses */
> >> +		if (event->addressing_mode == AEST_ADDREESS_LA ||
> >> +			(event->regs.err_addr & ERR_ADDR_AI)) {
> >> +			pr_notice("Can not translate LA to SPA\n");
> >> +			addr = 0;
> >> +		} else
> >> +			addr = event->regs.err_addr & (1UL << CONFIG_ARM64_PA_BITS);
> >> +
> >> +		status = event->regs.err_status;
> >> +		if (addr && ((status & ERR_STATUS_UE) || (status & ERR_STATUS_DE)))
> >> +			aest_handle_memory_failure(addr);
> >> +
> >> +		blocking_notifier_call_chain(&aest_decoder_chain, 0, event);
> >> +		gen_pool_free(adev->pool, (unsigned long)event,
> >> +				sizeof(*event));
> >> +	}
> >> +}
> >> +
> >> +static int aest_node_pool_init(struct aest_device *adev)
> >> +{
> >> +	unsigned long addr, size;
> >> +
> >> +	size = ilog2(sizeof(struct aest_event));
> >> +	adev->pool = devm_gen_pool_create(adev->dev, size, -1,
> >> +						dev_name(adev->dev));
> >> +	if (!adev->pool)
> >> +		return -ENOMEM;
> >> +
> >> +	size = PAGE_ALIGN(size * AEST_NODE_ALLOCED_MAX);
> >> +	addr = (unsigned long)devm_kzalloc(adev->dev, size, GFP_KERNEL);
> >> +	if (!addr)
> >> +		return -ENOMEM;
> >> +
> >> +	return gen_pool_add(adev->pool, addr, size, -1);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static void aest_panic(struct aest_record *record, struct ras_ext_regs *regs, char *msg)
> >> +{
> >> +	struct aest_event event = { 0 };
> >> +
> >> +	init_aest_event(&event, record, regs);
> >> +
> >> +	aest_print(&event);
> >> +
> >> +	panic(msg);
> >> +}
> >> +
> >> +static void aest_proc_record(struct aest_record *record, void *data)
> >> +{
> >> +	struct ras_ext_regs regs = {0};
> >> +	int *count = data;
> >> +
> >> +	regs.err_status = record_read(record, ERXSTATUS);
> >> +	if (!(regs.err_status & ERR_STATUS_V))
> >> +		return;
> >> +
> >> +	(*count)++;
> >> +
> >> +	if (regs.err_status & ERR_STATUS_AV)
> >> +		regs.err_addr = record_read(record, ERXADDR);
> >> +
> >> +	regs.err_fr = record->fr;
> >> +	regs.err_ctlr = record_read(record, ERXCTLR);
> >> +
> >> +	if (regs.err_status & ERR_STATUS_MV) {
> >> +		regs.err_misc[0] = record_read(record, ERXMISC0);
> >> +		regs.err_misc[1] = record_read(record, ERXMISC1);
> >> +		if (record->node->version >= ID_AA64PFR0_EL1_RAS_V1P1) {
> >> +			regs.err_misc[2] = record_read(record, ERXMISC2);
> >> +			regs.err_misc[3] = record_read(record, ERXMISC3);
> >> +		}
> >> +
> >> +		if (record->node->info->interface_hdr->flags &
> >> +			AEST_XFACE_FLAG_CLEAR_MISC) {
> >> +			record_write(record, ERXMISC0, 0);
> >> +			record_write(record, ERXMISC1, 0);
> >> +			if (record->node->version >= ID_AA64PFR0_EL1_RAS_V1P1) {
> >> +				record_write(record, ERXMISC2, 0);
> >> +				record_write(record, ERXMISC3, 0);
> >> +			}
> >> +		/* ce count is 0 if record do not support ce */
> >> +		} else if (record->ce.count > 0)
> >> +			record_write(record, ERXMISC0, record->ce.reg_val);
> >> +	}
> >> +
> >> +	/* panic if unrecoverable and uncontainable error encountered */
> >> +	if ((regs.err_status & ERR_STATUS_UE) &&
> >> +		(regs.err_status & ERR_STATUS_UET) > ERR_STATUS_UET_UEU)
> >> +		aest_panic(record, &regs, "AEST: unrecoverable error encountered");
> >
> > I think we need to use FIELD_GET to get correct value.
> > 	u64 ue = FIELD_GET(ERR_STATUS_UET, regs.err_status);
> > 	if ((regs.err_status & ERR_STATUS_UE) &&
> >   		(ue == ERR_STATUS_UET_UC || ue == ERR_STATUS_UET_UEU))
> >
> 
> OK, i will update next version.
> 
> >> +
> >> +	aest_log(record, &regs);
> >> +
> >> +	/* Write-one-to-clear the bits we've seen */
> >> +	regs.err_status &= ERR_STATUS_W1TC;
> >> +
> >> +	/* Multi bit filed need to write all-ones to clear. */
> >> +	if (regs.err_status & ERR_STATUS_CE)
> >> +		regs.err_status |= ERR_STATUS_CE;
> >> +
> >> +	/* Multi bit filed need to write all-ones to clear. */
> >> +	if (regs.err_status & ERR_STATUS_UET)
> >> +		regs.err_status |= ERR_STATUS_UET;
> >> +
> >> +	record_write(record, ERXSTATUS, regs.err_status);
> >> +}
> >> +
> >> +static void
> >> +aest_node_foreach_record(void (*func)(struct aest_record *, void *),
> >> +				struct aest_node *node, void *data,
> >> +				unsigned long *bitmap)
> >> +{
> >> +	int i;
> >> +
> >> +	for_each_clear_bit(i, bitmap, node->record_count) {
> >> +		aest_select_record(node, i);
> >> +
> >> +		func(&node->records[i], data);
> >> +
> >> +		aest_sync(node);
> >> +	}
> >> +}
> >> +
> >> +static int aest_proc(struct aest_node *node)
> >> +{
> >> +	int count = 0, i, j, size = node->record_count;
> >> +	u64 err_group = 0;
> >> +
> >> +	aest_node_dbg(node, "Poll bit %*pb\n", size, node->record_implemented);
> >> +	aest_node_foreach_record(aest_proc_record, node, &count,
> >> +						node->record_implemented);
> >> +
> >> +	if (!node->errgsr)
> >> +		return count;
> >> +
> >> +	aest_node_dbg(node, "Report bit %*pb\n", size, node->status_reporting);
> >> +	for (i = 0; i < BITS_TO_U64(size); i++) {
> >> +		err_group = readq_relaxed((void *)node->errgsr + i * 8);
> >> +		aest_node_dbg(node, "errgsr[%d]: 0x%llx\n", i, err_group);
> >> +
> >> +		for_each_set_bit(j, (unsigned long *)&err_group,
> >> +						BITS_PER_TYPE(u64)) {
> >> +			/*
> >> +			 * Error group base is only valid in Memory Map node,
> >> +			 * so driver do not need to write select register and
> >> +			 * sync.
> >> +			 */
> >> +			if (test_bit(i * BITS_PER_TYPE(u64) + j, node->status_reporting))
> >> +				continue;
> >> +			aest_proc_record(&node->records[j], &count);
> >> +		}
> >> +	}
> >> +
> >> +	return count;
> >> +}
> >> +
> >> +static irqreturn_t aest_irq_func(int irq, void *input)
> >> +{
> >> +	struct aest_device *adev = input;
> >> +	int i;
> >> +
> >> +	for (i = 0; i < adev->node_cnt; i++)
> >> +		aest_proc(&adev->nodes[i]);
> >> +
> >> +	return IRQ_HANDLED;
> >> +}
> >> +
> >> +static void aest_enable_irq(struct aest_record *record)
> >> +{
> >> +	u64 err_ctlr;
> >> +	struct aest_device *adev = record->node->adev;
> >> +
> >> +	err_ctlr = record_read(record, ERXCTLR);
> >> +
> >> +	if (adev->irq[ACPI_AEST_NODE_FAULT_HANDLING])
> >> +		err_ctlr |= (ERR_CTLR_FI | ERR_CTLR_CFI);
> >> +	if (adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY])
> >> +		err_ctlr |= ERR_CTLR_UI;
> >> +
> >> +	record_write(record, ERXCTLR, err_ctlr);
> >> +}
> >> +
> >> +static void aest_config_irq(struct aest_node *node)
> >> +{
> >> +	int i;
> >> +	struct acpi_aest_node_interrupt_v2 *interrupt;
> >> +
> >> +	if (!node->irq_config)
> >> +		return;
> >> +
> >> +	for (i = 0; i < node->info->interrupt_count; i++) {
> >> +		interrupt = &node->info->interrupt[i];
> >> +
> >> +		if (interrupt->type == ACPI_AEST_NODE_FAULT_HANDLING)
> >> +			writeq_relaxed(interrupt->gsiv, node->irq_config);
> >> +
> >> +		if (interrupt->type == ACPI_AEST_NODE_ERROR_RECOVERY)
> >> +			writeq_relaxed(interrupt->gsiv, node->irq_config + 8);
> >> +
> >> +		aest_node_dbg(node, "config irq type %d gsiv %d at %llx",
> >> +				interrupt->type, interrupt->gsiv,
> >> +				(u64)node->irq_config);
> >> +	}
> >> +}
> >> +
> >> +static enum ras_ce_threshold aest_get_ce_threshold(struct aest_record *record)
> >> +{
> >> +	u64 err_fr, err_fr_cec, err_fr_rp = -1;
> >> +
> >> +	err_fr = record->fr;
> >> +	err_fr_cec = FIELD_GET(ERR_FR_CEC, err_fr);
> >> +	err_fr_rp = FIELD_GET(ERR_FR_RP, err_fr);
> >> +
> >> +	if (err_fr_cec == ERR_FR_CEC_0B_COUNTER)
> >> +		return RAS_CE_THRESHOLD_0B;
> >> +	else if (err_fr_rp == ERR_FR_RP_DOUBLE_COUNTER)
> >> +		return RAS_CE_THRESHOLD_32B;
> >> +	else if (err_fr_cec == ERR_FR_CEC_8B_COUNTER)
> >> +		return RAS_CE_THRESHOLD_8B;
> >> +	else if (err_fr_cec == ERR_FR_CEC_16B_COUNTER)
> >> +		return RAS_CE_THRESHOLD_16B;
> >> +	else
> >> +		return UNKNOWN;
> >> +
> >> +}
> >> +
> >> +static const struct ce_threshold_info ce_info[] = {
> >> +	[RAS_CE_THRESHOLD_0B] = { 0 },
> >> +	[RAS_CE_THRESHOLD_8B] = {
> >> +		.max_count = ERR_8B_CEC_MAX,
> >> +		.mask = ERR_MISC0_8B_CEC,
> >> +		.shift = ERR_MISC0_CEC_SHIFT,
> >> +	},
> >> +	[RAS_CE_THRESHOLD_16B] = {
> >> +		.max_count = ERR_16B_CEC_MAX,
> >> +		.mask = ERR_MISC0_16B_CEC,
> >> +		.shift = ERR_MISC0_CEC_SHIFT,
> >> +	},
> >> +	//TODO: Support 32B CEC threshold.
> >> +	[RAS_CE_THRESHOLD_32B] = { 0 },
> >> +};
> >> +
> >> +static void aest_set_ce_threshold(struct aest_record *record)
> >> +{
> >> +	u64 err_misc0, ce_count;
> >> +	struct ce_threshold *ce = &record->ce;
> >> +	const struct ce_threshold_info *info;
> >> +
> >> +	record->threshold_type  = aest_get_ce_threshold(record);
> >> +
> >> +	switch (record->threshold_type) {
> >> +	case RAS_CE_THRESHOLD_0B:
> >> +		aest_record_dbg(record, "do not support CE threshold!\n");
> >> +		return;
> >> +	case RAS_CE_THRESHOLD_8B:
> >> +		aest_record_dbg(record, "support 8 bit CE threshold!\n");
> >> +		break;
> >> +	case RAS_CE_THRESHOLD_16B:
> >> +		aest_record_dbg(record, "support 16 bit CE threshold!\n");
> >> +		break;
> >> +	case RAS_CE_THRESHOLD_32B:
> >> +		aest_record_dbg(record, "not support 32 bit CE threshold!\n");
> >> +		break;
> >> +	default:
> >> +		aest_record_dbg(record, "Unknown misc0 ce threshold!\n");
> >> +	}
> >> +
> >> +	err_misc0 = record_read(record, ERXMISC0);
> >> +	info = &ce_info[record->threshold_type];
> >> +	ce->info = info;
> >> +	ce_count = (err_misc0 & info->mask) >> info->shift;
> >> +	if (ce_count) {
> >> +		ce->count = ce_count;
> >> +		ce->threshold = info->max_count - ce_count + 1;
> >> +		ce->reg_val = err_misc0;
> >> +		aest_record_dbg(record, "CE threshold is %llx, controlled by FW",
> >> +							ce->threshold);
> >> +		return;
> >> +	}
> >> +
> >> +	// Default CE threshold is 1.
> >> +	ce->count = info->max_count;
> >> +	ce->threshold = DEFAULT_CE_THRESHOLD;
> >> +	ce->reg_val = err_misc0 | info->mask;
> >> +
> >> +	record_write(record, ERXMISC0, ce->reg_val);
> >> +	aest_record_dbg(record, "CE threshold is %llx, controlled by Kernel",
> >> +							ce->threshold);
> >> +}
> >> +
> >> +static int aest_register_irq(struct aest_device *adev)
> >> +{
> >> +	int i, irq, ret;
> >> +	char *irq_desc;
> >> +
> >> +	irq_desc = devm_kasprintf(adev->dev, GFP_KERNEL, "%s.%s.",
> >> +				  dev_driver_string(adev->dev),
> >> +				  dev_name(adev->dev));
> >> +	if (!irq_desc)
> >> +		return -ENOMEM;
> >> +
> >> +	for (i = 0; i < MAX_GSI_PER_NODE; i++) {
> >> +		irq = adev->irq[i];
> >> +
> >> +		if (!irq)
> >> +			continue;
> >> +
> >> +		if (irq_is_percpu_devid(irq)) {
> >> +			ret = request_percpu_irq(irq, aest_irq_func,
> >> +							irq_desc,
> >> +							adev->adev_oncore);
> >> +			if (ret)
> >> +				goto free;
> >> +		} else {
> >> +			ret = devm_request_irq(adev->dev, irq, aest_irq_func,
> >> +					0, irq_desc, adev);
> >> +			if (ret)
> >> +				return ret;
> >> +		}
> >> +	}
> >> +	return 0;
> >> +
> >> +free:
> >> +	for (; i >= 0; i--) {
> >> +		irq = adev->irq[i];
> >> +
> >> +		if (irq_is_percpu_devid(irq))
> >> +			free_percpu_irq(irq, adev->adev_oncore);
> >> +	}
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +static int
> >> +aest_init_record(struct aest_record *record, int i, struct aest_node *node)
> >> +{
> >> +	struct device *dev = node->adev->dev;
> >> +
> >> +	record->name = devm_kasprintf(dev, GFP_KERNEL, "record%d", i);
> >> +	if (!record->name)
> >> +		return -ENOMEM;
> >> +
> >> +	if (node->base)
> >> +		record->regs_base = node->base + sizeof(struct ras_ext_regs) * i;
> >> +
> >> +	record->access = &aest_access[node->info->interface_hdr->type];
> >> +	record->addressing_mode = test_bit(i, node->info->addressing_mode);
> >> +	record->index = i;
> >> +	record->node = node;
> >> +	record->fr = record_read(record, ERXFR);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static void aest_online_record(struct aest_record *record, void *data)
> >> +{
> >> +	if (record->fr & ERR_FR_CE)
> >> +		aest_set_ce_threshold(record);
> >> +
> >> +	aest_enable_irq(record);
> >> +}
> >> +
> >> +static void aest_online_oncore_node(struct aest_node *node)
> >> +{
> >> +	int count;
> >> +
> >> +	count = aest_proc(node);
> >> +	aest_node_dbg(node, "Find %d error on CPU%d before AEST probe\n",
> >> +						count, smp_processor_id());
> >> +
> >> +	aest_node_foreach_record(aest_online_record, node, NULL,
> >> +						node->record_implemented);
> >> +
> >> +	aest_node_foreach_record(aest_online_record, node, NULL,
> >> +						node->status_reporting);
> >> +}
> >> +
> >> +static void aest_online_oncore_dev(void *data)
> >> +{
> >> +	int fhi_irq, eri_irq, i;
> >> +	struct aest_device *adev = this_cpu_ptr(data);
> >> +
> >> +	for (i = 0; i < adev->node_cnt; i++)
> >> +		aest_online_oncore_node(&adev->nodes[i]);
> >> +
> >> +	fhi_irq = adev->irq[ACPI_AEST_NODE_FAULT_HANDLING];
> >> +	if (fhi_irq > 0)
> >> +		enable_percpu_irq(fhi_irq, IRQ_TYPE_NONE);
> >> +	eri_irq = adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY];
> >> +	if (eri_irq > 0)
> >> +		enable_percpu_irq(eri_irq, IRQ_TYPE_NONE);
> >> +}
> >> +
> >> +static void aest_offline_oncore_dev(void *data)
> >> +{
> >> +	int fhi_irq, eri_irq;
> >> +	struct aest_device *adev = this_cpu_ptr(data);
> >> +
> >> +	fhi_irq = adev->irq[ACPI_AEST_NODE_FAULT_HANDLING];
> >> +	if (fhi_irq > 0)
> >> +		disable_percpu_irq(fhi_irq);
> >> +	eri_irq = adev->irq[ACPI_AEST_NODE_ERROR_RECOVERY];
> >> +	if (eri_irq > 0)
> >> +		disable_percpu_irq(eri_irq);
> >> +}
> >> +
> >> +static void aest_online_dev(struct aest_device *adev)
> >> +{
> >> +	int count, i;
> >> +	struct aest_node *node;
> >> +
> >> +	for (i = 0; i < adev->node_cnt; i++) {
> >> +		node = &adev->nodes[i];
> >> +
> >> +		if (!node->name)
> >> +			continue;
> >> +
> >> +		count = aest_proc(node);
> >> +		aest_node_dbg(node, "Find %d error before AEST probe\n", count);
> >> +
> >> +		aest_config_irq(node);
> >> +
> >> +		aest_node_foreach_record(aest_online_record, node, NULL,
> >> +						node->record_implemented);
> >> +		aest_node_foreach_record(aest_online_record, node, NULL,
> >> +						node->status_reporting);
> >> +	}
> >> +}
> >> +
> >> +static int aest_starting_cpu(unsigned int cpu)
> >> +{
> >> +	pr_debug("CPU%d starting\n", cpu);
> >> +	aest_online_oncore_dev(&percpu_adev);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int aest_dying_cpu(unsigned int cpu)
> >> +{
> >> +	pr_debug("CPU%d dying\n", cpu);
> >> +	aest_offline_oncore_dev(&percpu_adev);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static void aest_device_remove(struct platform_device *pdev)
> >> +{
> >> +	struct aest_device *adev = platform_get_drvdata(pdev);
> >> +	int i;
> >> +
> >> +	platform_set_drvdata(pdev, NULL);
> >> +
> >> +	if (adev->type != ACPI_AEST_PROCESSOR_ERROR_NODE)
> >> +		return;
> >> +
> >> +	on_each_cpu(aest_offline_oncore_dev, adev->adev_oncore, 1);
> >> +
> >> +	for (i = 0; i < MAX_GSI_PER_NODE; i++) {
> >> +		if (adev->irq[i])
> >> +			free_percpu_irq(adev->irq[i], adev->adev_oncore);
> >> +	}
> >> +}
> >> +
> >> +
> >> +static int get_aest_node_ver(struct aest_node *node)
> >> +{
> >> +	u64 reg;
> >> +	void *devarch_base;
> >> +
> >> +	if (node->type == ACPI_AEST_GIC_ERROR_NODE) {
> >> +		devarch_base = ioremap(node->info->interface_hdr->address +
> >> +						GIC_ERRDEVARCH, PAGE_SIZE);
> >> +		if (!devarch_base)
> >> +			return 0;
> >> +
> >> +		reg = readl_relaxed(devarch_base);
> >> +		iounmap(devarch_base);
> >> +
> >> +		return FIELD_GET(ERRDEVARCH_REV, reg);
> >> +	}
> >> +
> >> +	return FIELD_GET(ID_AA64PFR0_EL1_RAS_MASK, read_cpuid(ID_AA64PFR0_EL1));
> >> +}
> >> +
> >> +static char *alloc_aest_node_name(struct aest_node *node)
> >> +{
> >> +	char *name;
> >> +
> >> +	switch (node->type) {
> >> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
> >> +		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "%s.%d",
> >> +			aest_node_name[node->type],
> >> +			node->info->processor->processor_id);
> >> +		break;
> >> +	case ACPI_AEST_MEMORY_ERROR_NODE:
> >> +	case ACPI_AEST_SMMU_ERROR_NODE:
> >> +	case ACPI_AEST_VENDOR_ERROR_NODE:
> >> +	case ACPI_AEST_GIC_ERROR_NODE:
> >> +	case ACPI_AEST_PCIE_ERROR_NODE:
> >> +	case ACPI_AEST_PROXY_ERROR_NODE:
> >> +		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "%s.%llx",
> >> +			aest_node_name[node->type],
> >> +			node->info->interface_hdr->address);
> >> +		break;
> >> +	default:
> >> +		name = devm_kasprintf(node->adev->dev, GFP_KERNEL, "Unknown");
> >> +	}
> >> +
> >> +	return name;
> >> +}
> >> +
> >> +static int
> >> +aest_node_set_errgsr(struct aest_device *adev, struct aest_node *node)
> >> +{
> >> +	struct acpi_aest_node *anode = node->info;
> >> +	u64 errgsr_base = anode->common->error_group_register_base;
> >> +
> >> +	if (anode->interface_hdr->type != ACPI_AEST_NODE_MEMORY_MAPPED)
> >> +		return 0;
> >> +
> >> +	if (!node->base)
> >> +		return 0;
> >> +
> >> +	if (!(anode->interface_hdr->flags & AEST_XFACE_FLAG_ERROR_GROUP)) {
> >> +		node->errgsr = node->base + ERXGROUP;
> >> +		return 0;
> >> +	}
> >> +
> >> +	if (!errgsr_base)
> >> +		return -EINVAL;
> >> +
> >> +	node->errgsr = devm_ioremap(adev->dev, errgsr_base, PAGE_SIZE);
> >> +	if (!node->errgsr)
> >> +		return -ENOMEM;
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int aest_init_node(struct aest_device *adev, struct aest_node *node,
> >> +					struct acpi_aest_node *anode)
> >> +{
> >> +	int i, ret;
> >> +	u64 address, size, flags;
> >> +
> >> +	node->adev = adev;
> >> +	node->info = anode;
> >> +	node->type = anode->type;
> >> +	node->version = get_aest_node_ver(node);
> >> +	node->name = alloc_aest_node_name(node);
> >> +	if (!node->name)
> >> +		return -ENOMEM;
> >> +	node->record_implemented = anode->record_implemented;
> >> +	node->status_reporting = anode->status_reporting;
> >> +
> >> +	address = anode->interface_hdr->address;
> >> +	size = anode->interface_hdr->error_record_count *
> >> +						sizeof(struct ras_ext_regs);
> >> +	if (address) {
> >> +		node->base = devm_ioremap(adev->dev, address, size);
> >> +		if (!node->base)
> >> +			return -ENOMEM;
> >> +	}
> >> +
> >> +	flags = anode->interface_hdr->flags;
> >> +	address = node->info->common->fault_inject_register_base;
> >> +	if ((flags & AEST_XFACE_FLAG_FAULT_INJECT) && address) {
> >> +		node->inj = devm_ioremap(adev->dev, address, PAGE_SIZE);
> >> +		if (!node->inj)
> >> +			return -ENOMEM;
> >> +	}
> >> +
> >> +	address = node->info->common->interrupt_config_register_base;
> >> +	if ((flags & AEST_XFACE_FLAG_FAULT_INJECT) && address) {
> >> +		node->irq_config = devm_ioremap(adev->dev, address, PAGE_SIZE);
> >> +		if (!node->irq_config)
> >> +			return -ENOMEM;
> >> +	}
> >> +
> >> +	ret = aest_node_set_errgsr(adev, node);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	node->record_count = anode->interface_hdr->error_record_count;
> >> +	node->records = devm_kcalloc(adev->dev, node->record_count,
> >> +				sizeof(struct aest_record), GFP_KERNEL);
> >> +	if (!node->records)
> >> +		return -ENOMEM;
> >> +
> >> +	for (i = 0; i < node->record_count; i++) {
> >> +		ret = aest_init_record(&node->records[i], i, node);
> >> +		if (ret)
> >> +			return ret;
> >> +	}
> >> +	aest_node_dbg(node, "%d records, base: %llx, errgsr: %llx\n",
> >> +			node->record_count, (u64)node->base, (u64)node->errgsr);
> >> +	return 0;
> >> +}
> >> +
> >> +static int
> >> +aest_init_nodes(struct aest_device *adev, struct aest_hnode *ahnode)
> >> +{
> >> +	struct acpi_aest_node *anode;
> >> +	struct aest_node *node;
> >> +	int ret, i = 0;
> >> +
> >> +	adev->node_cnt = ahnode->count;
> >> +	adev->nodes = devm_kcalloc(adev->dev, adev->node_cnt,
> >> +					sizeof(struct aest_node), GFP_KERNEL);
> >> +	if (!adev->nodes)
> >> +		return -ENOMEM;
> >> +
> >> +	list_for_each_entry(anode, &ahnode->list, list) {
> >> +		adev->type = anode->type;
> >> +
> >> +		node = &adev->nodes[i++];
> >> +		ret = aest_init_node(adev, node, anode);
> >> +		if (ret)
> >> +			return ret;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int __setup_ppi(struct aest_device *adev)
> >> +{
> >> +	int cpu, i;
> >> +	struct aest_device *oncore_adev;
> >> +	struct aest_node *oncore_node;
> >> +	size_t size;
> >> +
> >> +	adev->adev_oncore = &percpu_adev;
> >> +	for_each_possible_cpu(cpu) {
> >> +		oncore_adev = per_cpu_ptr(&percpu_adev, cpu);
> >> +		memcpy(oncore_adev, adev, sizeof(struct aest_device));
> >> +
> >> +		oncore_adev->nodes = devm_kcalloc(adev->dev,
> >> +						oncore_adev->node_cnt,
> >> +						sizeof(struct aest_node),
> >> +						GFP_KERNEL);
> >> +		if (!oncore_adev->nodes)
> >> +			return -ENOMEM;
> >> +
> >> +		size = adev->node_cnt * sizeof(struct aest_node);
> >> +		memcpy(oncore_adev->nodes, adev->nodes, size);
> >> +		for (i = 0; i < oncore_adev->node_cnt; i++) {
> >> +			oncore_node = &oncore_adev->nodes[i];
> >> +			oncore_node->records = devm_kcalloc(adev->dev,
> >> +					oncore_node->record_count,
> >> +					sizeof(struct aest_record), GFP_KERNEL);
> >> +			if (!oncore_node->records)
> >> +				return -ENOMEM;
> >> +
> >> +			size = oncore_node->record_count *
> >> +						sizeof(struct aest_record);
> >> +			memcpy(oncore_node->records, adev->nodes[i].records,
> >> +									size);
> >> +		}
> >> +
> >> +		aest_dev_dbg(adev, "Init device on CPU%d.\n", cpu);
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int aest_setup_irq(struct platform_device *pdev, struct aest_device *adev)
> >> +{
> >> +	int fhi_irq, eri_irq;
> >> +
> >> +	fhi_irq = platform_get_irq_byname_optional(pdev, "fhi");
> >> +	if (fhi_irq > 0)
> >> +		adev->irq[0] = fhi_irq;
> >> +
> >> +	eri_irq = platform_get_irq_byname_optional(pdev, "eri");
> >> +	if (eri_irq > 0)
> >> +		adev->irq[1] = eri_irq;
> >> +
> >> +	/* Allocate and initialise the percpu device pointer for PPI */
> >> +	if (irq_is_percpu(fhi_irq) || irq_is_percpu(eri_irq))
> >> +		return __setup_ppi(adev);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int aest_device_probe(struct platform_device *pdev)
> >> +{
> >> +	int ret;
> >> +	struct aest_device *adev;
> >> +	struct aest_hnode *ahnode;
> >> +
> >> +	ahnode = *((struct aest_hnode **)pdev->dev.platform_data);
> >> +	if (!ahnode)
> >> +		return -ENODEV;
> >> +
> >> +	adev = devm_kzalloc(&pdev->dev, sizeof(*adev), GFP_KERNEL);
> >> +	if (!adev)
> >> +		return -ENOMEM;
> >> +
> >> +	adev->dev = &pdev->dev;
> >> +	INIT_WORK(&adev->aest_work, aest_node_pool_process);
> >> +	ret = aest_node_pool_init(adev);
> >> +	if (ret) {
> >> +		aest_dev_err(adev, "Failed init aest node pool.\n");
> >> +		return ret;
> >> +	}
> >> +	init_llist_head(&adev->event_list);
> >> +	adev->uid = ahnode->uid;
> >> +	aest_set_name(adev, ahnode);
> >> +
> >> +	ret = aest_init_nodes(adev, ahnode);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	ret = aest_setup_irq(pdev, adev);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	ret = aest_register_irq(adev);
> >> +	if (ret) {
> >> +		aest_dev_err(adev, "register irq failed\n");
> >> +		return ret;
> >> +	}
> >> +
> >> +	platform_set_drvdata(pdev, adev);
> >> +
> >> +	if (aest_dev_is_oncore(adev))
> >> +		ret = cpuhp_setup_state(CPUHP_AP_ARM_AEST_STARTING,
> >> +				"drivers/acpi/arm64/aest:starting",
> >> +				aest_starting_cpu, aest_dying_cpu);
> >> +	else
> >> +		aest_online_dev(adev);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	aest_dev_dbg(adev, "Node cnt: %x, uid: %x, irq: %d, %d\n",
> >> +			adev->node_cnt, adev->uid, adev->irq[0], adev->irq[1]);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static const struct acpi_device_id acpi_aest_ids[] = {
> >> +	{"ARMHE000", 0},
> >> +	{}
> >> +};
> >
> > My understanding is that platform device with name "AEST" is
> > created in acpi_aest_alloc_pdev and then the name will be used
> > to bind this driver for the dev. So, do we need ACPI HID definition
> > here? Using name should work well for both systems with or without
> > ARMHE000. Or, am I missing something?
> >
> > I have not yet finish to look all parts and will look them and
> > other patches too.
> >
> > Best Regards,
> > Tomohiro Misono
> 
> You are right, i will delete these code next version.
> 
> Best Regards,
> Ruidong
> 
> >
> >> +
> >> +static struct platform_driver aest_driver = {
> >> +	.driver	= {
> >> +		.name	= "AEST",
> >> +		.acpi_match_table = acpi_aest_ids,
> >> +	},
> >> +	.probe	= aest_device_probe,
> >> +	.remove = aest_device_remove,
> >> +};
> >> +
> >> +static int __init aest_init(void)
> >> +{
> >> +	return platform_driver_register(&aest_driver);
> >> +}
> >> +module_init(aest_init);
> >> +
> >> +static void __exit aest_exit(void)
> >> +{
> >> +	platform_driver_unregister(&aest_driver);
> >> +}
> >> +module_exit(aest_exit);
> >> +
> >> +MODULE_DESCRIPTION("ARM AEST Driver");
> >> +MODULE_AUTHOR("Ruidong Tian <tianruidong@linux.alibaba.com>");
> >> +MODULE_LICENSE("GPL");
> >> +
> >> diff --git a/drivers/ras/aest/aest.h b/drivers/ras/aest/aest.h
> >> new file mode 100644
> >> index 000000000000..04005aad3617
> >> --- /dev/null
> >> +++ b/drivers/ras/aest/aest.h
> >> @@ -0,0 +1,323 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> >> +/*
> >> + * ARM Error Source Table Support
> >> + *
> >> + * Copyright (c) 2021-2024, Alibaba Group.
> >> + */
> >> +
> >> +#include <linux/acpi_aest.h>
> >> +#include <asm/ras.h>
> >> +
> >> +#define MAX_GSI_PER_NODE 2
> >> +#define AEST_MAX_PPI 3
> >> +#define DEFAULT_CE_THRESHOLD 1
> >> +
> >> +#define record_read(record, offset) \
> >> +	record->access->read(record->regs_base, offset)
> >> +#define record_write(record, offset, val) \
> >> +	record->access->write(record->regs_base, offset, val)
> >> +
> >> +#define aest_dev_err(__adev, format, ...)	\
> >> +	dev_err((__adev)->dev, format, ##__VA_ARGS__)
> >> +#define aest_dev_info(__adev, format, ...)	\
> >> +	dev_info((__adev)->dev, format, ##__VA_ARGS__)
> >> +#define aest_dev_dbg(__adev, format, ...)	\
> >> +	dev_dbg((__adev)->dev, format, ##__VA_ARGS__)
> >> +
> >> +#define aest_node_err(__node, format, ...)	\
> >> +	dev_err((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
> >> +#define aest_node_info(__node, format, ...)	\
> >> +	dev_info((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
> >> +#define aest_node_dbg(__node, format, ...)	\
> >> +	dev_dbg((__node)->adev->dev, "%s: " format, (__node)->name, ##__VA_ARGS__)
> >> +
> >> +#define aest_record_err(__record, format, ...)	\
> >> +	dev_err((__record)->node->adev->dev, "%s: %s: " format, \
> >> +		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
> >> +#define aest_record_info(__record, format, ...)	\
> >> +	dev_info((__record)->node->adev->dev, "%s: %s: " format, \
> >> +		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
> >> +#define aest_record_dbg(__record, format, ...)	\
> >> +	dev_dbg((__record)->node->adev->dev, "%s: %s: " format, \
> >> +		(__record)->node->name, (__record)->name, ##__VA_ARGS__)
> >> +
> >> +#define ERXFR			0x0
> >> +#define ERXCTLR			0x8
> >> +#define ERXSTATUS		0x10
> >> +#define ERXADDR			0x18
> >> +#define ERXMISC0		0x20
> >> +#define ERXMISC1		0x28
> >> +#define ERXMISC2		0x30
> >> +#define ERXMISC3		0x38
> >> +
> >> +#define ERXGROUP		0xE00
> >> +#define GIC_ERRDEVARCH		0xFFBC
> >> +
> >> +extern struct xarray *aest_array;
> >> +
> >> +struct aest_event {
> >> +	struct llist_node llnode;
> >> +	char *node_name;
> >> +	u32 type;
> >> +	/*
> >> +	 * Different nodes have different meanings:
> >> +	 *   - Processor node	: processor number.
> >> +	 *   - Memory node	: SRAT proximity domain.
> >> +	 *   - SMMU node	: IORT proximity domain.
> >> +	 *   - GIC node		: interface type.
> >> +	 */
> >> +	u32 id0;
> >> +	/*
> >> +	 * Different nodes have different meanings:
> >> +	 *   - Processor node	: processor resource type.
> >> +	 *   - Memory node	: Non.
> >> +	 *   - SMMU node	: subcomponent reference.
> >> +	 *   - Vendor node	: Unique ID.
> >> +	 *   - GIC node		: instance identifier.
> >> +	 */
> >> +	u32 id1;
> >> +	char *hid;		// Vendor node	: hardware ID.
> >> +	u32 index;
> >> +	u64 ce_threshold;
> >> +	int addressing_mode;
> >> +	struct ras_ext_regs regs;
> >> +
> >> +	void *vendor_data;
> >> +	size_t vendor_data_size;
> >> +};
> >> +
> >> +struct aest_access {
> >> +	u64 (*read)(void *base, u32 offset);
> >> +	void (*write)(void *base, u32 offset, u64 val);
> >> +};
> >> +
> >> +struct ce_threshold_info {
> >> +	const u64			max_count;
> >> +	const u64			mask;
> >> +	const u64			shift;
> >> +};
> >> +
> >> +struct ce_threshold {
> >> +	const struct ce_threshold_info	*info;
> >> +	u64				count;
> >> +	u64				threshold;
> >> +	u64				reg_val;
> >> +};
> >> +
> >> +struct aest_record {
> >> +	char				*name;
> >> +	int				index;
> >> +	void __iomem			*regs_base;
> >> +
> >> +	/*
> >> +	 * This bit specifies the addressing mode  to populate the ERR_ADDR
> >> +	 * register:
> >> +	 *   0b: Error record reports System Physical Addresses (SPA) in
> >> +	 *       the ERR_ADDR register.
> >> +	 *   1b: Error record reports error node-specific Logical Addresses(LA)
> >> +	 *       in the ERR_ADD register. OS must use other means to translate
> >> +	 *       the reported LA into SPA
> >> +	 */
> >> +	int				addressing_mode;
> >> +	u64				fr;
> >> +	struct aest_node		*node;
> >> +
> >> +	struct dentry			*debugfs;
> >> +	struct ce_threshold		ce;
> >> +	enum ras_ce_threshold		threshold_type;
> >> +	const struct aest_access	*access;
> >> +
> >> +	void				*vendor_data;
> >> +	size_t				vendor_data_size;
> >> +};
> >> +
> >> +struct aest_node {
> >> +	char				*name;
> >> +	u8				type;
> >> +	void				*errgsr;
> >> +	void				*inj;
> >> +	void				*irq_config;
> >> +	void				*base;
> >> +
> >> +	/*
> >> +	 * This bitmap indicates which of the error records within this error
> >> +	 * node must be polled for error status.
> >> +	 * Bit[n] of this field pertains to error record corresponding to
> >> +	 * index n in this error group.
> >> +	 * Bit[n] = 0b: Error record at index n needs to be polled.
> >> +	 * Bit[n] = 1b: Error record at index n do not needs to be polled.
> >> +	 */
> >> +	unsigned long			*record_implemented;
> >> +	/*
> >> +	 * This bitmap indicates which of the error records within this error
> >> +	 * node support error status reporting using ERRGSR register.
> >> +	 * Bit[n] of this field pertains to error record corresponding to
> >> +	 * index n in this error group.
> >> +	 * Bit[n] = 0b: Error record at index n supports error status reporting
> >> +	 *              through ERRGSR.S.
> >> +	 * Bit[n] = 1b: Error record at index n does not support error reporting
> >> +	 *              through the ERRGSR.S bit If this error record is
> >> +	 *              implemented, then it must be polled explicitly for
> >> +	 *              error events.
> >> +	 */
> >> +	unsigned long			*status_reporting;
> >> +	int				version;
> >> +
> >> +	struct aest_device		*adev;
> >> +	struct acpi_aest_node		*info;
> >> +	struct dentry			*debugfs;
> >> +
> >> +	int				record_count;
> >> +	struct aest_record		*records;
> >> +
> >> +	struct aest_node __percpu	*oncore_node;
> >> +};
> >> +
> >> +struct aest_device {
> >> +	struct device			*dev;
> >> +	u32				type;
> >> +	int				node_cnt;
> >> +	struct aest_node		*nodes;
> >> +
> >> +	struct work_struct		aest_work;
> >> +	struct gen_pool			*pool;
> >> +	struct llist_head		event_list;
> >> +
> >> +	int				irq[MAX_GSI_PER_NODE];
> >> +	u32				uid;
> >> +	struct aest_device __percpu	*adev_oncore;
> >> +
> >> +	struct dentry			*debugfs;
> >> +};
> >> +
> >> +struct aest_node_context {
> >> +	struct aest_node		*node;
> >> +	unsigned long			*bitmap;
> >> +	void				(*func)(struct aest_record *record,
> >> +							void *data);
> >> +	void				*data;
> >> +	int				ret;
> >> +};
> >> +
> >> +#define CASE_READ(res, x)						\
> >> +	case (x): {							\
> >> +		res = read_sysreg_s(SYS_##x##_EL1);			\
> >> +		break;							\
> >> +	}
> >> +
> >> +#define CASE_WRITE(val, x)						\
> >> +	case (x): {							\
> >> +		write_sysreg_s((val), SYS_##x##_EL1);			\
> >> +		break;							\
> >> +	}
> >> +
> >> +static inline u64 aest_sysreg_read(void *__unused, u32 offset)
> >> +{
> >> +	u64 res;
> >> +
> >> +	switch (offset) {
> >> +	CASE_READ(res, ERXFR)
> >> +	CASE_READ(res, ERXCTLR)
> >> +	CASE_READ(res, ERXSTATUS)
> >> +	CASE_READ(res, ERXADDR)
> >> +	CASE_READ(res, ERXMISC0)
> >> +	CASE_READ(res, ERXMISC1)
> >> +	CASE_READ(res, ERXMISC2)
> >> +	CASE_READ(res, ERXMISC3)
> >> +	default :
> >> +		res = 0;
> >> +	}
> >> +	return res;
> >> +}
> >> +
> >> +static inline void aest_sysreg_write(void *base, u32 offset, u64 val)
> >> +{
> >> +	switch (offset) {
> >> +	CASE_WRITE(val, ERXFR)
> >> +	CASE_WRITE(val, ERXCTLR)
> >> +	CASE_WRITE(val, ERXSTATUS)
> >> +	CASE_WRITE(val, ERXADDR)
> >> +	CASE_WRITE(val, ERXMISC0)
> >> +	CASE_WRITE(val, ERXMISC1)
> >> +	CASE_WRITE(val, ERXMISC2)
> >> +	CASE_WRITE(val, ERXMISC3)
> >> +	default :
> >> +		return;
> >> +	}
> >> +}
> >> +
> >> +static inline u64 aest_iomem_read(void *base, u32 offset)
> >> +{
> >> +	return readq_relaxed(base + offset);
> >> +	return 0;
> >> +}
> >> +
> >> +static inline void aest_iomem_write(void *base, u32 offset, u64 val)
> >> +{
> >> +	writeq_relaxed(val, base + offset);
> >> +}
> >> +
> >> +/* access type is decided by AEST interface type. */
> >> +static const struct aest_access aest_access[] = {
> >> +	[ACPI_AEST_NODE_SYSTEM_REGISTER] = {
> >> +		.read = aest_sysreg_read,
> >> +		.write = aest_sysreg_write,
> >> +	},
> >> +
> >> +	[ACPI_AEST_NODE_MEMORY_MAPPED] = {
> >> +		.read = aest_iomem_read,
> >> +		.write = aest_iomem_write,
> >> +	},
> >> +	[ACPI_AEST_NODE_SINGLE_RECORD_MEMORY_MAPPED] = {
> >> +		.read = aest_iomem_read,
> >> +		.write = aest_iomem_write,
> >> +	},
> >> +	{ }
> >> +};
> >> +
> >> +static inline bool aest_dev_is_oncore(struct aest_device *adev)
> >> +{
> >> +	return adev->type == ACPI_AEST_PROCESSOR_ERROR_NODE;
> >> +}
> >> +
> >> +/*
> >> + * Each PE may has multi error record, you must selects an error
> >> + * record to be accessed through the Error Record System
> >> + * registers.
> >> + */
> >> +static inline void aest_select_record(struct aest_node *node, int index)
> >> +{
> >> +	if (node->type == ACPI_AEST_PROCESSOR_ERROR_NODE) {
> >> +		write_sysreg_s(index, SYS_ERRSELR_EL1);
> >> +		isb();
> >> +	}
> >> +}
> >> +
> >> +/* Ensure all writes has taken effect. */
> >> +static inline void aest_sync(struct aest_node *node)
> >> +{
> >> +	if (node->type == ACPI_AEST_PROCESSOR_ERROR_NODE)
> >> +		isb();
> >> +}
> >> +
> >> +static const char * const aest_node_name[] = {
> >> +	[ACPI_AEST_PROCESSOR_ERROR_NODE] = "processor",
> >> +	[ACPI_AEST_MEMORY_ERROR_NODE] = "memory",
> >> +	[ACPI_AEST_SMMU_ERROR_NODE] = "smmu",
> >> +	[ACPI_AEST_VENDOR_ERROR_NODE] = "vendor",
> >> +	[ACPI_AEST_GIC_ERROR_NODE] = "gic",
> >> +	[ACPI_AEST_PCIE_ERROR_NODE] = "pcie",
> >> +	[ACPI_AEST_PROXY_ERROR_NODE] = "proxy",
> >> +};
> >> +
> >> +static inline int
> >> +aest_set_name(struct aest_device *adev, struct aest_hnode *ahnode)
> >> +{
> >> +	adev->dev->init_name = devm_kasprintf(adev->dev, GFP_KERNEL,
> >> +					"%s%d", aest_node_name[ahnode->type],
> >> +						adev->uid);
> >> +	if (!adev->dev->init_name)
> >> +		return -ENOMEM;
> >> +
> >> +	return 0;
> >> +}
> >> diff --git a/include/linux/acpi_aest.h b/include/linux/acpi_aest.h
> >> new file mode 100644
> >> index 000000000000..1c2191791504
> >> --- /dev/null
> >> +++ b/include/linux/acpi_aest.h
> >> @@ -0,0 +1,68 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> >> +#ifndef __ACPI_AEST_H__
> >> +#define __ACPI_AEST_H__
> >> +
> >> +#include <linux/acpi.h>
> >> +#include <asm/ras.h>
> >> +
> >> +/* AEST component */
> >> +#define ACPI_AEST_PROC_FLAG_GLOBAL	(1<<0)
> >> +#define ACPI_AEST_PROC_FLAG_SHARED	(1<<1)
> >> +
> >> +#define AEST_ADDREESS_SPA	0
> >> +#define AEST_ADDREESS_LA	1
> >> +
> >> +/* AEST interrupt */
> >> +#define AEST_INTERRUPT_MODE		BIT(0)
> >> +#define AEST_INTERRUPT_FHI_MODE		BIT(1)
> >> +
> >> +#define AEST_INTERRUPT_FHI_UE_SUPPORT		BIT(0)
> >> +#define AEST_INTERRUPT_FHI_UE_NO_SUPPORT		BIT(1)
> >> +
> >> +#define AEST_MAX_INTERRUPT_PER_NODE 3
> >> +
> >> +/* AEST interface */
> >> +
> >> +#define AEST_XFACE_FLAG_SHARED		(1<<0)
> >> +#define AEST_XFACE_FLAG_CLEAR_MISC	(1<<1)
> >> +#define AEST_XFACE_FLAG_ERROR_DEVICE	(1<<2)
> >> +#define AEST_XFACE_FLAG_AFFINITY	(1<<3)
> >> +#define AEST_XFACE_FLAG_ERROR_GROUP	(1<<4)
> >> +#define AEST_XFACE_FLAG_FAULT_INJECT	(1<<5)
> >> +#define AEST_XFACE_FLAG_INT_CONFIG	(1<<6)
> >> +
> >> +struct aest_hnode {
> >> +	struct list_head list;
> >> +	int count;
> >> +	u32 uid;
> >> +	int type;
> >> +};
> >> +
> >> +struct acpi_aest_node {
> >> +	struct list_head list;
> >> +	int type;
> >> +	struct acpi_aest_node_interface_header *interface_hdr;
> >> +	unsigned long *record_implemented;
> >> +	unsigned long *status_reporting;
> >> +	unsigned long *addressing_mode;
> >> +	struct acpi_aest_node_interface_common *common;
> >> +	union {
> >> +		struct acpi_aest_processor *processor;
> >> +		struct acpi_aest_memory *memory;
> >> +		struct acpi_aest_smmu *smmu;
> >> +		struct acpi_aest_vendor_v2 *vendor;
> >> +		struct acpi_aest_gic *gic;
> >> +		struct acpi_aest_pcie *pcie;
> >> +		struct acpi_aest_proxy *proxy;
> >> +		void *spec_pointer;
> >> +	};
> >> +	union {
> >> +		struct acpi_aest_processor_cache *cache;
> >> +		struct acpi_aest_processor_tlb *tlb;
> >> +		struct acpi_aest_processor_generic *generic;
> >> +		void *processor_spec_pointer;
> >> +	};
> >> +	struct acpi_aest_node_interrupt_v2 *interrupt;
> >> +	int interrupt_count;
> >> +};
> >> +#endif /* __ACPI_IORT_H__ */
> >> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> >> index a04b73c40173..acf0e3957fdd 100644
> >> --- a/include/linux/cpuhotplug.h
> >> +++ b/include/linux/cpuhotplug.h
> >> @@ -179,6 +179,7 @@ enum cpuhp_state {
> >>   	CPUHP_AP_CSKY_TIMER_STARTING,
> >>   	CPUHP_AP_TI_GP_TIMER_STARTING,
> >>   	CPUHP_AP_HYPERV_TIMER_STARTING,
> >> +	CPUHP_AP_ARM_AEST_STARTING,
> >>   	/* Must be the last timer callback */
> >>   	CPUHP_AP_DUMMY_TIMER_STARTING,
> >>   	CPUHP_AP_ARM_XEN_STARTING,
> >> diff --git a/include/linux/ras.h b/include/linux/ras.h
> >> index a64182bc72ad..1c777af6a1af 100644
> >> --- a/include/linux/ras.h
> >> +++ b/include/linux/ras.h
> >> @@ -53,4 +53,12 @@ static inline unsigned long
> >>   amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err) { return -EINVAL; }
> >>   #endif /* CONFIG_AMD_ATL */
> >>
> >> +#if IS_ENABLED(CONFIG_AEST)
> >> +void aest_register_decode_chain(struct notifier_block *nb);
> >> +void aest_unregister_decode_chain(struct notifier_block *nb);
> >> +#else
> >> +static inline void aest_register_decode_chain(struct notifier_block *nb) {}
> >> +static inline void aest_unregister_decode_chain(struct notifier_block *nb) {}
> >> +#endif /* CONFIG_AEST */
> >> +
> >>   #endif /* __RAS_H__ */
> >> --
> >> 2.33.1
> >>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 0/5] ARM Error Source Table V2 Support
  2025-01-15  8:42 [PATCH v3 0/5] ARM Error Source Table V2 Support Ruidong Tian
                   ` (4 preceding siblings ...)
  2025-01-15  8:42 ` [PATCH v3 5/5] trace, ras: add ARM RAS extension trace event Ruidong Tian
@ 2025-02-19 20:30 ` Borislav Petkov
  5 siblings, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2025-02-19 20:30 UTC (permalink / raw)
  To: Ruidong Tian
  Cc: catalin.marinas, will, lpieralisi, guohanjun, sudeep.holla,
	xueshuai, baolin.wang, linux-kernel, linux-acpi, linux-arm-kernel,
	rafael, lenb, tony.luck, yazen.ghannam

On Wed, Jan 15, 2025 at 04:42:23PM +0800, Ruidong Tian wrote:
> AEST provides a mechanism for hardware to directly notify Kernel to
> handle RAS errors through interrupts, which is also known as Kernel-first
> mode.

Kernel first? Srsly? No? Oh.

https://www.youtube.com/watch?v=pFjSDM6D500

So what, folks realized finally that firmware-first is simply a stinking pile,
after a decade or so.

> AEST's Advantage
> ========================
> 
> 1. AEST uses EL1 interrupts to report CE/DE, making it more lightweight
>    than GHES (the Firmware First solution on Arm).

ROTFL.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver
  2025-01-15  8:42 ` [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver Ruidong Tian
  2025-01-17 10:50   ` Tomohiro Misono (Fujitsu)
@ 2025-02-19 20:49   ` Borislav Petkov
  1 sibling, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2025-02-19 20:49 UTC (permalink / raw)
  To: Ruidong Tian
  Cc: catalin.marinas, will, lpieralisi, guohanjun, sudeep.holla,
	xueshuai, baolin.wang, linux-kernel, linux-acpi, linux-arm-kernel,
	rafael, lenb, tony.luck, yazen.ghannam, Tyler Baicar

Just some cursory review...

On Wed, Jan 15, 2025 at 04:42:24PM +0800, Ruidong Tian wrote:
> Add support for parsing the ARM Error Source Table and basic handling of
> errors reported through both memory mapped and system register interfaces.
> 
> Assume system register interfaces are only registered with private
> peripheral interrupts (PPIs); otherwise there is no guarantee the
> core handling the error is the core which took the error and has the
> syndrome info in its system registers.
> 
> In kernel-first mode, all configuration is controlled by kernel, include
> CE ce_threshold and interrupt enable/disable.
> 
> All detected errors will be processed as follow:
>   - CE, DE: use a workqueue to log this hare errors.
>   - UER, UEO: log it and call memory_failun workquee.
>   - UC, UEU: panic in irq context.


Use a spellchecker for all your text.

In addition, use AI to check your English formulations.

> Signed-off-by: Tyler Baicar <baicar@os.amperecomputing.com>
> Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>

Who's the author: Tyler or you?

That's denoted with the From: field.

Make sure you go over Documentation/process/submitting-patches.rst for basic
mistakes.

> ---
>  MAINTAINERS                  |  10 +
>  arch/arm64/include/asm/ras.h |  95 ++++
>  drivers/acpi/arm64/Kconfig   |  11 +
>  drivers/acpi/arm64/Makefile  |   1 +
>  drivers/acpi/arm64/aest.c    | 335 ++++++++++++
>  drivers/acpi/arm64/init.c    |   2 +
>  drivers/acpi/arm64/init.h    |   1 +
>  drivers/ras/Kconfig          |   1 +
>  drivers/ras/Makefile         |   1 +
>  drivers/ras/aest/Kconfig     |  17 +
>  drivers/ras/aest/Makefile    |   5 +
>  drivers/ras/aest/aest-core.c | 976 +++++++++++++++++++++++++++++++++++
>  drivers/ras/aest/aest.h      | 323 ++++++++++++
>  include/linux/acpi_aest.h    |  68 +++
>  include/linux/cpuhotplug.h   |   1 +
>  include/linux/ras.h          |   8 +
>  16 files changed, 1855 insertions(+)

This patch is huuge and unreviewable: split it.

Also, I see issues like:

check_for_todos: WARNING: drivers/ras/aest/aest-core.c:207: Hunk contains unfinished TODO:
+               /* TODO: translate Logical Addresses to System Physical Addresses */


check_for_todos: WARNING: drivers/ras/aest/aest-core.c:446: Hunk contains unfinished TODO:
+       //TODO: Support 32B CEC threshold.

A TODO tells me that patch is not ready for upstream.

Also, get rid of all // comments in drivers/ras/ and use normal /* style.

Enough for now.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 2/5] RAS/AEST: Introduce AEST driver sysfs interface
  2025-01-15  8:42 ` [PATCH v3 2/5] RAS/AEST: Introduce AEST driver sysfs interface Ruidong Tian
  2025-01-18  2:37   ` kernel test robot
@ 2025-02-19 20:55   ` Borislav Petkov
  1 sibling, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2025-02-19 20:55 UTC (permalink / raw)
  To: Ruidong Tian
  Cc: catalin.marinas, will, lpieralisi, guohanjun, sudeep.holla,
	xueshuai, baolin.wang, linux-kernel, linux-acpi, linux-arm-kernel,
	rafael, lenb, tony.luck, yazen.ghannam

On Wed, Jan 15, 2025 at 04:42:25PM +0800, Ruidong Tian wrote:
> Exposes certain AEST driver information to userspace.

Why?

Why do we need to support a debug interface indefinitely.

If "no real reason", then drop it.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 4/5] RAS/ATL: Unified ATL interface for ARM64 and AMD
  2025-01-15  8:42 ` [PATCH v3 4/5] RAS/ATL: Unified ATL interface for ARM64 and AMD Ruidong Tian
  2025-01-17  6:14   ` kernel test robot
@ 2025-02-19 21:00   ` Borislav Petkov
  1 sibling, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2025-02-19 21:00 UTC (permalink / raw)
  To: Ruidong Tian
  Cc: catalin.marinas, will, lpieralisi, guohanjun, sudeep.holla,
	xueshuai, baolin.wang, linux-kernel, linux-acpi, linux-arm-kernel,
	rafael, lenb, tony.luck, yazen.ghannam

On Wed, Jan 15, 2025 at 04:42:27PM +0800, Ruidong Tian wrote:
> Subject: Re: [PATCH v3 4/5] RAS/ATL: Unified ATL interface for ARM64 and AMD

The condensed patch description in the subject line should start with a
uppercase letter and should be written in imperative tone.


> Translate device normalize address in AMD, also named logical address,
> to system physical address is a common interface in RAS. Provides common
> interface both for AMD and ARM.

This needs a lot more explanation.

> Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
> ---
>  drivers/edac/amd64_edac.c      |  2 +-
>  drivers/ras/aest/aest-core.c   | 12 ++++++------
>  drivers/ras/amd/atl/core.c     |  4 ++--
>  drivers/ras/amd/atl/internal.h |  2 +-
>  drivers/ras/amd/atl/umc.c      |  3 ++-
>  drivers/ras/ras.c              | 24 +++++++++++-------------
>  include/linux/ras.h            |  9 ++++-----
>  7 files changed, 27 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index ddfbdb66b794..1e9c96e4daa8 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -2832,7 +2832,7 @@ static void decode_umc_error(int node_id, struct mce *m)
>  	a_err.ipid = m->ipid;
>  	a_err.cpu  = m->extcpu;
>  
> -	sys_addr = amd_convert_umc_mca_addr_to_sys_addr(&a_err);
> +	sys_addr = convert_ras_la_to_spa(&a_err);

No, this is not how all this is done. You don't rename functions and make them
generic - you *extract* generic functionality into generic functions and have
other functions which use them, call them.

And you do that when there are users, not before.

Ok, that should be enough feedback for now.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-02-19 21:02 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-15  8:42 [PATCH v3 0/5] ARM Error Source Table V2 Support Ruidong Tian
2025-01-15  8:42 ` [PATCH v3 1/5] ACPI/RAS/AEST: Initial AEST driver Ruidong Tian
2025-01-17 10:50   ` Tomohiro Misono (Fujitsu)
2025-02-06  8:32     ` Ruidong Tian
2025-02-14  9:14       ` Tomohiro Misono (Fujitsu)
2025-02-19 20:49   ` Borislav Petkov
2025-01-15  8:42 ` [PATCH v3 2/5] RAS/AEST: Introduce AEST driver sysfs interface Ruidong Tian
2025-01-18  2:37   ` kernel test robot
2025-02-19 20:55   ` Borislav Petkov
2025-01-15  8:42 ` [PATCH v3 3/5] RAS/AEST: Introduce AEST inject interface to test AEST driver Ruidong Tian
2025-01-17  7:07   ` kernel test robot
2025-01-15  8:42 ` [PATCH v3 4/5] RAS/ATL: Unified ATL interface for ARM64 and AMD Ruidong Tian
2025-01-17  6:14   ` kernel test robot
2025-02-19 21:00   ` Borislav Petkov
2025-01-15  8:42 ` [PATCH v3 5/5] trace, ras: add ARM RAS extension trace event Ruidong Tian
2025-02-19 20:30 ` [PATCH v3 0/5] ARM Error Source Table V2 Support Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).