* [PATCH v10 0/7] Enable EINJv2 Support
@ 2025-06-17 19:30 Zaid Alali
2025-06-17 19:30 ` [PATCH v10 1/7] ACPI: APEI: EINJ: Fix kernel test sparse warnings Zaid Alali
` (7 more replies)
0 siblings, 8 replies; 11+ messages in thread
From: Zaid Alali @ 2025-06-17 19:30 UTC (permalink / raw)
To: rafael, lenb, james.morse, tony.luck, bp, kees, gustavoars,
zaidal, ira.weiny, Jonathan.Cameron, viro, sudeep.holla,
dan.carpenter, jonathanh, sthanneeru.opensrc, gregkh, peterz,
dan.j.williams, dave.jiang, benjamin.cheatham, linux-acpi,
linux-kernel, linux-hardening
The goal of this update is to allow the driver to simultaneously
support EINJ and EINJv2. The implementation follows ACPI 6.6
specs[1] that enables the driver to discover system capabilities
through GET_ERROR_TYPE.
Link: https://uefi.org/specs/ACPI/6.6/18_Platform_Error_Interfaces.html#error-injection [1]
V5:
*Users no longer input component array size, instead it
is counted by parsing the component array itself.
V6:
*Fix memory leak.
*If EINJv2 initialization failed, EINJv1 will still work, and
probe function will continue with disabled EINJv2.
V7:
*Update component array to take 128-bit values to match ACPI specs.
*Enable Vendor EINJv2 injections
*Moved component array parsing and validating to a separate
function to improve readability.
V8:
*Update UI to use single value files for component array.
*Update links to point to recent ACPI 6.6 spec release.
*Updated commit messages and documentation patch.
*Dropped the first two patches as they were merged via
ACPICA project.
V9:
*Fix commit messages signed-off/reviewed-by order.
*Fix sparse warning by defining syndrom_data as a
static struct.
V10:
*Use defined value instead of hard coded for component
array size
*Unset EINJv2 flag for EINJv1 injections
Tony Luck (1):
ACPI: APEI: EINJ: Create debugfs files to enter device id and syndrome
Zaid Alali (6):
ACPI: APEI: EINJ: Fix kernel test sparse warnings
ACPI: APEI: EINJ: Enable the discovery of EINJv2 capabilities
ACPI: APEI: EINJ: Add einjv2 extension struct
ACPI: APEI: EINJ: Discover EINJv2 parameters
ACPI: APEI: EINJ: Enable EINJv2 error injections
ACPI: APEI: EINJ: Update the documentation for EINJv2 support
.../firmware-guide/acpi/apei/einj.rst | 33 ++
drivers/acpi/apei/apei-internal.h | 2 +-
drivers/acpi/apei/einj-core.c | 374 ++++++++++++++----
drivers/acpi/apei/einj-cxl.c | 2 +-
4 files changed, 342 insertions(+), 69 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v10 1/7] ACPI: APEI: EINJ: Fix kernel test sparse warnings
2025-06-17 19:30 [PATCH v10 0/7] Enable EINJv2 Support Zaid Alali
@ 2025-06-17 19:30 ` Zaid Alali
2025-07-03 20:04 ` [PATCH] ACPI: APEI: EINJ: Fix trigger actions Tony Luck
2025-06-17 19:30 ` [PATCH v10 2/7] ACPI: APEI: EINJ: Enable the discovery of EINJv2 capabilities Zaid Alali
` (6 subsequent siblings)
7 siblings, 1 reply; 11+ messages in thread
From: Zaid Alali @ 2025-06-17 19:30 UTC (permalink / raw)
To: rafael, lenb, james.morse, tony.luck, bp, kees, gustavoars,
zaidal, ira.weiny, Jonathan.Cameron, viro, sudeep.holla,
dan.carpenter, jonathanh, sthanneeru.opensrc, gregkh, peterz,
dan.j.williams, dave.jiang, benjamin.cheatham, linux-acpi,
linux-kernel, linux-hardening
This patch fixes the kernel test robot warning reported here:
Link: https://lore.kernel.org/all/202410241620.oApALow5-lkp@intel.com/
Use pointers annotated with the __iomem marker for all iomem map calls,
and creates a local copy of the mapped IO memory for future access in
the code. memcpy_fromio() and memcpy_toio() are used to read/write data
from/to mapped IO memory.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Zaid Alali <zaidal@os.amperecomputing.com>
---
drivers/acpi/apei/einj-core.c | 106 +++++++++++++++++++---------------
1 file changed, 60 insertions(+), 46 deletions(-)
diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index 9b041415a9d0..e4fb4405deae 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -151,7 +151,7 @@ static DEFINE_MUTEX(einj_mutex);
*/
bool einj_initialized __ro_after_init;
-static void *einj_param;
+static void __iomem *einj_param;
static void einj_exec_ctx_init(struct apei_exec_context *ctx)
{
@@ -216,24 +216,26 @@ static void check_vendor_extension(u64 paddr,
struct set_error_type_with_address *v5param)
{
int offset = v5param->vendor_extension;
- struct vendor_error_type_extension *v;
+ struct vendor_error_type_extension v;
+ struct vendor_error_type_extension __iomem *p;
u32 sbdf;
if (!offset)
return;
- v = acpi_os_map_iomem(paddr + offset, sizeof(*v));
- if (!v)
+ p = acpi_os_map_iomem(paddr + offset, sizeof(*p));
+ if (!p)
return;
- get_oem_vendor_struct(paddr, offset, v);
- sbdf = v->pcie_sbdf;
+ memcpy_fromio(&v, p, sizeof(v));
+ get_oem_vendor_struct(paddr, offset, &v);
+ sbdf = v.pcie_sbdf;
sprintf(vendor_dev, "%x:%x:%x.%x vendor_id=%x device_id=%x rev_id=%x\n",
sbdf >> 24, (sbdf >> 16) & 0xff,
(sbdf >> 11) & 0x1f, (sbdf >> 8) & 0x7,
- v->vendor_id, v->device_id, v->rev_id);
- acpi_os_unmap_iomem(v, sizeof(*v));
+ v.vendor_id, v.device_id, v.rev_id);
+ acpi_os_unmap_iomem(p, sizeof(v));
}
-static void *einj_get_parameter_address(void)
+static void __iomem *einj_get_parameter_address(void)
{
int i;
u64 pa_v4 = 0, pa_v5 = 0;
@@ -254,26 +256,30 @@ static void *einj_get_parameter_address(void)
entry++;
}
if (pa_v5) {
- struct set_error_type_with_address *v5param;
+ struct set_error_type_with_address v5param;
+ struct set_error_type_with_address __iomem *p;
- v5param = acpi_os_map_iomem(pa_v5, sizeof(*v5param));
- if (v5param) {
+ p = acpi_os_map_iomem(pa_v5, sizeof(*p));
+ if (p) {
+ memcpy_fromio(&v5param, p, sizeof(v5param));
acpi5 = 1;
- check_vendor_extension(pa_v5, v5param);
- return v5param;
+ check_vendor_extension(pa_v5, &v5param);
+ return p;
}
}
if (param_extension && pa_v4) {
- struct einj_parameter *v4param;
+ struct einj_parameter v4param;
+ struct einj_parameter __iomem *p;
- v4param = acpi_os_map_iomem(pa_v4, sizeof(*v4param));
- if (!v4param)
+ p = acpi_os_map_iomem(pa_v4, sizeof(*p));
+ if (!p)
return NULL;
- if (v4param->reserved1 || v4param->reserved2) {
- acpi_os_unmap_iomem(v4param, sizeof(*v4param));
+ memcpy_fromio(&v4param, p, sizeof(v4param));
+ if (v4param.reserved1 || v4param.reserved2) {
+ acpi_os_unmap_iomem(p, sizeof(v4param));
return NULL;
}
- return v4param;
+ return p;
}
return NULL;
@@ -319,7 +325,7 @@ static struct acpi_generic_address *einj_get_trigger_parameter_region(
static int __einj_error_trigger(u64 trigger_paddr, u32 type,
u64 param1, u64 param2)
{
- struct acpi_einj_trigger *trigger_tab = NULL;
+ struct acpi_einj_trigger trigger_tab;
struct apei_exec_context trigger_ctx;
struct apei_resources trigger_resources;
struct acpi_whea_header *trigger_entry;
@@ -327,54 +333,57 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
u32 table_size;
int rc = -EIO;
struct acpi_generic_address *trigger_param_region = NULL;
+ struct acpi_einj_trigger __iomem *p;
- r = request_mem_region(trigger_paddr, sizeof(*trigger_tab),
+ r = request_mem_region(trigger_paddr, sizeof(trigger_tab),
"APEI EINJ Trigger Table");
if (!r) {
pr_err("Can not request [mem %#010llx-%#010llx] for Trigger table\n",
(unsigned long long)trigger_paddr,
(unsigned long long)trigger_paddr +
- sizeof(*trigger_tab) - 1);
+ sizeof(trigger_tab) - 1);
goto out;
}
- trigger_tab = ioremap_cache(trigger_paddr, sizeof(*trigger_tab));
- if (!trigger_tab) {
+ p = ioremap_cache(trigger_paddr, sizeof(*p));
+ if (!p) {
pr_err("Failed to map trigger table!\n");
goto out_rel_header;
}
- rc = einj_check_trigger_header(trigger_tab);
+ memcpy_fromio(&trigger_tab, p, sizeof(trigger_tab));
+ rc = einj_check_trigger_header(&trigger_tab);
if (rc) {
pr_warn(FW_BUG "Invalid trigger error action table.\n");
goto out_rel_header;
}
/* No action structures in the TRIGGER_ERROR table, nothing to do */
- if (!trigger_tab->entry_count)
+ if (!trigger_tab.entry_count)
goto out_rel_header;
rc = -EIO;
- table_size = trigger_tab->table_size;
- r = request_mem_region(trigger_paddr + sizeof(*trigger_tab),
- table_size - sizeof(*trigger_tab),
+ table_size = trigger_tab.table_size;
+ r = request_mem_region(trigger_paddr + sizeof(trigger_tab),
+ table_size - sizeof(trigger_tab),
"APEI EINJ Trigger Table");
if (!r) {
pr_err("Can not request [mem %#010llx-%#010llx] for Trigger Table Entry\n",
- (unsigned long long)trigger_paddr + sizeof(*trigger_tab),
+ (unsigned long long)trigger_paddr + sizeof(trigger_tab),
(unsigned long long)trigger_paddr + table_size - 1);
goto out_rel_header;
}
- iounmap(trigger_tab);
- trigger_tab = ioremap_cache(trigger_paddr, table_size);
- if (!trigger_tab) {
+ iounmap(p);
+ p = ioremap_cache(trigger_paddr, table_size);
+ if (!p) {
pr_err("Failed to map trigger table!\n");
goto out_rel_entry;
}
+ memcpy_fromio(&trigger_tab, p, sizeof(trigger_tab));
trigger_entry = (struct acpi_whea_header *)
- ((char *)trigger_tab + sizeof(struct acpi_einj_trigger));
+ ((char *)&trigger_tab + sizeof(struct acpi_einj_trigger));
apei_resources_init(&trigger_resources);
apei_exec_ctx_init(&trigger_ctx, einj_ins_type,
ARRAY_SIZE(einj_ins_type),
- trigger_entry, trigger_tab->entry_count);
+ trigger_entry, trigger_tab.entry_count);
rc = apei_exec_collect_resources(&trigger_ctx, &trigger_resources);
if (rc)
goto out_fini;
@@ -392,7 +401,7 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
apei_resources_init(&addr_resources);
trigger_param_region = einj_get_trigger_parameter_region(
- trigger_tab, param1, param2);
+ &trigger_tab, param1, param2);
if (trigger_param_region) {
rc = apei_resources_add(&addr_resources,
trigger_param_region->address,
@@ -421,13 +430,13 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
out_fini:
apei_resources_fini(&trigger_resources);
out_rel_entry:
- release_mem_region(trigger_paddr + sizeof(*trigger_tab),
- table_size - sizeof(*trigger_tab));
+ release_mem_region(trigger_paddr + sizeof(trigger_tab),
+ table_size - sizeof(trigger_tab));
out_rel_header:
- release_mem_region(trigger_paddr, sizeof(*trigger_tab));
+ release_mem_region(trigger_paddr, sizeof(trigger_tab));
out:
- if (trigger_tab)
- iounmap(trigger_tab);
+ if (p)
+ iounmap(p);
return rc;
}
@@ -446,8 +455,10 @@ static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
return rc;
apei_exec_ctx_set_input(&ctx, type);
if (acpi5) {
- struct set_error_type_with_address *v5param = einj_param;
+ struct set_error_type_with_address *v5param, v5_struct;
+ v5param = &v5_struct;
+ memcpy_fromio(v5param, einj_param, sizeof(*v5param));
v5param->type = type;
if (type & ACPI5_VENDOR_BIT) {
switch (vendor_flags) {
@@ -492,15 +503,18 @@ static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
break;
}
}
+ memcpy_toio(einj_param, v5param, sizeof(*v5param));
} else {
rc = apei_exec_run(&ctx, ACPI_EINJ_SET_ERROR_TYPE);
if (rc)
return rc;
if (einj_param) {
- struct einj_parameter *v4param = einj_param;
+ struct einj_parameter v4param;
- v4param->param1 = param1;
- v4param->param2 = param2;
+ memcpy_fromio(&v4param, einj_param, sizeof(v4param));
+ v4param.param1 = param1;
+ v4param.param2 = param2;
+ memcpy_toio(einj_param, &v4param, sizeof(v4param));
}
}
rc = apei_exec_run(&ctx, ACPI_EINJ_EXECUTE_OPERATION);
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v10 2/7] ACPI: APEI: EINJ: Enable the discovery of EINJv2 capabilities
2025-06-17 19:30 [PATCH v10 0/7] Enable EINJv2 Support Zaid Alali
2025-06-17 19:30 ` [PATCH v10 1/7] ACPI: APEI: EINJ: Fix kernel test sparse warnings Zaid Alali
@ 2025-06-17 19:30 ` Zaid Alali
2025-06-17 19:30 ` [PATCH v10 3/7] ACPI: APEI: EINJ: Add einjv2 extension struct Zaid Alali
` (5 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Zaid Alali @ 2025-06-17 19:30 UTC (permalink / raw)
To: rafael, lenb, james.morse, tony.luck, bp, kees, gustavoars,
zaidal, ira.weiny, Jonathan.Cameron, viro, sudeep.holla,
dan.carpenter, jonathanh, sthanneeru.opensrc, gregkh, peterz,
dan.j.williams, dave.jiang, benjamin.cheatham, linux-acpi,
linux-kernel, linux-hardening
Enable the driver to show all supported error injections for EINJ
and EINJv2 at the same time. EINJv2 capabilities can be discovered
by checking the return value of get_error_type, where bit 30 set
indicates EINJv2 support.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Zaid Alali <zaidal@os.amperecomputing.com>
---
drivers/acpi/apei/apei-internal.h | 2 +-
drivers/acpi/apei/einj-core.c | 75 +++++++++++++++++++++++++------
drivers/acpi/apei/einj-cxl.c | 2 +-
3 files changed, 63 insertions(+), 16 deletions(-)
diff --git a/drivers/acpi/apei/apei-internal.h b/drivers/acpi/apei/apei-internal.h
index cd2766c69d78..77c10a7a7a9f 100644
--- a/drivers/acpi/apei/apei-internal.h
+++ b/drivers/acpi/apei/apei-internal.h
@@ -131,7 +131,7 @@ static inline u32 cper_estatus_len(struct acpi_hest_generic_status *estatus)
int apei_osc_setup(void);
-int einj_get_available_error_type(u32 *type);
+int einj_get_available_error_type(u32 *type, int einj_action);
int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, u64 param3,
u64 param4);
int einj_cxl_rch_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index e4fb4405deae..a1ff42c226fb 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -33,6 +33,7 @@
#define SLEEP_UNIT_MAX 5000 /* 5ms */
/* Firmware should respond within 1 seconds */
#define FIRMWARE_TIMEOUT (1 * USEC_PER_SEC)
+#define ACPI65_EINJV2_SUPP BIT(30)
#define ACPI5_VENDOR_BIT BIT(31)
#define MEM_ERROR_MASK (ACPI_EINJ_MEMORY_CORRECTABLE | \
ACPI_EINJ_MEMORY_UNCORRECTABLE | \
@@ -84,6 +85,7 @@ static struct debugfs_blob_wrapper vendor_errors;
static char vendor_dev[64];
static u32 available_error_type;
+static u32 available_error_type_v2;
/*
* Some BIOSes allow parameters to the SET_ERROR_TYPE entries in the
@@ -159,13 +161,13 @@ static void einj_exec_ctx_init(struct apei_exec_context *ctx)
EINJ_TAB_ENTRY(einj_tab), einj_tab->entries);
}
-static int __einj_get_available_error_type(u32 *type)
+static int __einj_get_available_error_type(u32 *type, int einj_action)
{
struct apei_exec_context ctx;
int rc;
einj_exec_ctx_init(&ctx);
- rc = apei_exec_run(&ctx, ACPI_EINJ_GET_ERROR_TYPE);
+ rc = apei_exec_run(&ctx, einj_action);
if (rc)
return rc;
*type = apei_exec_ctx_get_output(&ctx);
@@ -174,17 +176,34 @@ static int __einj_get_available_error_type(u32 *type)
}
/* Get error injection capabilities of the platform */
-int einj_get_available_error_type(u32 *type)
+int einj_get_available_error_type(u32 *type, int einj_action)
{
int rc;
mutex_lock(&einj_mutex);
- rc = __einj_get_available_error_type(type);
+ rc = __einj_get_available_error_type(type, einj_action);
mutex_unlock(&einj_mutex);
return rc;
}
+static int einj_get_available_error_types(u32 *type1, u32 *type2)
+{
+ int rc;
+
+ rc = einj_get_available_error_type(type1, ACPI_EINJ_GET_ERROR_TYPE);
+ if (rc)
+ return rc;
+ if (*type1 & ACPI65_EINJV2_SUPP) {
+ rc = einj_get_available_error_type(type2,
+ ACPI_EINJV2_GET_ERROR_TYPE);
+ if (rc)
+ return rc;
+ }
+
+ return 0;
+}
+
static int einj_timedout(u64 *t)
{
if ((s64)*t < SLEEP_UNIT_MIN) {
@@ -646,6 +665,7 @@ static u64 error_param2;
static u64 error_param3;
static u64 error_param4;
static struct dentry *einj_debug_dir;
+static char einj_buf[32];
static struct { u32 mask; const char *str; } const einj_error_type_string[] = {
{ BIT(0), "Processor Correctable" },
{ BIT(1), "Processor Uncorrectable non-fatal" },
@@ -662,6 +682,12 @@ static struct { u32 mask; const char *str; } const einj_error_type_string[] = {
{ BIT(31), "Vendor Defined Error Types" },
};
+static struct { u32 mask; const char *str; } const einjv2_error_type_string[] = {
+ { BIT(0), "EINJV2 Processor Error" },
+ { BIT(1), "EINJV2 Memory Error" },
+ { BIT(2), "EINJV2 PCI Express Error" },
+};
+
static int available_error_type_show(struct seq_file *m, void *v)
{
@@ -669,17 +695,22 @@ static int available_error_type_show(struct seq_file *m, void *v)
if (available_error_type & einj_error_type_string[pos].mask)
seq_printf(m, "0x%08x\t%s\n", einj_error_type_string[pos].mask,
einj_error_type_string[pos].str);
-
+ if (available_error_type & ACPI65_EINJV2_SUPP) {
+ for (int pos = 0; pos < ARRAY_SIZE(einjv2_error_type_string); pos++) {
+ if (available_error_type_v2 & einjv2_error_type_string[pos].mask)
+ seq_printf(m, "V2_0x%08x\t%s\n", einjv2_error_type_string[pos].mask,
+ einjv2_error_type_string[pos].str);
+ }
+ }
return 0;
}
DEFINE_SHOW_ATTRIBUTE(available_error_type);
-static int error_type_get(void *data, u64 *val)
+static ssize_t error_type_get(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
{
- *val = error_type;
-
- return 0;
+ return simple_read_from_buffer(buf, count, ppos, einj_buf, strlen(einj_buf));
}
bool einj_is_cxl_error_type(u64 type)
@@ -712,9 +743,23 @@ int einj_validate_error_type(u64 type)
return 0;
}
-static int error_type_set(void *data, u64 val)
+static ssize_t error_type_set(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
{
int rc;
+ u64 val;
+
+ memset(einj_buf, 0, sizeof(einj_buf));
+ if (copy_from_user(einj_buf, buf, count))
+ return -EFAULT;
+
+ if (strncmp(einj_buf, "V2_", 3) == 0) {
+ if (!sscanf(einj_buf, "V2_%llx", &val))
+ return -EINVAL;
+ } else {
+ if (!sscanf(einj_buf, "%llx", &val))
+ return -EINVAL;
+ }
rc = einj_validate_error_type(val);
if (rc)
@@ -722,11 +767,13 @@ static int error_type_set(void *data, u64 val)
error_type = val;
- return 0;
+ return count;
}
-DEFINE_DEBUGFS_ATTRIBUTE(error_type_fops, error_type_get, error_type_set,
- "0x%llx\n");
+static const struct file_operations error_type_fops = {
+ .read = error_type_get,
+ .write = error_type_set,
+};
static int error_inject_set(void *data, u64 val)
{
@@ -778,7 +825,7 @@ static int __init einj_probe(struct faux_device *fdev)
goto err_put_table;
}
- rc = einj_get_available_error_type(&available_error_type);
+ rc = einj_get_available_error_types(&available_error_type, &available_error_type_v2);
if (rc)
goto err_put_table;
diff --git a/drivers/acpi/apei/einj-cxl.c b/drivers/acpi/apei/einj-cxl.c
index 78da9ae543a2..e70a416ec925 100644
--- a/drivers/acpi/apei/einj-cxl.c
+++ b/drivers/acpi/apei/einj-cxl.c
@@ -30,7 +30,7 @@ int einj_cxl_available_error_type_show(struct seq_file *m, void *v)
int cxl_err, rc;
u32 available_error_type = 0;
- rc = einj_get_available_error_type(&available_error_type);
+ rc = einj_get_available_error_type(&available_error_type, ACPI_EINJ_GET_ERROR_TYPE);
if (rc)
return rc;
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v10 3/7] ACPI: APEI: EINJ: Add einjv2 extension struct
2025-06-17 19:30 [PATCH v10 0/7] Enable EINJv2 Support Zaid Alali
2025-06-17 19:30 ` [PATCH v10 1/7] ACPI: APEI: EINJ: Fix kernel test sparse warnings Zaid Alali
2025-06-17 19:30 ` [PATCH v10 2/7] ACPI: APEI: EINJ: Enable the discovery of EINJv2 capabilities Zaid Alali
@ 2025-06-17 19:30 ` Zaid Alali
2025-06-17 19:30 ` [PATCH v10 4/7] ACPI: APEI: EINJ: Discover EINJv2 parameters Zaid Alali
` (4 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Zaid Alali @ 2025-06-17 19:30 UTC (permalink / raw)
To: rafael, lenb, james.morse, tony.luck, bp, kees, gustavoars,
zaidal, ira.weiny, Jonathan.Cameron, viro, sudeep.holla,
dan.carpenter, jonathanh, sthanneeru.opensrc, gregkh, peterz,
dan.j.williams, dave.jiang, benjamin.cheatham, linux-acpi,
linux-kernel, linux-hardening
Add einjv2 extension struct and EINJv2 error types to prepare
the driver for EINJv2 support. ACPI specifications[1] enables
EINJv2 by extending set_error_type_with_address struct.
Link: https://uefi.org/specs/ACPI/6.6/18_Platform_Error_Interfaces.html#einjv2-extension-structure [1]
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Zaid Alali <zaidal@os.amperecomputing.com>
---
drivers/acpi/apei/einj-core.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index a1ff42c226fb..1ffe8270634c 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -33,6 +33,7 @@
#define SLEEP_UNIT_MAX 5000 /* 5ms */
/* Firmware should respond within 1 seconds */
#define FIRMWARE_TIMEOUT (1 * USEC_PER_SEC)
+#define COMPONENT_LEN 16
#define ACPI65_EINJV2_SUPP BIT(30)
#define ACPI5_VENDOR_BIT BIT(31)
#define MEM_ERROR_MASK (ACPI_EINJ_MEMORY_CORRECTABLE | \
@@ -50,6 +51,28 @@
*/
static int acpi5;
+struct syndrome_array {
+ union {
+ u8 acpi_id[COMPONENT_LEN];
+ u8 device_id[COMPONENT_LEN];
+ u8 pcie_sbdf[COMPONENT_LEN];
+ u8 vendor_id[COMPONENT_LEN];
+ } comp_id;
+ union {
+ u8 proc_synd[COMPONENT_LEN];
+ u8 mem_synd[COMPONENT_LEN];
+ u8 pcie_synd[COMPONENT_LEN];
+ u8 vendor_synd[COMPONENT_LEN];
+ } comp_synd;
+};
+
+struct einjv2_extension_struct {
+ u32 length;
+ u16 revision;
+ u16 component_arr_count;
+ struct syndrome_array component_arr[] __counted_by(component_arr_count);
+};
+
struct set_error_type_with_address {
u32 type;
u32 vendor_extension;
@@ -58,6 +81,7 @@ struct set_error_type_with_address {
u64 memory_address;
u64 memory_address_range;
u32 pcie_sbdf;
+ struct einjv2_extension_struct einjv2_struct;
};
enum {
SETWA_FLAGS_APICID = 1,
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v10 4/7] ACPI: APEI: EINJ: Discover EINJv2 parameters
2025-06-17 19:30 [PATCH v10 0/7] Enable EINJv2 Support Zaid Alali
` (2 preceding siblings ...)
2025-06-17 19:30 ` [PATCH v10 3/7] ACPI: APEI: EINJ: Add einjv2 extension struct Zaid Alali
@ 2025-06-17 19:30 ` Zaid Alali
2025-06-17 19:30 ` [PATCH v10 5/7] ACPI: APEI: EINJ: Create debugfs files to enter device id and syndrome Zaid Alali
` (3 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Zaid Alali @ 2025-06-17 19:30 UTC (permalink / raw)
To: rafael, lenb, james.morse, tony.luck, bp, kees, gustavoars,
zaidal, ira.weiny, Jonathan.Cameron, viro, sudeep.holla,
dan.carpenter, jonathanh, sthanneeru.opensrc, gregkh, peterz,
dan.j.williams, dave.jiang, benjamin.cheatham, linux-acpi,
linux-kernel, linux-hardening
The EINJv2 set_error_type_with_address structure has a flex array
to hold the component IDs and syndrome values used when injecting
multiple errors at once.
Discover the size of this array by taking the address from the
ACPI_EINJ_SET_ERROR_TYPE_WITH_ADDRESS entry in the EINJ table
and reading the BIOS copy of the structure.
Derive the maximum number of components from the length field
in the einjv2_extension_struct at the end of the BIOS copy.
Map the whole of the structure into kernel memory (and unmap
on module unload).
[Tony: Code unchanged from Zaid's original. New commit message]
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Zaid Alali <zaidal@os.amperecomputing.com>
---
drivers/acpi/apei/einj-core.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index 1ffe8270634c..ea6fd4343e63 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -108,6 +108,7 @@ static struct debugfs_blob_wrapper vendor_blob;
static struct debugfs_blob_wrapper vendor_errors;
static char vendor_dev[64];
+static u32 max_nr_components;
static u32 available_error_type;
static u32 available_error_type_v2;
@@ -178,6 +179,7 @@ static DEFINE_MUTEX(einj_mutex);
bool einj_initialized __ro_after_init;
static void __iomem *einj_param;
+static u32 v5param_size;
static void einj_exec_ctx_init(struct apei_exec_context *ctx)
{
@@ -302,11 +304,31 @@ static void __iomem *einj_get_parameter_address(void)
struct set_error_type_with_address v5param;
struct set_error_type_with_address __iomem *p;
+ v5param_size = sizeof(v5param);
p = acpi_os_map_iomem(pa_v5, sizeof(*p));
if (p) {
- memcpy_fromio(&v5param, p, sizeof(v5param));
+ int offset, len;
+
+ memcpy_fromio(&v5param, p, v5param_size);
acpi5 = 1;
check_vendor_extension(pa_v5, &v5param);
+ if (available_error_type & ACPI65_EINJV2_SUPP) {
+ len = v5param.einjv2_struct.length;
+ offset = offsetof(struct einjv2_extension_struct, component_arr);
+ max_nr_components = (len - offset) /
+ sizeof(v5param.einjv2_struct.component_arr[0]);
+ /*
+ * The first call to acpi_os_map_iomem above does not include the
+ * component array, instead it is used to read and calculate maximum
+ * number of components supported by the system. Below, the mapping
+ * is expanded to include the component array.
+ */
+ acpi_os_unmap_iomem(p, v5param_size);
+ offset = offsetof(struct set_error_type_with_address, einjv2_struct);
+ v5param_size = offset + struct_size(&v5param.einjv2_struct,
+ component_arr, max_nr_components);
+ p = acpi_os_map_iomem(pa_v5, v5param_size);
+ }
return p;
}
}
@@ -933,7 +955,7 @@ static void __exit einj_remove(struct faux_device *fdev)
if (einj_param) {
acpi_size size = (acpi5) ?
- sizeof(struct set_error_type_with_address) :
+ v5param_size :
sizeof(struct einj_parameter);
acpi_os_unmap_iomem(einj_param, size);
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v10 5/7] ACPI: APEI: EINJ: Create debugfs files to enter device id and syndrome
2025-06-17 19:30 [PATCH v10 0/7] Enable EINJv2 Support Zaid Alali
` (3 preceding siblings ...)
2025-06-17 19:30 ` [PATCH v10 4/7] ACPI: APEI: EINJ: Discover EINJv2 parameters Zaid Alali
@ 2025-06-17 19:30 ` Zaid Alali
2025-06-17 19:30 ` [PATCH v10 6/7] ACPI: APEI: EINJ: Enable EINJv2 error injections Zaid Alali
` (2 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Zaid Alali @ 2025-06-17 19:30 UTC (permalink / raw)
To: rafael, lenb, james.morse, tony.luck, bp, kees, gustavoars,
zaidal, ira.weiny, Jonathan.Cameron, viro, sudeep.holla,
dan.carpenter, jonathanh, sthanneeru.opensrc, gregkh, peterz,
dan.j.williams, dave.jiang, benjamin.cheatham, linux-acpi,
linux-kernel, linux-hardening
From: Tony Luck <tony.luck@intel.com>
EINJv2 allows users to inject multiple errors at the same time by
specifying the device id and syndrome bits for each error in a flex
array.
Create files in the einj debugfs directory to enter data for each
device id and syndrome value. Note that the specification says these
are 128-bit little-endian values. Linux doesn't have a handy helper
to manage objects of this type.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Zaid Alali <zaidal@os.amperecomputing.com>
---
drivers/acpi/apei/einj-core.c | 97 +++++++++++++++++++++++++++++++++++
1 file changed, 97 insertions(+)
diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index ea6fd4343e63..87f1b8718387 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -111,6 +111,7 @@ static char vendor_dev[64];
static u32 max_nr_components;
static u32 available_error_type;
static u32 available_error_type_v2;
+static struct syndrome_array *syndrome_data;
/*
* Some BIOSes allow parameters to the SET_ERROR_TYPE entries in the
@@ -712,6 +713,7 @@ static u64 error_param3;
static u64 error_param4;
static struct dentry *einj_debug_dir;
static char einj_buf[32];
+static bool einj_v2_enabled;
static struct { u32 mask; const char *str; } const einj_error_type_string[] = {
{ BIT(0), "Processor Correctable" },
{ BIT(1), "Processor Uncorrectable non-fatal" },
@@ -848,6 +850,98 @@ static int einj_check_table(struct acpi_table_einj *einj_tab)
return 0;
}
+static ssize_t u128_read(struct file *f, char __user *buf, size_t count, loff_t *off)
+{
+ char output[2 * COMPONENT_LEN + 1];
+ u8 *data = f->f_inode->i_private;
+ int i;
+
+ if (*off >= sizeof(output))
+ return 0;
+
+ for (i = 0; i < COMPONENT_LEN; i++)
+ sprintf(output + 2 * i, "%.02x", data[COMPONENT_LEN - i - 1]);
+ output[2 * COMPONENT_LEN] = '\n';
+
+ return simple_read_from_buffer(buf, count, off, output, sizeof(output));
+}
+
+static ssize_t u128_write(struct file *f, const char __user *buf, size_t count, loff_t *off)
+{
+ char input[2 + 2 * COMPONENT_LEN + 2];
+ u8 *save = f->f_inode->i_private;
+ u8 tmp[COMPONENT_LEN];
+ char byte[3] = {};
+ char *s, *e;
+ size_t c;
+ long val;
+ int i;
+
+ /* Require that user supply whole input line in one write(2) syscall */
+ if (*off)
+ return -EINVAL;
+
+ c = simple_write_to_buffer(input, sizeof(input), off, buf, count);
+ if (c < 0)
+ return c;
+
+ if (c < 1 || input[c - 1] != '\n')
+ return -EINVAL;
+
+ /* Empty line means invalidate this entry */
+ if (c == 1) {
+ memset(save, 0xff, COMPONENT_LEN);
+ return c;
+ }
+
+ if (input[0] == '0' && (input[1] == 'x' || input[1] == 'X'))
+ s = input + 2;
+ else
+ s = input;
+ e = input + c - 1;
+
+ for (i = 0; i < COMPONENT_LEN; i++) {
+ byte[1] = *--e;
+ byte[0] = e > s ? *--e : '0';
+ if (kstrtol(byte, 16, &val))
+ return -EINVAL;
+ tmp[i] = val;
+ if (e <= s)
+ break;
+ }
+ while (++i < COMPONENT_LEN)
+ tmp[i] = 0;
+
+ memcpy(save, tmp, COMPONENT_LEN);
+
+ return c;
+}
+
+static const struct file_operations u128_fops = {
+ .read = u128_read,
+ .write = u128_write,
+};
+
+static bool setup_einjv2_component_files(void)
+{
+ char name[32];
+
+ syndrome_data = kcalloc(max_nr_components, sizeof(syndrome_data[0]), GFP_KERNEL);
+ if (!syndrome_data)
+ return false;
+
+ for (int i = 0; i < max_nr_components; i++) {
+ sprintf(name, "component_id%d", i);
+ debugfs_create_file(name, 0600, einj_debug_dir,
+ &syndrome_data[i].comp_id, &u128_fops);
+ sprintf(name, "component_syndrome%d", i);
+ debugfs_create_file(name, 0600, einj_debug_dir,
+ &syndrome_data[i].comp_synd, &u128_fops);
+ }
+
+ return true;
+}
+
static int __init einj_probe(struct faux_device *fdev)
{
int rc;
@@ -919,6 +1013,8 @@ static int __init einj_probe(struct faux_device *fdev)
&error_param4);
debugfs_create_x32("notrigger", S_IRUSR | S_IWUSR,
einj_debug_dir, ¬rigger);
+ if (available_error_type & ACPI65_EINJV2_SUPP)
+ einj_v2_enabled = setup_einjv2_component_files();
}
if (vendor_dev[0]) {
@@ -967,6 +1063,7 @@ static void __exit einj_remove(struct faux_device *fdev)
apei_resources_release(&einj_resources);
apei_resources_fini(&einj_resources);
debugfs_remove_recursive(einj_debug_dir);
+ kfree(syndrome_data);
acpi_put_table((struct acpi_table_header *)einj_tab);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v10 6/7] ACPI: APEI: EINJ: Enable EINJv2 error injections
2025-06-17 19:30 [PATCH v10 0/7] Enable EINJv2 Support Zaid Alali
` (4 preceding siblings ...)
2025-06-17 19:30 ` [PATCH v10 5/7] ACPI: APEI: EINJ: Create debugfs files to enter device id and syndrome Zaid Alali
@ 2025-06-17 19:30 ` Zaid Alali
2025-06-17 19:30 ` [PATCH v10 7/7] ACPI: APEI: EINJ: Update the documentation for EINJv2 support Zaid Alali
2025-06-18 18:51 ` [PATCH v10 0/7] Enable EINJv2 Support Rafael J. Wysocki
7 siblings, 0 replies; 11+ messages in thread
From: Zaid Alali @ 2025-06-17 19:30 UTC (permalink / raw)
To: rafael, lenb, james.morse, tony.luck, bp, kees, gustavoars,
zaidal, ira.weiny, Jonathan.Cameron, viro, sudeep.holla,
dan.carpenter, jonathanh, sthanneeru.opensrc, gregkh, peterz,
dan.j.williams, dave.jiang, benjamin.cheatham, linux-acpi,
linux-kernel, linux-hardening
Enable injection using EINJv2 mode of operation.
[Tony: Mostly Zaid's original code. I just changed how the error ID
and syndrome bits are implemented. Also swapped out some camelcase
variable names]
Co-developed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Zaid Alali <zaidal@os.amperecomputing.com>
---
drivers/acpi/apei/einj-core.c | 58 ++++++++++++++++++++++++++++-------
1 file changed, 47 insertions(+), 11 deletions(-)
diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index 87f1b8718387..d6d7e36e3647 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -87,6 +87,7 @@ enum {
SETWA_FLAGS_APICID = 1,
SETWA_FLAGS_MEM = 2,
SETWA_FLAGS_PCIE_SBDF = 4,
+ SETWA_FLAGS_EINJV2 = 8,
};
/*
@@ -181,6 +182,7 @@ bool einj_initialized __ro_after_init;
static void __iomem *einj_param;
static u32 v5param_size;
+static bool is_v2;
static void einj_exec_ctx_init(struct apei_exec_context *ctx)
{
@@ -507,12 +509,20 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
return rc;
}
+static bool is_end_of_list(u8 *val)
+{
+ for (int i = 0; i < COMPONENT_LEN; ++i) {
+ if (val[i] != 0xFF)
+ return false;
+ }
+ return true;
+}
static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
u64 param3, u64 param4)
{
struct apei_exec_context ctx;
u64 val, trigger_paddr, timeout = FIRMWARE_TIMEOUT;
- int rc;
+ int i, rc;
einj_exec_ctx_init(&ctx);
@@ -521,10 +531,10 @@ static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
return rc;
apei_exec_ctx_set_input(&ctx, type);
if (acpi5) {
- struct set_error_type_with_address *v5param, v5_struct;
+ struct set_error_type_with_address *v5param;
- v5param = &v5_struct;
- memcpy_fromio(v5param, einj_param, sizeof(*v5param));
+ v5param = kmalloc(v5param_size, GFP_KERNEL);
+ memcpy_fromio(v5param, einj_param, v5param_size);
v5param->type = type;
if (type & ACPI5_VENDOR_BIT) {
switch (vendor_flags) {
@@ -544,8 +554,21 @@ static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
v5param->flags = flags;
v5param->memory_address = param1;
v5param->memory_address_range = param2;
- v5param->apicid = param3;
- v5param->pcie_sbdf = param4;
+
+ if (is_v2) {
+ for (i = 0; i < max_nr_components; i++) {
+ if (is_end_of_list(syndrome_data[i].comp_id.acpi_id))
+ break;
+ v5param->einjv2_struct.component_arr[i].comp_id =
+ syndrome_data[i].comp_id;
+ v5param->einjv2_struct.component_arr[i].comp_synd =
+ syndrome_data[i].comp_synd;
+ }
+ v5param->einjv2_struct.component_arr_count = i;
+ } else {
+ v5param->apicid = param3;
+ v5param->pcie_sbdf = param4;
+ }
} else {
switch (type) {
case ACPI_EINJ_PROCESSOR_CORRECTABLE:
@@ -569,7 +592,8 @@ static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
break;
}
}
- memcpy_toio(einj_param, v5param, sizeof(*v5param));
+ memcpy_toio(einj_param, v5param, v5param_size);
+ kfree(v5param);
} else {
rc = apei_exec_run(&ctx, ACPI_EINJ_SET_ERROR_TYPE);
if (rc)
@@ -631,10 +655,15 @@ int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, u64 param3,
u64 base_addr, size;
/* If user manually set "flags", make sure it is legal */
- if (flags && (flags &
- ~(SETWA_FLAGS_APICID|SETWA_FLAGS_MEM|SETWA_FLAGS_PCIE_SBDF)))
+ if (flags && (flags & ~(SETWA_FLAGS_APICID | SETWA_FLAGS_MEM |
+ SETWA_FLAGS_PCIE_SBDF | SETWA_FLAGS_EINJV2)))
return -EINVAL;
+ /* check if type is a valid EINJv2 error type */
+ if (is_v2) {
+ if (!(type & available_error_type_v2))
+ return -EINVAL;
+ }
/*
* We need extra sanity checks for memory errors.
* Other types leap directly to injection.
@@ -743,7 +772,7 @@ static int available_error_type_show(struct seq_file *m, void *v)
if (available_error_type & einj_error_type_string[pos].mask)
seq_printf(m, "0x%08x\t%s\n", einj_error_type_string[pos].mask,
einj_error_type_string[pos].str);
- if (available_error_type & ACPI65_EINJV2_SUPP) {
+ if ((available_error_type & ACPI65_EINJV2_SUPP) && einj_v2_enabled) {
for (int pos = 0; pos < ARRAY_SIZE(einjv2_error_type_string); pos++) {
if (available_error_type_v2 & einjv2_error_type_string[pos].mask)
seq_printf(m, "V2_0x%08x\t%s\n", einjv2_error_type_string[pos].mask,
@@ -785,7 +814,7 @@ int einj_validate_error_type(u64 type)
if (tval & (tval - 1))
return -EINVAL;
if (!vendor)
- if (!(type & available_error_type))
+ if (!(type & (available_error_type | available_error_type_v2)))
return -EINVAL;
return 0;
@@ -804,9 +833,11 @@ static ssize_t error_type_set(struct file *file, const char __user *buf,
if (strncmp(einj_buf, "V2_", 3) == 0) {
if (!sscanf(einj_buf, "V2_%llx", &val))
return -EINVAL;
+ is_v2 = true;
} else {
if (!sscanf(einj_buf, "%llx", &val))
return -EINVAL;
+ is_v2 = false;
}
rc = einj_validate_error_type(val);
@@ -828,6 +859,11 @@ static int error_inject_set(void *data, u64 val)
if (!error_type)
return -EINVAL;
+ if (is_v2)
+ error_flags |= SETWA_FLAGS_EINJV2;
+ else
+ error_flags &= ~SETWA_FLAGS_EINJV2;
+
return einj_error_inject(error_type, error_flags, error_param1, error_param2,
error_param3, error_param4);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v10 7/7] ACPI: APEI: EINJ: Update the documentation for EINJv2 support
2025-06-17 19:30 [PATCH v10 0/7] Enable EINJv2 Support Zaid Alali
` (5 preceding siblings ...)
2025-06-17 19:30 ` [PATCH v10 6/7] ACPI: APEI: EINJ: Enable EINJv2 error injections Zaid Alali
@ 2025-06-17 19:30 ` Zaid Alali
2025-06-18 18:51 ` [PATCH v10 0/7] Enable EINJv2 Support Rafael J. Wysocki
7 siblings, 0 replies; 11+ messages in thread
From: Zaid Alali @ 2025-06-17 19:30 UTC (permalink / raw)
To: rafael, lenb, james.morse, tony.luck, bp, kees, gustavoars,
zaidal, ira.weiny, Jonathan.Cameron, viro, sudeep.holla,
dan.carpenter, jonathanh, sthanneeru.opensrc, gregkh, peterz,
dan.j.williams, dave.jiang, benjamin.cheatham, linux-acpi,
linux-kernel, linux-hardening
Add documentation based on implementation of EINJv2 as described in ACPI
6.5.A specification.
Link: https://uefi.org/specs/ACPI/6.5_A/18_Platform_Error_Interfaces.html#error-injection
[Tony: New user interface for device id and syndrome]
Co-developed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Zaid Alali <zaidal@os.amperecomputing.com>
---
.../firmware-guide/acpi/apei/einj.rst | 33 +++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst
index c52b9da08fa9..7d8435d35a18 100644
--- a/Documentation/firmware-guide/acpi/apei/einj.rst
+++ b/Documentation/firmware-guide/acpi/apei/einj.rst
@@ -59,6 +59,9 @@ The following files belong to it:
0x00000200 Platform Correctable
0x00000400 Platform Uncorrectable non-fatal
0x00000800 Platform Uncorrectable fatal
+ V2_0x00000001 EINJV2 Processor Error
+ V2_0x00000002 EINJV2 Memory Error
+ V2_0x00000004 EINJV2 PCI Express Error
================ ===================================
The format of the file contents are as above, except present are only
@@ -88,6 +91,8 @@ The following files belong to it:
Memory address and mask valid (param1 and param2).
Bit 2
PCIe (seg,bus,dev,fn) valid (see param4 below).
+ Bit 3
+ EINJv2 extension structure is valid
If set to zero, legacy behavior is mimicked where the type of
injection specifies just one bit set, and param1 is multiplexed.
@@ -122,6 +127,13 @@ The following files belong to it:
this actually works depends on what operations the BIOS actually
includes in the trigger phase.
+- component_id0 .. component_idN, component_syndrome0 .. component_syndromeN
+
+ These files are used to set the "Component Array" field
+ of the EINJv2 Extension Structure. Each holds a 128-bit
+ hex value. Writing just a newline to any of these files
+ sets an invalid (all-ones) value.
+
CXL error types are supported from ACPI 6.5 onwards (given a CXL port
is present). The EINJ user interface for CXL error types is at
<debugfs mount point>/cxl. The following files belong to it:
@@ -194,6 +206,27 @@ An error injection example::
# echo 0x8 > error_type # Choose correctable memory error
# echo 1 > error_inject # Inject now
+An EINJv2 error injection example::
+
+ # cd /sys/kernel/debug/apei/einj
+ # cat available_error_type # See which errors can be injected
+ 0x00000002 Processor Uncorrectable non-fatal
+ 0x00000008 Memory Correctable
+ 0x00000010 Memory Uncorrectable non-fatal
+ V2_0x00000001 EINJV2 Processor Error
+ V2_0x00000002 EINJV2 Memory Error
+
+ # echo 0x12345000 > param1 # Set memory address for injection
+ # echo 0xfffffffffffff000 > param2 # Range - anywhere in this page
+ # echo 0x1 > component_id0 # First device ID
+ # echo 0x4 > component_syndrome0 # First error syndrome
+ # echo 0x2 > component_id1 # Second device ID
+ # echo 0x4 > component_syndrome1 # Second error syndrome
+ # echo '' > component_id2 # Mark id2 invalid to terminate list
+ # echo V2_0x2 > error_type # Choose EINJv2 memory error
+ # echo 0xa > flags # set flags to indicate EINJv2
+ # echo 1 > error_inject # Inject now
+
You should see something like this in dmesg::
[22715.830801] EDAC sbridge MC3: HANDLING MCE MEMORY ERROR
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v10 0/7] Enable EINJv2 Support
2025-06-17 19:30 [PATCH v10 0/7] Enable EINJv2 Support Zaid Alali
` (6 preceding siblings ...)
2025-06-17 19:30 ` [PATCH v10 7/7] ACPI: APEI: EINJ: Update the documentation for EINJv2 support Zaid Alali
@ 2025-06-18 18:51 ` Rafael J. Wysocki
7 siblings, 0 replies; 11+ messages in thread
From: Rafael J. Wysocki @ 2025-06-18 18:51 UTC (permalink / raw)
To: Zaid Alali
Cc: rafael, lenb, james.morse, tony.luck, bp, kees, gustavoars,
ira.weiny, Jonathan.Cameron, viro, sudeep.holla, dan.carpenter,
jonathanh, sthanneeru.opensrc, gregkh, peterz, dan.j.williams,
dave.jiang, benjamin.cheatham, linux-acpi, linux-kernel,
linux-hardening
On Tue, Jun 17, 2025 at 9:30 PM Zaid Alali
<zaidal@os.amperecomputing.com> wrote:
>
> The goal of this update is to allow the driver to simultaneously
> support EINJ and EINJv2. The implementation follows ACPI 6.6
> specs[1] that enables the driver to discover system capabilities
> through GET_ERROR_TYPE.
>
> Link: https://uefi.org/specs/ACPI/6.6/18_Platform_Error_Interfaces.html#error-injection [1]
>
> V5:
> *Users no longer input component array size, instead it
> is counted by parsing the component array itself.
> V6:
> *Fix memory leak.
> *If EINJv2 initialization failed, EINJv1 will still work, and
> probe function will continue with disabled EINJv2.
> V7:
> *Update component array to take 128-bit values to match ACPI specs.
> *Enable Vendor EINJv2 injections
> *Moved component array parsing and validating to a separate
> function to improve readability.
> V8:
> *Update UI to use single value files for component array.
> *Update links to point to recent ACPI 6.6 spec release.
> *Updated commit messages and documentation patch.
> *Dropped the first two patches as they were merged via
> ACPICA project.
> V9:
> *Fix commit messages signed-off/reviewed-by order.
> *Fix sparse warning by defining syndrom_data as a
> static struct.
> V10:
> *Use defined value instead of hard coded for component
> array size
> *Unset EINJv2 flag for EINJv1 injections
>
> Tony Luck (1):
> ACPI: APEI: EINJ: Create debugfs files to enter device id and syndrome
>
> Zaid Alali (6):
> ACPI: APEI: EINJ: Fix kernel test sparse warnings
> ACPI: APEI: EINJ: Enable the discovery of EINJv2 capabilities
> ACPI: APEI: EINJ: Add einjv2 extension struct
> ACPI: APEI: EINJ: Discover EINJv2 parameters
> ACPI: APEI: EINJ: Enable EINJv2 error injections
> ACPI: APEI: EINJ: Update the documentation for EINJv2 support
>
> .../firmware-guide/acpi/apei/einj.rst | 33 ++
> drivers/acpi/apei/apei-internal.h | 2 +-
> drivers/acpi/apei/einj-core.c | 374 ++++++++++++++----
> drivers/acpi/apei/einj-cxl.c | 2 +-
> 4 files changed, 342 insertions(+), 69 deletions(-)
>
> --
Whole series applied as 6.17 material, thanks!
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] ACPI: APEI: EINJ: Fix trigger actions
2025-06-17 19:30 ` [PATCH v10 1/7] ACPI: APEI: EINJ: Fix kernel test sparse warnings Zaid Alali
@ 2025-07-03 20:04 ` Tony Luck
2025-07-07 16:24 ` Rafael J. Wysocki
0 siblings, 1 reply; 11+ messages in thread
From: Tony Luck @ 2025-07-03 20:04 UTC (permalink / raw)
To: zaidal
Cc: Jonathan.Cameron, benjamin.cheatham, bp, dan.carpenter,
dan.j.williams, dave.jiang, gregkh, gustavoars, ira.weiny,
james.morse, jonathanh, kees, lenb, linux-acpi, linux-hardening,
linux-kernel, peterz, rafael, sthanneeru.opensrc, sudeep.holla,
tony.luck, viro, Yi1 Lai
The trigger events are in BIOS memory immediately following the
acpi_einj_trigger structure. These were not copied to regular
kernel memory for use by apei_exec_ctx_init() so injections in
"notrigger=0" mode failed with a message like this:
APEI: Invalid action table, unknown instruction type: 123
Fix by allocating a "table_size" block of memory and copying the whole
table for use in the rest of the trigger flow.
Fixes: 1a35c88302a3 ("ACPI: APEI: EINJ: Fix kernel test sparse warnings")
Reported-by: Yi1 Lai <yi1.lai@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
drivers/acpi/apei/einj-core.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index 3d37978418e8..bf8dc92a373a 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -394,6 +394,7 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
u64 param1, u64 param2)
{
struct acpi_einj_trigger trigger_tab;
+ struct acpi_einj_trigger *full_trigger_tab;
struct apei_exec_context trigger_ctx;
struct apei_resources trigger_resources;
struct acpi_whea_header *trigger_entry;
@@ -430,6 +431,9 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
rc = -EIO;
table_size = trigger_tab.table_size;
+ full_trigger_tab = kmalloc(table_size, GFP_KERNEL);
+ if (!full_trigger_tab)
+ goto out_rel_header;
r = request_mem_region(trigger_paddr + sizeof(trigger_tab),
table_size - sizeof(trigger_tab),
"APEI EINJ Trigger Table");
@@ -437,7 +441,7 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
pr_err("Can not request [mem %#010llx-%#010llx] for Trigger Table Entry\n",
(unsigned long long)trigger_paddr + sizeof(trigger_tab),
(unsigned long long)trigger_paddr + table_size - 1);
- goto out_rel_header;
+ goto out_free_trigger_tab;
}
iounmap(p);
p = ioremap_cache(trigger_paddr, table_size);
@@ -445,9 +449,9 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
pr_err("Failed to map trigger table!\n");
goto out_rel_entry;
}
- memcpy_fromio(&trigger_tab, p, sizeof(trigger_tab));
+ memcpy_fromio(full_trigger_tab, p, table_size);
trigger_entry = (struct acpi_whea_header *)
- ((char *)&trigger_tab + sizeof(struct acpi_einj_trigger));
+ ((char *)full_trigger_tab + sizeof(struct acpi_einj_trigger));
apei_resources_init(&trigger_resources);
apei_exec_ctx_init(&trigger_ctx, einj_ins_type,
ARRAY_SIZE(einj_ins_type),
@@ -469,7 +473,7 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
apei_resources_init(&addr_resources);
trigger_param_region = einj_get_trigger_parameter_region(
- &trigger_tab, param1, param2);
+ full_trigger_tab, param1, param2);
if (trigger_param_region) {
rc = apei_resources_add(&addr_resources,
trigger_param_region->address,
@@ -500,6 +504,8 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
out_rel_entry:
release_mem_region(trigger_paddr + sizeof(trigger_tab),
table_size - sizeof(trigger_tab));
+out_free_trigger_tab:
+ kfree(full_trigger_tab);
out_rel_header:
release_mem_region(trigger_paddr, sizeof(trigger_tab));
out:
--
2.50.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] ACPI: APEI: EINJ: Fix trigger actions
2025-07-03 20:04 ` [PATCH] ACPI: APEI: EINJ: Fix trigger actions Tony Luck
@ 2025-07-07 16:24 ` Rafael J. Wysocki
0 siblings, 0 replies; 11+ messages in thread
From: Rafael J. Wysocki @ 2025-07-07 16:24 UTC (permalink / raw)
To: Tony Luck
Cc: zaidal, Jonathan.Cameron, benjamin.cheatham, bp, dan.carpenter,
dan.j.williams, dave.jiang, gregkh, gustavoars, ira.weiny,
james.morse, jonathanh, kees, lenb, linux-acpi, linux-hardening,
linux-kernel, peterz, rafael, sthanneeru.opensrc, sudeep.holla,
viro, Yi1 Lai
On Thu, Jul 3, 2025 at 10:05 PM Tony Luck <tony.luck@intel.com> wrote:
>
> The trigger events are in BIOS memory immediately following the
> acpi_einj_trigger structure. These were not copied to regular
> kernel memory for use by apei_exec_ctx_init() so injections in
> "notrigger=0" mode failed with a message like this:
>
> APEI: Invalid action table, unknown instruction type: 123
>
> Fix by allocating a "table_size" block of memory and copying the whole
> table for use in the rest of the trigger flow.
>
> Fixes: 1a35c88302a3 ("ACPI: APEI: EINJ: Fix kernel test sparse warnings")
> Reported-by: Yi1 Lai <yi1.lai@intel.com>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> drivers/acpi/apei/einj-core.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
> index 3d37978418e8..bf8dc92a373a 100644
> --- a/drivers/acpi/apei/einj-core.c
> +++ b/drivers/acpi/apei/einj-core.c
> @@ -394,6 +394,7 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
> u64 param1, u64 param2)
> {
> struct acpi_einj_trigger trigger_tab;
> + struct acpi_einj_trigger *full_trigger_tab;
> struct apei_exec_context trigger_ctx;
> struct apei_resources trigger_resources;
> struct acpi_whea_header *trigger_entry;
> @@ -430,6 +431,9 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
>
> rc = -EIO;
> table_size = trigger_tab.table_size;
> + full_trigger_tab = kmalloc(table_size, GFP_KERNEL);
> + if (!full_trigger_tab)
> + goto out_rel_header;
> r = request_mem_region(trigger_paddr + sizeof(trigger_tab),
> table_size - sizeof(trigger_tab),
> "APEI EINJ Trigger Table");
> @@ -437,7 +441,7 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
> pr_err("Can not request [mem %#010llx-%#010llx] for Trigger Table Entry\n",
> (unsigned long long)trigger_paddr + sizeof(trigger_tab),
> (unsigned long long)trigger_paddr + table_size - 1);
> - goto out_rel_header;
> + goto out_free_trigger_tab;
> }
> iounmap(p);
> p = ioremap_cache(trigger_paddr, table_size);
> @@ -445,9 +449,9 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
> pr_err("Failed to map trigger table!\n");
> goto out_rel_entry;
> }
> - memcpy_fromio(&trigger_tab, p, sizeof(trigger_tab));
> + memcpy_fromio(full_trigger_tab, p, table_size);
> trigger_entry = (struct acpi_whea_header *)
> - ((char *)&trigger_tab + sizeof(struct acpi_einj_trigger));
> + ((char *)full_trigger_tab + sizeof(struct acpi_einj_trigger));
> apei_resources_init(&trigger_resources);
> apei_exec_ctx_init(&trigger_ctx, einj_ins_type,
> ARRAY_SIZE(einj_ins_type),
> @@ -469,7 +473,7 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
>
> apei_resources_init(&addr_resources);
> trigger_param_region = einj_get_trigger_parameter_region(
> - &trigger_tab, param1, param2);
> + full_trigger_tab, param1, param2);
> if (trigger_param_region) {
> rc = apei_resources_add(&addr_resources,
> trigger_param_region->address,
> @@ -500,6 +504,8 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type,
> out_rel_entry:
> release_mem_region(trigger_paddr + sizeof(trigger_tab),
> table_size - sizeof(trigger_tab));
> +out_free_trigger_tab:
> + kfree(full_trigger_tab);
> out_rel_header:
> release_mem_region(trigger_paddr, sizeof(trigger_tab));
> out:
> --
Applied, thanks!
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-07-07 16:24 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-17 19:30 [PATCH v10 0/7] Enable EINJv2 Support Zaid Alali
2025-06-17 19:30 ` [PATCH v10 1/7] ACPI: APEI: EINJ: Fix kernel test sparse warnings Zaid Alali
2025-07-03 20:04 ` [PATCH] ACPI: APEI: EINJ: Fix trigger actions Tony Luck
2025-07-07 16:24 ` Rafael J. Wysocki
2025-06-17 19:30 ` [PATCH v10 2/7] ACPI: APEI: EINJ: Enable the discovery of EINJv2 capabilities Zaid Alali
2025-06-17 19:30 ` [PATCH v10 3/7] ACPI: APEI: EINJ: Add einjv2 extension struct Zaid Alali
2025-06-17 19:30 ` [PATCH v10 4/7] ACPI: APEI: EINJ: Discover EINJv2 parameters Zaid Alali
2025-06-17 19:30 ` [PATCH v10 5/7] ACPI: APEI: EINJ: Create debugfs files to enter device id and syndrome Zaid Alali
2025-06-17 19:30 ` [PATCH v10 6/7] ACPI: APEI: EINJ: Enable EINJv2 error injections Zaid Alali
2025-06-17 19:30 ` [PATCH v10 7/7] ACPI: APEI: EINJ: Update the documentation for EINJv2 support Zaid Alali
2025-06-18 18:51 ` [PATCH v10 0/7] Enable EINJv2 Support Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).