* [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
@ 2025-02-21 14:35 Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 01/14] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
` (15 more replies)
0 siblings, 16 replies; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Philippe Mathieu-Daudé, Ani Sinha,
Cleber Rosa, Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
Now that the ghes preparation patches were merged, let's add support
for error injection.
On this series, the first 6 patches chang to the math used to calculate offsets at HEST
table and hardware_error firmware file, together with its migration code. Migration tested
with both latest QEMU released kernel and upstream, on both directions.
The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
to inject ARM Processor Error records.
---
v4:
- added an extra comment for AcpiGhesState structure;
- patches reordered;
- no functional changes, just code shift between the patches in this series.
v3:
- addressed more nits;
- hest_add_le now points to the beginning of HEST table;
- removed HEST from tests/data/acpi;
- added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
v2:
- address some nits;
- improved ags cleanup patch and removed ags.present field;
- added some missing le*_to_cpu() calls;
- update date at copyright for new files to 2024-2025;
- qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
- added HEST and DSDT tables after the changes to make check target happy.
(two patches: first one whitelisting such tables; second one removing from
whitelist and updating/adding such tables to tests/data/acpi)
Mauro Carvalho Chehab (14):
acpi/ghes: prepare to change the way HEST offsets are calculated
acpi/ghes: add a firmware file with HEST address
acpi/ghes: Use HEST table offsets when preparing GHES records
acpi/ghes: don't hard-code the number of sources for HEST table
acpi/ghes: add a notifier to notify when error data is ready
acpi/ghes: create an ancillary acpi_ghes_get_state() function
acpi/generic_event_device: Update GHES migration to cover hest addr
acpi/generic_event_device: add logic to detect if HEST addr is
available
acpi/generic_event_device: add an APEI error device
tests/acpi: virt: allow acpi table changes for a new table: HEST
arm/virt: Wire up a GED error device for ACPI / GHES
tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
qapi/acpi-hest: add an interface to do generic CPER error injection
scripts/ghes_inject: add a script to generate GHES error inject
MAINTAINERS | 10 +
hw/acpi/Kconfig | 5 +
hw/acpi/aml-build.c | 10 +
hw/acpi/generic_event_device.c | 43 ++
hw/acpi/ghes-stub.c | 7 +-
hw/acpi/ghes.c | 231 ++++--
hw/acpi/ghes_cper.c | 38 +
hw/acpi/ghes_cper_stub.c | 19 +
hw/acpi/meson.build | 2 +
hw/arm/virt-acpi-build.c | 37 +-
hw/arm/virt.c | 19 +-
hw/core/machine.c | 2 +
include/hw/acpi/acpi_dev_interface.h | 1 +
include/hw/acpi/aml-build.h | 2 +
include/hw/acpi/generic_event_device.h | 1 +
include/hw/acpi/ghes.h | 54 +-
include/hw/arm/virt.h | 2 +
qapi/acpi-hest.json | 35 +
qapi/meson.build | 1 +
qapi/qapi-schema.json | 1 +
scripts/arm_processor_error.py | 476 ++++++++++++
scripts/ghes_inject.py | 51 ++
scripts/qmp_helper.py | 702 ++++++++++++++++++
target/arm/kvm.c | 7 +-
tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
.../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
29 files changed, 1677 insertions(+), 79 deletions(-)
create mode 100644 hw/acpi/ghes_cper.c
create mode 100644 hw/acpi/ghes_cper_stub.c
create mode 100644 qapi/acpi-hest.json
create mode 100644 scripts/arm_processor_error.py
create mode 100755 scripts/ghes_inject.py
create mode 100755 scripts/qmp_helper.py
--
2.48.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v4 01/14] acpi/ghes: prepare to change the way HEST offsets are calculated
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-26 14:37 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 02/14] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
` (14 subsequent siblings)
15 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Peter Maydell,
Shannon Zhao, linux-kernel
Add a new ags flag to change the way HEST offsets are calculated.
Currently, offsets needed to store ACPI HEST offsets and read ack
are calculated based on a previous knowledge from the logic
which creates the HEST table.
Such logic is not generic, not allowing to easily add more HEST
entries nor replicates what OSPM does.
As the next patches will be adding a more generic logic, add a
new use_hest_addr, set to false, in preparation for such changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
hw/acpi/ghes.c | 46 ++++++++++++++++++++++++----------------
hw/arm/virt-acpi-build.c | 15 ++++++++++---
include/hw/acpi/ghes.h | 14 ++++++++++--
3 files changed, 52 insertions(+), 23 deletions(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index b709c177cdea..e49a03fdb94e 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -206,7 +206,8 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
* Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
* See docs/specs/acpi_hest_ghes.rst for blobs format.
*/
-static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
+static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
+ BIOSLinker *linker)
{
int i, error_status_block_offset;
@@ -251,13 +252,15 @@ static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
}
- /*
- * tell firmware to write hardware_errors GPA into
- * hardware_errors_addr fw_cfg, once the former has been initialized.
- */
- bios_linker_loader_write_pointer(linker, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, 0,
- sizeof(uint64_t),
- ACPI_HW_ERROR_FW_CFG_FILE, 0);
+ if (!ags->use_hest_addr) {
+ /*
+ * Tell firmware to write hardware_errors GPA into
+ * hardware_errors_addr fw_cfg, once the former has been initialized.
+ */
+ bios_linker_loader_write_pointer(linker, ACPI_HW_ERROR_ADDR_FW_CFG_FILE,
+ 0, sizeof(uint64_t),
+ ACPI_HW_ERROR_FW_CFG_FILE, 0);
+ }
}
/* Build Generic Hardware Error Source version 2 (GHESv2) */
@@ -331,14 +334,15 @@ static void build_ghes_v2(GArray *table_data,
}
/* Build Hardware Error Source Table */
-void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
+void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
+ GArray *hardware_errors,
BIOSLinker *linker,
const char *oem_id, const char *oem_table_id)
{
AcpiTable table = { .sig = "HEST", .rev = 1,
.oem_id = oem_id, .oem_table_id = oem_table_id };
- build_ghes_error_table(hardware_errors, linker);
+ build_ghes_error_table(ags, hardware_errors, linker);
acpi_table_begin(&table, table_data);
@@ -357,11 +361,11 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
hardware_error->len);
- /* Create a read-write fw_cfg file for Address */
- fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
- NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
-
- ags->present = true;
+ if (!ags->use_hest_addr) {
+ /* Create a read-write fw_cfg file for Address */
+ fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
+ NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
+ }
}
static void get_hw_error_offsets(uint64_t ghes_addr,
@@ -411,8 +415,11 @@ void ghes_record_cper_errors(const void *cper, size_t len,
ags = &acpi_ged_state->ghes_state;
assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
- get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
- &cper_addr, &read_ack_register_addr);
+
+ if (!ags->use_hest_addr) {
+ get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
+ &cper_addr, &read_ack_register_addr);
+ }
if (!cper_addr) {
error_setg(errp, "can not find Generic Error Status Block");
@@ -494,5 +501,8 @@ bool acpi_ghes_present(void)
return false;
}
ags = &acpi_ged_state->ghes_state;
- return ags->present;
+ if (!ags->hw_error_le)
+ return false;
+
+ return true;
}
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 3ac8f8e17861..8ab8d11b6536 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -946,9 +946,18 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
build_dbg2(tables_blob, tables->linker, vms);
if (vms->ras) {
- acpi_add_table(table_offsets, tables_blob);
- acpi_build_hest(tables_blob, tables->hardware_errors, tables->linker,
- vms->oem_id, vms->oem_table_id);
+ AcpiGedState *acpi_ged_state;
+ AcpiGhesState *ags;
+
+ acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
+ NULL));
+ if (acpi_ged_state) {
+ ags = &acpi_ged_state->ghes_state;
+
+ acpi_add_table(table_offsets, tables_blob);
+ acpi_build_hest(ags, tables_blob, tables->hardware_errors,
+ tables->linker, vms->oem_id, vms->oem_table_id);
+ }
}
if (ms->numa_state->num_nodes > 0) {
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 39619a2457cb..a3d62b96584f 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -64,12 +64,22 @@ enum {
ACPI_GHES_ERROR_SOURCE_COUNT
};
+/*
+ * AcpiGhesState stores an offset that will be used to fill HEST entries.
+ *
+ * When use_hest_addr is false, the stored offset is placed at hw_error_le,
+ * meaning an offset from the etc/hardware_errors firmware address. This
+ * is the default on QEMU 9.x.
+ *
+ * An offset value equal to zero means that GHES is not present.
+ */
typedef struct AcpiGhesState {
uint64_t hw_error_le;
- bool present; /* True if GHES is present at all on this board */
+ bool use_hest_addr; /* Currently, always false */
} AcpiGhesState;
-void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
+void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
+ GArray *hardware_errors,
BIOSLinker *linker,
const char *oem_id, const char *oem_table_id);
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 02/14] acpi/ghes: add a firmware file with HEST address
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 01/14] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-26 14:48 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 03/14] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
` (13 subsequent siblings)
15 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, linux-kernel
Store HEST table address at GPA, placing its the start of the table at
hest_addr_le variable.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/ghes.c | 22 ++++++++++++++++++++--
include/hw/acpi/ghes.h | 7 ++++++-
2 files changed, 26 insertions(+), 3 deletions(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index e49a03fdb94e..ba37be9e7022 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -30,6 +30,7 @@
#define ACPI_HW_ERROR_FW_CFG_FILE "etc/hardware_errors"
#define ACPI_HW_ERROR_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
+#define ACPI_HEST_ADDR_FW_CFG_FILE "etc/acpi_table_hest_addr"
/* The max size in bytes for one error block */
#define ACPI_GHES_MAX_RAW_DATA_LENGTH (1 * KiB)
@@ -341,6 +342,9 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
{
AcpiTable table = { .sig = "HEST", .rev = 1,
.oem_id = oem_id, .oem_table_id = oem_table_id };
+ uint32_t hest_offset;
+
+ hest_offset = table_data->len;
build_ghes_error_table(ags, hardware_errors, linker);
@@ -352,6 +356,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
ACPI_GHES_NOTIFY_SEA, ACPI_HEST_SRC_ID_SEA);
acpi_table_end(linker, &table);
+
+ if (ags->use_hest_addr) {
+ /*
+ * Tell firmware to write into GPA the address of HEST via fw_cfg,
+ * once initialized.
+ */
+ bios_linker_loader_write_pointer(linker,
+ ACPI_HEST_ADDR_FW_CFG_FILE, 0,
+ sizeof(uint64_t),
+ ACPI_BUILD_TABLE_FILE, hest_offset);
+ }
}
void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
@@ -361,7 +376,10 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
hardware_error->len);
- if (!ags->use_hest_addr) {
+ if (ags->use_hest_addr) {
+ fw_cfg_add_file_callback(s, ACPI_HEST_ADDR_FW_CFG_FILE, NULL, NULL,
+ NULL, &(ags->hest_addr_le), sizeof(ags->hest_addr_le), false);
+ } else {
/* Create a read-write fw_cfg file for Address */
fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
@@ -501,7 +519,7 @@ bool acpi_ghes_present(void)
return false;
}
ags = &acpi_ged_state->ghes_state;
- if (!ags->hw_error_le)
+ if (!ags->hw_error_le && !ags->hest_addr_le)
return false;
return true;
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index a3d62b96584f..454e97b5341c 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -71,9 +71,14 @@ enum {
* meaning an offset from the etc/hardware_errors firmware address. This
* is the default on QEMU 9.x.
*
- * An offset value equal to zero means that GHES is not present.
+ * When use_hest_addr is true, the stored offset is placed at hest_addr_le,
+ * meaning an offset from theHEST table address from etc/acpi/tables firmware.
+ * This is the default for QEMU 10.x and above.
+ *
+ * If both offset values are equal to zero, it means that GHES is not present.
*/
typedef struct AcpiGhesState {
+ uint64_t hest_addr_le;
uint64_t hw_error_le;
bool use_hest_addr; /* Currently, always false */
} AcpiGhesState;
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 03/14] acpi/ghes: Use HEST table offsets when preparing GHES records
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 01/14] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 02/14] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-26 15:16 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 04/14] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
` (12 subsequent siblings)
15 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, linux-kernel
There are two pointers that are needed during error injection:
1. The start address of the CPER block to be stored;
2. The address of the ack.
It is preferable to calculate them from the HEST table. This allows
checking the source ID, the size of the table and the type of the
HEST error block structures.
Yet, keep the old code, as this is needed for migration purposes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
hw/acpi/ghes.c | 100 +++++++++++++++++++++++++++++++++++++++++
include/hw/acpi/ghes.h | 2 +-
2 files changed, 101 insertions(+), 1 deletion(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index ba37be9e7022..7efea519f766 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -41,6 +41,12 @@
/* Address offset in Generic Address Structure(GAS) */
#define GAS_ADDR_OFFSET 4
+/*
+ * ACPI spec 1.0b
+ * 5.2.3 System Description Table Header
+ */
+#define ACPI_DESC_HEADER_OFFSET 36
+
/*
* The total size of Generic Error Data Entry
* ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
@@ -61,6 +67,30 @@
*/
#define ACPI_GHES_GESB_SIZE 20
+/*
+ * See the memory layout map at docs/specs/acpi_hest_ghes.rst.
+ */
+
+/*
+ * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source version 2
+ * Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure
+ */
+#define HEST_GHES_V2_ENTRY_SIZE 92
+
+/*
+ * ACPI 6.1: 18.3.2.7: Generic Hardware Error Source
+ * Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure
+ * Read Ack Register
+ */
+#define GHES_READ_ACK_ADDR_OFF 64
+
+/*
+ * ACPI 6.1: 18.3.2.7: Generic Hardware Error Source
+ * Table 18-341 Generic Hardware Error Source Structure
+ * Error Status Address
+ */
+#define GHES_ERR_STATUS_ADDR_OFF 20
+
/*
* Values for error_severity field
*/
@@ -412,6 +442,73 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
*read_ack_register_addr = ghes_addr + sizeof(uint64_t);
}
+static void get_ghes_source_offsets(uint16_t source_id,
+ uint64_t hest_addr,
+ uint64_t *cper_addr,
+ uint64_t *read_ack_start_addr,
+ Error **errp)
+{
+ uint64_t hest_err_block_addr, hest_read_ack_addr;
+ uint64_t err_source_entry, error_block_addr;
+ uint32_t num_sources, i;
+
+ hest_addr += ACPI_DESC_HEADER_OFFSET;
+
+ cpu_physical_memory_read(hest_addr, &num_sources,
+ sizeof(num_sources));
+ num_sources = le32_to_cpu(num_sources);
+
+ err_source_entry = hest_addr + sizeof(num_sources);
+
+ /*
+ * Currently, HEST Error source navigates only for GHESv2 tables
+ */
+ for (i = 0; i < num_sources; i++) {
+ uint64_t addr = err_source_entry;
+ uint16_t type, src_id;
+
+ cpu_physical_memory_read(addr, &type, sizeof(type));
+ type = le16_to_cpu(type);
+
+ /* For now, we only know the size of GHESv2 table */
+ if (type != ACPI_GHES_SOURCE_GENERIC_ERROR_V2) {
+ error_setg(errp, "HEST: type %d not supported.", type);
+ return;
+ }
+
+ /* Compare CPER source address at the GHESv2 structure */
+ addr += sizeof(type);
+ cpu_physical_memory_read(addr, &src_id, sizeof(src_id));
+ if (le16_to_cpu(src_id) == source_id) {
+ break;
+ }
+
+ err_source_entry += HEST_GHES_V2_ENTRY_SIZE;
+ }
+ if (i == num_sources) {
+ error_setg(errp, "HEST: Source %d not found.", source_id);
+ return;
+ }
+
+ /* Navigate though table address pointers */
+ hest_err_block_addr = err_source_entry + GHES_ERR_STATUS_ADDR_OFF +
+ GAS_ADDR_OFFSET;
+
+ cpu_physical_memory_read(hest_err_block_addr, &error_block_addr,
+ sizeof(error_block_addr));
+ error_block_addr = le64_to_cpu(error_block_addr);
+
+ cpu_physical_memory_read(error_block_addr, cper_addr,
+ sizeof(*cper_addr));
+ *cper_addr = le64_to_cpu(*cper_addr);
+
+ hest_read_ack_addr = err_source_entry + GHES_READ_ACK_ADDR_OFF +
+ GAS_ADDR_OFFSET;
+ cpu_physical_memory_read(hest_read_ack_addr, read_ack_start_addr,
+ sizeof(*read_ack_start_addr));
+ *read_ack_start_addr = le64_to_cpu(*read_ack_start_addr);
+}
+
void ghes_record_cper_errors(const void *cper, size_t len,
uint16_t source_id, Error **errp)
{
@@ -437,6 +534,9 @@ void ghes_record_cper_errors(const void *cper, size_t len,
if (!ags->use_hest_addr) {
get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
&cper_addr, &read_ack_register_addr);
+ } else {
+ get_ghes_source_offsets(source_id, le64_to_cpu(ags->hest_addr_le),
+ &cper_addr, &read_ack_register_addr, errp);
}
if (!cper_addr) {
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 454e97b5341c..2f06e433ce04 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -80,7 +80,7 @@ enum {
typedef struct AcpiGhesState {
uint64_t hest_addr_le;
uint64_t hw_error_le;
- bool use_hest_addr; /* Currently, always false */
+ bool use_hest_addr; /* True if HEST address is present */
} AcpiGhesState;
void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 04/14] acpi/ghes: don't hard-code the number of sources for HEST table
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (2 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 03/14] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-26 15:48 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 05/14] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
` (11 subsequent siblings)
15 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Peter Maydell,
Shannon Zhao, linux-kernel
The current code is actually dependent on having just one error
structure with a single source, as any change there would cause
migration issues.
As the number of sources should be arch-dependent, as it will depend on
what kind of notifications will exist, and how many errors can be
reported at the same time, change the logic to be more flexible,
allowing the number of sources to be defined when building the
HEST table by the caller.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/ghes.c | 38 +++++++++++++++++++++-----------------
hw/arm/virt-acpi-build.c | 8 +++++++-
include/hw/acpi/ghes.h | 17 ++++++++++++-----
3 files changed, 40 insertions(+), 23 deletions(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 7efea519f766..4a4ea8f4be90 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -238,17 +238,17 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
* See docs/specs/acpi_hest_ghes.rst for blobs format.
*/
static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
- BIOSLinker *linker)
+ BIOSLinker *linker, int num_sources)
{
int i, error_status_block_offset;
/* Build error_block_address */
- for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+ for (i = 0; i < num_sources; i++) {
build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
}
/* Build read_ack_register */
- for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+ for (i = 0; i < num_sources; i++) {
/*
* Initialize the value of read_ack_register to 1, so GHES can be
* writable after (re)boot.
@@ -263,13 +263,13 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
/* Reserve space for Error Status Data Block */
acpi_data_push(hardware_errors,
- ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
+ ACPI_GHES_MAX_RAW_DATA_LENGTH * num_sources);
/* Tell guest firmware to place hardware_errors blob into RAM */
bios_linker_loader_alloc(linker, ACPI_HW_ERROR_FW_CFG_FILE,
hardware_errors, sizeof(uint64_t), false);
- for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+ for (i = 0; i < num_sources; i++) {
/*
* Tell firmware to patch error_block_address entries to point to
* corresponding "Generic Error Status Block"
@@ -295,12 +295,14 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
}
/* Build Generic Hardware Error Source version 2 (GHESv2) */
-static void build_ghes_v2(GArray *table_data,
- BIOSLinker *linker,
- enum AcpiGhesNotifyType notify,
- uint16_t source_id)
+static void build_ghes_v2_entry(GArray *table_data,
+ BIOSLinker *linker,
+ const AcpiNotificationSourceId *notif_src,
+ uint16_t index, int num_sources)
{
uint64_t address_offset;
+ const uint16_t notify = notif_src->notify;
+ const uint16_t source_id = notif_src->source_id;
/*
* Type:
@@ -331,7 +333,7 @@ static void build_ghes_v2(GArray *table_data,
address_offset + GAS_ADDR_OFFSET,
sizeof(uint64_t),
ACPI_HW_ERROR_FW_CFG_FILE,
- source_id * sizeof(uint64_t));
+ index * sizeof(uint64_t));
/* Notification Structure */
build_ghes_hw_error_notification(table_data, notify);
@@ -351,8 +353,7 @@ static void build_ghes_v2(GArray *table_data,
address_offset + GAS_ADDR_OFFSET,
sizeof(uint64_t),
ACPI_HW_ERROR_FW_CFG_FILE,
- (ACPI_GHES_ERROR_SOURCE_COUNT + source_id)
- * sizeof(uint64_t));
+ (num_sources + index) * sizeof(uint64_t));
/*
* Read Ack Preserve field
@@ -368,22 +369,26 @@ static void build_ghes_v2(GArray *table_data,
void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
GArray *hardware_errors,
BIOSLinker *linker,
+ const AcpiNotificationSourceId *notif_source,
+ int num_sources,
const char *oem_id, const char *oem_table_id)
{
AcpiTable table = { .sig = "HEST", .rev = 1,
.oem_id = oem_id, .oem_table_id = oem_table_id };
uint32_t hest_offset;
+ int i;
hest_offset = table_data->len;
- build_ghes_error_table(ags, hardware_errors, linker);
+ build_ghes_error_table(ags, hardware_errors, linker, num_sources);
acpi_table_begin(&table, table_data);
/* Error Source Count */
- build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
- build_ghes_v2(table_data, linker,
- ACPI_GHES_NOTIFY_SEA, ACPI_HEST_SRC_ID_SEA);
+ build_append_int_noprefix(table_data, num_sources, 4);
+ for (i = 0; i < num_sources; i++) {
+ build_ghes_v2_entry(table_data, linker, ¬if_source[i], i, num_sources);
+ }
acpi_table_end(linker, &table);
@@ -529,7 +534,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
}
ags = &acpi_ged_state->ghes_state;
- assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
if (!ags->use_hest_addr) {
get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 8ab8d11b6536..4439252e1a75 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -893,6 +893,10 @@ static void acpi_align_size(GArray *blob, unsigned align)
g_array_set_size(blob, ROUND_UP(acpi_data_len(blob), align));
}
+static const AcpiNotificationSourceId hest_ghes_notify[] = {
+ { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
+};
+
static
void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
{
@@ -956,7 +960,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
acpi_add_table(table_offsets, tables_blob);
acpi_build_hest(ags, tables_blob, tables->hardware_errors,
- tables->linker, vms->oem_id, vms->oem_table_id);
+ tables->linker, hest_ghes_notify,
+ ARRAY_SIZE(hest_ghes_notify),
+ vms->oem_id, vms->oem_table_id);
}
}
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 2f06e433ce04..51c6b6b33327 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -57,13 +57,18 @@ enum AcpiGhesNotifyType {
ACPI_GHES_NOTIFY_RESERVED = 12
};
-enum {
- ACPI_HEST_SRC_ID_SEA = 0,
- /* future ids go here */
-
- ACPI_GHES_ERROR_SOURCE_COUNT
+/*
+ * ID numbers used to fill HEST source ID field
+ */
+enum AcpiGhesSourceID {
+ ACPI_HEST_SRC_ID_SYNC,
};
+typedef struct AcpiNotificationSourceId {
+ enum AcpiGhesSourceID source_id;
+ enum AcpiGhesNotifyType notify;
+} AcpiNotificationSourceId;
+
/*
* AcpiGhesState stores an offset that will be used to fill HEST entries.
*
@@ -86,6 +91,8 @@ typedef struct AcpiGhesState {
void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
GArray *hardware_errors,
BIOSLinker *linker,
+ const AcpiNotificationSourceId * const notif_source,
+ int num_sources,
const char *oem_id, const char *oem_table_id);
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
GArray *hardware_errors);
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 05/14] acpi/ghes: add a notifier to notify when error data is ready
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (3 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 04/14] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
` (10 subsequent siblings)
15 siblings, 0 replies; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, linux-kernel
Some error injection notify methods are async, like GPIO
notify. Add a notifier to be used when the error record is
ready to be sent to the guest OS.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
hw/acpi/ghes.c | 5 ++++-
include/hw/acpi/ghes.h | 3 +++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 4a4ea8f4be90..f2d1cc7369f4 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -514,6 +514,9 @@ static void get_ghes_source_offsets(uint16_t source_id,
*read_ack_start_addr = le64_to_cpu(*read_ack_start_addr);
}
+NotifierList acpi_generic_error_notifiers =
+ NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
+
void ghes_record_cper_errors(const void *cper, size_t len,
uint16_t source_id, Error **errp)
{
@@ -570,7 +573,7 @@ void ghes_record_cper_errors(const void *cper, size_t len,
/* Write the generic error data entry into guest memory */
cpu_physical_memory_write(cper_addr, cper, len);
- return;
+ notifier_list_notify(&acpi_generic_error_notifiers, NULL);
}
int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 51c6b6b33327..219aa7ab4fe0 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -24,6 +24,9 @@
#include "hw/acpi/bios-linker-loader.h"
#include "qapi/error.h"
+#include "qemu/notify.h"
+
+extern NotifierList acpi_generic_error_notifiers;
/*
* Values for Hardware Error Notification Type field
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (4 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 05/14] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-26 15:27 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 07/14] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
` (9 subsequent siblings)
15 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Paolo Bonzini,
Peter Maydell, kvm, linux-kernel
Instead of having a function to check if ACPI is enabled
(acpi_ghes_present), change its logic to be more generic,
returing a pointed to AcpiGhesState.
Such change allows cleanup the ghes GED state code, avoiding
to read it multiple times, and simplifying the code.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/ghes-stub.c | 7 ++++---
hw/acpi/ghes.c | 38 ++++++++++----------------------------
include/hw/acpi/ghes.h | 14 ++++++++------
target/arm/kvm.c | 7 +++++--
4 files changed, 27 insertions(+), 39 deletions(-)
diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
index 7cec1812dad9..40f660c246fe 100644
--- a/hw/acpi/ghes-stub.c
+++ b/hw/acpi/ghes-stub.c
@@ -11,12 +11,13 @@
#include "qemu/osdep.h"
#include "hw/acpi/ghes.h"
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t physical_address)
{
return -1;
}
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
{
- return false;
+ return NULL;
}
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index f2d1cc7369f4..401789259f60 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -425,10 +425,6 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
uint64_t *cper_addr,
uint64_t *read_ack_register_addr)
{
- if (!ghes_addr) {
- return;
- }
-
/*
* non-HEST version supports only one source, so no need to change
* the start offset based on the source ID. Also, we can't validate
@@ -517,27 +513,16 @@ static void get_ghes_source_offsets(uint16_t source_id,
NotifierList acpi_generic_error_notifiers =
NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
-void ghes_record_cper_errors(const void *cper, size_t len,
+void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
uint16_t source_id, Error **errp)
{
uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
- AcpiGedState *acpi_ged_state;
- AcpiGhesState *ags;
if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
error_setg(errp, "GHES CPER record is too big: %zd", len);
return;
}
- acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
- NULL));
- if (!acpi_ged_state) {
- error_setg(errp, "Can't find ACPI_GED object");
- return;
- }
- ags = &acpi_ged_state->ghes_state;
-
-
if (!ags->use_hest_addr) {
get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
&cper_addr, &read_ack_register_addr);
@@ -546,11 +531,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
&cper_addr, &read_ack_register_addr, errp);
}
- if (!cper_addr) {
- error_setg(errp, "can not find Generic Error Status Block");
- return;
- }
-
cpu_physical_memory_read(read_ack_register_addr,
&read_ack_register, sizeof(read_ack_register));
@@ -576,7 +556,8 @@ void ghes_record_cper_errors(const void *cper, size_t len,
notifier_list_notify(&acpi_generic_error_notifiers, NULL);
}
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t physical_address)
{
/* Memory Error Section Type */
const uint8_t guid[] =
@@ -602,7 +583,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
acpi_ghes_build_append_mem_cper(block, physical_address);
/* Report the error */
- ghes_record_cper_errors(block->data, block->len, source_id, &errp);
+ ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
g_array_free(block, true);
@@ -614,7 +595,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
return 0;
}
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
{
AcpiGedState *acpi_ged_state;
AcpiGhesState *ags;
@@ -623,11 +604,12 @@ bool acpi_ghes_present(void)
NULL));
if (!acpi_ged_state) {
- return false;
+ return NULL;
}
ags = &acpi_ged_state->ghes_state;
- if (!ags->hw_error_le && !ags->hest_addr_le)
- return false;
- return true;
+ if (!ags->hw_error_le && !ags->hest_addr_le) {
+ return NULL;
+ }
+ return ags;
}
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 219aa7ab4fe0..276f9dc076d9 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -99,15 +99,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
const char *oem_id, const char *oem_table_id);
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
GArray *hardware_errors);
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t error_physical_addr);
-void ghes_record_cper_errors(const void *cper, size_t len,
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t error_physical_addr);
+void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
uint16_t source_id, Error **errp);
/**
- * acpi_ghes_present: Report whether ACPI GHES table is present
+ * acpi_ghes_get_state: Get a pointer for ACPI ghes state
*
- * Returns: true if the system has an ACPI GHES table and it is
- * safe to call acpi_ghes_memory_errors() to record a memory error.
+ * Returns: a pointer to ghes state if the system has an ACPI GHES table,
+ * it is enabled and it is safe to call acpi_ghes_memory_errors() to record
+ * a memory error. Returns false, otherwise.
*/
-bool acpi_ghes_present(void);
+AcpiGhesState *acpi_ghes_get_state(void);
#endif
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index da30bdbb2349..80ca7779797b 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2366,10 +2366,12 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
{
ram_addr_t ram_addr;
hwaddr paddr;
+ AcpiGhesState *ags;
assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
- if (acpi_ghes_present() && addr) {
+ ags = acpi_ghes_get_state();
+ if (ags && addr) {
ram_addr = qemu_ram_addr_from_host(addr);
if (ram_addr != RAM_ADDR_INVALID &&
kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
@@ -2387,7 +2389,8 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
*/
if (code == BUS_MCEERR_AR) {
kvm_cpu_synchronize_state(c);
- if (!acpi_ghes_memory_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
+ if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SEA,
+ paddr)) {
kvm_inject_arm_sea(c);
} else {
error_report("failed to record the error");
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 07/14] acpi/generic_event_device: Update GHES migration to cover hest addr
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (5 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 08/14] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
` (8 subsequent siblings)
15 siblings, 0 replies; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
The GHES migration logic should now support HEST table location too.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/generic_event_device.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index c85d97ca3776..5346cae573b7 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -386,6 +386,34 @@ static const VMStateDescription vmstate_ghes_state = {
}
};
+static const VMStateDescription vmstate_hest = {
+ .name = "acpi-hest",
+ .version_id = 1,
+ .minimum_version_id = 1,
+ .fields = (const VMStateField[]) {
+ VMSTATE_UINT64(hest_addr_le, AcpiGhesState),
+ VMSTATE_END_OF_LIST()
+ },
+};
+
+static bool hest_needed(void *opaque)
+{
+ AcpiGedState *s = opaque;
+ return s->ghes_state.hest_addr_le;
+}
+
+static const VMStateDescription vmstate_hest_state = {
+ .name = "acpi-ged/hest",
+ .version_id = 1,
+ .minimum_version_id = 1,
+ .needed = hest_needed,
+ .fields = (const VMStateField[]) {
+ VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
+ vmstate_hest, AcpiGhesState),
+ VMSTATE_END_OF_LIST()
+ }
+};
+
static const VMStateDescription vmstate_acpi_ged = {
.name = "acpi-ged",
.version_id = 1,
@@ -398,6 +426,7 @@ static const VMStateDescription vmstate_acpi_ged = {
&vmstate_memhp_state,
&vmstate_cpuhp_state,
&vmstate_ghes_state,
+ &vmstate_hest_state,
NULL
}
};
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 08/14] acpi/generic_event_device: add logic to detect if HEST addr is available
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (6 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 07/14] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-26 15:52 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 09/14] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
` (7 subsequent siblings)
15 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Philippe Mathieu-Daudé, Ani Sinha,
Eduardo Habkost, Marcel Apfelbaum, Peter Maydell, Shannon Zhao,
Yanan Wang, Zhao Liu, linux-kernel
Create a new property (x-has-hest-addr) and use it to detect if
the GHES table offsets can be calculated from the HEST address
(qemu 10.0 and upper) or via the legacy way via an offset obtained
from the hardware_errors firmware file.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
hw/acpi/generic_event_device.c | 1 +
hw/arm/virt-acpi-build.c | 18 ++++++++++++++++--
hw/core/machine.c | 2 ++
3 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 5346cae573b7..14d8513a5440 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -318,6 +318,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
static const Property acpi_ged_properties[] = {
DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
+ DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, false),
};
static const VMStateDescription vmstate_memhp_state = {
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 4439252e1a75..9de51105a513 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -897,6 +897,10 @@ static const AcpiNotificationSourceId hest_ghes_notify[] = {
{ ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
};
+static const AcpiNotificationSourceId hest_ghes_notify_9_2[] = {
+ { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
+};
+
static
void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
{
@@ -950,7 +954,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
build_dbg2(tables_blob, tables->linker, vms);
if (vms->ras) {
+ static const AcpiNotificationSourceId *notify;
AcpiGedState *acpi_ged_state;
+ unsigned int notify_sz;
AcpiGhesState *ags;
acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
@@ -959,9 +965,17 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
ags = &acpi_ged_state->ghes_state;
acpi_add_table(table_offsets, tables_blob);
+
+ if (!ags->use_hest_addr) {
+ notify = hest_ghes_notify_9_2;
+ notify_sz = ARRAY_SIZE(hest_ghes_notify_9_2);
+ } else {
+ notify = hest_ghes_notify;
+ notify_sz = ARRAY_SIZE(hest_ghes_notify);
+ }
+
acpi_build_hest(ags, tables_blob, tables->hardware_errors,
- tables->linker, hest_ghes_notify,
- ARRAY_SIZE(hest_ghes_notify),
+ tables->linker, notify, notify_sz,
vms->oem_id, vms->oem_table_id);
}
}
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 02cff735b3fb..7a11e0f87b11 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -34,6 +34,7 @@
#include "hw/virtio/virtio-pci.h"
#include "hw/virtio/virtio-net.h"
#include "hw/virtio/virtio-iommu.h"
+#include "hw/acpi/generic_event_device.h"
#include "audio/audio.h"
GlobalProperty hw_compat_9_2[] = {
@@ -43,6 +44,7 @@ GlobalProperty hw_compat_9_2[] = {
{ "virtio-balloon-pci-non-transitional", "vectors", "0" },
{ "virtio-mem-pci", "vectors", "0" },
{ "migration", "multifd-clean-tls-termination", "false" },
+ { TYPE_ACPI_GED, "x-has-hest-addr", "false" },
};
const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2);
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 09/14] acpi/generic_event_device: add an APEI error device
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (7 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 08/14] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 10/14] tests/acpi: virt: allow acpi table changes for a new table: HEST Mauro Carvalho Chehab
` (6 subsequent siblings)
15 siblings, 0 replies; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
Adds a generic error device to handle generic hardware error
events as specified at ACPI 6.5 specification at 18.3.2.7.2:
https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#event-notification-for-generic-error-sources
using HID PNP0C33.
The PNP0C33 device is used to report hardware errors to
the guest via ACPI APEI Generic Hardware Error Source (GHES).
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/aml-build.c | 10 ++++++++++
hw/acpi/generic_event_device.c | 13 +++++++++++++
include/hw/acpi/acpi_dev_interface.h | 1 +
include/hw/acpi/aml-build.h | 2 ++
include/hw/acpi/generic_event_device.h | 1 +
5 files changed, 27 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index f8f93a9f66c8..e4bd7b611372 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2614,3 +2614,13 @@ Aml *aml_i2c_serial_bus_device(uint16_t address, const char *resource_source)
return var;
}
+
+/* ACPI 5.0b: 18.3.2.6.2 Event Notification For Generic Error Sources */
+Aml *aml_error_device(void)
+{
+ Aml *dev = aml_device(ACPI_APEI_ERROR_DEVICE);
+ aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C33")));
+ aml_append(dev, aml_name_decl("_UID", aml_int(0)));
+
+ return dev;
+}
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 14d8513a5440..180eebbce1cd 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -26,6 +26,7 @@ static const uint32_t ged_supported_events[] = {
ACPI_GED_PWR_DOWN_EVT,
ACPI_GED_NVDIMM_HOTPLUG_EVT,
ACPI_GED_CPU_HOTPLUG_EVT,
+ ACPI_GED_ERROR_EVT,
};
/*
@@ -116,6 +117,16 @@ void build_ged_aml(Aml *table, const char *name, HotplugHandler *hotplug_dev,
aml_notify(aml_name(ACPI_POWER_BUTTON_DEVICE),
aml_int(0x80)));
break;
+ case ACPI_GED_ERROR_EVT:
+ /*
+ * ACPI 5.0b: 5.6.6 Device Object Notifications
+ * Table 5-135 Error Device Notification Values
+ * Defines 0x80 as the value to be used on notifications
+ */
+ aml_append(if_ctx,
+ aml_notify(aml_name(ACPI_APEI_ERROR_DEVICE),
+ aml_int(0x80)));
+ break;
case ACPI_GED_NVDIMM_HOTPLUG_EVT:
aml_append(if_ctx,
aml_notify(aml_name("\\_SB.NVDR"),
@@ -295,6 +306,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
sel = ACPI_GED_MEM_HOTPLUG_EVT;
} else if (ev & ACPI_POWER_DOWN_STATUS) {
sel = ACPI_GED_PWR_DOWN_EVT;
+ } else if (ev & ACPI_GENERIC_ERROR) {
+ sel = ACPI_GED_ERROR_EVT;
} else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
} else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
diff --git a/include/hw/acpi/acpi_dev_interface.h b/include/hw/acpi/acpi_dev_interface.h
index 68d9d15f50aa..8294f8f0ccca 100644
--- a/include/hw/acpi/acpi_dev_interface.h
+++ b/include/hw/acpi/acpi_dev_interface.h
@@ -13,6 +13,7 @@ typedef enum {
ACPI_NVDIMM_HOTPLUG_STATUS = 16,
ACPI_VMGENID_CHANGE_STATUS = 32,
ACPI_POWER_DOWN_STATUS = 64,
+ ACPI_GENERIC_ERROR = 128,
} AcpiEventStatusBits;
#define TYPE_ACPI_DEVICE_IF "acpi-device-interface"
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index c18f68134246..f38e12971932 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -252,6 +252,7 @@ struct CrsRangeSet {
/* Consumer/Producer */
#define AML_SERIAL_BUS_FLAG_CONSUME_ONLY (1 << 1)
+#define ACPI_APEI_ERROR_DEVICE "GEDD"
/**
* init_aml_allocator:
*
@@ -382,6 +383,7 @@ Aml *aml_dma(AmlDmaType typ, AmlDmaBusMaster bm, AmlTransferSize sz,
uint8_t channel);
Aml *aml_sleep(uint64_t msec);
Aml *aml_i2c_serial_bus_device(uint16_t address, const char *resource_source);
+Aml *aml_error_device(void);
/* Block AML object primitives */
Aml *aml_scope(const char *name_format, ...) G_GNUC_PRINTF(1, 2);
diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
index d2dac87b4a9f..1c18ac296fcb 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -101,6 +101,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
#define ACPI_GED_PWR_DOWN_EVT 0x2
#define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4
#define ACPI_GED_CPU_HOTPLUG_EVT 0x8
+#define ACPI_GED_ERROR_EVT 0x10
typedef struct GEDState {
MemoryRegion evt;
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 10/14] tests/acpi: virt: allow acpi table changes for a new table: HEST
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (8 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 09/14] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-26 15:55 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 11/14] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
` (5 subsequent siblings)
15 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
The DSDT table will also be affected by such change.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
tests/qtest/bios-tables-test-allowed-diff.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8bf4..1a4c2277bd5a 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,2 @@
/* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/aarch64/virt/DSDT",
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 11/14] arm/virt: Wire up a GED error device for ACPI / GHES
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (9 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 10/14] tests/acpi: virt: allow acpi table changes for a new table: HEST Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-26 15:58 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 12/14] tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT Mauro Carvalho Chehab
` (4 subsequent siblings)
15 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Peter Maydell, Shannon Zhao,
linux-kernel
Adds support to ARM virtualization to allow handling
generic error ACPI Event via GED & error source device.
It is aligned with Linux Kernel patch:
https://lore.kernel.org/lkml/1272350481-27951-8-git-send-email-ying.huang@intel.com/
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
---
Changes from v8:
- Added a call to the function that produces GHES generic
records, as this is now added earlier in this series.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
hw/acpi/generic_event_device.c | 2 +-
hw/arm/virt-acpi-build.c | 1 +
hw/arm/virt.c | 12 +++++++++++-
include/hw/arm/virt.h | 1 +
4 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 180eebbce1cd..f5e899155d34 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -331,7 +331,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
static const Property acpi_ged_properties[] = {
DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
- DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, false),
+ DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, true),
};
static const VMStateDescription vmstate_memhp_state = {
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 9de51105a513..4f174795ed60 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -861,6 +861,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
}
acpi_dsdt_add_power_button(scope);
+ aml_append(scope, aml_error_device());
#ifdef CONFIG_TPM
acpi_dsdt_add_tpm(scope, vms);
#endif
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 4a5a9666e916..3faf32f900b5 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -678,7 +678,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
DeviceState *dev;
MachineState *ms = MACHINE(vms);
int irq = vms->irqmap[VIRT_ACPI_GED];
- uint32_t event = ACPI_GED_PWR_DOWN_EVT;
+ uint32_t event = ACPI_GED_PWR_DOWN_EVT | ACPI_GED_ERROR_EVT;
if (ms->ram_slots) {
event |= ACPI_GED_MEM_HOTPLUG_EVT;
@@ -1010,6 +1010,13 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
}
}
+static void virt_generic_error_req(Notifier *n, void *opaque)
+{
+ VirtMachineState *s = container_of(n, VirtMachineState, generic_error_notifier);
+
+ acpi_send_event(s->acpi_dev, ACPI_GENERIC_ERROR);
+}
+
static void create_gpio_keys(char *fdt, DeviceState *pl061_dev,
uint32_t phandle)
{
@@ -2404,6 +2411,9 @@ static void machvirt_init(MachineState *machine)
if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
vms->acpi_dev = create_acpi_ged(vms);
+ vms->generic_error_notifier.notify = virt_generic_error_req;
+ notifier_list_add(&acpi_generic_error_notifiers,
+ &vms->generic_error_notifier);
} else {
create_gpio_devices(vms, VIRT_GPIO, sysmem);
}
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index c8e94e6aedc9..f3cf28436770 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -176,6 +176,7 @@ struct VirtMachineState {
DeviceState *gic;
DeviceState *acpi_dev;
Notifier powerdown_notifier;
+ Notifier generic_error_notifier;
PCIBus *bus;
char *oem_id;
char *oem_table_id;
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 12/14] tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (10 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 11/14] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 13/14] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
` (3 subsequent siblings)
15 siblings, 0 replies; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
--- a/DSDT.dsl 2025-01-28 09:38:15.155347858 +0100
+++ b/DSDT.dsl 2025-01-28 09:39:01.684836954 +0100
@@ -9,9 +9,9 @@
*
* Original Table Header:
* Signature "DSDT"
- * Length 0x00001516 (5398)
+ * Length 0x00001542 (5442)
* Revision 0x02
- * Checksum 0x0F
+ * Checksum 0xE9
* OEM ID "BOCHS "
* OEM Table ID "BXPC "
* OEM Revision 0x00000001 (1)
@@ -1931,6 +1931,11 @@
{
Notify (PWRB, 0x80) // Status Change
}
+
+ If (((Local0 & 0x10) == 0x10))
+ {
+ Notify (GEDD, 0x80) // Status Change
+ }
}
}
@@ -1939,6 +1944,12 @@
Name (_HID, "PNP0C0C" /* Power Button Device */) // _HID: Hardware ID
Name (_UID, Zero) // _UID: Unique ID
}
+
+ Device (GEDD)
+ {
+ Name (_HID, "PNP0C33" /* Error Device */) // _HID: Hardware ID
+ Name (_UID, Zero) // _UID: Unique ID
+ }
}
}
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
.../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
tests/qtest/bios-tables-test-allowed-diff.h | 1 -
6 files changed, 1 deletion(-)
diff --git a/tests/data/acpi/aarch64/virt/DSDT b/tests/data/acpi/aarch64/virt/DSDT
index 36d3e5d5a5e47359b6dcb3706f98b4f225677591..a182bd9d7182dccdf63c650d048c58f18505d001 100644
GIT binary patch
delta 109
zcmX@3@k4{lCD<jTLWF^ViDe>}G*h$dM)euOOwJsW4+;nC=*7E+g>V+Q2D|zsED)Gn
zoxsJ!z{S)S5FX^j)c_F?VBivHb9Z%dnXE4&D;?b=31V}^dw9C=2KWUSI2#)?aKwjt
Hx-b9$X;vI^
delta 64
zcmeyNaYlp7CD<jzM}&caNqQoeG*i3NM)euOOit{R4+;lM%f`Egg>V+Q2D|zsED)Gn
UoxsJ!z{S)S5FX?-*+E1W06%jPR{#J2
diff --git a/tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt b/tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt
index e6154d0355f84fdcc51387b4db8f9ee63acae4e9..af1f2b0eb0b77a80c5bd74f201d24f71e486627f 100644
GIT binary patch
delta 110
zcmZ3ac}|ndCD<k8oCpI0)4_>c(oCIR8`a+lGdXii78eO-)SH|wBICY5U~+W=mjDBo
yK%2X(iwjpnbdzL2c#soEyoaX?Z-8HbfwO@#14n$Qrwc=LlO#wDl9aJAR0;r(tsHj%
delta 66
zcmX@7xk!`CCD<iokq83=(~XH-(oDVX8`a+lGdZzO78eO-l%1R{A|oB$BpDDM<irv0
W;pxH~;1^)vY~akm5g+R5!T<noi4jWx
diff --git a/tests/data/acpi/aarch64/virt/DSDT.memhp b/tests/data/acpi/aarch64/virt/DSDT.memhp
index 33f011d6b635035a04c0b39ce9b4e219f7ae74b7..10436ec87c4859fb84b3ecb7bba5788f38112e59 100644
GIT binary patch
delta 88
zcmbPheA1Z9CD<k8q$C3algUIbX{MH08`WnBGdXcjJ}4Z_<jXo)OvH<SfxzVI1TFyv
qE`c_8R~MJfaU%At($P(lAPz^oho=i~fM0-tv#~J)M|`NK3j+W#;TF9B
delta 44
zcmX?UJlB}ZCD<iot|S8klg&gfX{L_p8`WnBGdXfiJ}4Z_<ij#qOvGz*p@=Oj039?8
AE&u=k
diff --git a/tests/data/acpi/aarch64/virt/DSDT.pxb b/tests/data/acpi/aarch64/virt/DSDT.pxb
index c0fdc6e9c1396cc2259dc4bc665ba023adcf4c9b..0524b3cbe00bfe552de824dd1090bd00a208c527 100644
GIT binary patch
delta 110
zcmexwz1oJ$CD<iITaJN&sbC_PG*jDyjq2XAOwJsWOJsu?^(LQ?m2qDnFu6K`OMrn(
ypv~RY#f7UOx=Au1JjjV7-ow*{H^48zz}di=fg?WD(}f|rNfM+6Ny^w5Dg^+WYaFrw
delta 66
zcmZ2&^WU1wCD<k8zbpd-Q^!OuX{N5b8`ZsKnVi@sm&gV)%1%BZD<d7<BpDDM<irv0
W;pxH~;1^)vY~akm5g+R5!T<oNArgiF
diff --git a/tests/data/acpi/aarch64/virt/DSDT.topology b/tests/data/acpi/aarch64/virt/DSDT.topology
index 029d03eecc4efddc001e5377e85ac8e831294362..8c0423fe62d6950f9098983d86bfee256d7d003a 100644
GIT binary patch
delta 86
zcmbQHbx4cLCD<jzNtA(s>E%Q&X{O%5jp|7vOwJsWyG4Q-^(NmJk>Ot;Fu6K`OMrn(
opv~RY#bxqO5n1WzCP@&RBi_T)g*U)2z`)tqn1Lfc)YF9l01l28<p2Nx
delta 42
ycmX@4HBF1lCD<iIOq79viGL!OG*hGhM)f2SCMWjE-6Fw^vXk$N$V}!Dl?DLb(h64q
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 1a4c2277bd5a..dfb8523c8bf4 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,2 +1 @@
/* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/aarch64/virt/DSDT",
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 13/14] qapi/acpi-hest: add an interface to do generic CPER error injection
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (11 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 12/14] tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 14/14] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
` (2 subsequent siblings)
15 siblings, 0 replies; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Eric Blake,
Markus Armbruster, Michael Roth, Paolo Bonzini, Peter Maydell,
Shannon Zhao, linux-kernel
Creates a QMP command to be used for generic ACPI APEI hardware error
injection (HEST) via GHESv2, and add support for it for ARM guests.
Error injection uses ACPI_HEST_SRC_ID_QMP source ID to be platform
independent. This is mapped at arch virt bindings, depending on the
types supported by QEMU and by the BIOS. So, on ARM, this is supported
via ACPI_GHES_NOTIFY_GPIO notification type.
This patch is co-authored:
- original ghes logic to inject a simple ARM record by Shiju Jose;
- generic logic to handle block addresses by Jonathan Cameron;
- generic GHESv2 error inject by Mauro Carvalho Chehab;
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-authored-by: Shiju Jose <shiju.jose@huawei.com>
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
---
Changes since v9:
- ARM source IDs renamed to reflect SYNC/ASYNC;
- command name changed to better reflect what it does;
- some improvements at JSON documentation;
- add a check for QMP source at the notification logic.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
MAINTAINERS | 7 +++++++
hw/acpi/Kconfig | 5 +++++
hw/acpi/ghes.c | 2 +-
hw/acpi/ghes_cper.c | 38 ++++++++++++++++++++++++++++++++++++++
hw/acpi/ghes_cper_stub.c | 19 +++++++++++++++++++
hw/acpi/meson.build | 2 ++
hw/arm/virt-acpi-build.c | 1 +
hw/arm/virt.c | 7 +++++++
include/hw/acpi/ghes.h | 1 +
include/hw/arm/virt.h | 1 +
qapi/acpi-hest.json | 35 +++++++++++++++++++++++++++++++++++
qapi/meson.build | 1 +
qapi/qapi-schema.json | 1 +
13 files changed, 119 insertions(+), 1 deletion(-)
create mode 100644 hw/acpi/ghes_cper.c
create mode 100644 hw/acpi/ghes_cper_stub.c
create mode 100644 qapi/acpi-hest.json
diff --git a/MAINTAINERS b/MAINTAINERS
index 3848d37a38d2..aed0f4cc62cd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2080,6 +2080,13 @@ F: hw/acpi/ghes.c
F: include/hw/acpi/ghes.h
F: docs/specs/acpi_hest_ghes.rst
+ACPI/HEST/GHES/ARM processor CPER
+R: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+S: Maintained
+F: hw/arm/ghes_cper.c
+F: hw/acpi/ghes_cper_stub.c
+F: qapi/acpi-hest.json
+
ppc4xx
L: qemu-ppc@nongnu.org
S: Orphan
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 1d4e9f0845c0..daabbe6cd11e 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -51,6 +51,11 @@ config ACPI_APEI
bool
depends on ACPI
+config GHES_CPER
+ bool
+ depends on ACPI_APEI
+ default y
+
config ACPI_PCI
bool
depends on ACPI && PCI
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 401789259f60..3bea55e2e8e9 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -553,7 +553,7 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
/* Write the generic error data entry into guest memory */
cpu_physical_memory_write(cper_addr, cper, len);
- notifier_list_notify(&acpi_generic_error_notifiers, NULL);
+ notifier_list_notify(&acpi_generic_error_notifiers, &source_id);
}
int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
diff --git a/hw/acpi/ghes_cper.c b/hw/acpi/ghes_cper.c
new file mode 100644
index 000000000000..0a2d95dd8b27
--- /dev/null
+++ b/hw/acpi/ghes_cper.c
@@ -0,0 +1,38 @@
+/*
+ * CPER payload parser for error injection
+ *
+ * Copyright(C) 2024-2025 Huawei LTD.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include "qemu/base64.h"
+#include "qemu/error-report.h"
+#include "qemu/uuid.h"
+#include "qapi/qapi-commands-acpi-hest.h"
+#include "hw/acpi/ghes.h"
+
+void qmp_inject_ghes_v2_error(const char *qmp_cper, Error **errp)
+{
+ AcpiGhesState *ags;
+
+ ags = acpi_ghes_get_state();
+ if (!ags) {
+ return;
+ }
+
+ uint8_t *cper;
+ size_t len;
+
+ cper = qbase64_decode(qmp_cper, -1, &len, errp);
+ if (!cper) {
+ error_setg(errp, "missing GHES CPER payload");
+ return;
+ }
+
+ ghes_record_cper_errors(ags, cper, len, ACPI_HEST_SRC_ID_QMP, errp);
+}
diff --git a/hw/acpi/ghes_cper_stub.c b/hw/acpi/ghes_cper_stub.c
new file mode 100644
index 000000000000..5ebc61970a78
--- /dev/null
+++ b/hw/acpi/ghes_cper_stub.c
@@ -0,0 +1,19 @@
+/*
+ * Stub interface for CPER payload parser for error injection
+ *
+ * Copyright(C) 2024-2025 Huawei LTD.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-acpi-hest.h"
+#include "hw/acpi/ghes.h"
+
+void qmp_inject_ghes_v2_error(const char *cper, Error **errp)
+{
+ error_setg(errp, "GHES QMP error inject is not compiled in");
+}
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
index 73f02b96912b..56b5d1ec9691 100644
--- a/hw/acpi/meson.build
+++ b/hw/acpi/meson.build
@@ -34,4 +34,6 @@ endif
system_ss.add(when: 'CONFIG_ACPI', if_false: files('acpi-stub.c', 'aml-build-stub.c', 'ghes-stub.c', 'acpi_interface.c'))
system_ss.add(when: 'CONFIG_ACPI_PCI_BRIDGE', if_false: files('pci-bridge-stub.c'))
system_ss.add_all(when: 'CONFIG_ACPI', if_true: acpi_ss)
+system_ss.add(when: 'CONFIG_GHES_CPER', if_true: files('ghes_cper.c'))
+system_ss.add(when: 'CONFIG_GHES_CPER', if_false: files('ghes_cper_stub.c'))
system_ss.add(files('acpi-qmp-cmds.c'))
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 4f174795ed60..7b6e90d69298 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -896,6 +896,7 @@ static void acpi_align_size(GArray *blob, unsigned align)
static const AcpiNotificationSourceId hest_ghes_notify[] = {
{ ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
+ { ACPI_HEST_SRC_ID_QMP, ACPI_GHES_NOTIFY_GPIO },
};
static const AcpiNotificationSourceId hest_ghes_notify_9_2[] = {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3faf32f900b5..116428ab582e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1012,6 +1012,13 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
static void virt_generic_error_req(Notifier *n, void *opaque)
{
+ uint16_t *source_id = opaque;
+
+ /* Currently, only QMP source ID is async */
+ if (*source_id != ACPI_HEST_SRC_ID_QMP) {
+ return;
+ }
+
VirtMachineState *s = container_of(n, VirtMachineState, generic_error_notifier);
acpi_send_event(s->acpi_dev, ACPI_GENERIC_ERROR);
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 276f9dc076d9..47f30fec724a 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -65,6 +65,7 @@ enum AcpiGhesNotifyType {
*/
enum AcpiGhesSourceID {
ACPI_HEST_SRC_ID_SYNC,
+ ACPI_HEST_SRC_ID_QMP, /* Use it only for QMP injected errors */
};
typedef struct AcpiNotificationSourceId {
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index f3cf28436770..56f270f61cf5 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -33,6 +33,7 @@
#include "exec/hwaddr.h"
#include "qemu/notify.h"
#include "hw/boards.h"
+#include "hw/acpi/ghes.h"
#include "hw/arm/boot.h"
#include "hw/arm/bsa.h"
#include "hw/block/flash.h"
diff --git a/qapi/acpi-hest.json b/qapi/acpi-hest.json
new file mode 100644
index 000000000000..fff5018c7ec1
--- /dev/null
+++ b/qapi/acpi-hest.json
@@ -0,0 +1,35 @@
+# -*- Mode: Python -*-
+# vim: filetype=python
+
+##
+# == GHESv2 CPER Error Injection
+#
+# Defined since ACPI Specification 6.1,
+# section 18.3.2.8 Generic Hardware Error Source version 2. See:
+#
+# https://uefi.org/sites/default/files/resources/ACPI_6_1.pdf
+##
+
+
+##
+# @inject-ghes-v2-error:
+#
+# Inject an error with additional ACPI 6.1 GHESv2 error information
+#
+# @cper: contains a base64 encoded string with raw data for a single
+# CPER record with Generic Error Status Block, Generic Error Data
+# Entry and generic error data payload, as described at
+# https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#format
+#
+# Features:
+#
+# @unstable: This command is experimental.
+#
+# Since: 10.0
+##
+{ 'command': 'inject-ghes-v2-error',
+ 'data': {
+ 'cper': 'str'
+ },
+ 'features': [ 'unstable' ]
+}
diff --git a/qapi/meson.build b/qapi/meson.build
index e7bc54e5d047..35cea6147262 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -59,6 +59,7 @@ qapi_all_modules = [
if have_system
qapi_all_modules += [
'acpi',
+ 'acpi-hest',
'audio',
'cryptodev',
'qdev',
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index b1581988e4eb..baf19ab73afe 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -75,6 +75,7 @@
{ 'include': 'misc-target.json' }
{ 'include': 'audio.json' }
{ 'include': 'acpi.json' }
+{ 'include': 'acpi-hest.json' }
{ 'include': 'pci.json' }
{ 'include': 'stats.json' }
{ 'include': 'virtio.json' }
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 14/14] scripts/ghes_inject: add a script to generate GHES error inject
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (12 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 13/14] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for " Igor Mammedov
2025-02-27 9:54 ` Igor Mammedov
15 siblings, 0 replies; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Cleber Rosa, John Snow, linux-kernel
Using the QMP GHESv2 API requires preparing a raw data array
containing a CPER record.
Add a helper script with subcommands to prepare such data.
Currently, only ARM Processor error CPER record is supported, by
using:
$ ghes_inject.py arm
which produces those warnings on Linux:
[ 705.032426] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
[ 774.866308] {4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
[ 774.866583] {4}[Hardware Error]: event severity: recoverable
[ 774.866738] {4}[Hardware Error]: Error 0, type: recoverable
[ 774.866889] {4}[Hardware Error]: section_type: ARM processor error
[ 774.867048] {4}[Hardware Error]: MIDR: 0x00000000000f0510
[ 774.867189] {4}[Hardware Error]: running state: 0x0
[ 774.867321] {4}[Hardware Error]: Power State Coordination Interface state: 0
[ 774.867511] {4}[Hardware Error]: Error info structure 0:
[ 774.867679] {4}[Hardware Error]: num errors: 2
[ 774.867801] {4}[Hardware Error]: error_type: 0x02: cache error
[ 774.867962] {4}[Hardware Error]: error_info: 0x000000000091000f
[ 774.868124] {4}[Hardware Error]: transaction type: Data Access
[ 774.868280] {4}[Hardware Error]: cache error, operation type: Data write
[ 774.868465] {4}[Hardware Error]: cache level: 2
[ 774.868592] {4}[Hardware Error]: processor context not corrupted
[ 774.868774] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
Such script allows customizing the error data, allowing to change
all fields at the record. Please use:
$ ghes_inject.py arm -h
For more details about its usage.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
MAINTAINERS | 3 +
scripts/arm_processor_error.py | 476 ++++++++++++++++++++++
scripts/ghes_inject.py | 51 +++
scripts/qmp_helper.py | 702 +++++++++++++++++++++++++++++++++
4 files changed, 1232 insertions(+)
create mode 100644 scripts/arm_processor_error.py
create mode 100755 scripts/ghes_inject.py
create mode 100755 scripts/qmp_helper.py
diff --git a/MAINTAINERS b/MAINTAINERS
index aed0f4cc62cd..203baee63712 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2086,6 +2086,9 @@ S: Maintained
F: hw/arm/ghes_cper.c
F: hw/acpi/ghes_cper_stub.c
F: qapi/acpi-hest.json
+F: scripts/ghes_inject.py
+F: scripts/arm_processor_error.py
+F: scripts/qmp_helper.py
ppc4xx
L: qemu-ppc@nongnu.org
diff --git a/scripts/arm_processor_error.py b/scripts/arm_processor_error.py
new file mode 100644
index 000000000000..1dd42e42a877
--- /dev/null
+++ b/scripts/arm_processor_error.py
@@ -0,0 +1,476 @@
+#!/usr/bin/env python3
+#
+# pylint: disable=C0301,C0114,R0903,R0912,R0913,R0914,R0915,W0511
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (C) 2024-2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+
+# TODO: current implementation has dummy defaults.
+#
+# For a better implementation, a QMP addition/call is needed to
+# retrieve some data for ARM Processor Error injection:
+#
+# - ARM registers: power_state, mpidr.
+
+"""
+Generates an ARM processor error CPER, compatible with
+UEFI 2.9A Errata.
+
+Injecting such errors can be done using:
+
+ $ ./scripts/ghes_inject.py arm
+ Error injected.
+
+Produces a simple CPER register, as detected on a Linux guest:
+
+[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
+[Hardware Error]: event severity: recoverable
+[Hardware Error]: Error 0, type: recoverable
+[Hardware Error]: section_type: ARM processor error
+[Hardware Error]: MIDR: 0x0000000000000000
+[Hardware Error]: running state: 0x0
+[Hardware Error]: Power State Coordination Interface state: 0
+[Hardware Error]: Error info structure 0:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x02: cache error
+[Hardware Error]: error_info: 0x000000000091000f
+[Hardware Error]: transaction type: Data Access
+[Hardware Error]: cache error, operation type: Data write
+[Hardware Error]: cache level: 2
+[Hardware Error]: processor context not corrupted
+[Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
+
+The ARM Processor Error message can be customized via command line
+parameters. For instance:
+
+ $ ./scripts/ghes_inject.py arm --mpidr 0x444 --running --affinity 1 \
+ --error-info 12345678 --vendor 0x13,123,4,5,1 --ctx-array 0,1,2,3,4,5 \
+ -t cache tlb bus micro-arch tlb,micro-arch
+ Error injected.
+
+Injects this error, as detected on a Linux guest:
+
+[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
+[Hardware Error]: event severity: recoverable
+[Hardware Error]: Error 0, type: recoverable
+[Hardware Error]: section_type: ARM processor error
+[Hardware Error]: MIDR: 0x0000000000000000
+[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x0000000000000000
+[Hardware Error]: error affinity level: 0
+[Hardware Error]: running state: 0x1
+[Hardware Error]: Power State Coordination Interface state: 0
+[Hardware Error]: Error info structure 0:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x02: cache error
+[Hardware Error]: error_info: 0x0000000000bc614e
+[Hardware Error]: cache level: 2
+[Hardware Error]: processor context not corrupted
+[Hardware Error]: Error info structure 1:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x04: TLB error
+[Hardware Error]: error_info: 0x000000000054007f
+[Hardware Error]: transaction type: Instruction
+[Hardware Error]: TLB error, operation type: Instruction fetch
+[Hardware Error]: TLB level: 1
+[Hardware Error]: processor context not corrupted
+[Hardware Error]: the error has not been corrected
+[Hardware Error]: PC is imprecise
+[Hardware Error]: Error info structure 2:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x08: bus error
+[Hardware Error]: error_info: 0x00000080d6460fff
+[Hardware Error]: transaction type: Generic
+[Hardware Error]: bus error, operation type: Generic read (type of instruction or data request cannot be determined)
+[Hardware Error]: affinity level at which the bus error occurred: 1
+[Hardware Error]: processor context corrupted
+[Hardware Error]: the error has been corrected
+[Hardware Error]: PC is imprecise
+[Hardware Error]: Program execution can be restarted reliably at the PC associated with the error.
+[Hardware Error]: participation type: Local processor observed
+[Hardware Error]: request timed out
+[Hardware Error]: address space: External Memory Access
+[Hardware Error]: memory access attributes:0x20
+[Hardware Error]: access mode: secure
+[Hardware Error]: Error info structure 3:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x10: micro-architectural error
+[Hardware Error]: error_info: 0x0000000078da03ff
+[Hardware Error]: Error info structure 4:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x14: TLB error|micro-architectural error
+[Hardware Error]: Context info structure 0:
+[Hardware Error]: register context type: AArch64 EL1 context registers
+[Hardware Error]: 00000000: 00000000 00000000
+[Hardware Error]: Vendor specific error info has 5 bytes:
+[Hardware Error]: 00000000: 13 7b 04 05 01 .{...
+[Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
+[Firmware Warn]: GHES: Unhandled processor error type 0x04: TLB error
+[Firmware Warn]: GHES: Unhandled processor error type 0x08: bus error
+[Firmware Warn]: GHES: Unhandled processor error type 0x10: micro-architectural error
+[Firmware Warn]: GHES: Unhandled processor error type 0x14: TLB error|micro-architectural error
+"""
+
+import argparse
+import re
+
+from qmp_helper import qmp, util, cper_guid
+
+
+class ArmProcessorEinj:
+ """
+ Implements ARM Processor Error injection via GHES
+ """
+
+ DESC = """
+ Generates an ARM processor error CPER, compatible with
+ UEFI 2.9A Errata.
+ """
+
+ ACPI_GHES_ARM_CPER_LENGTH = 40
+ ACPI_GHES_ARM_CPER_PEI_LENGTH = 32
+
+ # Context types
+ CONTEXT_AARCH32_EL1 = 1
+ CONTEXT_AARCH64_EL1 = 5
+ CONTEXT_MISC_REG = 8
+
+ def __init__(self, subparsers):
+ """Initialize the error injection class and add subparser"""
+
+ # Valid choice values
+ self.arm_valid_bits = {
+ "mpidr": util.bit(0),
+ "affinity": util.bit(1),
+ "running": util.bit(2),
+ "vendor": util.bit(3),
+ }
+
+ self.pei_flags = {
+ "first": util.bit(0),
+ "last": util.bit(1),
+ "propagated": util.bit(2),
+ "overflow": util.bit(3),
+ }
+
+ self.pei_error_types = {
+ "cache": util.bit(1),
+ "tlb": util.bit(2),
+ "bus": util.bit(3),
+ "micro-arch": util.bit(4),
+ }
+
+ self.pei_valid_bits = {
+ "multiple-error": util.bit(0),
+ "flags": util.bit(1),
+ "error-info": util.bit(2),
+ "virt-addr": util.bit(3),
+ "phy-addr": util.bit(4),
+ }
+
+ self.data = bytearray()
+
+ parser = subparsers.add_parser("arm", description=self.DESC)
+
+ arm_valid_bits = ",".join(self.arm_valid_bits.keys())
+ flags = ",".join(self.pei_flags.keys())
+ error_types = ",".join(self.pei_error_types.keys())
+ pei_valid_bits = ",".join(self.pei_valid_bits.keys())
+
+ # UEFI N.16 ARM Validation bits
+ g_arm = parser.add_argument_group("ARM processor")
+ g_arm.add_argument("--arm", "--arm-valid",
+ help=f"ARM valid bits: {arm_valid_bits}")
+ g_arm.add_argument("-a", "--affinity", "--level", "--affinity-level",
+ type=lambda x: int(x, 0),
+ help="Affinity level (when multiple levels apply)")
+ g_arm.add_argument("-l", "--mpidr", type=lambda x: int(x, 0),
+ help="Multiprocessor Affinity Register")
+ g_arm.add_argument("-i", "--midr", type=lambda x: int(x, 0),
+ help="Main ID Register")
+ g_arm.add_argument("-r", "--running",
+ action=argparse.BooleanOptionalAction,
+ default=None,
+ help="Indicates if the processor is running or not")
+ g_arm.add_argument("--psci", "--psci-state",
+ type=lambda x: int(x, 0),
+ help="Power State Coordination Interface - PSCI state")
+
+ # TODO: Add vendor-specific support
+
+ # UEFI N.17 bitmaps (type and flags)
+ g_pei = parser.add_argument_group("ARM Processor Error Info (PEI)")
+ g_pei.add_argument("-t", "--type", nargs="+",
+ help=f"one or more error types: {error_types}")
+ g_pei.add_argument("-f", "--flags", nargs="*",
+ help=f"zero or more error flags: {flags}")
+ g_pei.add_argument("-V", "--pei-valid", "--error-valid", nargs="*",
+ help=f"zero or more PEI valid bits: {pei_valid_bits}")
+
+ # UEFI N.17 Integer values
+ g_pei.add_argument("-m", "--multiple-error", nargs="+",
+ help="Number of errors: 0: Single error, 1: Multiple errors, 2-65535: Error count if known")
+ g_pei.add_argument("-e", "--error-info", nargs="+",
+ help="Error information (UEFI 2.10 tables N.18 to N.20)")
+ g_pei.add_argument("-p", "--physical-address", nargs="+",
+ help="Physical address")
+ g_pei.add_argument("-v", "--virtual-address", nargs="+",
+ help="Virtual address")
+
+ # UEFI N.21 Context
+ g_ctx = parser.add_argument_group("Processor Context")
+ g_ctx.add_argument("--ctx-type", "--context-type", nargs="*",
+ help="Type of the context (0=ARM32 GPR, 5=ARM64 EL1, other values supported)")
+ g_ctx.add_argument("--ctx-size", "--context-size", nargs="*",
+ help="Minimal size of the context")
+ g_ctx.add_argument("--ctx-array", "--context-array", nargs="*",
+ help="Comma-separated arrays for each context")
+
+ # Vendor-specific data
+ g_vendor = parser.add_argument_group("Vendor-specific data")
+ g_vendor.add_argument("--vendor", "--vendor-specific", nargs="+",
+ help="Vendor-specific byte arrays of data")
+
+ # Add arguments for Generic Error Data
+ qmp.argparse(parser)
+
+ parser.set_defaults(func=self.send_cper)
+
+ def send_cper(self, args):
+ """Parse subcommand arguments and send a CPER via QMP"""
+
+ qmp_cmd = qmp(args.host, args.port, args.debug)
+
+ # Handle Generic Error Data arguments if any
+ qmp_cmd.set_args(args)
+
+ is_cpu_type = re.compile(r"^([\w+]+\-)?arm\-cpu$")
+ cpus = qmp_cmd.search_qom("/machine/unattached/device",
+ "type", is_cpu_type)
+
+ cper = {}
+ pei = {}
+ ctx = {}
+ vendor = {}
+
+ arg = vars(args)
+
+ # Handle global parameters
+ if args.arm:
+ arm_valid_init = False
+ cper["valid"] = util.get_choice(name="valid",
+ value=args.arm,
+ choices=self.arm_valid_bits,
+ suffixes=["-error", "-err"])
+ else:
+ cper["valid"] = 0
+ arm_valid_init = True
+
+ if "running" in arg:
+ if args.running:
+ cper["running-state"] = util.bit(0)
+ else:
+ cper["running-state"] = 0
+ else:
+ cper["running-state"] = 0
+
+ if arm_valid_init:
+ if args.affinity:
+ cper["valid"] |= self.arm_valid_bits["affinity"]
+
+ if args.mpidr:
+ cper["valid"] |= self.arm_valid_bits["mpidr"]
+
+ if "running-state" in cper:
+ cper["valid"] |= self.arm_valid_bits["running"]
+
+ if args.psci:
+ cper["valid"] |= self.arm_valid_bits["running"]
+
+ # Handle PEI
+ if not args.type:
+ args.type = ["cache-error"]
+
+ util.get_mult_choices(
+ pei,
+ name="valid",
+ values=args.pei_valid,
+ choices=self.pei_valid_bits,
+ suffixes=["-valid", "--addr"],
+ )
+ util.get_mult_choices(
+ pei,
+ name="type",
+ values=args.type,
+ choices=self.pei_error_types,
+ suffixes=["-error", "-err"],
+ )
+ util.get_mult_choices(
+ pei,
+ name="flags",
+ values=args.flags,
+ choices=self.pei_flags,
+ suffixes=["-error", "-cap"],
+ )
+ util.get_mult_int(pei, "error-info", args.error_info)
+ util.get_mult_int(pei, "multiple-error", args.multiple_error)
+ util.get_mult_int(pei, "phy-addr", args.physical_address)
+ util.get_mult_int(pei, "virt-addr", args.virtual_address)
+
+ # Handle context
+ util.get_mult_int(ctx, "type", args.ctx_type, allow_zero=True)
+ util.get_mult_int(ctx, "minimal-size", args.ctx_size, allow_zero=True)
+ util.get_mult_array(ctx, "register", args.ctx_array, allow_zero=True)
+
+ util.get_mult_array(vendor, "bytes", args.vendor, max_val=255)
+
+ # Store PEI
+ pei_data = bytearray()
+ default_flags = self.pei_flags["first"]
+ default_flags |= self.pei_flags["last"]
+
+ error_info_num = 0
+
+ for i, p in pei.items(): # pylint: disable=W0612
+ error_info_num += 1
+
+ # UEFI 2.10 doesn't define how to encode error information
+ # when multiple types are raised. So, provide a default only
+ # if a single type is there
+ if "error-info" not in p:
+ if p["type"] == util.bit(1):
+ p["error-info"] = 0x0091000F
+ if p["type"] == util.bit(2):
+ p["error-info"] = 0x0054007F
+ if p["type"] == util.bit(3):
+ p["error-info"] = 0x80D6460FFF
+ if p["type"] == util.bit(4):
+ p["error-info"] = 0x78DA03FF
+
+ if "valid" not in p:
+ p["valid"] = 0
+ if "multiple-error" in p:
+ p["valid"] |= self.pei_valid_bits["multiple-error"]
+
+ if "flags" in p:
+ p["valid"] |= self.pei_valid_bits["flags"]
+
+ if "error-info" in p:
+ p["valid"] |= self.pei_valid_bits["error-info"]
+
+ if "phy-addr" in p:
+ p["valid"] |= self.pei_valid_bits["phy-addr"]
+
+ if "virt-addr" in p:
+ p["valid"] |= self.pei_valid_bits["virt-addr"]
+
+ # Version
+ util.data_add(pei_data, 0, 1)
+
+ util.data_add(pei_data,
+ self.ACPI_GHES_ARM_CPER_PEI_LENGTH, 1)
+
+ util.data_add(pei_data, p["valid"], 2)
+ util.data_add(pei_data, p["type"], 1)
+ util.data_add(pei_data, p.get("multiple-error", 1), 2)
+ util.data_add(pei_data, p.get("flags", default_flags), 1)
+ util.data_add(pei_data, p.get("error-info", 0), 8)
+ util.data_add(pei_data, p.get("virt-addr", 0xDEADBEEF), 8)
+ util.data_add(pei_data, p.get("phy-addr", 0xABBA0BAD), 8)
+
+ # Store Context
+ ctx_data = bytearray()
+ context_info_num = 0
+
+ if ctx:
+ ret = qmp_cmd.send_cmd("query-target", may_open=True)
+
+ default_ctx = self.CONTEXT_MISC_REG
+
+ if "arch" in ret:
+ if ret["arch"] == "aarch64":
+ default_ctx = self.CONTEXT_AARCH64_EL1
+ elif ret["arch"] == "arm":
+ default_ctx = self.CONTEXT_AARCH32_EL1
+
+ for k in sorted(ctx.keys()):
+ context_info_num += 1
+
+ if "type" not in ctx[k]:
+ ctx[k]["type"] = default_ctx
+
+ if "register" not in ctx[k]:
+ ctx[k]["register"] = []
+
+ reg_size = len(ctx[k]["register"])
+ size = 0
+
+ if "minimal-size" in ctx:
+ size = ctx[k]["minimal-size"]
+
+ size = max(size, reg_size)
+
+ size = (size + 1) % 0xFFFE
+
+ # Version
+ util.data_add(ctx_data, 0, 2)
+
+ util.data_add(ctx_data, ctx[k]["type"], 2)
+
+ util.data_add(ctx_data, 8 * size, 4)
+
+ for r in ctx[k]["register"]:
+ util.data_add(ctx_data, r, 8)
+
+ for i in range(reg_size, size): # pylint: disable=W0612
+ util.data_add(ctx_data, 0, 8)
+
+ # Vendor-specific bytes are not grouped
+ vendor_data = bytearray()
+ if vendor:
+ for k in sorted(vendor.keys()):
+ for b in vendor[k]["bytes"]:
+ util.data_add(vendor_data, b, 1)
+
+ # Encode ARM Processor Error
+ data = bytearray()
+
+ util.data_add(data, cper["valid"], 4)
+
+ util.data_add(data, error_info_num, 2)
+ util.data_add(data, context_info_num, 2)
+
+ # Calculate the length of the CPER data
+ cper_length = self.ACPI_GHES_ARM_CPER_LENGTH
+ cper_length += len(pei_data)
+ cper_length += len(vendor_data)
+ cper_length += len(ctx_data)
+ util.data_add(data, cper_length, 4)
+
+ util.data_add(data, arg.get("affinity-level", 0), 1)
+
+ # Reserved
+ util.data_add(data, 0, 3)
+
+ if "midr-el1" not in arg:
+ if cpus:
+ cmd_arg = {
+ 'path': cpus[0],
+ 'property': "midr"
+ }
+ ret = qmp_cmd.send_cmd("qom-get", cmd_arg, may_open=True)
+ if isinstance(ret, int):
+ arg["midr-el1"] = ret
+
+ util.data_add(data, arg.get("mpidr-el1", 0), 8)
+ util.data_add(data, arg.get("midr-el1", 0), 8)
+ util.data_add(data, cper["running-state"], 4)
+ util.data_add(data, arg.get("psci-state", 0), 4)
+
+ # Add PEI
+ data.extend(pei_data)
+ data.extend(ctx_data)
+ data.extend(vendor_data)
+
+ self.data = data
+
+ qmp_cmd.send_cper(cper_guid.CPER_PROC_ARM, self.data)
diff --git a/scripts/ghes_inject.py b/scripts/ghes_inject.py
new file mode 100755
index 000000000000..9a235201418b
--- /dev/null
+++ b/scripts/ghes_inject.py
@@ -0,0 +1,51 @@
+#!/usr/bin/env python3
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (C) 2024-2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+
+"""
+Handle ACPI GHESv2 error injection logic QEMU QMP interface.
+"""
+
+import argparse
+import sys
+
+from arm_processor_error import ArmProcessorEinj
+
+EINJ_DESC = """
+Handle ACPI GHESv2 error injection logic QEMU QMP interface.
+
+It allows using UEFI BIOS EINJ features to generate GHES records.
+
+It helps testing CPER and GHES drivers at the guest OS and how
+userspace applications at the guest handle them.
+"""
+
+def main():
+ """Main program"""
+
+ # Main parser - handle generic args like QEMU QMP TCP socket options
+ parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter,
+ usage="%(prog)s [options]",
+ description=EINJ_DESC)
+
+ g_options = parser.add_argument_group("QEMU QMP socket options")
+ g_options.add_argument("-H", "--host", default="localhost", type=str,
+ help="host name")
+ g_options.add_argument("-P", "--port", default=4445, type=int,
+ help="TCP port number")
+ g_options.add_argument('-d', '--debug', action='store_true')
+
+ subparsers = parser.add_subparsers()
+
+ ArmProcessorEinj(subparsers)
+
+ args = parser.parse_args()
+ if "func" in args:
+ args.func(args)
+ else:
+ sys.exit(f"Please specify a valid command for {sys.argv[0]}")
+
+if __name__ == "__main__":
+ main()
diff --git a/scripts/qmp_helper.py b/scripts/qmp_helper.py
new file mode 100755
index 000000000000..d7e6aabce8fe
--- /dev/null
+++ b/scripts/qmp_helper.py
@@ -0,0 +1,702 @@
+#!/usr/bin/env python3
+#
+# pylint: disable=C0103,E0213,E1135,E1136,E1137,R0902,R0903,R0912,R0913,R0917
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (C) 2024-2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+
+"""
+Helper classes to be used by ghes_inject command classes.
+"""
+
+import json
+import sys
+
+from datetime import datetime
+from os import path as os_path
+
+try:
+ qemu_dir = os_path.abspath(os_path.dirname(os_path.dirname(__file__)))
+ sys.path.append(os_path.join(qemu_dir, 'python'))
+
+ from qemu.qmp.legacy import QEMUMonitorProtocol
+
+except ModuleNotFoundError as exc:
+ print(f"Module '{exc.name}' not found.")
+ print("Try export PYTHONPATH=top-qemu-dir/python or run from top-qemu-dir")
+ sys.exit(1)
+
+from base64 import b64encode
+
+class util:
+ """
+ Ancillary functions to deal with bitmaps, parse arguments,
+ generate GUID and encode data on a bytearray buffer.
+ """
+
+ #
+ # Helper routines to handle multiple choice arguments
+ #
+ def get_choice(name, value, choices, suffixes=None, bitmask=True):
+ """Produce a list from multiple choice argument"""
+
+ new_values = 0
+
+ if not value:
+ return new_values
+
+ for val in value.split(","):
+ val = val.lower()
+
+ if suffixes:
+ for suffix in suffixes:
+ val = val.removesuffix(suffix)
+
+ if val not in choices.keys():
+ if suffixes:
+ for suffix in suffixes:
+ if val + suffix in choices.keys():
+ val += suffix
+ break
+
+ if val not in choices.keys():
+ sys.exit(f"Error on '{name}': choice '{val}' is invalid.")
+
+ val = choices[val]
+
+ if bitmask:
+ new_values |= val
+ else:
+ if new_values:
+ sys.exit(f"Error on '{name}': only one value is accepted.")
+
+ new_values = val
+
+ return new_values
+
+ def get_array(name, values, max_val=None):
+ """Add numbered hashes from integer lists into an array"""
+
+ array = []
+
+ for value in values:
+ for val in value.split(","):
+ try:
+ val = int(val, 0)
+ except ValueError:
+ sys.exit(f"Error on '{name}': {val} is not an integer")
+
+ if val < 0:
+ sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+ if max_val and val > max_val:
+ sys.exit(f"Error on '{name}': {val} is too little")
+
+ array.append(val)
+
+ return array
+
+ def get_mult_array(mult, name, values, allow_zero=False, max_val=None):
+ """Add numbered hashes from integer lists"""
+
+ if not allow_zero:
+ if not values:
+ return
+ else:
+ if values is None:
+ return
+
+ if not values:
+ i = 0
+ if i not in mult:
+ mult[i] = {}
+
+ mult[i][name] = []
+ return
+
+ i = 0
+ for value in values:
+ for val in value.split(","):
+ try:
+ val = int(val, 0)
+ except ValueError:
+ sys.exit(f"Error on '{name}': {val} is not an integer")
+
+ if val < 0:
+ sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+ if max_val and val > max_val:
+ sys.exit(f"Error on '{name}': {val} is too little")
+
+ if i not in mult:
+ mult[i] = {}
+
+ if name not in mult[i]:
+ mult[i][name] = []
+
+ mult[i][name].append(val)
+
+ i += 1
+
+
+ def get_mult_choices(mult, name, values, choices,
+ suffixes=None, allow_zero=False):
+ """Add numbered hashes from multiple choice arguments"""
+
+ if not allow_zero:
+ if not values:
+ return
+ else:
+ if values is None:
+ return
+
+ i = 0
+ for val in values:
+ new_values = util.get_choice(name, val, choices, suffixes)
+
+ if i not in mult:
+ mult[i] = {}
+
+ mult[i][name] = new_values
+ i += 1
+
+
+ def get_mult_int(mult, name, values, allow_zero=False):
+ """Add numbered hashes from integer arguments"""
+ if not allow_zero:
+ if not values:
+ return
+ else:
+ if values is None:
+ return
+
+ i = 0
+ for val in values:
+ try:
+ val = int(val, 0)
+ except ValueError:
+ sys.exit(f"Error on '{name}': {val} is not an integer")
+
+ if val < 0:
+ sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+ if i not in mult:
+ mult[i] = {}
+
+ mult[i][name] = val
+ i += 1
+
+
+ #
+ # Data encode helper functions
+ #
+ def bit(b):
+ """Simple macro to define a bit on a bitmask"""
+ return 1 << b
+
+
+ def data_add(data, value, num_bytes):
+ """Adds bytes from value inside a bitarray"""
+
+ data.extend(value.to_bytes(num_bytes, byteorder="little")) # pylint: disable=E1101
+
+ def dump_bytearray(name, data):
+ """Does an hexdump of a byte array, grouping in bytes"""
+
+ print(f"{name} ({len(data)} bytes):")
+
+ for ln_start in range(0, len(data), 16):
+ ln_end = min(ln_start + 16, len(data))
+ print(f" {ln_start:08x} ", end="")
+ for i in range(ln_start, ln_end):
+ print(f"{data[i]:02x} ", end="")
+ for i in range(ln_end, ln_start + 16):
+ print(" ", end="")
+ print(" ", end="")
+ for i in range(ln_start, ln_end):
+ if data[i] >= 32 and data[i] < 127:
+ print(chr(data[i]), end="")
+ else:
+ print(".", end="")
+
+ print()
+ print()
+
+ def time(string):
+ """Handle BCD timestamps used on Generic Error Data Block"""
+
+ time = None
+
+ # Formats to be used when parsing time stamps
+ formats = [
+ "%Y-%m-%d %H:%M:%S",
+ ]
+
+ if string == "now":
+ time = datetime.now()
+
+ if time is None:
+ for fmt in formats:
+ try:
+ time = datetime.strptime(string, fmt)
+ break
+ except ValueError:
+ pass
+
+ if time is None:
+ raise ValueError("Invalid time format")
+
+ return time
+
+class guid:
+ """
+ Simple class to handle GUID fields.
+ """
+
+ def __init__(self, time_low, time_mid, time_high, nodes):
+ """Initialize a GUID value"""
+
+ assert len(nodes) == 8
+
+ self.time_low = time_low
+ self.time_mid = time_mid
+ self.time_high = time_high
+ self.nodes = nodes
+
+ @classmethod
+ def UUID(cls, guid_str):
+ """Initialize a GUID using a string on its standard format"""
+
+ if len(guid_str) != 36:
+ print("Size not 36")
+ raise ValueError('Invalid GUID size')
+
+ # It is easier to parse without separators. So, drop them
+ guid_str = guid_str.replace('-', '')
+
+ if len(guid_str) != 32:
+ print("Size not 32", guid_str, len(guid_str))
+ raise ValueError('Invalid GUID hex size')
+
+ time_low = 0
+ time_mid = 0
+ time_high = 0
+ nodes = []
+
+ for i in reversed(range(16, 32, 2)):
+ h = guid_str[i:i + 2]
+ value = int(h, 16)
+ nodes.insert(0, value)
+
+ time_high = int(guid_str[12:16], 16)
+ time_mid = int(guid_str[8:12], 16)
+ time_low = int(guid_str[0:8], 16)
+
+ return cls(time_low, time_mid, time_high, nodes)
+
+ def __str__(self):
+ """Output a GUID value on its default string representation"""
+
+ clock = self.nodes[0] << 8 | self.nodes[1]
+
+ node = 0
+ for i in range(2, len(self.nodes)):
+ node = node << 8 | self.nodes[i]
+
+ s = f"{self.time_low:08x}-{self.time_mid:04x}-"
+ s += f"{self.time_high:04x}-{clock:04x}-{node:012x}"
+ return s
+
+ def to_bytes(self):
+ """Output a GUID value in bytes"""
+
+ data = bytearray()
+
+ util.data_add(data, self.time_low, 4)
+ util.data_add(data, self.time_mid, 2)
+ util.data_add(data, self.time_high, 2)
+ data.extend(bytearray(self.nodes))
+
+ return data
+
+class qmp:
+ """
+ Opens a connection and send/receive QMP commands.
+ """
+
+ def send_cmd(self, command, args=None, may_open=False, return_error=True):
+ """Send a command to QMP, optinally opening a connection"""
+
+ if may_open:
+ self._connect()
+ elif not self.connected:
+ return False
+
+ msg = { 'execute': command }
+ if args:
+ msg['arguments'] = args
+
+ try:
+ obj = self.qmp_monitor.cmd_obj(msg)
+ # Can we use some other exception class here?
+ except Exception as e: # pylint: disable=W0718
+ print(f"Command: {command}")
+ print(f"Failed to inject error: {e}.")
+ return None
+
+ if "return" in obj:
+ if isinstance(obj.get("return"), dict):
+ if obj["return"]:
+ return obj["return"]
+ return "OK"
+
+ return obj["return"]
+
+ if isinstance(obj.get("error"), dict):
+ error = obj["error"]
+ if return_error:
+ print(f"Command: {msg}")
+ print(f'{error["class"]}: {error["desc"]}')
+ else:
+ print(json.dumps(obj))
+
+ return None
+
+ def _close(self):
+ """Shutdown and close the socket, if opened"""
+ if not self.connected:
+ return
+
+ self.qmp_monitor.close()
+ self.connected = False
+
+ def _connect(self):
+ """Connect to a QMP TCP/IP port, if not connected yet"""
+
+ if self.connected:
+ return True
+
+ try:
+ self.qmp_monitor.connect(negotiate=True)
+ except ConnectionError:
+ sys.exit(f"Can't connect to QMP host {self.host}:{self.port}")
+
+ self.connected = True
+
+ return True
+
+ BLOCK_STATUS_BITS = {
+ "uncorrectable": util.bit(0),
+ "correctable": util.bit(1),
+ "multi-uncorrectable": util.bit(2),
+ "multi-correctable": util.bit(3),
+ }
+
+ ERROR_SEVERITY = {
+ "recoverable": 0,
+ "fatal": 1,
+ "corrected": 2,
+ "none": 3,
+ }
+
+ VALIDATION_BITS = {
+ "fru-id": util.bit(0),
+ "fru-text": util.bit(1),
+ "timestamp": util.bit(2),
+ }
+
+ GEDB_FLAGS_BITS = {
+ "recovered": util.bit(0),
+ "prev-error": util.bit(1),
+ "simulated": util.bit(2),
+ }
+
+ GENERIC_DATA_SIZE = 72
+
+ def argparse(parser):
+ """Prepare a parser group to query generic error data"""
+
+ block_status_bits = ",".join(qmp.BLOCK_STATUS_BITS.keys())
+ error_severity_enum = ",".join(qmp.ERROR_SEVERITY.keys())
+ validation_bits = ",".join(qmp.VALIDATION_BITS.keys())
+ gedb_flags_bits = ",".join(qmp.GEDB_FLAGS_BITS.keys())
+
+ g_gen = parser.add_argument_group("Generic Error Data") # pylint: disable=E1101
+ g_gen.add_argument("--block-status",
+ help=f"block status bits: {block_status_bits}")
+ g_gen.add_argument("--raw-data", nargs="+",
+ help="Raw data inside the Error Status Block")
+ g_gen.add_argument("--error-severity", "--severity",
+ help=f"error severity: {error_severity_enum}")
+ g_gen.add_argument("--gen-err-valid-bits",
+ "--generic-error-validation-bits",
+ help=f"validation bits: {validation_bits}")
+ g_gen.add_argument("--fru-id", type=guid.UUID,
+ help="GUID representing a physical device")
+ g_gen.add_argument("--fru-text",
+ help="ASCII string identifying the FRU hardware")
+ g_gen.add_argument("--timestamp", type=util.time,
+ help="Time when the error info was collected")
+ g_gen.add_argument("--precise", "--precise-timestamp",
+ action='store_true',
+ help="Marks the timestamp as precise if --timestamp is used")
+ g_gen.add_argument("--gedb-flags",
+ help=f"General Error Data Block flags: {gedb_flags_bits}")
+
+ def set_args(self, args):
+ """Set the arguments optionally defined via self.argparse()"""
+
+ if args.block_status:
+ self.block_status = util.get_choice(name="block-status",
+ value=args.block_status,
+ choices=self.BLOCK_STATUS_BITS,
+ bitmask=False)
+ if args.raw_data:
+ self.raw_data = util.get_array("raw-data", args.raw_data,
+ max_val=255)
+ print(self.raw_data)
+
+ if args.error_severity:
+ self.error_severity = util.get_choice(name="error-severity",
+ value=args.error_severity,
+ choices=self.ERROR_SEVERITY,
+ bitmask=False)
+
+ if args.fru_id:
+ self.fru_id = args.fru_id.to_bytes()
+ if not args.gen_err_valid_bits:
+ self.validation_bits |= self.VALIDATION_BITS["fru-id"]
+
+ if args.fru_text:
+ text = bytearray(args.fru_text.encode('ascii'))
+ if len(text) > 20:
+ sys.exit("FRU text is too big to fit")
+
+ self.fru_text = text
+ if not args.gen_err_valid_bits:
+ self.validation_bits |= self.VALIDATION_BITS["fru-text"]
+
+ if args.timestamp:
+ time = args.timestamp
+ century = int(time.year / 100)
+
+ bcd = bytearray()
+ util.data_add(bcd, (time.second // 10) << 4 | (time.second % 10), 1)
+ util.data_add(bcd, (time.minute // 10) << 4 | (time.minute % 10), 1)
+ util.data_add(bcd, (time.hour // 10) << 4 | (time.hour % 10), 1)
+
+ if args.precise:
+ util.data_add(bcd, 1, 1)
+ else:
+ util.data_add(bcd, 0, 1)
+
+ util.data_add(bcd, (time.day // 10) << 4 | (time.day % 10), 1)
+ util.data_add(bcd, (time.month // 10) << 4 | (time.month % 10), 1)
+ util.data_add(bcd,
+ ((time.year % 100) // 10) << 4 | (time.year % 10), 1)
+ util.data_add(bcd, ((century % 100) // 10) << 4 | (century % 10), 1)
+
+ self.timestamp = bcd
+ if not args.gen_err_valid_bits:
+ self.validation_bits |= self.VALIDATION_BITS["timestamp"]
+
+ if args.gen_err_valid_bits:
+ self.validation_bits = util.get_choice(name="validation",
+ value=args.gen_err_valid_bits,
+ choices=self.VALIDATION_BITS)
+
+ def __init__(self, host, port, debug=False):
+ """Initialize variables used by the QMP send logic"""
+
+ self.connected = False
+ self.host = host
+ self.port = port
+ self.debug = debug
+
+ # ACPI 6.1: 18.3.2.7.1 Generic Error Data: Generic Error Status Block
+ self.block_status = self.BLOCK_STATUS_BITS["uncorrectable"]
+ self.raw_data = []
+ self.error_severity = self.ERROR_SEVERITY["recoverable"]
+
+ # ACPI 6.1: 18.3.2.7.1 Generic Error Data: Generic Error Data Entry
+ self.validation_bits = 0
+ self.flags = 0
+ self.fru_id = bytearray(16)
+ self.fru_text = bytearray(20)
+ self.timestamp = bytearray(8)
+
+ self.qmp_monitor = QEMUMonitorProtocol(address=(self.host, self.port))
+
+ #
+ # Socket QMP send command
+ #
+ def send_cper_raw(self, cper_data):
+ """Send a raw CPER data to QEMU though QMP TCP socket"""
+
+ data = b64encode(bytes(cper_data)).decode('ascii')
+
+ cmd_arg = {
+ 'cper': data
+ }
+
+ self._connect()
+
+ if self.send_cmd("inject-ghes-v2-error", cmd_arg):
+ print("Error injected.")
+
+ def send_cper(self, notif_type, payload):
+ """Send commands to QEMU though QMP TCP socket"""
+
+ # Fill CPER record header
+
+ # NOTE: bits 4 to 13 of block status contain the number of
+ # data entries in the data section. This is currently unsupported.
+
+ cper_length = len(payload)
+ data_length = cper_length + len(self.raw_data) + self.GENERIC_DATA_SIZE
+
+ # Generic Error Data Entry
+ gede = bytearray()
+
+ gede.extend(notif_type.to_bytes())
+ util.data_add(gede, self.error_severity, 4)
+ util.data_add(gede, 0x300, 2)
+ util.data_add(gede, self.validation_bits, 1)
+ util.data_add(gede, self.flags, 1)
+ util.data_add(gede, cper_length, 4)
+ gede.extend(self.fru_id)
+ gede.extend(self.fru_text)
+ gede.extend(self.timestamp)
+
+ # Generic Error Status Block
+ gebs = bytearray()
+
+ if self.raw_data:
+ raw_data_offset = len(gebs)
+ else:
+ raw_data_offset = 0
+
+ util.data_add(gebs, self.block_status, 4)
+ util.data_add(gebs, raw_data_offset, 4)
+ util.data_add(gebs, len(self.raw_data), 4)
+ util.data_add(gebs, data_length, 4)
+ util.data_add(gebs, self.error_severity, 4)
+
+ cper_data = bytearray()
+ cper_data.extend(gebs)
+ cper_data.extend(gede)
+ cper_data.extend(bytearray(self.raw_data))
+ cper_data.extend(bytearray(payload))
+
+ if self.debug:
+ print(f"GUID: {notif_type}")
+
+ util.dump_bytearray("Generic Error Status Block", gebs)
+ util.dump_bytearray("Generic Error Data Entry", gede)
+
+ if self.raw_data:
+ util.dump_bytearray("Raw data", bytearray(self.raw_data))
+
+ util.dump_bytearray("Payload", payload)
+
+ self.send_cper_raw(cper_data)
+
+
+ def search_qom(self, path, prop, regex):
+ """
+ Return a list of devices that match path array like:
+
+ /machine/unattached/device
+ /machine/peripheral-anon/device
+ ...
+ """
+
+ found = []
+
+ i = 0
+ while 1:
+ dev = f"{path}[{i}]"
+ args = {
+ 'path': dev,
+ 'property': prop
+ }
+ ret = self.send_cmd("qom-get", args, may_open=True, return_error=False)
+ if not ret:
+ break
+
+ if isinstance(ret, str):
+ if regex.search(ret):
+ found.append(dev)
+
+ i += 1
+ if i > 10000:
+ print("Too many objects returned by qom-get!")
+ break
+
+ return found
+
+class cper_guid:
+ """
+ Contains CPER GUID, as per:
+ https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html
+ """
+
+ CPER_PROC_GENERIC = guid(0x9876CCAD, 0x47B4, 0x4bdb,
+ [0xB6, 0x5E, 0x16, 0xF1,
+ 0x93, 0xC4, 0xF3, 0xDB])
+
+ CPER_PROC_X86 = guid(0xDC3EA0B0, 0xA144, 0x4797,
+ [0xB9, 0x5B, 0x53, 0xFA,
+ 0x24, 0x2B, 0x6E, 0x1D])
+
+ CPER_PROC_ITANIUM = guid(0xe429faf1, 0x3cb7, 0x11d4,
+ [0xbc, 0xa7, 0x00, 0x80,
+ 0xc7, 0x3c, 0x88, 0x81])
+
+ CPER_PROC_ARM = guid(0xE19E3D16, 0xBC11, 0x11E4,
+ [0x9C, 0xAA, 0xC2, 0x05,
+ 0x1D, 0x5D, 0x46, 0xB0])
+
+ CPER_PLATFORM_MEM = guid(0xA5BC1114, 0x6F64, 0x4EDE,
+ [0xB8, 0x63, 0x3E, 0x83,
+ 0xED, 0x7C, 0x83, 0xB1])
+
+ CPER_PLATFORM_MEM2 = guid(0x61EC04FC, 0x48E6, 0xD813,
+ [0x25, 0xC9, 0x8D, 0xAA,
+ 0x44, 0x75, 0x0B, 0x12])
+
+ CPER_PCIE = guid(0xD995E954, 0xBBC1, 0x430F,
+ [0xAD, 0x91, 0xB4, 0x4D,
+ 0xCB, 0x3C, 0x6F, 0x35])
+
+ CPER_PCI_BUS = guid(0xC5753963, 0x3B84, 0x4095,
+ [0xBF, 0x78, 0xED, 0xDA,
+ 0xD3, 0xF9, 0xC9, 0xDD])
+
+ CPER_PCI_DEV = guid(0xEB5E4685, 0xCA66, 0x4769,
+ [0xB6, 0xA2, 0x26, 0x06,
+ 0x8B, 0x00, 0x13, 0x26])
+
+ CPER_FW_ERROR = guid(0x81212A96, 0x09ED, 0x4996,
+ [0x94, 0x71, 0x8D, 0x72,
+ 0x9C, 0x8E, 0x69, 0xED])
+
+ CPER_DMA_GENERIC = guid(0x5B51FEF7, 0xC79D, 0x4434,
+ [0x8F, 0x1B, 0xAA, 0x62,
+ 0xDE, 0x3E, 0x2C, 0x64])
+
+ CPER_DMA_VT = guid(0x71761D37, 0x32B2, 0x45cd,
+ [0xA7, 0xD0, 0xB0, 0xFE,
+ 0xDD, 0x93, 0xE8, 0xCF])
+
+ CPER_DMA_IOMMU = guid(0x036F84E1, 0x7F37, 0x428c,
+ [0xA7, 0x9E, 0x57, 0x5F,
+ 0xDF, 0xAA, 0x84, 0xEC])
+
+ CPER_CCIX_PER = guid(0x91335EF6, 0xEBFB, 0x4478,
+ [0xA6, 0xA6, 0x88, 0xB7,
+ 0x28, 0xCF, 0x75, 0xD7])
+
+ CPER_CXL_PROT_ERR = guid(0x80B9EFB4, 0x52B5, 0x4DE3,
+ [0xA7, 0x77, 0x68, 0x78,
+ 0x4B, 0x77, 0x10, 0x48])
--
2.48.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (13 preceding siblings ...)
2025-02-21 14:35 ` [PATCH v4 14/14] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
@ 2025-02-26 14:16 ` Igor Mammedov
2025-02-26 14:39 ` Mauro Carvalho Chehab
2025-02-27 9:54 ` Igor Mammedov
15 siblings, 1 reply; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 14:16 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
On Fri, 21 Feb 2025 15:35:09 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Now that the ghes preparation patches were merged, let's add support
> for error injection.
>
> On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> table and hardware_error firmware file, together with its migration code. Migration tested
> with both latest QEMU released kernel and upstream, on both directions.
>
> The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> to inject ARM Processor Error records.
please, run ./scripts/checkpatch on patches before submitting them.
as it stands now series cannot be merged due to failing checkpatch
>
> ---
> v4:
> - added an extra comment for AcpiGhesState structure;
> - patches reordered;
> - no functional changes, just code shift between the patches in this series.
>
> v3:
> - addressed more nits;
> - hest_add_le now points to the beginning of HEST table;
> - removed HEST from tests/data/acpi;
> - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
>
> v2:
> - address some nits;
> - improved ags cleanup patch and removed ags.present field;
> - added some missing le*_to_cpu() calls;
> - update date at copyright for new files to 2024-2025;
> - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> - added HEST and DSDT tables after the changes to make check target happy.
> (two patches: first one whitelisting such tables; second one removing from
> whitelist and updating/adding such tables to tests/data/acpi)
>
>
>
> Mauro Carvalho Chehab (14):
> acpi/ghes: prepare to change the way HEST offsets are calculated
> acpi/ghes: add a firmware file with HEST address
> acpi/ghes: Use HEST table offsets when preparing GHES records
> acpi/ghes: don't hard-code the number of sources for HEST table
> acpi/ghes: add a notifier to notify when error data is ready
> acpi/ghes: create an ancillary acpi_ghes_get_state() function
> acpi/generic_event_device: Update GHES migration to cover hest addr
> acpi/generic_event_device: add logic to detect if HEST addr is
> available
> acpi/generic_event_device: add an APEI error device
> tests/acpi: virt: allow acpi table changes for a new table: HEST
> arm/virt: Wire up a GED error device for ACPI / GHES
> tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> qapi/acpi-hest: add an interface to do generic CPER error injection
> scripts/ghes_inject: add a script to generate GHES error inject
>
> MAINTAINERS | 10 +
> hw/acpi/Kconfig | 5 +
> hw/acpi/aml-build.c | 10 +
> hw/acpi/generic_event_device.c | 43 ++
> hw/acpi/ghes-stub.c | 7 +-
> hw/acpi/ghes.c | 231 ++++--
> hw/acpi/ghes_cper.c | 38 +
> hw/acpi/ghes_cper_stub.c | 19 +
> hw/acpi/meson.build | 2 +
> hw/arm/virt-acpi-build.c | 37 +-
> hw/arm/virt.c | 19 +-
> hw/core/machine.c | 2 +
> include/hw/acpi/acpi_dev_interface.h | 1 +
> include/hw/acpi/aml-build.h | 2 +
> include/hw/acpi/generic_event_device.h | 1 +
> include/hw/acpi/ghes.h | 54 +-
> include/hw/arm/virt.h | 2 +
> qapi/acpi-hest.json | 35 +
> qapi/meson.build | 1 +
> qapi/qapi-schema.json | 1 +
> scripts/arm_processor_error.py | 476 ++++++++++++
> scripts/ghes_inject.py | 51 ++
> scripts/qmp_helper.py | 702 ++++++++++++++++++
> target/arm/kvm.c | 7 +-
> tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> 29 files changed, 1677 insertions(+), 79 deletions(-)
> create mode 100644 hw/acpi/ghes_cper.c
> create mode 100644 hw/acpi/ghes_cper_stub.c
> create mode 100644 qapi/acpi-hest.json
> create mode 100644 scripts/arm_processor_error.py
> create mode 100755 scripts/ghes_inject.py
> create mode 100755 scripts/qmp_helper.py
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 01/14] acpi/ghes: prepare to change the way HEST offsets are calculated
2025-02-21 14:35 ` [PATCH v4 01/14] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
@ 2025-02-26 14:37 ` Igor Mammedov
2025-02-27 11:45 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 14:37 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, Peter Maydell, Shannon Zhao,
linux-kernel
On Fri, 21 Feb 2025 15:35:10 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Add a new ags flag to change the way HEST offsets are calculated.
> Currently, offsets needed to store ACPI HEST offsets and read ack
> are calculated based on a previous knowledge from the logic
> which creates the HEST table.
>
> Such logic is not generic, not allowing to easily add more HEST
> entries nor replicates what OSPM does.
>
> As the next patches will be adding a more generic logic, add a
> new use_hest_addr, set to false, in preparation for such changes.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> hw/acpi/ghes.c | 46 ++++++++++++++++++++++++----------------
> hw/arm/virt-acpi-build.c | 15 ++++++++++---
> include/hw/acpi/ghes.h | 14 ++++++++++--
> 3 files changed, 52 insertions(+), 23 deletions(-)
>
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index b709c177cdea..e49a03fdb94e 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -206,7 +206,8 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
> * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
> * See docs/specs/acpi_hest_ghes.rst for blobs format.
> */
> -static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
> +static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
> + BIOSLinker *linker)
> {
> int i, error_status_block_offset;
>
> @@ -251,13 +252,15 @@ static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
> i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
> }
>
> - /*
> - * tell firmware to write hardware_errors GPA into
> - * hardware_errors_addr fw_cfg, once the former has been initialized.
> - */
> - bios_linker_loader_write_pointer(linker, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, 0,
> - sizeof(uint64_t),
> - ACPI_HW_ERROR_FW_CFG_FILE, 0);
> + if (!ags->use_hest_addr) {
> + /*
> + * Tell firmware to write hardware_errors GPA into
> + * hardware_errors_addr fw_cfg, once the former has been initialized.
> + */
> + bios_linker_loader_write_pointer(linker, ACPI_HW_ERROR_ADDR_FW_CFG_FILE,
> + 0, sizeof(uint64_t),
> + ACPI_HW_ERROR_FW_CFG_FILE, 0);
> + }
> }
>
> /* Build Generic Hardware Error Source version 2 (GHESv2) */
> @@ -331,14 +334,15 @@ static void build_ghes_v2(GArray *table_data,
> }
>
> /* Build Hardware Error Source Table */
> -void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
> +void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> + GArray *hardware_errors,
> BIOSLinker *linker,
> const char *oem_id, const char *oem_table_id)
> {
> AcpiTable table = { .sig = "HEST", .rev = 1,
> .oem_id = oem_id, .oem_table_id = oem_table_id };
>
> - build_ghes_error_table(hardware_errors, linker);
> + build_ghes_error_table(ags, hardware_errors, linker);
>
> acpi_table_begin(&table, table_data);
>
> @@ -357,11 +361,11 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
> hardware_error->len);
>
> - /* Create a read-write fw_cfg file for Address */
> - fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
> - NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
> -
> - ags->present = true;
> + if (!ags->use_hest_addr) {
> + /* Create a read-write fw_cfg file for Address */
> + fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
> + NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
> + }
> }
>
> static void get_hw_error_offsets(uint64_t ghes_addr,
> @@ -411,8 +415,11 @@ void ghes_record_cper_errors(const void *cper, size_t len,
> ags = &acpi_ged_state->ghes_state;
>
> assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
> - get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
> - &cper_addr, &read_ack_register_addr);
> +
> + if (!ags->use_hest_addr) {
> + get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
> + &cper_addr, &read_ack_register_addr);
> + }
>
> if (!cper_addr) {
> error_setg(errp, "can not find Generic Error Status Block");
> @@ -494,5 +501,8 @@ bool acpi_ghes_present(void)
> return false;
> }
> ags = &acpi_ged_state->ghes_state;
> - return ags->present;
> + if (!ags->hw_error_le)
> + return false;
> +
> + return true;
> }
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 3ac8f8e17861..8ab8d11b6536 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -946,9 +946,18 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> build_dbg2(tables_blob, tables->linker, vms);
>
> if (vms->ras) {
> - acpi_add_table(table_offsets, tables_blob);
> - acpi_build_hest(tables_blob, tables->hardware_errors, tables->linker,
> - vms->oem_id, vms->oem_table_id);
> + AcpiGedState *acpi_ged_state;
> + AcpiGhesState *ags;
> +
> + acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
^^^ will explode if object_resolve_path_type() returns NULL
> + NULL));
it's also expensive load-wise.
You have access to vms with ged pointer here, use that
(search for 'acpi_ged_state = ACPI_GED' example)
> + if (acpi_ged_state) {
hence, this check is not really needed,
we have to have GED at this point or abort
earlier code that instantiates GED should take care of
cleanly exiting if it failed to create GED so we would never get
to missing GED here
> + ags = &acpi_ged_state->ghes_state;
> +
> + acpi_add_table(table_offsets, tables_blob);
> + acpi_build_hest(ags, tables_blob, tables->hardware_errors,
> + tables->linker, vms->oem_id, vms->oem_table_id);
> + }
> }
>
> if (ms->numa_state->num_nodes > 0) {
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index 39619a2457cb..a3d62b96584f 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -64,12 +64,22 @@ enum {
> ACPI_GHES_ERROR_SOURCE_COUNT
> };
>
> +/*
> + * AcpiGhesState stores an offset that will be used to fill HEST entries.
> + *
> + * When use_hest_addr is false, the stored offset is placed at hw_error_le,
> + * meaning an offset from the etc/hardware_errors firmware address. This
> + * is the default on QEMU 9.x.
> + *
> + * An offset value equal to zero means that GHES is not present.
> + */
> typedef struct AcpiGhesState {
> uint64_t hw_error_le;
> - bool present; /* True if GHES is present at all on this board */
> + bool use_hest_addr; /* Currently, always false */
> } AcpiGhesState;
>
> -void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
> +void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> + GArray *hardware_errors,
> BIOSLinker *linker,
> const char *oem_id, const char *oem_table_id);
> void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for " Igor Mammedov
@ 2025-02-26 14:39 ` Mauro Carvalho Chehab
2025-02-26 14:51 ` Igor Mammedov
0 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-26 14:39 UTC (permalink / raw)
To: Igor Mammedov
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
Em Wed, 26 Feb 2025 15:16:56 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Fri, 21 Feb 2025 15:35:09 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Now that the ghes preparation patches were merged, let's add support
> > for error injection.
> >
> > On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> > table and hardware_error firmware file, together with its migration code. Migration tested
> > with both latest QEMU released kernel and upstream, on both directions.
> >
> > The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> > to inject ARM Processor Error records.
>
> please, run ./scripts/checkpatch on patches before submitting them.
> as it stands now series cannot be merged due to failing checkpatch
Weird... checkpatch is at pre-commit hook, as recommended at QEMU
documentation. It is actually a little harder to manage this way, as it
sometimes cause troubles with binary files.
Anyway, I'll run it by hand before sending the next version.
>
> >
> > ---
> > v4:
> > - added an extra comment for AcpiGhesState structure;
> > - patches reordered;
> > - no functional changes, just code shift between the patches in this series.
> >
> > v3:
> > - addressed more nits;
> > - hest_add_le now points to the beginning of HEST table;
> > - removed HEST from tests/data/acpi;
> > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> >
> > v2:
> > - address some nits;
> > - improved ags cleanup patch and removed ags.present field;
> > - added some missing le*_to_cpu() calls;
> > - update date at copyright for new files to 2024-2025;
> > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > - added HEST and DSDT tables after the changes to make check target happy.
> > (two patches: first one whitelisting such tables; second one removing from
> > whitelist and updating/adding such tables to tests/data/acpi)
> >
> >
> >
> > Mauro Carvalho Chehab (14):
> > acpi/ghes: prepare to change the way HEST offsets are calculated
> > acpi/ghes: add a firmware file with HEST address
> > acpi/ghes: Use HEST table offsets when preparing GHES records
> > acpi/ghes: don't hard-code the number of sources for HEST table
> > acpi/ghes: add a notifier to notify when error data is ready
> > acpi/ghes: create an ancillary acpi_ghes_get_state() function
> > acpi/generic_event_device: Update GHES migration to cover hest addr
> > acpi/generic_event_device: add logic to detect if HEST addr is
> > available
> > acpi/generic_event_device: add an APEI error device
> > tests/acpi: virt: allow acpi table changes for a new table: HEST
> > arm/virt: Wire up a GED error device for ACPI / GHES
> > tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> > qapi/acpi-hest: add an interface to do generic CPER error injection
> > scripts/ghes_inject: add a script to generate GHES error inject
> >
> > MAINTAINERS | 10 +
> > hw/acpi/Kconfig | 5 +
> > hw/acpi/aml-build.c | 10 +
> > hw/acpi/generic_event_device.c | 43 ++
> > hw/acpi/ghes-stub.c | 7 +-
> > hw/acpi/ghes.c | 231 ++++--
> > hw/acpi/ghes_cper.c | 38 +
> > hw/acpi/ghes_cper_stub.c | 19 +
> > hw/acpi/meson.build | 2 +
> > hw/arm/virt-acpi-build.c | 37 +-
> > hw/arm/virt.c | 19 +-
> > hw/core/machine.c | 2 +
> > include/hw/acpi/acpi_dev_interface.h | 1 +
> > include/hw/acpi/aml-build.h | 2 +
> > include/hw/acpi/generic_event_device.h | 1 +
> > include/hw/acpi/ghes.h | 54 +-
> > include/hw/arm/virt.h | 2 +
> > qapi/acpi-hest.json | 35 +
> > qapi/meson.build | 1 +
> > qapi/qapi-schema.json | 1 +
> > scripts/arm_processor_error.py | 476 ++++++++++++
> > scripts/ghes_inject.py | 51 ++
> > scripts/qmp_helper.py | 702 ++++++++++++++++++
> > target/arm/kvm.c | 7 +-
> > tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> > .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> > tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> > tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> > tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> > 29 files changed, 1677 insertions(+), 79 deletions(-)
> > create mode 100644 hw/acpi/ghes_cper.c
> > create mode 100644 hw/acpi/ghes_cper_stub.c
> > create mode 100644 qapi/acpi-hest.json
> > create mode 100644 scripts/arm_processor_error.py
> > create mode 100755 scripts/ghes_inject.py
> > create mode 100755 scripts/qmp_helper.py
> >
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 02/14] acpi/ghes: add a firmware file with HEST address
2025-02-21 14:35 ` [PATCH v4 02/14] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
@ 2025-02-26 14:48 ` Igor Mammedov
0 siblings, 0 replies; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 14:48 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, linux-kernel
On Fri, 21 Feb 2025 15:35:11 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Store HEST table address at GPA, placing its the start of the table at
> hest_addr_le variable.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> ---
> hw/acpi/ghes.c | 22 ++++++++++++++++++++--
> include/hw/acpi/ghes.h | 7 ++++++-
> 2 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index e49a03fdb94e..ba37be9e7022 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -30,6 +30,7 @@
>
> #define ACPI_HW_ERROR_FW_CFG_FILE "etc/hardware_errors"
> #define ACPI_HW_ERROR_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
> +#define ACPI_HEST_ADDR_FW_CFG_FILE "etc/acpi_table_hest_addr"
>
> /* The max size in bytes for one error block */
> #define ACPI_GHES_MAX_RAW_DATA_LENGTH (1 * KiB)
> @@ -341,6 +342,9 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> {
> AcpiTable table = { .sig = "HEST", .rev = 1,
> .oem_id = oem_id, .oem_table_id = oem_table_id };
> + uint32_t hest_offset;
> +
> + hest_offset = table_data->len;
>
> build_ghes_error_table(ags, hardware_errors, linker);
>
> @@ -352,6 +356,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> ACPI_GHES_NOTIFY_SEA, ACPI_HEST_SRC_ID_SEA);
>
> acpi_table_end(linker, &table);
> +
> + if (ags->use_hest_addr) {
> + /*
> + * Tell firmware to write into GPA the address of HEST via fw_cfg,
> + * once initialized.
> + */
> + bios_linker_loader_write_pointer(linker,
> + ACPI_HEST_ADDR_FW_CFG_FILE, 0,
> + sizeof(uint64_t),
> + ACPI_BUILD_TABLE_FILE, hest_offset);
> + }
> }
>
> void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> @@ -361,7 +376,10 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
> hardware_error->len);
>
> - if (!ags->use_hest_addr) {
> + if (ags->use_hest_addr) {
> + fw_cfg_add_file_callback(s, ACPI_HEST_ADDR_FW_CFG_FILE, NULL, NULL,
> + NULL, &(ags->hest_addr_le), sizeof(ags->hest_addr_le), false);
> + } else {
> /* Create a read-write fw_cfg file for Address */
> fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
> NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
> @@ -501,7 +519,7 @@ bool acpi_ghes_present(void)
> return false;
> }
> ags = &acpi_ged_state->ghes_state;
> - if (!ags->hw_error_le)
> + if (!ags->hw_error_le && !ags->hest_addr_le)
> return false;
>
> return true;
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index a3d62b96584f..454e97b5341c 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -71,9 +71,14 @@ enum {
> * meaning an offset from the etc/hardware_errors firmware address. This
> * is the default on QEMU 9.x.
> *
> - * An offset value equal to zero means that GHES is not present.
> + * When use_hest_addr is true, the stored offset is placed at hest_addr_le,
> + * meaning an offset from theHEST table address from etc/acpi/tables firmware.
^^^^^^ missing whitespace
'offset' language is confusing here, is asks for explanation offset from what?
what is kept in hest_addr_le is GPA of HEST table, it would be better to address
wording here.
The same applies to similar comment in previous patch
> + * This is the default for QEMU 10.x and above.
> + *
> + * If both offset values are equal to zero, it means that GHES is not present
> */
> typedef struct AcpiGhesState {
> + uint64_t hest_addr_le;
> uint64_t hw_error_le;
> bool use_hest_addr; /* Currently, always false */
> } AcpiGhesState;
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-26 14:39 ` Mauro Carvalho Chehab
@ 2025-02-26 14:51 ` Igor Mammedov
2025-02-26 16:00 ` Igor Mammedov
0 siblings, 1 reply; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 14:51 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
On Wed, 26 Feb 2025 15:39:13 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Em Wed, 26 Feb 2025 15:16:56 +0100
> Igor Mammedov <imammedo@redhat.com> escreveu:
>
> > On Fri, 21 Feb 2025 15:35:09 +0100
> > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> >
> > > Now that the ghes preparation patches were merged, let's add support
> > > for error injection.
> > >
> > > On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> > > table and hardware_error firmware file, together with its migration code. Migration tested
> > > with both latest QEMU released kernel and upstream, on both directions.
> > >
> > > The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> > > to inject ARM Processor Error records.
> >
> > please, run ./scripts/checkpatch on patches before submitting them.
> > as it stands now series cannot be merged due to failing checkpatch
>
> Weird... checkpatch is at pre-commit hook, as recommended at QEMU
> documentation. It is actually a little harder to manage this way, as it
> sometimes cause troubles with binary files.
>
> Anyway, I'll run it by hand before sending the next version.
I've just applied v4 => format-patch => checkpatch
maybe I did something wrong (don't see how) but it complains overhere
PS: do not respin until I've finish this review.
> >
> > >
> > > ---
> > > v4:
> > > - added an extra comment for AcpiGhesState structure;
> > > - patches reordered;
> > > - no functional changes, just code shift between the patches in this series.
> > >
> > > v3:
> > > - addressed more nits;
> > > - hest_add_le now points to the beginning of HEST table;
> > > - removed HEST from tests/data/acpi;
> > > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> > >
> > > v2:
> > > - address some nits;
> > > - improved ags cleanup patch and removed ags.present field;
> > > - added some missing le*_to_cpu() calls;
> > > - update date at copyright for new files to 2024-2025;
> > > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > > - added HEST and DSDT tables after the changes to make check target happy.
> > > (two patches: first one whitelisting such tables; second one removing from
> > > whitelist and updating/adding such tables to tests/data/acpi)
> > >
> > >
> > >
> > > Mauro Carvalho Chehab (14):
> > > acpi/ghes: prepare to change the way HEST offsets are calculated
> > > acpi/ghes: add a firmware file with HEST address
> > > acpi/ghes: Use HEST table offsets when preparing GHES records
> > > acpi/ghes: don't hard-code the number of sources for HEST table
> > > acpi/ghes: add a notifier to notify when error data is ready
> > > acpi/ghes: create an ancillary acpi_ghes_get_state() function
> > > acpi/generic_event_device: Update GHES migration to cover hest addr
> > > acpi/generic_event_device: add logic to detect if HEST addr is
> > > available
> > > acpi/generic_event_device: add an APEI error device
> > > tests/acpi: virt: allow acpi table changes for a new table: HEST
> > > arm/virt: Wire up a GED error device for ACPI / GHES
> > > tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> > > qapi/acpi-hest: add an interface to do generic CPER error injection
> > > scripts/ghes_inject: add a script to generate GHES error inject
> > >
> > > MAINTAINERS | 10 +
> > > hw/acpi/Kconfig | 5 +
> > > hw/acpi/aml-build.c | 10 +
> > > hw/acpi/generic_event_device.c | 43 ++
> > > hw/acpi/ghes-stub.c | 7 +-
> > > hw/acpi/ghes.c | 231 ++++--
> > > hw/acpi/ghes_cper.c | 38 +
> > > hw/acpi/ghes_cper_stub.c | 19 +
> > > hw/acpi/meson.build | 2 +
> > > hw/arm/virt-acpi-build.c | 37 +-
> > > hw/arm/virt.c | 19 +-
> > > hw/core/machine.c | 2 +
> > > include/hw/acpi/acpi_dev_interface.h | 1 +
> > > include/hw/acpi/aml-build.h | 2 +
> > > include/hw/acpi/generic_event_device.h | 1 +
> > > include/hw/acpi/ghes.h | 54 +-
> > > include/hw/arm/virt.h | 2 +
> > > qapi/acpi-hest.json | 35 +
> > > qapi/meson.build | 1 +
> > > qapi/qapi-schema.json | 1 +
> > > scripts/arm_processor_error.py | 476 ++++++++++++
> > > scripts/ghes_inject.py | 51 ++
> > > scripts/qmp_helper.py | 702 ++++++++++++++++++
> > > target/arm/kvm.c | 7 +-
> > > tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> > > .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> > > tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> > > tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> > > tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> > > 29 files changed, 1677 insertions(+), 79 deletions(-)
> > > create mode 100644 hw/acpi/ghes_cper.c
> > > create mode 100644 hw/acpi/ghes_cper_stub.c
> > > create mode 100644 qapi/acpi-hest.json
> > > create mode 100644 scripts/arm_processor_error.py
> > > create mode 100755 scripts/ghes_inject.py
> > > create mode 100755 scripts/qmp_helper.py
> > >
> >
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 03/14] acpi/ghes: Use HEST table offsets when preparing GHES records
2025-02-21 14:35 ` [PATCH v4 03/14] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
@ 2025-02-26 15:16 ` Igor Mammedov
0 siblings, 0 replies; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 15:16 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, linux-kernel
On Fri, 21 Feb 2025 15:35:12 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> There are two pointers that are needed during error injection:
>
> 1. The start address of the CPER block to be stored;
> 2. The address of the ack.
s/ack/read_ack/
>
> It is preferable to calculate them from the HEST table. This allows
> checking the source ID, the size of the table and the type of the
> HEST error block structures.
>
> Yet, keep the old code, as this is needed for migration purposes
+ from older QEMU versions
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> hw/acpi/ghes.c | 100 +++++++++++++++++++++++++++++++++++++++++
> include/hw/acpi/ghes.h | 2 +-
> 2 files changed, 101 insertions(+), 1 deletion(-)
>
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index ba37be9e7022..7efea519f766 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -41,6 +41,12 @@
> /* Address offset in Generic Address Structure(GAS) */
> #define GAS_ADDR_OFFSET 4
>
> +/*
> + * ACPI spec 1.0b
> + * 5.2.3 System Description Table Header
> + */
> +#define ACPI_DESC_HEADER_OFFSET 36
> +
> /*
> * The total size of Generic Error Data Entry
> * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
> @@ -61,6 +67,30 @@
> */
> #define ACPI_GHES_GESB_SIZE 20
>
> +/*
> + * See the memory layout map at docs/specs/acpi_hest_ghes.rst.
> + */
> +
> +/*
> + * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source version 2
> + * Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure
> + */
> +#define HEST_GHES_V2_ENTRY_SIZE 92
> +
> +/*
> + * ACPI 6.1: 18.3.2.7: Generic Hardware Error Source
wrong chapter, read ack can't be in v1 GHES
> + * Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure
> + * Read Ack Register
> + */
> +#define GHES_READ_ACK_ADDR_OFF 64
> +
> +/*
> + * ACPI 6.1: 18.3.2.7: Generic Hardware Error Source
> + * Table 18-341 Generic Hardware Error Source Structure
> + * Error Status Address
> + */
> +#define GHES_ERR_STATUS_ADDR_OFF 20
> +
> /*
> * Values for error_severity field
> */
> @@ -412,6 +442,73 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
> *read_ack_register_addr = ghes_addr + sizeof(uint64_t);
> }
>
> +static void get_ghes_source_offsets(uint16_t source_id,
> + uint64_t hest_addr,
> + uint64_t *cper_addr,
> + uint64_t *read_ack_start_addr,
> + Error **errp)
> +{
> + uint64_t hest_err_block_addr, hest_read_ack_addr;
> + uint64_t err_source_entry, error_block_addr;
> + uint32_t num_sources, i;
> +
> + hest_addr += ACPI_DESC_HEADER_OFFSET;
> +
> + cpu_physical_memory_read(hest_addr, &num_sources,
> + sizeof(num_sources));
> + num_sources = le32_to_cpu(num_sources);
> +
> + err_source_entry = hest_addr + sizeof(num_sources);
> +
> + /*
> + * Currently, HEST Error source navigates only for GHESv2 tables
> + */
> + for (i = 0; i < num_sources; i++) {
> + uint64_t addr = err_source_entry;
> + uint16_t type, src_id;
> +
> + cpu_physical_memory_read(addr, &type, sizeof(type));
> + type = le16_to_cpu(type);
> +
> + /* For now, we only know the size of GHESv2 table */
> + if (type != ACPI_GHES_SOURCE_GENERIC_ERROR_V2) {
> + error_setg(errp, "HEST: type %d not supported.", type);
> + return;
> + }
> +
> + /* Compare CPER source address at the GHESv2 structure */
^^^^^ typo?
> + addr += sizeof(type);
> + cpu_physical_memory_read(addr, &src_id, sizeof(src_id));
> + if (le16_to_cpu(src_id) == source_id) {
> + break;
> + }
> +
> + err_source_entry += HEST_GHES_V2_ENTRY_SIZE;
> + }
> + if (i == num_sources) {
> + error_setg(errp, "HEST: Source %d not found.", source_id);
> + return;
> + }
> +
> + /* Navigate though table address pointers */
^^^^^ typo
> + hest_err_block_addr = err_source_entry + GHES_ERR_STATUS_ADDR_OFF +
> + GAS_ADDR_OFFSET;
> +
> + cpu_physical_memory_read(hest_err_block_addr, &error_block_addr,
> + sizeof(error_block_addr));
> + error_block_addr = le64_to_cpu(error_block_addr);
> +
> + cpu_physical_memory_read(error_block_addr, cper_addr,
> + sizeof(*cper_addr));
> + *cper_addr = le64_to_cpu(*cper_addr);
> +
> + hest_read_ack_addr = err_source_entry + GHES_READ_ACK_ADDR_OFF +
> + GAS_ADDR_OFFSET;
> + cpu_physical_memory_read(hest_read_ack_addr, read_ack_start_addr,
> + sizeof(*read_ack_start_addr));
> + *read_ack_start_addr = le64_to_cpu(*read_ack_start_addr);
> +}
> +
> void ghes_record_cper_errors(const void *cper, size_t len,
> uint16_t source_id, Error **errp)
> {
> @@ -437,6 +534,9 @@ void ghes_record_cper_errors(const void *cper, size_t len,
> if (!ags->use_hest_addr) {
> get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
> &cper_addr, &read_ack_register_addr);
> + } else {
> + get_ghes_source_offsets(source_id, le64_to_cpu(ags->hest_addr_le),
> + &cper_addr, &read_ack_register_addr, errp);
> }
>
> if (!cper_addr) {
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index 454e97b5341c..2f06e433ce04 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -80,7 +80,7 @@ enum {
> typedef struct AcpiGhesState {
> uint64_t hest_addr_le;
> uint64_t hw_error_le;
> - bool use_hest_addr; /* Currently, always false */
> + bool use_hest_addr; /* True if HEST address is present */
> } AcpiGhesState;
>
> void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function
2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
@ 2025-02-26 15:27 ` Igor Mammedov
0 siblings, 0 replies; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 15:27 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, Paolo Bonzini, Peter Maydell,
kvm, linux-kernel
On Fri, 21 Feb 2025 15:35:15 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Instead of having a function to check if ACPI is enabled
> (acpi_ghes_present), change its logic to be more generic,
> returing a pointed to AcpiGhesState.
>
> Such change allows cleanup the ghes GED state code, avoiding
> to read it multiple times, and simplifying the code.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> ---
> hw/acpi/ghes-stub.c | 7 ++++---
> hw/acpi/ghes.c | 38 ++++++++++----------------------------
> include/hw/acpi/ghes.h | 14 ++++++++------
> target/arm/kvm.c | 7 +++++--
> 4 files changed, 27 insertions(+), 39 deletions(-)
>
> diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
> index 7cec1812dad9..40f660c246fe 100644
> --- a/hw/acpi/ghes-stub.c
> +++ b/hw/acpi/ghes-stub.c
> @@ -11,12 +11,13 @@
> #include "qemu/osdep.h"
> #include "hw/acpi/ghes.h"
>
> -int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
> +int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
> + uint64_t physical_address)
> {
> return -1;
> }
>
> -bool acpi_ghes_present(void)
> +AcpiGhesState *acpi_ghes_get_state(void)
> {
> - return false;
> + return NULL;
> }
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index f2d1cc7369f4..401789259f60 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -425,10 +425,6 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
> uint64_t *cper_addr,
> uint64_t *read_ack_register_addr)
> {
> - if (!ghes_addr) {
> - return;
> - }
> -
> /*
> * non-HEST version supports only one source, so no need to change
> * the start offset based on the source ID. Also, we can't validate
> @@ -517,27 +513,16 @@ static void get_ghes_source_offsets(uint16_t source_id,
> NotifierList acpi_generic_error_notifiers =
> NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
>
> -void ghes_record_cper_errors(const void *cper, size_t len,
> +void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
> uint16_t source_id, Error **errp)
> {
> uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
> - AcpiGedState *acpi_ged_state;
> - AcpiGhesState *ags;
>
> if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
> error_setg(errp, "GHES CPER record is too big: %zd", len);
> return;
> }
>
> - acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
> - NULL));
> - if (!acpi_ged_state) {
> - error_setg(errp, "Can't find ACPI_GED object");
> - return;
> - }
> - ags = &acpi_ged_state->ghes_state;
> -
> -
> if (!ags->use_hest_addr) {
> get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
> &cper_addr, &read_ack_register_addr);
> @@ -546,11 +531,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
> &cper_addr, &read_ack_register_addr, errp);
> }
>
> - if (!cper_addr) {
> - error_setg(errp, "can not find Generic Error Status Block");
> - return;
> - }
> -
> cpu_physical_memory_read(read_ack_register_addr,
> &read_ack_register, sizeof(read_ack_register));
>
> @@ -576,7 +556,8 @@ void ghes_record_cper_errors(const void *cper, size_t len,
> notifier_list_notify(&acpi_generic_error_notifiers, NULL);
> }
>
> -int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
> +int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
> + uint64_t physical_address)
> {
> /* Memory Error Section Type */
> const uint8_t guid[] =
> @@ -602,7 +583,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
> acpi_ghes_build_append_mem_cper(block, physical_address);
>
> /* Report the error */
> - ghes_record_cper_errors(block->data, block->len, source_id, &errp);
> + ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
>
> g_array_free(block, true);
>
> @@ -614,7 +595,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
> return 0;
> }
>
> -bool acpi_ghes_present(void)
> +AcpiGhesState *acpi_ghes_get_state(void)
> {
> AcpiGedState *acpi_ged_state;
> AcpiGhesState *ags;
> @@ -623,11 +604,12 @@ bool acpi_ghes_present(void)
> NULL));
>
> if (!acpi_ged_state) {
> - return false;
> + return NULL;
> }
> ags = &acpi_ged_state->ghes_state;
> - if (!ags->hw_error_le && !ags->hest_addr_le)
> - return false;
>
> - return true;
> + if (!ags->hw_error_le && !ags->hest_addr_le) {
> + return NULL;
> + }
> + return ags;
> }
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index 219aa7ab4fe0..276f9dc076d9 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -99,15 +99,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> const char *oem_id, const char *oem_table_id);
> void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
> GArray *hardware_errors);
> -int acpi_ghes_memory_errors(uint16_t source_id, uint64_t error_physical_addr);
> -void ghes_record_cper_errors(const void *cper, size_t len,
> +int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
> + uint64_t error_physical_addr);
> +void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
> uint16_t source_id, Error **errp);
>
> /**
> - * acpi_ghes_present: Report whether ACPI GHES table is present
> + * acpi_ghes_get_state: Get a pointer for ACPI ghes state
> *
> - * Returns: true if the system has an ACPI GHES table and it is
> - * safe to call acpi_ghes_memory_errors() to record a memory error.
> + * Returns: a pointer to ghes state if the system has an ACPI GHES table,
> + * it is enabled and it is safe to call acpi_ghes_memory_errors() to record
^^^^^^^^^^^^^ can't link 'it' with anything, I'd drop this
> + * a memory error. Returns false, otherwise.
^^^ NULL ??
> */
> -bool acpi_ghes_present(void);
> +AcpiGhesState *acpi_ghes_get_state(void);
> #endif
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index da30bdbb2349..80ca7779797b 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -2366,10 +2366,12 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
> {
> ram_addr_t ram_addr;
> hwaddr paddr;
> + AcpiGhesState *ags;
>
> assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
>
> - if (acpi_ghes_present() && addr) {
> + ags = acpi_ghes_get_state();
> + if (ags && addr) {
> ram_addr = qemu_ram_addr_from_host(addr);
> if (ram_addr != RAM_ADDR_INVALID &&
> kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
> @@ -2387,7 +2389,8 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
> */
> if (code == BUS_MCEERR_AR) {
> kvm_cpu_synchronize_state(c);
> - if (!acpi_ghes_memory_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
> + if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SEA,
> + paddr)) {
> kvm_inject_arm_sea(c);
> } else {
> error_report("failed to record the error");
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 04/14] acpi/ghes: don't hard-code the number of sources for HEST table
2025-02-21 14:35 ` [PATCH v4 04/14] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
@ 2025-02-26 15:48 ` Igor Mammedov
0 siblings, 0 replies; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 15:48 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, Peter Maydell, Shannon Zhao,
linux-kernel
On Fri, 21 Feb 2025 15:35:13 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> The current code is actually dependent on having just one error
> structure with a single source, as any change there would cause
> migration issues.
>
> As the number of sources should be arch-dependent, as it will depend on
> what kind of notifications will exist, and how many errors can be
> reported at the same time, change the logic to be more flexible,
> allowing the number of sources to be defined when building the
> HEST table by the caller.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> ---
> hw/acpi/ghes.c | 38 +++++++++++++++++++++-----------------
> hw/arm/virt-acpi-build.c | 8 +++++++-
> include/hw/acpi/ghes.h | 17 ++++++++++++-----
> 3 files changed, 40 insertions(+), 23 deletions(-)
>
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 7efea519f766..4a4ea8f4be90 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -238,17 +238,17 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
> * See docs/specs/acpi_hest_ghes.rst for blobs format.
> */
> static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
> - BIOSLinker *linker)
> + BIOSLinker *linker, int num_sources)
> {
> int i, error_status_block_offset;
>
> /* Build error_block_address */
> - for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
> + for (i = 0; i < num_sources; i++) {
> build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
> }
>
> /* Build read_ack_register */
> - for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
> + for (i = 0; i < num_sources; i++) {
> /*
> * Initialize the value of read_ack_register to 1, so GHES can be
> * writable after (re)boot.
> @@ -263,13 +263,13 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
>
> /* Reserve space for Error Status Data Block */
> acpi_data_push(hardware_errors,
> - ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
> + ACPI_GHES_MAX_RAW_DATA_LENGTH * num_sources);
>
> /* Tell guest firmware to place hardware_errors blob into RAM */
> bios_linker_loader_alloc(linker, ACPI_HW_ERROR_FW_CFG_FILE,
> hardware_errors, sizeof(uint64_t), false);
>
> - for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
> + for (i = 0; i < num_sources; i++) {
> /*
> * Tell firmware to patch error_block_address entries to point to
> * corresponding "Generic Error Status Block"
> @@ -295,12 +295,14 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
> }
>
> /* Build Generic Hardware Error Source version 2 (GHESv2) */
> -static void build_ghes_v2(GArray *table_data,
> - BIOSLinker *linker,
> - enum AcpiGhesNotifyType notify,
> - uint16_t source_id)
> +static void build_ghes_v2_entry(GArray *table_data,
> + BIOSLinker *linker,
> + const AcpiNotificationSourceId *notif_src,
> + uint16_t index, int num_sources)
> {
> uint64_t address_offset;
> + const uint16_t notify = notif_src->notify;
> + const uint16_t source_id = notif_src->source_id;
>
> /*
> * Type:
> @@ -331,7 +333,7 @@ static void build_ghes_v2(GArray *table_data,
> address_offset + GAS_ADDR_OFFSET,
> sizeof(uint64_t),
> ACPI_HW_ERROR_FW_CFG_FILE,
> - source_id * sizeof(uint64_t));
> + index * sizeof(uint64_t));
>
> /* Notification Structure */
> build_ghes_hw_error_notification(table_data, notify);
> @@ -351,8 +353,7 @@ static void build_ghes_v2(GArray *table_data,
> address_offset + GAS_ADDR_OFFSET,
> sizeof(uint64_t),
> ACPI_HW_ERROR_FW_CFG_FILE,
> - (ACPI_GHES_ERROR_SOURCE_COUNT + source_id)
> - * sizeof(uint64_t));
> + (num_sources + index) * sizeof(uint64_t));
>
> /*
> * Read Ack Preserve field
> @@ -368,22 +369,26 @@ static void build_ghes_v2(GArray *table_data,
> void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> GArray *hardware_errors,
> BIOSLinker *linker,
> + const AcpiNotificationSourceId *notif_source,
> + int num_sources,
> const char *oem_id, const char *oem_table_id)
> {
> AcpiTable table = { .sig = "HEST", .rev = 1,
> .oem_id = oem_id, .oem_table_id = oem_table_id };
> uint32_t hest_offset;
> + int i;
>
> hest_offset = table_data->len;
>
> - build_ghes_error_table(ags, hardware_errors, linker);
> + build_ghes_error_table(ags, hardware_errors, linker, num_sources);
>
> acpi_table_begin(&table, table_data);
>
> /* Error Source Count */
> - build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
> - build_ghes_v2(table_data, linker,
> - ACPI_GHES_NOTIFY_SEA, ACPI_HEST_SRC_ID_SEA);
> + build_append_int_noprefix(table_data, num_sources, 4);
> + for (i = 0; i < num_sources; i++) {
> + build_ghes_v2_entry(table_data, linker, ¬if_source[i], i, num_sources);
> + }
>
> acpi_table_end(linker, &table);
>
> @@ -529,7 +534,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
> }
> ags = &acpi_ged_state->ghes_state;
>
> - assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
I'd also remove one blank line here
>
> if (!ags->use_hest_addr) {
> get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 8ab8d11b6536..4439252e1a75 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -893,6 +893,10 @@ static void acpi_align_size(GArray *blob, unsigned align)
> g_array_set_size(blob, ROUND_UP(acpi_data_len(blob), align));
> }
>
> +static const AcpiNotificationSourceId hest_ghes_notify[] = {
> + { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
> +};
> +
> static
> void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> {
> @@ -956,7 +960,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>
> acpi_add_table(table_offsets, tables_blob);
> acpi_build_hest(ags, tables_blob, tables->hardware_errors,
> - tables->linker, vms->oem_id, vms->oem_table_id);
> + tables->linker, hest_ghes_notify,
> + ARRAY_SIZE(hest_ghes_notify),
> + vms->oem_id, vms->oem_table_id);
> }
> }
>
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index 2f06e433ce04..51c6b6b33327 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -57,13 +57,18 @@ enum AcpiGhesNotifyType {
> ACPI_GHES_NOTIFY_RESERVED = 12
> };
>
> -enum {
> - ACPI_HEST_SRC_ID_SEA = 0,
> - /* future ids go here */
> -
> - ACPI_GHES_ERROR_SOURCE_COUNT
> +/*
> + * ID numbers used to fill HEST source ID field
> + */
> +enum AcpiGhesSourceID {
> + ACPI_HEST_SRC_ID_SYNC,
> };
>
> +typedef struct AcpiNotificationSourceId {
> + enum AcpiGhesSourceID source_id;
> + enum AcpiGhesNotifyType notify;
> +} AcpiNotificationSourceId;
> +
> /*
> * AcpiGhesState stores an offset that will be used to fill HEST entries.
> *
> @@ -86,6 +91,8 @@ typedef struct AcpiGhesState {
> void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> GArray *hardware_errors,
> BIOSLinker *linker,
> + const AcpiNotificationSourceId * const notif_source,
> + int num_sources,
> const char *oem_id, const char *oem_table_id);
> void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
> GArray *hardware_errors);
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 08/14] acpi/generic_event_device: add logic to detect if HEST addr is available
2025-02-21 14:35 ` [PATCH v4 08/14] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
@ 2025-02-26 15:52 ` Igor Mammedov
2025-02-27 7:19 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 15:52 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha,
Eduardo Habkost, Marcel Apfelbaum, Peter Maydell, Shannon Zhao,
Yanan Wang, Zhao Liu, linux-kernel
On Fri, 21 Feb 2025 15:35:17 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Create a new property (x-has-hest-addr) and use it to detect if
> the GHES table offsets can be calculated from the HEST address
> (qemu 10.0 and upper) or via the legacy way via an offset obtained
> from the hardware_errors firmware file.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> hw/acpi/generic_event_device.c | 1 +
> hw/arm/virt-acpi-build.c | 18 ++++++++++++++++--
> hw/core/machine.c | 2 ++
> 3 files changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index 5346cae573b7..14d8513a5440 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -318,6 +318,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>
> static const Property acpi_ged_properties[] = {
> DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
> + DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, false),
you below set it for 9.2 to false, so
shouldn't it be set to true by default here?
> };
>
> static const VMStateDescription vmstate_memhp_state = {
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 4439252e1a75..9de51105a513 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -897,6 +897,10 @@ static const AcpiNotificationSourceId hest_ghes_notify[] = {
> { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
> };
>
> +static const AcpiNotificationSourceId hest_ghes_notify_9_2[] = {
> + { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
> +};
> +
> static
> void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> {
> @@ -950,7 +954,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> build_dbg2(tables_blob, tables->linker, vms);
>
> if (vms->ras) {
> + static const AcpiNotificationSourceId *notify;
> AcpiGedState *acpi_ged_state;
> + unsigned int notify_sz;
> AcpiGhesState *ags;
>
> acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
> @@ -959,9 +965,17 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> ags = &acpi_ged_state->ghes_state;
>
> acpi_add_table(table_offsets, tables_blob);
> +
> + if (!ags->use_hest_addr) {
> + notify = hest_ghes_notify_9_2;
> + notify_sz = ARRAY_SIZE(hest_ghes_notify_9_2);
> + } else {
> + notify = hest_ghes_notify;
> + notify_sz = ARRAY_SIZE(hest_ghes_notify);
> + }
> +
> acpi_build_hest(ags, tables_blob, tables->hardware_errors,
> - tables->linker, hest_ghes_notify,
> - ARRAY_SIZE(hest_ghes_notify),
> + tables->linker, notify, notify_sz,
> vms->oem_id, vms->oem_table_id);
> }
> }
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 02cff735b3fb..7a11e0f87b11 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -34,6 +34,7 @@
> #include "hw/virtio/virtio-pci.h"
> #include "hw/virtio/virtio-net.h"
> #include "hw/virtio/virtio-iommu.h"
> +#include "hw/acpi/generic_event_device.h"
> #include "audio/audio.h"
>
> GlobalProperty hw_compat_9_2[] = {
> @@ -43,6 +44,7 @@ GlobalProperty hw_compat_9_2[] = {
> { "virtio-balloon-pci-non-transitional", "vectors", "0" },
> { "virtio-mem-pci", "vectors", "0" },
> { "migration", "multifd-clean-tls-termination", "false" },
> + { TYPE_ACPI_GED, "x-has-hest-addr", "false" },
> };
> const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2);
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 10/14] tests/acpi: virt: allow acpi table changes for a new table: HEST
2025-02-21 14:35 ` [PATCH v4 10/14] tests/acpi: virt: allow acpi table changes for a new table: HEST Mauro Carvalho Chehab
@ 2025-02-26 15:55 ` Igor Mammedov
0 siblings, 0 replies; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 15:55 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, linux-kernel
On Fri, 21 Feb 2025 15:35:19 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> The DSDT table will also be affected by such change.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> tests/qtest/bios-tables-test-allowed-diff.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
> index dfb8523c8bf4..1a4c2277bd5a 100644
> --- a/tests/qtest/bios-tables-test-allowed-diff.h
> +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> @@ -1 +1,2 @@
> /* List of comma-separated changed AML files to ignore */
> +"tests/data/acpi/aarch64/virt/DSDT",
this and flowing update would also include HEST table, once you enable 'ras' in tests
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 11/14] arm/virt: Wire up a GED error device for ACPI / GHES
2025-02-21 14:35 ` [PATCH v4 11/14] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
@ 2025-02-26 15:58 ` Igor Mammedov
0 siblings, 0 replies; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 15:58 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Peter Maydell, Shannon Zhao, linux-kernel
On Fri, 21 Feb 2025 15:35:20 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Adds support to ARM virtualization to allow handling
> generic error ACPI Event via GED & error source device.
>
> It is aligned with Linux Kernel patch:
> https://lore.kernel.org/lkml/1272350481-27951-8-git-send-email-ying.huang@intel.com/
>
> Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Acked-by: Igor Mammedov <imammedo@redhat.com>
>
> ---
>
> Changes from v8:
>
> - Added a call to the function that produces GHES generic
> records, as this is now added earlier in this series.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> hw/acpi/generic_event_device.c | 2 +-
> hw/arm/virt-acpi-build.c | 1 +
> hw/arm/virt.c | 12 +++++++++++-
> include/hw/arm/virt.h | 1 +
> 4 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index 180eebbce1cd..f5e899155d34 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -331,7 +331,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>
> static const Property acpi_ged_properties[] = {
> DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
> - DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, false),
> + DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, true),
irrelevant to this patch, see comment in 8/14
> };
>
> static const VMStateDescription vmstate_memhp_state = {
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 9de51105a513..4f174795ed60 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -861,6 +861,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> }
>
> acpi_dsdt_add_power_button(scope);
> + aml_append(scope, aml_error_device());
> #ifdef CONFIG_TPM
> acpi_dsdt_add_tpm(scope, vms);
> #endif
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 4a5a9666e916..3faf32f900b5 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -678,7 +678,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
> DeviceState *dev;
> MachineState *ms = MACHINE(vms);
> int irq = vms->irqmap[VIRT_ACPI_GED];
> - uint32_t event = ACPI_GED_PWR_DOWN_EVT;
> + uint32_t event = ACPI_GED_PWR_DOWN_EVT | ACPI_GED_ERROR_EVT;
>
> if (ms->ram_slots) {
> event |= ACPI_GED_MEM_HOTPLUG_EVT;
> @@ -1010,6 +1010,13 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
> }
> }
>
> +static void virt_generic_error_req(Notifier *n, void *opaque)
> +{
> + VirtMachineState *s = container_of(n, VirtMachineState, generic_error_notifier);
> +
> + acpi_send_event(s->acpi_dev, ACPI_GENERIC_ERROR);
> +}
> +
> static void create_gpio_keys(char *fdt, DeviceState *pl061_dev,
> uint32_t phandle)
> {
> @@ -2404,6 +2411,9 @@ static void machvirt_init(MachineState *machine)
>
> if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
> vms->acpi_dev = create_acpi_ged(vms);
> + vms->generic_error_notifier.notify = virt_generic_error_req;
> + notifier_list_add(&acpi_generic_error_notifiers,
> + &vms->generic_error_notifier);
> } else {
> create_gpio_devices(vms, VIRT_GPIO, sysmem);
> }
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index c8e94e6aedc9..f3cf28436770 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -176,6 +176,7 @@ struct VirtMachineState {
> DeviceState *gic;
> DeviceState *acpi_dev;
> Notifier powerdown_notifier;
> + Notifier generic_error_notifier;
> PCIBus *bus;
> char *oem_id;
> char *oem_table_id;
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-26 14:51 ` Igor Mammedov
@ 2025-02-26 16:00 ` Igor Mammedov
0 siblings, 0 replies; 34+ messages in thread
From: Igor Mammedov @ 2025-02-26 16:00 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
On Wed, 26 Feb 2025 15:51:43 +0100
Igor Mammedov <imammedo@redhat.com> wrote:
> On Wed, 26 Feb 2025 15:39:13 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
[...]
>
> PS: do not respin until I've finish this review.
finished
>
> > >
> > > >
> > > > ---
> > > > v4:
> > > > - added an extra comment for AcpiGhesState structure;
> > > > - patches reordered;
> > > > - no functional changes, just code shift between the patches in this series.
> > > >
> > > > v3:
> > > > - addressed more nits;
> > > > - hest_add_le now points to the beginning of HEST table;
> > > > - removed HEST from tests/data/acpi;
> > > > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> > > >
> > > > v2:
> > > > - address some nits;
> > > > - improved ags cleanup patch and removed ags.present field;
> > > > - added some missing le*_to_cpu() calls;
> > > > - update date at copyright for new files to 2024-2025;
> > > > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > > > - added HEST and DSDT tables after the changes to make check target happy.
> > > > (two patches: first one whitelisting such tables; second one removing from
> > > > whitelist and updating/adding such tables to tests/data/acpi)
> > > >
> > > >
> > > >
> > > > Mauro Carvalho Chehab (14):
> > > > acpi/ghes: prepare to change the way HEST offsets are calculated
> > > > acpi/ghes: add a firmware file with HEST address
> > > > acpi/ghes: Use HEST table offsets when preparing GHES records
> > > > acpi/ghes: don't hard-code the number of sources for HEST table
> > > > acpi/ghes: add a notifier to notify when error data is ready
> > > > acpi/ghes: create an ancillary acpi_ghes_get_state() function
> > > > acpi/generic_event_device: Update GHES migration to cover hest addr
> > > > acpi/generic_event_device: add logic to detect if HEST addr is
> > > > available
> > > > acpi/generic_event_device: add an APEI error device
> > > > tests/acpi: virt: allow acpi table changes for a new table: HEST
> > > > arm/virt: Wire up a GED error device for ACPI / GHES
> > > > tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> > > > qapi/acpi-hest: add an interface to do generic CPER error injection
> > > > scripts/ghes_inject: add a script to generate GHES error inject
> > > >
> > > > MAINTAINERS | 10 +
> > > > hw/acpi/Kconfig | 5 +
> > > > hw/acpi/aml-build.c | 10 +
> > > > hw/acpi/generic_event_device.c | 43 ++
> > > > hw/acpi/ghes-stub.c | 7 +-
> > > > hw/acpi/ghes.c | 231 ++++--
> > > > hw/acpi/ghes_cper.c | 38 +
> > > > hw/acpi/ghes_cper_stub.c | 19 +
> > > > hw/acpi/meson.build | 2 +
> > > > hw/arm/virt-acpi-build.c | 37 +-
> > > > hw/arm/virt.c | 19 +-
> > > > hw/core/machine.c | 2 +
> > > > include/hw/acpi/acpi_dev_interface.h | 1 +
> > > > include/hw/acpi/aml-build.h | 2 +
> > > > include/hw/acpi/generic_event_device.h | 1 +
> > > > include/hw/acpi/ghes.h | 54 +-
> > > > include/hw/arm/virt.h | 2 +
> > > > qapi/acpi-hest.json | 35 +
> > > > qapi/meson.build | 1 +
> > > > qapi/qapi-schema.json | 1 +
> > > > scripts/arm_processor_error.py | 476 ++++++++++++
> > > > scripts/ghes_inject.py | 51 ++
> > > > scripts/qmp_helper.py | 702 ++++++++++++++++++
> > > > target/arm/kvm.c | 7 +-
> > > > tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> > > > .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> > > > tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> > > > tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> > > > tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> > > > 29 files changed, 1677 insertions(+), 79 deletions(-)
> > > > create mode 100644 hw/acpi/ghes_cper.c
> > > > create mode 100644 hw/acpi/ghes_cper_stub.c
> > > > create mode 100644 qapi/acpi-hest.json
> > > > create mode 100644 scripts/arm_processor_error.py
> > > > create mode 100755 scripts/ghes_inject.py
> > > > create mode 100755 scripts/qmp_helper.py
> > > >
> > >
> >
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 08/14] acpi/generic_event_device: add logic to detect if HEST addr is available
2025-02-26 15:52 ` Igor Mammedov
@ 2025-02-27 7:19 ` Mauro Carvalho Chehab
2025-02-27 7:26 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 7:19 UTC (permalink / raw)
To: Igor Mammedov
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha,
Eduardo Habkost, Marcel Apfelbaum, Peter Maydell, Shannon Zhao,
Yanan Wang, Zhao Liu, linux-kernel
Em Wed, 26 Feb 2025 16:52:26 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Fri, 21 Feb 2025 15:35:17 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> > index 5346cae573b7..14d8513a5440 100644
> > --- a/hw/acpi/generic_event_device.c
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -318,6 +318,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> >
> > static const Property acpi_ged_properties[] = {
> > DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
> > + DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, false),
>
> you below set it for 9.2 to false, so
> shouldn't it be set to true by default here?
Yes, but it is too early to do that here, as the DSDT table was not
updated to contain the GED device.
We're switching it to true later on, at patch 11::
d8c44ee13fbe ("arm/virt: Wire up a GED error device for ACPI / GHES")
Thanks,
Mauro
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 08/14] acpi/generic_event_device: add logic to detect if HEST addr is available
2025-02-27 7:19 ` Mauro Carvalho Chehab
@ 2025-02-27 7:26 ` Mauro Carvalho Chehab
2025-02-27 9:50 ` Igor Mammedov
0 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 7:26 UTC (permalink / raw)
To: Igor Mammedov
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha,
Eduardo Habkost, Marcel Apfelbaum, Peter Maydell, Shannon Zhao,
Yanan Wang, Zhao Liu, linux-kernel
Em Thu, 27 Feb 2025 08:19:27 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:
> Em Wed, 26 Feb 2025 16:52:26 +0100
> Igor Mammedov <imammedo@redhat.com> escreveu:
>
> > On Fri, 21 Feb 2025 15:35:17 +0100
> > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> >
>
> > > diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> > > index 5346cae573b7..14d8513a5440 100644
> > > --- a/hw/acpi/generic_event_device.c
> > > +++ b/hw/acpi/generic_event_device.c
> > > @@ -318,6 +318,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > >
> > > static const Property acpi_ged_properties[] = {
> > > DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
> > > + DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, false),
> >
> > you below set it for 9.2 to false, so
> > shouldn't it be set to true by default here?
>
> Yes, but it is too early to do that here, as the DSDT table was not
> updated to contain the GED device.
>
> We're switching it to true later on, at patch 11::
>
> d8c44ee13fbe ("arm/virt: Wire up a GED error device for ACPI / GHES")
Hmm... too many rebases that on my head things are becoming shady ;-)
Originally, this was setting it to true, but you requested to move it
to another patch during one of the patch reorder requests.
Anyway, after all those rebases, I guess it is now safe to set it
to true here without breaking bisectability. I'll move the hunk back
to this patch.
Thanks,
Mauro
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 08/14] acpi/generic_event_device: add logic to detect if HEST addr is available
2025-02-27 7:26 ` Mauro Carvalho Chehab
@ 2025-02-27 9:50 ` Igor Mammedov
0 siblings, 0 replies; 34+ messages in thread
From: Igor Mammedov @ 2025-02-27 9:50 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha,
Eduardo Habkost, Marcel Apfelbaum, Peter Maydell, Shannon Zhao,
Yanan Wang, Zhao Liu, linux-kernel
On Thu, 27 Feb 2025 08:26:38 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Em Thu, 27 Feb 2025 08:19:27 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:
>
> > Em Wed, 26 Feb 2025 16:52:26 +0100
> > Igor Mammedov <imammedo@redhat.com> escreveu:
> >
> > > On Fri, 21 Feb 2025 15:35:17 +0100
> > > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> > >
> >
> > > > diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> > > > index 5346cae573b7..14d8513a5440 100644
> > > > --- a/hw/acpi/generic_event_device.c
> > > > +++ b/hw/acpi/generic_event_device.c
> > > > @@ -318,6 +318,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> > > >
> > > > static const Property acpi_ged_properties[] = {
> > > > DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
> > > > + DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, false),
> > >
> > > you below set it for 9.2 to false, so
> > > shouldn't it be set to true by default here?
> >
> > Yes, but it is too early to do that here, as the DSDT table was not
> > updated to contain the GED device.
> >
> > We're switching it to true later on, at patch 11::
> >
> > d8c44ee13fbe ("arm/virt: Wire up a GED error device for ACPI / GHES")
After sleeping on it,
what you did here is totally correct.
You are right, We can't really flip switch to true here
since without 11/14 APEI will stop working properly.
Perhaps add to commit message a note explaining why it's false
in this patch and where it will be set to true.
>
> Hmm... too many rebases that on my head things are becoming shady ;-)
>
> Originally, this was setting it to true, but you requested to move it
> to another patch during one of the patch reorder requests.
>
> Anyway, after all those rebases, I guess it is now safe to set it
> to true here without breaking bisectability. I'll move the hunk back
> to this patch.
>
> Thanks,
> Mauro
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (14 preceding siblings ...)
2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for " Igor Mammedov
@ 2025-02-27 9:54 ` Igor Mammedov
2025-02-27 11:05 ` Mauro Carvalho Chehab
15 siblings, 1 reply; 34+ messages in thread
From: Igor Mammedov @ 2025-02-27 9:54 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
On Fri, 21 Feb 2025 15:35:09 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Now that the ghes preparation patches were merged, let's add support
> for error injection.
>
> On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> table and hardware_error firmware file, together with its migration code. Migration tested
> with both latest QEMU released kernel and upstream, on both directions.
>
> The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> to inject ARM Processor Error records.
>
> ---
> v4:
> - added an extra comment for AcpiGhesState structure;
> - patches reordered;
> - no functional changes, just code shift between the patches in this series.
>
> v3:
> - addressed more nits;
> - hest_add_le now points to the beginning of HEST table;
> - removed HEST from tests/data/acpi;
> - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
>
> v2:
> - address some nits;
> - improved ags cleanup patch and removed ags.present field;
> - added some missing le*_to_cpu() calls;
> - update date at copyright for new files to 2024-2025;
> - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> - added HEST and DSDT tables after the changes to make check target happy.
> (two patches: first one whitelisting such tables; second one removing from
> whitelist and updating/adding such tables to tests/data/acpi)
>
>
>
> Mauro Carvalho Chehab (14):
> acpi/ghes: prepare to change the way HEST offsets are calculated
> acpi/ghes: add a firmware file with HEST address
> acpi/ghes: Use HEST table offsets when preparing GHES records
> acpi/ghes: don't hard-code the number of sources for HEST table
> acpi/ghes: add a notifier to notify when error data is ready
> acpi/ghes: create an ancillary acpi_ghes_get_state() function
> acpi/generic_event_device: Update GHES migration to cover hest addr
> acpi/generic_event_device: add logic to detect if HEST addr is
> available
> acpi/generic_event_device: add an APEI error device
> tests/acpi: virt: allow acpi table changes for a new table: HEST
> arm/virt: Wire up a GED error device for ACPI / GHES
> tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> qapi/acpi-hest: add an interface to do generic CPER error injection
> scripts/ghes_inject: add a script to generate GHES error inject
>
> MAINTAINERS | 10 +
> hw/acpi/Kconfig | 5 +
> hw/acpi/aml-build.c | 10 +
> hw/acpi/generic_event_device.c | 43 ++
> hw/acpi/ghes-stub.c | 7 +-
> hw/acpi/ghes.c | 231 ++++--
> hw/acpi/ghes_cper.c | 38 +
> hw/acpi/ghes_cper_stub.c | 19 +
> hw/acpi/meson.build | 2 +
> hw/arm/virt-acpi-build.c | 37 +-
> hw/arm/virt.c | 19 +-
> hw/core/machine.c | 2 +
> include/hw/acpi/acpi_dev_interface.h | 1 +
> include/hw/acpi/aml-build.h | 2 +
> include/hw/acpi/generic_event_device.h | 1 +
> include/hw/acpi/ghes.h | 54 +-
> include/hw/arm/virt.h | 2 +
> qapi/acpi-hest.json | 35 +
> qapi/meson.build | 1 +
> qapi/qapi-schema.json | 1 +
> scripts/arm_processor_error.py | 476 ++++++++++++
> scripts/ghes_inject.py | 51 ++
> scripts/qmp_helper.py | 702 ++++++++++++++++++
> target/arm/kvm.c | 7 +-
> tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> 29 files changed, 1677 insertions(+), 79 deletions(-)
> create mode 100644 hw/acpi/ghes_cper.c
> create mode 100644 hw/acpi/ghes_cper_stub.c
> create mode 100644 qapi/acpi-hest.json
> create mode 100644 scripts/arm_processor_error.py
> create mode 100755 scripts/ghes_inject.py
> create mode 100755 scripts/qmp_helper.py
>
once you enable, ras in tests as 1st patches and fixup minor issues
please try to do patch by patch compile/bios-tables-test testing, to avoid
unnecessary respin in case at table change crept in somewhere unnoticed.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-27 9:54 ` Igor Mammedov
@ 2025-02-27 11:05 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:05 UTC (permalink / raw)
To: Igor Mammedov
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
Em Thu, 27 Feb 2025 10:54:54 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Fri, 21 Feb 2025 15:35:09 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Now that the ghes preparation patches were merged, let's add support
> > for error injection.
> >
> > On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> > table and hardware_error firmware file, together with its migration code. Migration tested
> > with both latest QEMU released kernel and upstream, on both directions.
> >
> > The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> > to inject ARM Processor Error records.
> >
> > ---
> > v4:
> > - added an extra comment for AcpiGhesState structure;
> > - patches reordered;
> > - no functional changes, just code shift between the patches in this series.
> >
> > v3:
> > - addressed more nits;
> > - hest_add_le now points to the beginning of HEST table;
> > - removed HEST from tests/data/acpi;
> > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> >
> > v2:
> > - address some nits;
> > - improved ags cleanup patch and removed ags.present field;
> > - added some missing le*_to_cpu() calls;
> > - update date at copyright for new files to 2024-2025;
> > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > - added HEST and DSDT tables after the changes to make check target happy.
> > (two patches: first one whitelisting such tables; second one removing from
> > whitelist and updating/adding such tables to tests/data/acpi)
> >
> >
> >
> > Mauro Carvalho Chehab (14):
> > acpi/ghes: prepare to change the way HEST offsets are calculated
> > acpi/ghes: add a firmware file with HEST address
> > acpi/ghes: Use HEST table offsets when preparing GHES records
> > acpi/ghes: don't hard-code the number of sources for HEST table
> > acpi/ghes: add a notifier to notify when error data is ready
> > acpi/ghes: create an ancillary acpi_ghes_get_state() function
> > acpi/generic_event_device: Update GHES migration to cover hest addr
> > acpi/generic_event_device: add logic to detect if HEST addr is
> > available
> > acpi/generic_event_device: add an APEI error device
> > tests/acpi: virt: allow acpi table changes for a new table: HEST
> > arm/virt: Wire up a GED error device for ACPI / GHES
> > tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> > qapi/acpi-hest: add an interface to do generic CPER error injection
> > scripts/ghes_inject: add a script to generate GHES error inject
> >
> > MAINTAINERS | 10 +
> > hw/acpi/Kconfig | 5 +
> > hw/acpi/aml-build.c | 10 +
> > hw/acpi/generic_event_device.c | 43 ++
> > hw/acpi/ghes-stub.c | 7 +-
> > hw/acpi/ghes.c | 231 ++++--
> > hw/acpi/ghes_cper.c | 38 +
> > hw/acpi/ghes_cper_stub.c | 19 +
> > hw/acpi/meson.build | 2 +
> > hw/arm/virt-acpi-build.c | 37 +-
> > hw/arm/virt.c | 19 +-
> > hw/core/machine.c | 2 +
> > include/hw/acpi/acpi_dev_interface.h | 1 +
> > include/hw/acpi/aml-build.h | 2 +
> > include/hw/acpi/generic_event_device.h | 1 +
> > include/hw/acpi/ghes.h | 54 +-
> > include/hw/arm/virt.h | 2 +
> > qapi/acpi-hest.json | 35 +
> > qapi/meson.build | 1 +
> > qapi/qapi-schema.json | 1 +
> > scripts/arm_processor_error.py | 476 ++++++++++++
> > scripts/ghes_inject.py | 51 ++
> > scripts/qmp_helper.py | 702 ++++++++++++++++++
> > target/arm/kvm.c | 7 +-
> > tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> > .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> > tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> > tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> > tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> > 29 files changed, 1677 insertions(+), 79 deletions(-)
> > create mode 100644 hw/acpi/ghes_cper.c
> > create mode 100644 hw/acpi/ghes_cper_stub.c
> > create mode 100644 qapi/acpi-hest.json
> > create mode 100644 scripts/arm_processor_error.py
> > create mode 100755 scripts/ghes_inject.py
> > create mode 100755 scripts/qmp_helper.py
> >
>
> once you enable, ras in tests as 1st patches and fixup minor issues
> please try to do patch by patch compile/bios-tables-test testing, to avoid
> unnecessary respin in case at table change crept in somewhere unnoticed.
Just submitted v5.
I took some extra care to avoid bisect issues. Still checkpatch
had some warnings, but they seemed false positives.
Thanks,
Mauro
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 01/14] acpi/ghes: prepare to change the way HEST offsets are calculated
2025-02-26 14:37 ` Igor Mammedov
@ 2025-02-27 11:45 ` Mauro Carvalho Chehab
2025-02-27 12:22 ` Igor Mammedov
0 siblings, 1 reply; 34+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:45 UTC (permalink / raw)
To: Igor Mammedov
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, Peter Maydell, Shannon Zhao,
linux-kernel
Em Wed, 26 Feb 2025 15:37:14 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Fri, 21 Feb 2025 15:35:10 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index 3ac8f8e17861..8ab8d11b6536 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -946,9 +946,18 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> > build_dbg2(tables_blob, tables->linker, vms);
> >
> > if (vms->ras) {
> > - acpi_add_table(table_offsets, tables_blob);
> > - acpi_build_hest(tables_blob, tables->hardware_errors, tables->linker,
> > - vms->oem_id, vms->oem_table_id);
> > + AcpiGedState *acpi_ged_state;
> > + AcpiGhesState *ags;
> > +
> > + acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
> ^^^ will explode if object_resolve_path_type() returns NULL
> > + NULL));
>
> it's also expensive load-wise.
> You have access to vms with ged pointer here, use that
> (search for 'acpi_ged_state = ACPI_GED' example)
Ok, but the state binding on ghes were designed to use ACPI_GED. I moved
the code that it is using ACPI_GED() to the beginning of v5 series,
just after the HEST table test addition.
With that, ACPI_GED() is now used only on two places inside ghes:
- at virt_acpi_build(), during VM initialization;
- at acpi_ghes_get_state().
If you want to replace it by some other solution, IMO we should do
it on some separate series, as this is not related to neither error
injection nor with offset calculation to get read ack address.
> > + if (acpi_ged_state) {
>
> hence, this check is not really needed,
> we have to have GED at this point or abort
>
> earlier code that instantiates GED should take care of
> cleanly exiting if it failed to create GED so we would never get
> to missing GED here
I dropped this check on v5.
Thanks,
Mauro
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 01/14] acpi/ghes: prepare to change the way HEST offsets are calculated
2025-02-27 11:45 ` Mauro Carvalho Chehab
@ 2025-02-27 12:22 ` Igor Mammedov
0 siblings, 0 replies; 34+ messages in thread
From: Igor Mammedov @ 2025-02-27 12:22 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, Peter Maydell, Shannon Zhao,
linux-kernel
On Thu, 27 Feb 2025 12:45:38 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Em Wed, 26 Feb 2025 15:37:14 +0100
> Igor Mammedov <imammedo@redhat.com> escreveu:
>
> > On Fri, 21 Feb 2025 15:35:10 +0100
> > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> >
>
> > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > > index 3ac8f8e17861..8ab8d11b6536 100644
> > > --- a/hw/arm/virt-acpi-build.c
> > > +++ b/hw/arm/virt-acpi-build.c
> > > @@ -946,9 +946,18 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> > > build_dbg2(tables_blob, tables->linker, vms);
> > >
> > > if (vms->ras) {
> > > - acpi_add_table(table_offsets, tables_blob);
> > > - acpi_build_hest(tables_blob, tables->hardware_errors, tables->linker,
> > > - vms->oem_id, vms->oem_table_id);
> > > + AcpiGedState *acpi_ged_state;
> > > + AcpiGhesState *ags;
> > > +
> > > + acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
> > ^^^ will explode if object_resolve_path_type() returns NULL
> > > + NULL));
> >
> > it's also expensive load-wise.
> > You have access to vms with ged pointer here, use that
> > (search for 'acpi_ged_state = ACPI_GED' example)
>
> Ok, but the state binding on ghes were designed to use ACPI_GED. I moved
> the code that it is using ACPI_GED() to the beginning of v5 series,
> just after the HEST table test addition.
>
> With that, ACPI_GED() is now used only on two places inside ghes:
>
> - at virt_acpi_build(), during VM initialization;
ACPI_GED() is not expensive, what I'm referring to is
object_resolve_path_type()
given it's a new code and virt_acpi_build() has direct access
to ged pointer, there is no excuse to use object_resolve_path_type().
all you have to do here is:
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index e6328af5d2..040d875d4e 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -949,8 +949,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
AcpiGedState *acpi_ged_state;
AcpiGhesState *ags;
- acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
- NULL));
+ acpi_ged_state = ACPI_GED(vms->acpi_dev);
ags = &acpi_ged_state->ghes_state;
if (ags) {
acpi_add_table(table_offsets, tables_blob);
> - at acpi_ghes_get_state().
this one is different, it doesn't have access to ged so it
has to look up for it.
>
> If you want to replace it by some other solution, IMO we should do
> it on some separate series, as this is not related to neither error
> injection nor with offset calculation to get read ack address.
>
> > > + if (acpi_ged_state) {
> >
> > hence, this check is not really needed,
> > we have to have GED at this point or abort
> >
> > earlier code that instantiates GED should take care of
> > cleanly exiting if it failed to create GED so we would never get
> > to missing GED here
>
> I dropped this check on v5.
>
> Thanks,
> Mauro
>
^ permalink raw reply related [flat|nested] 34+ messages in thread
end of thread, other threads:[~2025-02-27 12:23 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 01/14] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
2025-02-26 14:37 ` Igor Mammedov
2025-02-27 11:45 ` Mauro Carvalho Chehab
2025-02-27 12:22 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 02/14] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
2025-02-26 14:48 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 03/14] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
2025-02-26 15:16 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 04/14] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
2025-02-26 15:48 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 05/14] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
2025-02-26 15:27 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 07/14] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 08/14] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
2025-02-26 15:52 ` Igor Mammedov
2025-02-27 7:19 ` Mauro Carvalho Chehab
2025-02-27 7:26 ` Mauro Carvalho Chehab
2025-02-27 9:50 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 09/14] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 10/14] tests/acpi: virt: allow acpi table changes for a new table: HEST Mauro Carvalho Chehab
2025-02-26 15:55 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 11/14] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
2025-02-26 15:58 ` Igor Mammedov
2025-02-21 14:35 ` [PATCH v4 12/14] tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 13/14] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 14/14] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for " Igor Mammedov
2025-02-26 14:39 ` Mauro Carvalho Chehab
2025-02-26 14:51 ` Igor Mammedov
2025-02-26 16:00 ` Igor Mammedov
2025-02-27 9:54 ` Igor Mammedov
2025-02-27 11:05 ` Mauro Carvalho Chehab
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).