* [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject
@ 2025-02-27 11:03 Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 01/21] tests/acpi: virt: add an empty HEST file Mauro Carvalho Chehab
` (21 more replies)
0 siblings, 22 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Philippe Mathieu-Daudé, Ani Sinha,
Cleber Rosa, Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
Now that the ghes preparation patches were merged, let's add support
for error injection.
On this version, HEST table got added to ACPI tables testing for aarch64 virt.
There are also some patch reorder to help reviewers to check the changes.
The code itself is almost identical to v4, with just a few minor nits addressed.
---
v5:
- make checkpatch happier;
- HEST table is now tested;
- some changes at HEST spec documentation to align with code changes;
- extra care was taken with regards to git bisectability.
v4:
- added an extra comment for AcpiGhesState structure;
- patches reordered;
- no functional changes, just code shift between the patches in this series.
v3:
- addressed more nits;
- hest_add_le now points to the beginning of HEST table;
- removed HEST from tests/data/acpi;
- added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
v2:
- address some nits;
- improved ags cleanup patch and removed ags.present field;
- added some missing le*_to_cpu() calls;
- update date at copyright for new files to 2024-2025;
- qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
- added HEST and DSDT tables after the changes to make check target happy.
(two patches: first one whitelisting such tables; second one removing from
whitelist and updating/adding such tables to tests/data/acpi)
Mauro Carvalho Chehab (21):
tests/acpi: virt: add an empty HEST file
tests/qtest/bios-tables-test: extend to also check HEST table
tests/acpi: virt: update HEST file with its current data
acpi/ghes: Cleanup the code which gets ghes ged state
acpi/ghes: prepare to change the way HEST offsets are calculated
acpi/ghes: add a firmware file with HEST address
acpi/ghes: Use HEST table offsets when preparing GHES records
acpi/ghes: don't hard-code the number of sources for HEST table
acpi/ghes: add a notifier to notify when error data is ready
acpi/ghes: create an ancillary acpi_ghes_get_state() function
acpi/generic_event_device: Update GHES migration to cover hest addr
acpi/generic_event_device: add logic to detect if HEST addr is
available
acpi/generic_event_device: add an APEI error device
tests/acpi: virt: allow acpi table changes at DSDT and HEST tables
arm/virt: Wire up a GED error device for ACPI / GHES
qapi/acpi-hest: add an interface to do generic CPER error injection
tests/acpi: virt: update HEST table to accept two sources
tests/acpi: virt: and update DSDT table to add the new GED device
docs: hest: add new "etc/acpi_table_hest_addr" and update workflow
acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x
scripts/ghes_inject: add a script to generate GHES error inject
MAINTAINERS | 10 +
docs/specs/acpi_hest_ghes.rst | 28 +-
hw/acpi/Kconfig | 5 +
hw/acpi/aml-build.c | 10 +
hw/acpi/generic_event_device.c | 43 ++
hw/acpi/ghes-stub.c | 7 +-
hw/acpi/ghes.c | 231 ++++--
hw/acpi/ghes_cper.c | 38 +
hw/acpi/ghes_cper_stub.c | 19 +
hw/acpi/meson.build | 2 +
hw/arm/virt-acpi-build.c | 36 +-
hw/arm/virt.c | 19 +-
hw/core/machine.c | 2 +
include/hw/acpi/acpi_dev_interface.h | 1 +
include/hw/acpi/aml-build.h | 2 +
include/hw/acpi/generic_event_device.h | 1 +
include/hw/acpi/ghes.h | 52 +-
include/hw/arm/virt.h | 2 +
qapi/acpi-hest.json | 35 +
qapi/meson.build | 1 +
qapi/qapi-schema.json | 1 +
scripts/arm_processor_error.py | 476 ++++++++++++
scripts/ghes_inject.py | 51 ++
scripts/qmp_helper.py | 702 ++++++++++++++++++
target/arm/kvm.c | 7 +-
tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
.../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
tests/data/acpi/aarch64/virt/HEST | Bin 0 -> 224 bytes
tests/qtest/bios-tables-test.c | 2 +-
32 files changed, 1692 insertions(+), 91 deletions(-)
create mode 100644 hw/acpi/ghes_cper.c
create mode 100644 hw/acpi/ghes_cper_stub.c
create mode 100644 qapi/acpi-hest.json
create mode 100644 scripts/arm_processor_error.py
create mode 100755 scripts/ghes_inject.py
create mode 100755 scripts/qmp_helper.py
create mode 100644 tests/data/acpi/aarch64/virt/HEST
--
2.48.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v5 01/21] tests/acpi: virt: add an empty HEST file
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 12:02 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 02/21] tests/qtest/bios-tables-test: extend to also check HEST table Mauro Carvalho Chehab
` (20 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
Such file will be used to track HEST table changes.
For now, disallow HEST table check until we update it to the
current data.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
tests/data/acpi/aarch64/virt/HEST | 0
tests/qtest/bios-tables-test-allowed-diff.h | 1 +
2 files changed, 1 insertion(+)
create mode 100644 tests/data/acpi/aarch64/virt/HEST
diff --git a/tests/data/acpi/aarch64/virt/HEST b/tests/data/acpi/aarch64/virt/HEST
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8bf4..39901c58d647 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,2 @@
/* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/aarch64/virt/HEST",
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 02/21] tests/qtest/bios-tables-test: extend to also check HEST table
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 01/21] tests/acpi: virt: add an empty HEST file Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 12:03 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 03/21] tests/acpi: virt: update HEST file with its current data Mauro Carvalho Chehab
` (19 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
Currently, aarch64 can generate a HEST table when loaded with
-machine ras=on. Add support for it.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
tests/qtest/bios-tables-test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index 0a333ec43536..8d41601cc9e9 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -2122,7 +2122,7 @@ static void test_acpi_aarch64_virt_tcg(void)
data.smbios_cpu_max_speed = 2900;
data.smbios_cpu_curr_speed = 2700;
- test_acpi_one("-cpu cortex-a57 "
+ test_acpi_one("-cpu cortex-a57 -machine ras=on "
"-smbios type=4,max-speed=2900,current-speed=2700", &data);
free_test_data(&data);
}
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 03/21] tests/acpi: virt: update HEST file with its current data
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 01/21] tests/acpi: virt: add an empty HEST file Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 02/21] tests/qtest/bios-tables-test: extend to also check HEST table Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 12:03 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 04/21] acpi/ghes: Cleanup the code which gets ghes ged state Mauro Carvalho Chehab
` (18 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
Now that HEST table is checked for aarch64, add the current
firmware file.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
tests/data/acpi/aarch64/virt/HEST | Bin 0 -> 132 bytes
tests/qtest/bios-tables-test-allowed-diff.h | 1 -
2 files changed, 1 deletion(-)
diff --git a/tests/data/acpi/aarch64/virt/HEST b/tests/data/acpi/aarch64/virt/HEST
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..4c5d8c5b5da5b3241f93cd0839e94272bf6b1486 100644
GIT binary patch
literal 132
zcmeZp4Gw8xU|?W;<mB({5v<@85#X$#prF9Wz`y`vgJ=-uVqjqS|DS;o#%Ew*U|?_n
dk++-~7#J8hWI!Yi09DHYRr~Kh1c1x}0RY>66afGL
literal 0
HcmV?d00001
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 39901c58d647..dfb8523c8bf4 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,2 +1 @@
/* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/aarch64/virt/HEST",
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 04/21] acpi/ghes: Cleanup the code which gets ghes ged state
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (2 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 03/21] tests/acpi: virt: update HEST file with its current data Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 05/21] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
` (17 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Paolo Bonzini,
Peter Maydell, kvm, linux-kernel
Move the check logic into a common function and simplify the
code which checks if GHES is enabled and was properly setup.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/ghes-stub.c | 7 ++++---
hw/acpi/ghes.c | 38 +++++++++++---------------------------
include/hw/acpi/ghes.h | 14 +++++++-------
target/arm/kvm.c | 7 +++++--
4 files changed, 27 insertions(+), 39 deletions(-)
diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
index 7cec1812dad9..40f660c246fe 100644
--- a/hw/acpi/ghes-stub.c
+++ b/hw/acpi/ghes-stub.c
@@ -11,12 +11,13 @@
#include "qemu/osdep.h"
#include "hw/acpi/ghes.h"
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t physical_address)
{
return -1;
}
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
{
- return false;
+ return NULL;
}
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index b709c177cdea..84b891fd3dcf 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -360,18 +360,12 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
/* Create a read-write fw_cfg file for Address */
fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
-
- ags->present = true;
}
static void get_hw_error_offsets(uint64_t ghes_addr,
uint64_t *cper_addr,
uint64_t *read_ack_register_addr)
{
- if (!ghes_addr) {
- return;
- }
-
/*
* non-HEST version supports only one source, so no need to change
* the start offset based on the source ID. Also, we can't validate
@@ -390,35 +384,20 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
*read_ack_register_addr = ghes_addr + sizeof(uint64_t);
}
-void ghes_record_cper_errors(const void *cper, size_t len,
+void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
uint16_t source_id, Error **errp)
{
uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
- AcpiGedState *acpi_ged_state;
- AcpiGhesState *ags;
if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
error_setg(errp, "GHES CPER record is too big: %zd", len);
return;
}
- acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
- NULL));
- if (!acpi_ged_state) {
- error_setg(errp, "Can't find ACPI_GED object");
- return;
- }
- ags = &acpi_ged_state->ghes_state;
-
assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
&cper_addr, &read_ack_register_addr);
- if (!cper_addr) {
- error_setg(errp, "can not find Generic Error Status Block");
- return;
- }
-
cpu_physical_memory_read(read_ack_register_addr,
&read_ack_register, sizeof(read_ack_register));
@@ -444,7 +423,8 @@ void ghes_record_cper_errors(const void *cper, size_t len,
return;
}
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t physical_address)
{
/* Memory Error Section Type */
const uint8_t guid[] =
@@ -470,7 +450,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
acpi_ghes_build_append_mem_cper(block, physical_address);
/* Report the error */
- ghes_record_cper_errors(block->data, block->len, source_id, &errp);
+ ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
g_array_free(block, true);
@@ -482,7 +462,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
return 0;
}
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
{
AcpiGedState *acpi_ged_state;
AcpiGhesState *ags;
@@ -491,8 +471,12 @@ bool acpi_ghes_present(void)
NULL));
if (!acpi_ged_state) {
- return false;
+ return NULL;
}
ags = &acpi_ged_state->ghes_state;
- return ags->present;
+
+ if (!ags->hw_error_le) {
+ return NULL;
+ }
+ return ags;
}
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 39619a2457cb..f96ac3e85ca2 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -66,7 +66,6 @@ enum {
typedef struct AcpiGhesState {
uint64_t hw_error_le;
- bool present; /* True if GHES is present at all on this board */
} AcpiGhesState;
void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
@@ -74,15 +73,16 @@ void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
const char *oem_id, const char *oem_table_id);
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
GArray *hardware_errors);
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t error_physical_addr);
-void ghes_record_cper_errors(const void *cper, size_t len,
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t error_physical_addr);
+void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
uint16_t source_id, Error **errp);
/**
- * acpi_ghes_present: Report whether ACPI GHES table is present
+ * acpi_ghes_get_state: Get a pointer for ACPI ghes state
*
- * Returns: true if the system has an ACPI GHES table and it is
- * safe to call acpi_ghes_memory_errors() to record a memory error.
+ * Returns: a pointer to ghes state if the system has an ACPI GHES table,
+ * NULL, otherwise.
*/
-bool acpi_ghes_present(void);
+AcpiGhesState *acpi_ghes_get_state(void);
#endif
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index da30bdbb2349..80ca7779797b 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2366,10 +2366,12 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
{
ram_addr_t ram_addr;
hwaddr paddr;
+ AcpiGhesState *ags;
assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
- if (acpi_ghes_present() && addr) {
+ ags = acpi_ghes_get_state();
+ if (ags && addr) {
ram_addr = qemu_ram_addr_from_host(addr);
if (ram_addr != RAM_ADDR_INVALID &&
kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
@@ -2387,7 +2389,8 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
*/
if (code == BUS_MCEERR_AR) {
kvm_cpu_synchronize_state(c);
- if (!acpi_ghes_memory_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
+ if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SEA,
+ paddr)) {
kvm_inject_arm_sea(c);
} else {
error_report("failed to record the error");
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 05/21] acpi/ghes: prepare to change the way HEST offsets are calculated
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (3 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 04/21] acpi/ghes: Cleanup the code which gets ghes ged state Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 13:25 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 06/21] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
` (16 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Peter Maydell,
Shannon Zhao, linux-kernel
Add a new ags flag to change the way HEST offsets are calculated.
Currently, offsets needed to store ACPI HEST offsets and read ack
are calculated based on a previous knowledge from the logic
which creates the HEST table.
Such logic is not generic, not allowing to easily add more HEST
entries nor replicates what OSPM does.
As the next patches will be adding a more generic logic, add a
new use_hest_addr, set to false, in preparation for such changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
hw/acpi/ghes.c | 39 ++++++++++++++++++++++++---------------
hw/arm/virt-acpi-build.c | 14 +++++++++++---
include/hw/acpi/ghes.h | 12 +++++++++++-
3 files changed, 46 insertions(+), 19 deletions(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 84b891fd3dcf..9243b5ad4acb 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -206,7 +206,8 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
* Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
* See docs/specs/acpi_hest_ghes.rst for blobs format.
*/
-static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
+static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
+ BIOSLinker *linker)
{
int i, error_status_block_offset;
@@ -251,13 +252,15 @@ static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
}
- /*
- * tell firmware to write hardware_errors GPA into
- * hardware_errors_addr fw_cfg, once the former has been initialized.
- */
- bios_linker_loader_write_pointer(linker, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, 0,
- sizeof(uint64_t),
- ACPI_HW_ERROR_FW_CFG_FILE, 0);
+ if (!ags->use_hest_addr) {
+ /*
+ * Tell firmware to write hardware_errors GPA into
+ * hardware_errors_addr fw_cfg, once the former has been initialized.
+ */
+ bios_linker_loader_write_pointer(linker, ACPI_HW_ERROR_ADDR_FW_CFG_FILE,
+ 0, sizeof(uint64_t),
+ ACPI_HW_ERROR_FW_CFG_FILE, 0);
+ }
}
/* Build Generic Hardware Error Source version 2 (GHESv2) */
@@ -331,14 +334,15 @@ static void build_ghes_v2(GArray *table_data,
}
/* Build Hardware Error Source Table */
-void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
+void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
+ GArray *hardware_errors,
BIOSLinker *linker,
const char *oem_id, const char *oem_table_id)
{
AcpiTable table = { .sig = "HEST", .rev = 1,
.oem_id = oem_id, .oem_table_id = oem_table_id };
- build_ghes_error_table(hardware_errors, linker);
+ build_ghes_error_table(ags, hardware_errors, linker);
acpi_table_begin(&table, table_data);
@@ -357,9 +361,11 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
hardware_error->len);
- /* Create a read-write fw_cfg file for Address */
- fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
- NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
+ if (!ags->use_hest_addr) {
+ /* Create a read-write fw_cfg file for Address */
+ fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
+ NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
+ }
}
static void get_hw_error_offsets(uint64_t ghes_addr,
@@ -395,8 +401,11 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
}
assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
- get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
- &cper_addr, &read_ack_register_addr);
+
+ if (!ags->use_hest_addr) {
+ get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
+ &cper_addr, &read_ack_register_addr);
+ }
cpu_physical_memory_read(read_ack_register_addr,
&read_ack_register, sizeof(read_ack_register));
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 3ac8f8e17861..e6328af5d238 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -946,9 +946,17 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
build_dbg2(tables_blob, tables->linker, vms);
if (vms->ras) {
- acpi_add_table(table_offsets, tables_blob);
- acpi_build_hest(tables_blob, tables->hardware_errors, tables->linker,
- vms->oem_id, vms->oem_table_id);
+ AcpiGedState *acpi_ged_state;
+ AcpiGhesState *ags;
+
+ acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
+ NULL));
+ ags = &acpi_ged_state->ghes_state;
+ if (ags) {
+ acpi_add_table(table_offsets, tables_blob);
+ acpi_build_hest(ags, tables_blob, tables->hardware_errors,
+ tables->linker, vms->oem_id, vms->oem_table_id);
+ }
}
if (ms->numa_state->num_nodes > 0) {
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index f96ac3e85ca2..5000891f163f 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -64,11 +64,21 @@ enum {
ACPI_GHES_ERROR_SOURCE_COUNT
};
+/*
+ * AcpiGhesState stores an offset that will be used to fill HEST entries.
+ *
+ * When use_hest_addr is false, the GPA of the etc/hardware_errors firmware
+ * is stored at hw_error_le. This is the default on QEMU 9.x.
+ *
+ * An GPA value equal to zero means that GHES is not present.
+ */
typedef struct AcpiGhesState {
uint64_t hw_error_le;
+ bool use_hest_addr; /* Currently, always false */
} AcpiGhesState;
-void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
+void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
+ GArray *hardware_errors,
BIOSLinker *linker,
const char *oem_id, const char *oem_table_id);
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 06/21] acpi/ghes: add a firmware file with HEST address
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (4 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 05/21] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 13:23 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 07/21] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
` (15 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, linux-kernel
Store HEST table address at GPA, placing its the start of the table at
hest_addr_le variable.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/ghes.c | 20 +++++++++++++++++++-
include/hw/acpi/ghes.h | 7 ++++++-
2 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 9243b5ad4acb..8ec423726b3f 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -30,6 +30,7 @@
#define ACPI_HW_ERROR_FW_CFG_FILE "etc/hardware_errors"
#define ACPI_HW_ERROR_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
+#define ACPI_HEST_ADDR_FW_CFG_FILE "etc/acpi_table_hest_addr"
/* The max size in bytes for one error block */
#define ACPI_GHES_MAX_RAW_DATA_LENGTH (1 * KiB)
@@ -341,6 +342,9 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
{
AcpiTable table = { .sig = "HEST", .rev = 1,
.oem_id = oem_id, .oem_table_id = oem_table_id };
+ uint32_t hest_offset;
+
+ hest_offset = table_data->len;
build_ghes_error_table(ags, hardware_errors, linker);
@@ -352,6 +356,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
ACPI_GHES_NOTIFY_SEA, ACPI_HEST_SRC_ID_SEA);
acpi_table_end(linker, &table);
+
+ if (ags->use_hest_addr) {
+ /*
+ * Tell firmware to write into GPA the address of HEST via fw_cfg,
+ * once initialized.
+ */
+ bios_linker_loader_write_pointer(linker,
+ ACPI_HEST_ADDR_FW_CFG_FILE, 0,
+ sizeof(uint64_t),
+ ACPI_BUILD_TABLE_FILE, hest_offset);
+ }
}
void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
@@ -361,7 +376,10 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
hardware_error->len);
- if (!ags->use_hest_addr) {
+ if (ags->use_hest_addr) {
+ fw_cfg_add_file_callback(s, ACPI_HEST_ADDR_FW_CFG_FILE, NULL, NULL,
+ NULL, &(ags->hest_addr_le), sizeof(ags->hest_addr_le), false);
+ } else {
/* Create a read-write fw_cfg file for Address */
fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 5000891f163f..38abe6e3db52 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -70,9 +70,14 @@ enum {
* When use_hest_addr is false, the GPA of the etc/hardware_errors firmware
* is stored at hw_error_le. This is the default on QEMU 9.x.
*
- * An GPA value equal to zero means that GHES is not present.
+ * When use_hest_addr is true, the stored offset is placed at hest_addr_le,
+ * meaning an offset from the HEST table address from etc/acpi/tables firmware.
+ * This is the default for QEMU 10.x and above.
+ *
+ * Whe both GPA values are equal to zero means that GHES is not present.
*/
typedef struct AcpiGhesState {
+ uint64_t hest_addr_le;
uint64_t hw_error_le;
bool use_hest_addr; /* Currently, always false */
} AcpiGhesState;
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 07/21] acpi/ghes: Use HEST table offsets when preparing GHES records
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (5 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 06/21] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 13:27 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 08/21] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
` (14 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, linux-kernel
There are two pointers that are needed during error injection:
1. The start address of the CPER block to be stored;
2. The address of the read ack.
It is preferable to calculate them from the HEST table. This allows
checking the source ID, the size of the table and the type of the
HEST error block structures.
Yet, keep the old code, as this is needed for migration purposes
from older QEMU versions.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
hw/acpi/ghes.c | 100 +++++++++++++++++++++++++++++++++++++++++
include/hw/acpi/ghes.h | 2 +-
2 files changed, 101 insertions(+), 1 deletion(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 8ec423726b3f..5158418f93cb 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -41,6 +41,12 @@
/* Address offset in Generic Address Structure(GAS) */
#define GAS_ADDR_OFFSET 4
+/*
+ * ACPI spec 1.0b
+ * 5.2.3 System Description Table Header
+ */
+#define ACPI_DESC_HEADER_OFFSET 36
+
/*
* The total size of Generic Error Data Entry
* ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
@@ -61,6 +67,30 @@
*/
#define ACPI_GHES_GESB_SIZE 20
+/*
+ * See the memory layout map at docs/specs/acpi_hest_ghes.rst.
+ */
+
+/*
+ * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source version 2
+ * Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure
+ */
+#define HEST_GHES_V2_ENTRY_SIZE 92
+
+/*
+ * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source version 2
+ * Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure
+ * Read Ack Register
+ */
+#define GHES_READ_ACK_ADDR_OFF 64
+
+/*
+ * ACPI 6.1: 18.3.2.7: Generic Hardware Error Source
+ * Table 18-341 Generic Hardware Error Source Structure
+ * Error Status Address
+ */
+#define GHES_ERR_STATUS_ADDR_OFF 20
+
/*
* Values for error_severity field
*/
@@ -408,6 +438,73 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
*read_ack_register_addr = ghes_addr + sizeof(uint64_t);
}
+static void get_ghes_source_offsets(uint16_t source_id,
+ uint64_t hest_addr,
+ uint64_t *cper_addr,
+ uint64_t *read_ack_start_addr,
+ Error **errp)
+{
+ uint64_t hest_err_block_addr, hest_read_ack_addr;
+ uint64_t err_source_entry, error_block_addr;
+ uint32_t num_sources, i;
+
+ hest_addr += ACPI_DESC_HEADER_OFFSET;
+
+ cpu_physical_memory_read(hest_addr, &num_sources,
+ sizeof(num_sources));
+ num_sources = le32_to_cpu(num_sources);
+
+ err_source_entry = hest_addr + sizeof(num_sources);
+
+ /*
+ * Currently, HEST Error source navigates only for GHESv2 tables
+ */
+ for (i = 0; i < num_sources; i++) {
+ uint64_t addr = err_source_entry;
+ uint16_t type, src_id;
+
+ cpu_physical_memory_read(addr, &type, sizeof(type));
+ type = le16_to_cpu(type);
+
+ /* For now, we only know the size of GHESv2 table */
+ if (type != ACPI_GHES_SOURCE_GENERIC_ERROR_V2) {
+ error_setg(errp, "HEST: type %d not supported.", type);
+ return;
+ }
+
+ /* Compare CPER source ID at the GHESv2 structure */
+ addr += sizeof(type);
+ cpu_physical_memory_read(addr, &src_id, sizeof(src_id));
+ if (le16_to_cpu(src_id) == source_id) {
+ break;
+ }
+
+ err_source_entry += HEST_GHES_V2_ENTRY_SIZE;
+ }
+ if (i == num_sources) {
+ error_setg(errp, "HEST: Source %d not found.", source_id);
+ return;
+ }
+
+ /* Navigate through table address pointers */
+ hest_err_block_addr = err_source_entry + GHES_ERR_STATUS_ADDR_OFF +
+ GAS_ADDR_OFFSET;
+
+ cpu_physical_memory_read(hest_err_block_addr, &error_block_addr,
+ sizeof(error_block_addr));
+ error_block_addr = le64_to_cpu(error_block_addr);
+
+ cpu_physical_memory_read(error_block_addr, cper_addr,
+ sizeof(*cper_addr));
+ *cper_addr = le64_to_cpu(*cper_addr);
+
+ hest_read_ack_addr = err_source_entry + GHES_READ_ACK_ADDR_OFF +
+ GAS_ADDR_OFFSET;
+ cpu_physical_memory_read(hest_read_ack_addr, read_ack_start_addr,
+ sizeof(*read_ack_start_addr));
+ *read_ack_start_addr = le64_to_cpu(*read_ack_start_addr);
+}
+
void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
uint16_t source_id, Error **errp)
{
@@ -423,6 +520,9 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
if (!ags->use_hest_addr) {
get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
&cper_addr, &read_ack_register_addr);
+ } else {
+ get_ghes_source_offsets(source_id, le64_to_cpu(ags->hest_addr_le),
+ &cper_addr, &read_ack_register_addr, errp);
}
cpu_physical_memory_read(read_ack_register_addr,
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 38abe6e3db52..dcc7288ffba5 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -79,7 +79,7 @@ enum {
typedef struct AcpiGhesState {
uint64_t hest_addr_le;
uint64_t hw_error_le;
- bool use_hest_addr; /* Currently, always false */
+ bool use_hest_addr; /* True if HEST address is present */
} AcpiGhesState;
void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 08/21] acpi/ghes: don't hard-code the number of sources for HEST table
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (6 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 07/21] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 09/21] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
` (13 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Peter Maydell,
Shannon Zhao, linux-kernel
The current code is actually dependent on having just one error
structure with a single source, as any change there would cause
migration issues.
As the number of sources should be arch-dependent, as it will depend on
what kind of notifications will exist, and how many errors can be
reported at the same time, change the logic to be more flexible,
allowing the number of sources to be defined when building the
HEST table by the caller.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/ghes.c | 39 +++++++++++++++++++++------------------
hw/arm/virt-acpi-build.c | 8 +++++++-
include/hw/acpi/ghes.h | 17 ++++++++++++-----
3 files changed, 40 insertions(+), 24 deletions(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 5158418f93cb..d1da16b3da2b 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -238,17 +238,17 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
* See docs/specs/acpi_hest_ghes.rst for blobs format.
*/
static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
- BIOSLinker *linker)
+ BIOSLinker *linker, int num_sources)
{
int i, error_status_block_offset;
/* Build error_block_address */
- for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+ for (i = 0; i < num_sources; i++) {
build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
}
/* Build read_ack_register */
- for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+ for (i = 0; i < num_sources; i++) {
/*
* Initialize the value of read_ack_register to 1, so GHES can be
* writable after (re)boot.
@@ -263,13 +263,13 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
/* Reserve space for Error Status Data Block */
acpi_data_push(hardware_errors,
- ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
+ ACPI_GHES_MAX_RAW_DATA_LENGTH * num_sources);
/* Tell guest firmware to place hardware_errors blob into RAM */
bios_linker_loader_alloc(linker, ACPI_HW_ERROR_FW_CFG_FILE,
hardware_errors, sizeof(uint64_t), false);
- for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+ for (i = 0; i < num_sources; i++) {
/*
* Tell firmware to patch error_block_address entries to point to
* corresponding "Generic Error Status Block"
@@ -295,12 +295,14 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
}
/* Build Generic Hardware Error Source version 2 (GHESv2) */
-static void build_ghes_v2(GArray *table_data,
- BIOSLinker *linker,
- enum AcpiGhesNotifyType notify,
- uint16_t source_id)
+static void build_ghes_v2_entry(GArray *table_data,
+ BIOSLinker *linker,
+ const AcpiNotificationSourceId *notif_src,
+ uint16_t index, int num_sources)
{
uint64_t address_offset;
+ const uint16_t notify = notif_src->notify;
+ const uint16_t source_id = notif_src->source_id;
/*
* Type:
@@ -331,7 +333,7 @@ static void build_ghes_v2(GArray *table_data,
address_offset + GAS_ADDR_OFFSET,
sizeof(uint64_t),
ACPI_HW_ERROR_FW_CFG_FILE,
- source_id * sizeof(uint64_t));
+ index * sizeof(uint64_t));
/* Notification Structure */
build_ghes_hw_error_notification(table_data, notify);
@@ -351,8 +353,7 @@ static void build_ghes_v2(GArray *table_data,
address_offset + GAS_ADDR_OFFSET,
sizeof(uint64_t),
ACPI_HW_ERROR_FW_CFG_FILE,
- (ACPI_GHES_ERROR_SOURCE_COUNT + source_id)
- * sizeof(uint64_t));
+ (num_sources + index) * sizeof(uint64_t));
/*
* Read Ack Preserve field
@@ -368,22 +369,26 @@ static void build_ghes_v2(GArray *table_data,
void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
GArray *hardware_errors,
BIOSLinker *linker,
+ const AcpiNotificationSourceId *notif_source,
+ int num_sources,
const char *oem_id, const char *oem_table_id)
{
AcpiTable table = { .sig = "HEST", .rev = 1,
.oem_id = oem_id, .oem_table_id = oem_table_id };
uint32_t hest_offset;
+ int i;
hest_offset = table_data->len;
- build_ghes_error_table(ags, hardware_errors, linker);
+ build_ghes_error_table(ags, hardware_errors, linker, num_sources);
acpi_table_begin(&table, table_data);
/* Error Source Count */
- build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
- build_ghes_v2(table_data, linker,
- ACPI_GHES_NOTIFY_SEA, ACPI_HEST_SRC_ID_SEA);
+ build_append_int_noprefix(table_data, num_sources, 4);
+ for (i = 0; i < num_sources; i++) {
+ build_ghes_v2_entry(table_data, linker, ¬if_source[i], i, num_sources);
+ }
acpi_table_end(linker, &table);
@@ -515,8 +520,6 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
return;
}
- assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
-
if (!ags->use_hest_addr) {
get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
&cper_addr, &read_ack_register_addr);
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index e6328af5d238..af5056201c22 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -893,6 +893,10 @@ static void acpi_align_size(GArray *blob, unsigned align)
g_array_set_size(blob, ROUND_UP(acpi_data_len(blob), align));
}
+static const AcpiNotificationSourceId hest_ghes_notify[] = {
+ { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
+};
+
static
void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
{
@@ -955,7 +959,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
if (ags) {
acpi_add_table(table_offsets, tables_blob);
acpi_build_hest(ags, tables_blob, tables->hardware_errors,
- tables->linker, vms->oem_id, vms->oem_table_id);
+ tables->linker, hest_ghes_notify,
+ ARRAY_SIZE(hest_ghes_notify),
+ vms->oem_id, vms->oem_table_id);
}
}
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index dcc7288ffba5..2f0c3288a860 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -57,13 +57,18 @@ enum AcpiGhesNotifyType {
ACPI_GHES_NOTIFY_RESERVED = 12
};
-enum {
- ACPI_HEST_SRC_ID_SEA = 0,
- /* future ids go here */
-
- ACPI_GHES_ERROR_SOURCE_COUNT
+/*
+ * ID numbers used to fill HEST source ID field
+ */
+enum AcpiGhesSourceID {
+ ACPI_HEST_SRC_ID_SYNC,
};
+typedef struct AcpiNotificationSourceId {
+ enum AcpiGhesSourceID source_id;
+ enum AcpiGhesNotifyType notify;
+} AcpiNotificationSourceId;
+
/*
* AcpiGhesState stores an offset that will be used to fill HEST entries.
*
@@ -85,6 +90,8 @@ typedef struct AcpiGhesState {
void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
GArray *hardware_errors,
BIOSLinker *linker,
+ const AcpiNotificationSourceId * const notif_source,
+ int num_sources,
const char *oem_id, const char *oem_table_id);
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
GArray *hardware_errors);
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 09/21] acpi/ghes: add a notifier to notify when error data is ready
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (7 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 08/21] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 10/21] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
` (12 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, linux-kernel
Some error injection notify methods are async, like GPIO
notify. Add a notifier to be used when the error record is
ready to be sent to the guest OS.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
hw/acpi/ghes.c | 5 ++++-
include/hw/acpi/ghes.h | 3 +++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index d1da16b3da2b..c3a64adfe5ed 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -510,6 +510,9 @@ static void get_ghes_source_offsets(uint16_t source_id,
*read_ack_start_addr = le64_to_cpu(*read_ack_start_addr);
}
+NotifierList acpi_generic_error_notifiers =
+ NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
+
void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
uint16_t source_id, Error **errp)
{
@@ -550,7 +553,7 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
/* Write the generic error data entry into guest memory */
cpu_physical_memory_write(cper_addr, cper, len);
- return;
+ notifier_list_notify(&acpi_generic_error_notifiers, NULL);
}
int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 2f0c3288a860..bf9f3de27122 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -24,6 +24,9 @@
#include "hw/acpi/bios-linker-loader.h"
#include "qapi/error.h"
+#include "qemu/notify.h"
+
+extern NotifierList acpi_generic_error_notifiers;
/*
* Values for Hardware Error Notification Type field
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 10/21] acpi/ghes: create an ancillary acpi_ghes_get_state() function
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (8 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 09/21] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 11:31 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 11/21] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
` (11 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, linux-kernel
Instead of having a function to check if ACPI is enabled
(acpi_ghes_present), change its logic to be more generic,
returing a pointed to AcpiGhesState.
Such change allows cleanup the ghes GED state code, avoiding
to read it multiple times, and simplifying the code.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/ghes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index c3a64adfe5ed..0135ac844bcf 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -608,7 +608,7 @@ AcpiGhesState *acpi_ghes_get_state(void)
}
ags = &acpi_ged_state->ghes_state;
- if (!ags->hw_error_le) {
+ if (!ags->hw_error_le && !ags->hest_addr_le) {
return NULL;
}
return ags;
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 11/21] acpi/generic_event_device: Update GHES migration to cover hest addr
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (9 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 10/21] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 12/21] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
` (10 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
The GHES migration logic should now support HEST table location too.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/generic_event_device.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index c85d97ca3776..5346cae573b7 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -386,6 +386,34 @@ static const VMStateDescription vmstate_ghes_state = {
}
};
+static const VMStateDescription vmstate_hest = {
+ .name = "acpi-hest",
+ .version_id = 1,
+ .minimum_version_id = 1,
+ .fields = (const VMStateField[]) {
+ VMSTATE_UINT64(hest_addr_le, AcpiGhesState),
+ VMSTATE_END_OF_LIST()
+ },
+};
+
+static bool hest_needed(void *opaque)
+{
+ AcpiGedState *s = opaque;
+ return s->ghes_state.hest_addr_le;
+}
+
+static const VMStateDescription vmstate_hest_state = {
+ .name = "acpi-ged/hest",
+ .version_id = 1,
+ .minimum_version_id = 1,
+ .needed = hest_needed,
+ .fields = (const VMStateField[]) {
+ VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
+ vmstate_hest, AcpiGhesState),
+ VMSTATE_END_OF_LIST()
+ }
+};
+
static const VMStateDescription vmstate_acpi_ged = {
.name = "acpi-ged",
.version_id = 1,
@@ -398,6 +426,7 @@ static const VMStateDescription vmstate_acpi_ged = {
&vmstate_memhp_state,
&vmstate_cpuhp_state,
&vmstate_ghes_state,
+ &vmstate_hest_state,
NULL
}
};
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 12/21] acpi/generic_event_device: add logic to detect if HEST addr is available
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (10 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 11/21] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 13:33 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 13/21] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
` (9 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Philippe Mathieu-Daudé, Ani Sinha,
Eduardo Habkost, Marcel Apfelbaum, Peter Maydell, Shannon Zhao,
Yanan Wang, Zhao Liu, linux-kernel
Create a new property (x-has-hest-addr) and use it to detect if
the GHES table offsets can be calculated from the HEST address
(qemu 10.0 and upper) or via the legacy way via an offset obtained
from the hardware_errors firmware file.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
hw/acpi/generic_event_device.c | 1 +
hw/arm/virt-acpi-build.c | 18 ++++++++++++++++--
hw/core/machine.c | 2 ++
3 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 5346cae573b7..14d8513a5440 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -318,6 +318,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
static const Property acpi_ged_properties[] = {
DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
+ DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, false),
};
static const VMStateDescription vmstate_memhp_state = {
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index af5056201c22..03ee30b3b3f0 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -897,6 +897,10 @@ static const AcpiNotificationSourceId hest_ghes_notify[] = {
{ ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
};
+static const AcpiNotificationSourceId hest_ghes_notify_9_2[] = {
+ { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
+};
+
static
void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
{
@@ -951,6 +955,8 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
if (vms->ras) {
AcpiGedState *acpi_ged_state;
+ static const AcpiNotificationSourceId *notify;
+ unsigned int notify_sz;
AcpiGhesState *ags;
acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
@@ -958,9 +964,17 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
ags = &acpi_ged_state->ghes_state;
if (ags) {
acpi_add_table(table_offsets, tables_blob);
+
+ if (!ags->use_hest_addr) {
+ notify = hest_ghes_notify_9_2;
+ notify_sz = ARRAY_SIZE(hest_ghes_notify_9_2);
+ } else {
+ notify = hest_ghes_notify;
+ notify_sz = ARRAY_SIZE(hest_ghes_notify);
+ }
+
acpi_build_hest(ags, tables_blob, tables->hardware_errors,
- tables->linker, hest_ghes_notify,
- ARRAY_SIZE(hest_ghes_notify),
+ tables->linker, notify, notify_sz,
vms->oem_id, vms->oem_table_id);
}
}
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 02cff735b3fb..7a11e0f87b11 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -34,6 +34,7 @@
#include "hw/virtio/virtio-pci.h"
#include "hw/virtio/virtio-net.h"
#include "hw/virtio/virtio-iommu.h"
+#include "hw/acpi/generic_event_device.h"
#include "audio/audio.h"
GlobalProperty hw_compat_9_2[] = {
@@ -43,6 +44,7 @@ GlobalProperty hw_compat_9_2[] = {
{ "virtio-balloon-pci-non-transitional", "vectors", "0" },
{ "virtio-mem-pci", "vectors", "0" },
{ "migration", "multifd-clean-tls-termination", "false" },
+ { TYPE_ACPI_GED, "x-has-hest-addr", "false" },
};
const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2);
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 13/21] acpi/generic_event_device: add an APEI error device
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (11 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 12/21] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 14/21] tests/acpi: virt: allow acpi table changes at DSDT and HEST tables Mauro Carvalho Chehab
` (8 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
Adds a generic error device to handle generic hardware error
events as specified at ACPI 6.5 specification at 18.3.2.7.2:
https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#event-notification-for-generic-error-sources
using HID PNP0C33.
The PNP0C33 device is used to report hardware errors to
the guest via ACPI APEI Generic Hardware Error Source (GHES).
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/aml-build.c | 10 ++++++++++
hw/acpi/generic_event_device.c | 13 +++++++++++++
include/hw/acpi/acpi_dev_interface.h | 1 +
include/hw/acpi/aml-build.h | 2 ++
include/hw/acpi/generic_event_device.h | 1 +
5 files changed, 27 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index f8f93a9f66c8..e4bd7b611372 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2614,3 +2614,13 @@ Aml *aml_i2c_serial_bus_device(uint16_t address, const char *resource_source)
return var;
}
+
+/* ACPI 5.0b: 18.3.2.6.2 Event Notification For Generic Error Sources */
+Aml *aml_error_device(void)
+{
+ Aml *dev = aml_device(ACPI_APEI_ERROR_DEVICE);
+ aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C33")));
+ aml_append(dev, aml_name_decl("_UID", aml_int(0)));
+
+ return dev;
+}
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 14d8513a5440..180eebbce1cd 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -26,6 +26,7 @@ static const uint32_t ged_supported_events[] = {
ACPI_GED_PWR_DOWN_EVT,
ACPI_GED_NVDIMM_HOTPLUG_EVT,
ACPI_GED_CPU_HOTPLUG_EVT,
+ ACPI_GED_ERROR_EVT,
};
/*
@@ -116,6 +117,16 @@ void build_ged_aml(Aml *table, const char *name, HotplugHandler *hotplug_dev,
aml_notify(aml_name(ACPI_POWER_BUTTON_DEVICE),
aml_int(0x80)));
break;
+ case ACPI_GED_ERROR_EVT:
+ /*
+ * ACPI 5.0b: 5.6.6 Device Object Notifications
+ * Table 5-135 Error Device Notification Values
+ * Defines 0x80 as the value to be used on notifications
+ */
+ aml_append(if_ctx,
+ aml_notify(aml_name(ACPI_APEI_ERROR_DEVICE),
+ aml_int(0x80)));
+ break;
case ACPI_GED_NVDIMM_HOTPLUG_EVT:
aml_append(if_ctx,
aml_notify(aml_name("\\_SB.NVDR"),
@@ -295,6 +306,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
sel = ACPI_GED_MEM_HOTPLUG_EVT;
} else if (ev & ACPI_POWER_DOWN_STATUS) {
sel = ACPI_GED_PWR_DOWN_EVT;
+ } else if (ev & ACPI_GENERIC_ERROR) {
+ sel = ACPI_GED_ERROR_EVT;
} else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
} else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
diff --git a/include/hw/acpi/acpi_dev_interface.h b/include/hw/acpi/acpi_dev_interface.h
index 68d9d15f50aa..8294f8f0ccca 100644
--- a/include/hw/acpi/acpi_dev_interface.h
+++ b/include/hw/acpi/acpi_dev_interface.h
@@ -13,6 +13,7 @@ typedef enum {
ACPI_NVDIMM_HOTPLUG_STATUS = 16,
ACPI_VMGENID_CHANGE_STATUS = 32,
ACPI_POWER_DOWN_STATUS = 64,
+ ACPI_GENERIC_ERROR = 128,
} AcpiEventStatusBits;
#define TYPE_ACPI_DEVICE_IF "acpi-device-interface"
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index c18f68134246..f38e12971932 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -252,6 +252,7 @@ struct CrsRangeSet {
/* Consumer/Producer */
#define AML_SERIAL_BUS_FLAG_CONSUME_ONLY (1 << 1)
+#define ACPI_APEI_ERROR_DEVICE "GEDD"
/**
* init_aml_allocator:
*
@@ -382,6 +383,7 @@ Aml *aml_dma(AmlDmaType typ, AmlDmaBusMaster bm, AmlTransferSize sz,
uint8_t channel);
Aml *aml_sleep(uint64_t msec);
Aml *aml_i2c_serial_bus_device(uint16_t address, const char *resource_source);
+Aml *aml_error_device(void);
/* Block AML object primitives */
Aml *aml_scope(const char *name_format, ...) G_GNUC_PRINTF(1, 2);
diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
index d2dac87b4a9f..1c18ac296fcb 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -101,6 +101,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
#define ACPI_GED_PWR_DOWN_EVT 0x2
#define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4
#define ACPI_GED_CPU_HOTPLUG_EVT 0x8
+#define ACPI_GED_ERROR_EVT 0x10
typedef struct GEDState {
MemoryRegion evt;
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 14/21] tests/acpi: virt: allow acpi table changes at DSDT and HEST tables
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (12 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 13/21] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 13:34 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 15/21] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
` (7 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
We'll be adding a new GED device for HEST GPIO notification and
increasing the number of entries at the HEST table.
Blocklist testing HEST and DSDT tables until such changes
are completed.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
tests/qtest/bios-tables-test-allowed-diff.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8bf4..0a1a26543ba2 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,7 @@
/* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/aarch64/virt/HEST",
+"tests/data/acpi/aarch64/virt/DSDT",
+"tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt",
+"tests/data/acpi/aarch64/virt/DSDT.memhp",
+"tests/data/acpi/aarch64/virt/DSDT.pxb",
+"tests/data/acpi/aarch64/virt/DSDT.topology",
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 15/21] arm/virt: Wire up a GED error device for ACPI / GHES
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (13 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 14/21] tests/acpi: virt: allow acpi table changes at DSDT and HEST tables Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 16/21] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
` (6 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Peter Maydell, Shannon Zhao,
linux-kernel
Adds support to ARM virtualization to allow handling
generic error ACPI Event via GED & error source device.
It is aligned with Linux Kernel patch:
https://lore.kernel.org/lkml/1272350481-27951-8-git-send-email-ying.huang@intel.com/
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
---
hw/arm/virt-acpi-build.c | 1 +
hw/arm/virt.c | 12 +++++++++++-
include/hw/arm/virt.h | 1 +
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 03ee30b3b3f0..841078f65880 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -861,6 +861,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
}
acpi_dsdt_add_power_button(scope);
+ aml_append(scope, aml_error_device());
#ifdef CONFIG_TPM
acpi_dsdt_add_tpm(scope, vms);
#endif
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 4a5a9666e916..3faf32f900b5 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -678,7 +678,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
DeviceState *dev;
MachineState *ms = MACHINE(vms);
int irq = vms->irqmap[VIRT_ACPI_GED];
- uint32_t event = ACPI_GED_PWR_DOWN_EVT;
+ uint32_t event = ACPI_GED_PWR_DOWN_EVT | ACPI_GED_ERROR_EVT;
if (ms->ram_slots) {
event |= ACPI_GED_MEM_HOTPLUG_EVT;
@@ -1010,6 +1010,13 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
}
}
+static void virt_generic_error_req(Notifier *n, void *opaque)
+{
+ VirtMachineState *s = container_of(n, VirtMachineState, generic_error_notifier);
+
+ acpi_send_event(s->acpi_dev, ACPI_GENERIC_ERROR);
+}
+
static void create_gpio_keys(char *fdt, DeviceState *pl061_dev,
uint32_t phandle)
{
@@ -2404,6 +2411,9 @@ static void machvirt_init(MachineState *machine)
if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
vms->acpi_dev = create_acpi_ged(vms);
+ vms->generic_error_notifier.notify = virt_generic_error_req;
+ notifier_list_add(&acpi_generic_error_notifiers,
+ &vms->generic_error_notifier);
} else {
create_gpio_devices(vms, VIRT_GPIO, sysmem);
}
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index c8e94e6aedc9..f3cf28436770 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -176,6 +176,7 @@ struct VirtMachineState {
DeviceState *gic;
DeviceState *acpi_dev;
Notifier powerdown_notifier;
+ Notifier generic_error_notifier;
PCIBus *bus;
char *oem_id;
char *oem_table_id;
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 16/21] qapi/acpi-hest: add an interface to do generic CPER error injection
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (14 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 15/21] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 17/21] tests/acpi: virt: update HEST table to accept two sources Mauro Carvalho Chehab
` (5 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Eric Blake,
Markus Armbruster, Michael Roth, Paolo Bonzini, Peter Maydell,
Shannon Zhao, linux-kernel
Creates a QMP command to be used for generic ACPI APEI hardware error
injection (HEST) via GHESv2, and add support for it for ARM guests.
Error injection uses ACPI_HEST_SRC_ID_QMP source ID to be platform
independent. This is mapped at arch virt bindings, depending on the
types supported by QEMU and by the BIOS. So, on ARM, this is supported
via ACPI_GHES_NOTIFY_GPIO notification type.
This patch is co-authored:
- original ghes logic to inject a simple ARM record by Shiju Jose;
- generic logic to handle block addresses by Jonathan Cameron;
- generic GHESv2 error inject by Mauro Carvalho Chehab;
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-authored-by: Shiju Jose <shiju.jose@huawei.com>
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
---
MAINTAINERS | 7 +++++++
hw/acpi/Kconfig | 5 +++++
hw/acpi/ghes.c | 2 +-
hw/acpi/ghes_cper.c | 38 ++++++++++++++++++++++++++++++++++++++
hw/acpi/ghes_cper_stub.c | 19 +++++++++++++++++++
hw/acpi/meson.build | 2 ++
hw/arm/virt-acpi-build.c | 1 +
hw/arm/virt.c | 7 +++++++
include/hw/acpi/ghes.h | 1 +
include/hw/arm/virt.h | 1 +
qapi/acpi-hest.json | 35 +++++++++++++++++++++++++++++++++++
qapi/meson.build | 1 +
qapi/qapi-schema.json | 1 +
13 files changed, 119 insertions(+), 1 deletion(-)
create mode 100644 hw/acpi/ghes_cper.c
create mode 100644 hw/acpi/ghes_cper_stub.c
create mode 100644 qapi/acpi-hest.json
diff --git a/MAINTAINERS b/MAINTAINERS
index 1911949526ce..7358735007c8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2081,6 +2081,13 @@ F: hw/acpi/ghes.c
F: include/hw/acpi/ghes.h
F: docs/specs/acpi_hest_ghes.rst
+ACPI/HEST/GHES/ARM processor CPER
+R: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+S: Maintained
+F: hw/arm/ghes_cper.c
+F: hw/acpi/ghes_cper_stub.c
+F: qapi/acpi-hest.json
+
ppc4xx
L: qemu-ppc@nongnu.org
S: Orphan
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 1d4e9f0845c0..daabbe6cd11e 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -51,6 +51,11 @@ config ACPI_APEI
bool
depends on ACPI
+config GHES_CPER
+ bool
+ depends on ACPI_APEI
+ default y
+
config ACPI_PCI
bool
depends on ACPI && PCI
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 0135ac844bcf..1d02ef6dcb70 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -553,7 +553,7 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
/* Write the generic error data entry into guest memory */
cpu_physical_memory_write(cper_addr, cper, len);
- notifier_list_notify(&acpi_generic_error_notifiers, NULL);
+ notifier_list_notify(&acpi_generic_error_notifiers, &source_id);
}
int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
diff --git a/hw/acpi/ghes_cper.c b/hw/acpi/ghes_cper.c
new file mode 100644
index 000000000000..0a2d95dd8b27
--- /dev/null
+++ b/hw/acpi/ghes_cper.c
@@ -0,0 +1,38 @@
+/*
+ * CPER payload parser for error injection
+ *
+ * Copyright(C) 2024-2025 Huawei LTD.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include "qemu/base64.h"
+#include "qemu/error-report.h"
+#include "qemu/uuid.h"
+#include "qapi/qapi-commands-acpi-hest.h"
+#include "hw/acpi/ghes.h"
+
+void qmp_inject_ghes_v2_error(const char *qmp_cper, Error **errp)
+{
+ AcpiGhesState *ags;
+
+ ags = acpi_ghes_get_state();
+ if (!ags) {
+ return;
+ }
+
+ uint8_t *cper;
+ size_t len;
+
+ cper = qbase64_decode(qmp_cper, -1, &len, errp);
+ if (!cper) {
+ error_setg(errp, "missing GHES CPER payload");
+ return;
+ }
+
+ ghes_record_cper_errors(ags, cper, len, ACPI_HEST_SRC_ID_QMP, errp);
+}
diff --git a/hw/acpi/ghes_cper_stub.c b/hw/acpi/ghes_cper_stub.c
new file mode 100644
index 000000000000..5ebc61970a78
--- /dev/null
+++ b/hw/acpi/ghes_cper_stub.c
@@ -0,0 +1,19 @@
+/*
+ * Stub interface for CPER payload parser for error injection
+ *
+ * Copyright(C) 2024-2025 Huawei LTD.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-acpi-hest.h"
+#include "hw/acpi/ghes.h"
+
+void qmp_inject_ghes_v2_error(const char *cper, Error **errp)
+{
+ error_setg(errp, "GHES QMP error inject is not compiled in");
+}
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
index 73f02b96912b..56b5d1ec9691 100644
--- a/hw/acpi/meson.build
+++ b/hw/acpi/meson.build
@@ -34,4 +34,6 @@ endif
system_ss.add(when: 'CONFIG_ACPI', if_false: files('acpi-stub.c', 'aml-build-stub.c', 'ghes-stub.c', 'acpi_interface.c'))
system_ss.add(when: 'CONFIG_ACPI_PCI_BRIDGE', if_false: files('pci-bridge-stub.c'))
system_ss.add_all(when: 'CONFIG_ACPI', if_true: acpi_ss)
+system_ss.add(when: 'CONFIG_GHES_CPER', if_true: files('ghes_cper.c'))
+system_ss.add(when: 'CONFIG_GHES_CPER', if_false: files('ghes_cper_stub.c'))
system_ss.add(files('acpi-qmp-cmds.c'))
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 841078f65880..eb580459f271 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -896,6 +896,7 @@ static void acpi_align_size(GArray *blob, unsigned align)
static const AcpiNotificationSourceId hest_ghes_notify[] = {
{ ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
+ { ACPI_HEST_SRC_ID_QMP, ACPI_GHES_NOTIFY_GPIO },
};
static const AcpiNotificationSourceId hest_ghes_notify_9_2[] = {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3faf32f900b5..116428ab582e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1012,6 +1012,13 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
static void virt_generic_error_req(Notifier *n, void *opaque)
{
+ uint16_t *source_id = opaque;
+
+ /* Currently, only QMP source ID is async */
+ if (*source_id != ACPI_HEST_SRC_ID_QMP) {
+ return;
+ }
+
VirtMachineState *s = container_of(n, VirtMachineState, generic_error_notifier);
acpi_send_event(s->acpi_dev, ACPI_GENERIC_ERROR);
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index bf9f3de27122..2a5ac3d20c76 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -65,6 +65,7 @@ enum AcpiGhesNotifyType {
*/
enum AcpiGhesSourceID {
ACPI_HEST_SRC_ID_SYNC,
+ ACPI_HEST_SRC_ID_QMP, /* Use it only for QMP injected errors */
};
typedef struct AcpiNotificationSourceId {
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index f3cf28436770..56f270f61cf5 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -33,6 +33,7 @@
#include "exec/hwaddr.h"
#include "qemu/notify.h"
#include "hw/boards.h"
+#include "hw/acpi/ghes.h"
#include "hw/arm/boot.h"
#include "hw/arm/bsa.h"
#include "hw/block/flash.h"
diff --git a/qapi/acpi-hest.json b/qapi/acpi-hest.json
new file mode 100644
index 000000000000..fff5018c7ec1
--- /dev/null
+++ b/qapi/acpi-hest.json
@@ -0,0 +1,35 @@
+# -*- Mode: Python -*-
+# vim: filetype=python
+
+##
+# == GHESv2 CPER Error Injection
+#
+# Defined since ACPI Specification 6.1,
+# section 18.3.2.8 Generic Hardware Error Source version 2. See:
+#
+# https://uefi.org/sites/default/files/resources/ACPI_6_1.pdf
+##
+
+
+##
+# @inject-ghes-v2-error:
+#
+# Inject an error with additional ACPI 6.1 GHESv2 error information
+#
+# @cper: contains a base64 encoded string with raw data for a single
+# CPER record with Generic Error Status Block, Generic Error Data
+# Entry and generic error data payload, as described at
+# https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#format
+#
+# Features:
+#
+# @unstable: This command is experimental.
+#
+# Since: 10.0
+##
+{ 'command': 'inject-ghes-v2-error',
+ 'data': {
+ 'cper': 'str'
+ },
+ 'features': [ 'unstable' ]
+}
diff --git a/qapi/meson.build b/qapi/meson.build
index e7bc54e5d047..35cea6147262 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -59,6 +59,7 @@ qapi_all_modules = [
if have_system
qapi_all_modules += [
'acpi',
+ 'acpi-hest',
'audio',
'cryptodev',
'qdev',
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index b1581988e4eb..baf19ab73afe 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -75,6 +75,7 @@
{ 'include': 'misc-target.json' }
{ 'include': 'audio.json' }
{ 'include': 'acpi.json' }
+{ 'include': 'acpi-hest.json' }
{ 'include': 'pci.json' }
{ 'include': 'stats.json' }
{ 'include': 'virtio.json' }
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 17/21] tests/acpi: virt: update HEST table to accept two sources
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (15 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 16/21] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 13:10 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 18/21] tests/acpi: virt: and update DSDT table to add the new GED device Mauro Carvalho Chehab
` (4 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, linux-kernel
--- /tmp/asl-38PE22.dsl 2025-02-26 16:25:32.362148388 +0100
+++ /tmp/asl-HSPE22.dsl 2025-02-26 16:25:32.361148402 +0100
@@ -1,39 +1,39 @@
/*
* Intel ACPI Component Architecture
* AML/ASL+ Disassembler version 20240322 (64-bit version)
* Copyright (c) 2000 - 2023 Intel Corporation
*
- * Disassembly of tests/data/acpi/aarch64/virt/HEST
+ * Disassembly of /tmp/aml-DMPE22
*
* ACPI Data Table [HEST]
*
* Format: [HexOffset DecimalOffset ByteLength] FieldName : FieldValue (in hex)
*/
[000h 0000 004h] Signature : "HEST" [Hardware Error Source Table]
-[004h 0004 004h] Table Length : 00000084
+[004h 0004 004h] Table Length : 000000E0
[008h 0008 001h] Revision : 01
-[009h 0009 001h] Checksum : E2
+[009h 0009 001h] Checksum : 6C
[00Ah 0010 006h] Oem ID : "BOCHS "
[010h 0016 008h] Oem Table ID : "BXPC "
[018h 0024 004h] Oem Revision : 00000001
[01Ch 0028 004h] Asl Compiler ID : "BXPC"
[020h 0032 004h] Asl Compiler Revision : 00000001
-[024h 0036 004h] Error Source Count : 00000001
+[024h 0036 004h] Error Source Count : 00000002
[028h 0040 002h] Subtable Type : 000A [Generic Hardware Error Source V2]
[02Ah 0042 002h] Source Id : 0000
[02Ch 0044 002h] Related Source Id : FFFF
[02Eh 0046 001h] Reserved : 00
[02Fh 0047 001h] Enabled : 01
[030h 0048 004h] Records To Preallocate : 00000001
[034h 0052 004h] Max Sections Per Record : 00000001
[038h 0056 004h] Max Raw Data Length : 00000400
[03Ch 0060 00Ch] Error Status Address : [Generic Address Structure]
[03Ch 0060 001h] Space ID : 00 [SystemMemory]
[03Dh 0061 001h] Bit Width : 40
[03Eh 0062 001h] Bit Offset : 00
[03Fh 0063 001h] Encoded Access Width : 04 [QWord Access:64]
[040h 0064 008h] Address : 0000000043DA0000
@@ -42,32 +42,75 @@
[048h 0072 001h] Notify Type : 08 [SEA]
[049h 0073 001h] Notify Length : 1C
[04Ah 0074 002h] Configuration Write Enable : 0000
[04Ch 0076 004h] PollInterval : 00000000
[050h 0080 004h] Vector : 00000000
[054h 0084 004h] Polling Threshold Value : 00000000
[058h 0088 004h] Polling Threshold Window : 00000000
[05Ch 0092 004h] Error Threshold Value : 00000000
[060h 0096 004h] Error Threshold Window : 00000000
[064h 0100 004h] Error Status Block Length : 00000400
[068h 0104 00Ch] Read Ack Register : [Generic Address Structure]
[068h 0104 001h] Space ID : 00 [SystemMemory]
[069h 0105 001h] Bit Width : 40
[06Ah 0106 001h] Bit Offset : 00
[06Bh 0107 001h] Encoded Access Width : 04 [QWord Access:64]
-[06Ch 0108 008h] Address : 0000000043DA0008
+[06Ch 0108 008h] Address : 0000000043DA0010
[074h 0116 008h] Read Ack Preserve : FFFFFFFFFFFFFFFE
[07Ch 0124 008h] Read Ack Write : 0000000000000001
-Raw Table Data: Length 132 (0x84)
+[084h 0132 002h] Subtable Type : 000A [Generic Hardware Error Source V2]
+[086h 0134 002h] Source Id : 0001
+[088h 0136 002h] Related Source Id : FFFF
+[08Ah 0138 001h] Reserved : 00
+[08Bh 0139 001h] Enabled : 01
+[08Ch 0140 004h] Records To Preallocate : 00000001
+[090h 0144 004h] Max Sections Per Record : 00000001
+[094h 0148 004h] Max Raw Data Length : 00000400
+
+[098h 0152 00Ch] Error Status Address : [Generic Address Structure]
+[098h 0152 001h] Space ID : 00 [SystemMemory]
+[099h 0153 001h] Bit Width : 40
+[09Ah 0154 001h] Bit Offset : 00
+[09Bh 0155 001h] Encoded Access Width : 04 [QWord Access:64]
+[09Ch 0156 008h] Address : 0000000043DA0008
+
+[0A4h 0164 01Ch] Notify : [Hardware Error Notification Structure]
+[0A4h 0164 001h] Notify Type : 07 [GPIO]
+[0A5h 0165 001h] Notify Length : 1C
+[0A6h 0166 002h] Configuration Write Enable : 0000
+[0A8h 0168 004h] PollInterval : 00000000
+[0ACh 0172 004h] Vector : 00000000
+[0B0h 0176 004h] Polling Threshold Value : 00000000
+[0B4h 0180 004h] Polling Threshold Window : 00000000
+[0B8h 0184 004h] Error Threshold Value : 00000000
+[0BCh 0188 004h] Error Threshold Window : 00000000
+
+[0C0h 0192 004h] Error Status Block Length : 00000400
+[0C4h 0196 00Ch] Read Ack Register : [Generic Address Structure]
+[0C4h 0196 001h] Space ID : 00 [SystemMemory]
+[0C5h 0197 001h] Bit Width : 40
+[0C6h 0198 001h] Bit Offset : 00
+[0C7h 0199 001h] Encoded Access Width : 04 [QWord Access:64]
+[0C8h 0200 008h] Address : 0000000043DA0018
- 0000: 48 45 53 54 84 00 00 00 01 E2 42 4F 43 48 53 20 // HEST......BOCHS
+[0D0h 0208 008h] Read Ack Preserve : FFFFFFFFFFFFFFFE
+[0D8h 0216 008h] Read Ack Write : 0000000000000001
+
+Raw Table Data: Length 224 (0xE0)
+
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
tests/data/acpi/aarch64/virt/HEST | Bin 132 -> 224 bytes
1 file changed, 0 insertions(+), 0 deletions(-)
diff --git a/tests/data/acpi/aarch64/virt/HEST b/tests/data/acpi/aarch64/virt/HEST
index 4c5d8c5b5da5b3241f93cd0839e94272bf6b1486..674272922db7d48f7821aa7c83ec76bb3b556d2a 100644
GIT binary patch
delta 68
zcmZo+e89-%;TjzBfPsO5F=rx|6eH6_Rd+^#iMisuTnvm1|Nk>EGJ@nLCJHmL%S;Ru
WnV7)J#lXPAz`)?Zz#=g*R~!HcF%5eF
delta 29
lcmaFB*uu!=;Tjy$!oa}5_-G=R6eHtARriT=I3|_|004Ge2nqlI
--
2.48.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v5 18/21] tests/acpi: virt: and update DSDT table to add the new GED device
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (16 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 17/21] tests/acpi: virt: update HEST table to accept two sources Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 19/21] docs: hest: add new "etc/acpi_table_hest_addr" and update workflow Mauro Carvalho Chehab
` (3 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
--- /tmp/asl-L7J912.dsl 2025-02-26 16:22:26.539657960 +0100
+++ /tmp/asl-T9N912.dsl 2025-02-26 16:22:26.536658001 +0100
@@ -1,30 +1,30 @@
/*
* Intel ACPI Component Architecture
* AML/ASL+ Disassembler version 20240322 (64-bit version)
* Copyright (c) 2000 - 2023 Intel Corporation
*
* Disassembling to symbolic ASL+ operators
*
- * Disassembly of tests/data/acpi/aarch64/virt/DSDT
+ * Disassembly of /tmp/aml-TNQ912
*
* Original Table Header:
* Signature "DSDT"
- * Length 0x0000144C (5196)
+ * Length 0x00001478 (5240)
* Revision 0x02
- * Checksum 0x1B
+ * Checksum 0x04
* OEM ID "BOCHS "
* OEM Table ID "BXPC "
* OEM Revision 0x00000001 (1)
* Compiler ID "BXPC"
* Compiler Version 0x00000001 (1)
*/
DefinitionBlock ("", "DSDT", 2, "BOCHS ", "BXPC ", 0x00000001)
{
Scope (\_SB)
{
Device (C000)
{
Name (_HID, "ACPI0007" /* Processor Device */) // _HID: Hardware ID
Name (_UID, Zero) // _UID: Unique ID
}
@@ -1876,27 +1876,38 @@
0x00000029,
}
})
OperationRegion (EREG, SystemMemory, 0x09080000, 0x04)
Field (EREG, DWordAcc, NoLock, WriteAsZeros)
{
ESEL, 32
}
Method (_EVT, 1, Serialized) // _EVT: Event
{
Local0 = ESEL /* \_SB_.GED_.ESEL */
If (((Local0 & 0x02) == 0x02))
{
Notify (PWRB, 0x80) // Status Change
}
+
+ If (((Local0 & 0x10) == 0x10))
+ {
+ Notify (GEDD, 0x80) // Status Change
+ }
}
}
Device (PWRB)
{
Name (_HID, "PNP0C0C" /* Power Button Device */) // _HID: Hardware ID
Name (_UID, Zero) // _UID: Unique ID
}
+
+ Device (GEDD)
+ {
+ Name (_HID, "PNP0C33" /* Error Device */) // _HID: Hardware ID
+ Name (_UID, Zero) // _UID: Unique ID
+ }
}
}
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
.../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
tests/qtest/bios-tables-test-allowed-diff.h | 6 ------
6 files changed, 6 deletions(-)
diff --git a/tests/data/acpi/aarch64/virt/DSDT b/tests/data/acpi/aarch64/virt/DSDT
index 36d3e5d5a5e47359b6dcb3706f98b4f225677591..a182bd9d7182dccdf63c650d048c58f18505d001 100644
GIT binary patch
delta 109
zcmX@3@k4{lCD<jTLWF^ViDe>}G*h$dM)euOOwJsW4+;nC=*7E+g>V+Q2D|zsED)Gn
zoxsJ!z{S)S5FX^j)c_F?VBivHb9Z%dnXE4&D;?b=31V}^dw9C=2KWUSI2#)?aKwjt
Hx-b9$X;vI^
delta 64
zcmeyNaYlp7CD<jzM}&caNqQoeG*i3NM)euOOit{R4+;lM%f`Egg>V+Q2D|zsED)Gn
UoxsJ!z{S)S5FX?-*+E1W06%jPR{#J2
diff --git a/tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt b/tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt
index e6154d0355f84fdcc51387b4db8f9ee63acae4e9..af1f2b0eb0b77a80c5bd74f201d24f71e486627f 100644
GIT binary patch
delta 110
zcmZ3ac}|ndCD<k8oCpI0)4_>c(oCIR8`a+lGdXii78eO-)SH|wBICY5U~+W=mjDBo
yK%2X(iwjpnbdzL2c#soEyoaX?Z-8HbfwO@#14n$Qrwc=LlO#wDl9aJAR0;r(tsHj%
delta 66
zcmX@7xk!`CCD<iokq83=(~XH-(oDVX8`a+lGdZzO78eO-l%1R{A|oB$BpDDM<irv0
W;pxH~;1^)vY~akm5g+R5!T<noi4jWx
diff --git a/tests/data/acpi/aarch64/virt/DSDT.memhp b/tests/data/acpi/aarch64/virt/DSDT.memhp
index 33f011d6b635035a04c0b39ce9b4e219f7ae74b7..10436ec87c4859fb84b3ecb7bba5788f38112e59 100644
GIT binary patch
delta 88
zcmbPheA1Z9CD<k8q$C3algUIbX{MH08`WnBGdXcjJ}4Z_<jXo)OvH<SfxzVI1TFyv
qE`c_8R~MJfaU%At($P(lAPz^oho=i~fM0-tv#~J)M|`NK3j+W#;TF9B
delta 44
zcmX?UJlB}ZCD<iot|S8klg&gfX{L_p8`WnBGdXfiJ}4Z_<ij#qOvGz*p@=Oj039?8
AE&u=k
diff --git a/tests/data/acpi/aarch64/virt/DSDT.pxb b/tests/data/acpi/aarch64/virt/DSDT.pxb
index c0fdc6e9c1396cc2259dc4bc665ba023adcf4c9b..0524b3cbe00bfe552de824dd1090bd00a208c527 100644
GIT binary patch
delta 110
zcmexwz1oJ$CD<iITaJN&sbC_PG*jDyjq2XAOwJsWOJsu?^(LQ?m2qDnFu6K`OMrn(
ypv~RY#f7UOx=Au1JjjV7-ow*{H^48zz}di=fg?WD(}f|rNfM+6Ny^w5Dg^+WYaFrw
delta 66
zcmZ2&^WU1wCD<k8zbpd-Q^!OuX{N5b8`ZsKnVi@sm&gV)%1%BZD<d7<BpDDM<irv0
W;pxH~;1^)vY~akm5g+R5!T<oNArgiF
diff --git a/tests/data/acpi/aarch64/virt/DSDT.topology b/tests/data/acpi/aarch64/virt/DSDT.topology
index 029d03eecc4efddc001e5377e85ac8e831294362..8c0423fe62d6950f9098983d86bfee256d7d003a 100644
GIT binary patch
delta 86
zcmbQHbx4cLCD<jzNtA(s>E%Q&X{O%5jp|7vOwJsWyG4Q-^(NmJk>Ot;Fu6K`OMrn(
opv~RY#bxqO5n1WzCP@&RBi_T)g*U)2z`)tqn1Lfc)YF9l01l28<p2Nx
delta 42
ycmX@4HBF1lCD<iIOq79viGL!OG*hGhM)f2SCMWjE-6Fw^vXk$N$V}!Dl?DLb(h64q
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 0a1a26543ba2..dfb8523c8bf4 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,7 +1 @@
/* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/aarch64/virt/HEST",
-"tests/data/acpi/aarch64/virt/DSDT",
-"tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt",
-"tests/data/acpi/aarch64/virt/DSDT.memhp",
-"tests/data/acpi/aarch64/virt/DSDT.pxb",
-"tests/data/acpi/aarch64/virt/DSDT.topology",
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 19/21] docs: hest: add new "etc/acpi_table_hest_addr" and update workflow
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (17 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 18/21] tests/acpi: virt: and update DSDT table to add the new GED device Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 13:21 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 20/21] acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x Mauro Carvalho Chehab
` (2 subsequent siblings)
21 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Dongjiu Geng, linux-kernel
While the HEST layout didn't change, there are some internal
changes related to how offsets are calculated and how memory error
events are triggered.
Update specs to reflect such changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
docs/specs/acpi_hest_ghes.rst | 28 +++++++++++++++++-----------
1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
index c3e9f8d9a702..f3cb3074b082 100644
--- a/docs/specs/acpi_hest_ghes.rst
+++ b/docs/specs/acpi_hest_ghes.rst
@@ -89,12 +89,21 @@ Design Details
addresses in the "error_block_address" fields with a pointer to the
respective "Error Status Data Block" in the "etc/hardware_errors" blob.
-(8) QEMU defines a third and write-only fw_cfg blob which is called
- "etc/hardware_errors_addr". Through that blob, the firmware can send back
- the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
- blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
- for the firmware. The firmware will write back the start address of
- "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
+(8) QEMU defines a third and write-only fw_cfg blob to store the location
+ where the error block offsets, read ack registers and CPER records are
+ stored.
+
+ Up to QEMU 9.2, the location was at "etc/hardware_errors_addr", and
+ contains an offset for the beginning of "etc/hardware_errors".
+
+ Newer versions place the location at "etc/acpi_table_hest_addr",
+ pointing to the beginning of the HEST table.
+
+ Through that such offsets, the firmware can send back the guest-side
+ allocation addresses to QEMU. They contain a 8-byte entry. QEMU generates
+ a single WRITE_POINTER command for the firmware. The firmware will write
+ back the start address of either "etc/hardware_errors" or HEST table at
+ the correspoinding address firmware.
(9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
"Error Status Data Block", guest memory, and then injects platform specific
@@ -105,8 +114,5 @@ Design Details
kernel, on receiving notification, guest APEI driver could read the CPER error
and take appropriate action.
-(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
- find out "Error Status Data Block" entry corresponding to error source. So supported
- source_id values should be assigned here and not be changed afterwards to make sure
- that guest will write error into expected "Error Status Data Block" even if guest was
- migrated to a newer QEMU.
+(11) kvm_arch_on_sigbus_vcpu() report RAS errors via a SEA notifications,
+ when a SIGBUS event is triggered.
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 20/21] acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (18 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 19/21] docs: hest: add new "etc/acpi_table_hest_addr" and update workflow Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 21/21] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
2025-02-27 13:30 ` [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for " Igor Mammedov
21 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, linux-kernel
Now that we have everything in place, enable using HEST GPA
instead of etc/hardware_errors GPA.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
hw/acpi/generic_event_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 180eebbce1cd..f5e899155d34 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -331,7 +331,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
static const Property acpi_ged_properties[] = {
DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
- DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, false),
+ DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, true),
};
static const VMStateDescription vmstate_memhp_state = {
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v5 21/21] scripts/ghes_inject: add a script to generate GHES error inject
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (19 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 20/21] acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x Mauro Carvalho Chehab
@ 2025-02-27 11:03 ` Mauro Carvalho Chehab
2025-02-27 13:30 ` [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for " Igor Mammedov
21 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:03 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Cleber Rosa, John Snow, linux-kernel
Using the QMP GHESv2 API requires preparing a raw data array
containing a CPER record.
Add a helper script with subcommands to prepare such data.
Currently, only ARM Processor error CPER record is supported, by
using:
$ ghes_inject.py arm
which produces those warnings on Linux:
[ 705.032426] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
[ 774.866308] {4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
[ 774.866583] {4}[Hardware Error]: event severity: recoverable
[ 774.866738] {4}[Hardware Error]: Error 0, type: recoverable
[ 774.866889] {4}[Hardware Error]: section_type: ARM processor error
[ 774.867048] {4}[Hardware Error]: MIDR: 0x00000000000f0510
[ 774.867189] {4}[Hardware Error]: running state: 0x0
[ 774.867321] {4}[Hardware Error]: Power State Coordination Interface state: 0
[ 774.867511] {4}[Hardware Error]: Error info structure 0:
[ 774.867679] {4}[Hardware Error]: num errors: 2
[ 774.867801] {4}[Hardware Error]: error_type: 0x02: cache error
[ 774.867962] {4}[Hardware Error]: error_info: 0x000000000091000f
[ 774.868124] {4}[Hardware Error]: transaction type: Data Access
[ 774.868280] {4}[Hardware Error]: cache error, operation type: Data write
[ 774.868465] {4}[Hardware Error]: cache level: 2
[ 774.868592] {4}[Hardware Error]: processor context not corrupted
[ 774.868774] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
Such script allows customizing the error data, allowing to change
all fields at the record. Please use:
$ ghes_inject.py arm -h
For more details about its usage.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
MAINTAINERS | 3 +
scripts/arm_processor_error.py | 476 ++++++++++++++++++++++
scripts/ghes_inject.py | 51 +++
scripts/qmp_helper.py | 702 +++++++++++++++++++++++++++++++++
4 files changed, 1232 insertions(+)
create mode 100644 scripts/arm_processor_error.py
create mode 100755 scripts/ghes_inject.py
create mode 100755 scripts/qmp_helper.py
diff --git a/MAINTAINERS b/MAINTAINERS
index 7358735007c8..f2e911e34120 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2087,6 +2087,9 @@ S: Maintained
F: hw/arm/ghes_cper.c
F: hw/acpi/ghes_cper_stub.c
F: qapi/acpi-hest.json
+F: scripts/ghes_inject.py
+F: scripts/arm_processor_error.py
+F: scripts/qmp_helper.py
ppc4xx
L: qemu-ppc@nongnu.org
diff --git a/scripts/arm_processor_error.py b/scripts/arm_processor_error.py
new file mode 100644
index 000000000000..1dd42e42a877
--- /dev/null
+++ b/scripts/arm_processor_error.py
@@ -0,0 +1,476 @@
+#!/usr/bin/env python3
+#
+# pylint: disable=C0301,C0114,R0903,R0912,R0913,R0914,R0915,W0511
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (C) 2024-2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+
+# TODO: current implementation has dummy defaults.
+#
+# For a better implementation, a QMP addition/call is needed to
+# retrieve some data for ARM Processor Error injection:
+#
+# - ARM registers: power_state, mpidr.
+
+"""
+Generates an ARM processor error CPER, compatible with
+UEFI 2.9A Errata.
+
+Injecting such errors can be done using:
+
+ $ ./scripts/ghes_inject.py arm
+ Error injected.
+
+Produces a simple CPER register, as detected on a Linux guest:
+
+[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
+[Hardware Error]: event severity: recoverable
+[Hardware Error]: Error 0, type: recoverable
+[Hardware Error]: section_type: ARM processor error
+[Hardware Error]: MIDR: 0x0000000000000000
+[Hardware Error]: running state: 0x0
+[Hardware Error]: Power State Coordination Interface state: 0
+[Hardware Error]: Error info structure 0:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x02: cache error
+[Hardware Error]: error_info: 0x000000000091000f
+[Hardware Error]: transaction type: Data Access
+[Hardware Error]: cache error, operation type: Data write
+[Hardware Error]: cache level: 2
+[Hardware Error]: processor context not corrupted
+[Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
+
+The ARM Processor Error message can be customized via command line
+parameters. For instance:
+
+ $ ./scripts/ghes_inject.py arm --mpidr 0x444 --running --affinity 1 \
+ --error-info 12345678 --vendor 0x13,123,4,5,1 --ctx-array 0,1,2,3,4,5 \
+ -t cache tlb bus micro-arch tlb,micro-arch
+ Error injected.
+
+Injects this error, as detected on a Linux guest:
+
+[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
+[Hardware Error]: event severity: recoverable
+[Hardware Error]: Error 0, type: recoverable
+[Hardware Error]: section_type: ARM processor error
+[Hardware Error]: MIDR: 0x0000000000000000
+[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x0000000000000000
+[Hardware Error]: error affinity level: 0
+[Hardware Error]: running state: 0x1
+[Hardware Error]: Power State Coordination Interface state: 0
+[Hardware Error]: Error info structure 0:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x02: cache error
+[Hardware Error]: error_info: 0x0000000000bc614e
+[Hardware Error]: cache level: 2
+[Hardware Error]: processor context not corrupted
+[Hardware Error]: Error info structure 1:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x04: TLB error
+[Hardware Error]: error_info: 0x000000000054007f
+[Hardware Error]: transaction type: Instruction
+[Hardware Error]: TLB error, operation type: Instruction fetch
+[Hardware Error]: TLB level: 1
+[Hardware Error]: processor context not corrupted
+[Hardware Error]: the error has not been corrected
+[Hardware Error]: PC is imprecise
+[Hardware Error]: Error info structure 2:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x08: bus error
+[Hardware Error]: error_info: 0x00000080d6460fff
+[Hardware Error]: transaction type: Generic
+[Hardware Error]: bus error, operation type: Generic read (type of instruction or data request cannot be determined)
+[Hardware Error]: affinity level at which the bus error occurred: 1
+[Hardware Error]: processor context corrupted
+[Hardware Error]: the error has been corrected
+[Hardware Error]: PC is imprecise
+[Hardware Error]: Program execution can be restarted reliably at the PC associated with the error.
+[Hardware Error]: participation type: Local processor observed
+[Hardware Error]: request timed out
+[Hardware Error]: address space: External Memory Access
+[Hardware Error]: memory access attributes:0x20
+[Hardware Error]: access mode: secure
+[Hardware Error]: Error info structure 3:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x10: micro-architectural error
+[Hardware Error]: error_info: 0x0000000078da03ff
+[Hardware Error]: Error info structure 4:
+[Hardware Error]: num errors: 2
+[Hardware Error]: error_type: 0x14: TLB error|micro-architectural error
+[Hardware Error]: Context info structure 0:
+[Hardware Error]: register context type: AArch64 EL1 context registers
+[Hardware Error]: 00000000: 00000000 00000000
+[Hardware Error]: Vendor specific error info has 5 bytes:
+[Hardware Error]: 00000000: 13 7b 04 05 01 .{...
+[Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
+[Firmware Warn]: GHES: Unhandled processor error type 0x04: TLB error
+[Firmware Warn]: GHES: Unhandled processor error type 0x08: bus error
+[Firmware Warn]: GHES: Unhandled processor error type 0x10: micro-architectural error
+[Firmware Warn]: GHES: Unhandled processor error type 0x14: TLB error|micro-architectural error
+"""
+
+import argparse
+import re
+
+from qmp_helper import qmp, util, cper_guid
+
+
+class ArmProcessorEinj:
+ """
+ Implements ARM Processor Error injection via GHES
+ """
+
+ DESC = """
+ Generates an ARM processor error CPER, compatible with
+ UEFI 2.9A Errata.
+ """
+
+ ACPI_GHES_ARM_CPER_LENGTH = 40
+ ACPI_GHES_ARM_CPER_PEI_LENGTH = 32
+
+ # Context types
+ CONTEXT_AARCH32_EL1 = 1
+ CONTEXT_AARCH64_EL1 = 5
+ CONTEXT_MISC_REG = 8
+
+ def __init__(self, subparsers):
+ """Initialize the error injection class and add subparser"""
+
+ # Valid choice values
+ self.arm_valid_bits = {
+ "mpidr": util.bit(0),
+ "affinity": util.bit(1),
+ "running": util.bit(2),
+ "vendor": util.bit(3),
+ }
+
+ self.pei_flags = {
+ "first": util.bit(0),
+ "last": util.bit(1),
+ "propagated": util.bit(2),
+ "overflow": util.bit(3),
+ }
+
+ self.pei_error_types = {
+ "cache": util.bit(1),
+ "tlb": util.bit(2),
+ "bus": util.bit(3),
+ "micro-arch": util.bit(4),
+ }
+
+ self.pei_valid_bits = {
+ "multiple-error": util.bit(0),
+ "flags": util.bit(1),
+ "error-info": util.bit(2),
+ "virt-addr": util.bit(3),
+ "phy-addr": util.bit(4),
+ }
+
+ self.data = bytearray()
+
+ parser = subparsers.add_parser("arm", description=self.DESC)
+
+ arm_valid_bits = ",".join(self.arm_valid_bits.keys())
+ flags = ",".join(self.pei_flags.keys())
+ error_types = ",".join(self.pei_error_types.keys())
+ pei_valid_bits = ",".join(self.pei_valid_bits.keys())
+
+ # UEFI N.16 ARM Validation bits
+ g_arm = parser.add_argument_group("ARM processor")
+ g_arm.add_argument("--arm", "--arm-valid",
+ help=f"ARM valid bits: {arm_valid_bits}")
+ g_arm.add_argument("-a", "--affinity", "--level", "--affinity-level",
+ type=lambda x: int(x, 0),
+ help="Affinity level (when multiple levels apply)")
+ g_arm.add_argument("-l", "--mpidr", type=lambda x: int(x, 0),
+ help="Multiprocessor Affinity Register")
+ g_arm.add_argument("-i", "--midr", type=lambda x: int(x, 0),
+ help="Main ID Register")
+ g_arm.add_argument("-r", "--running",
+ action=argparse.BooleanOptionalAction,
+ default=None,
+ help="Indicates if the processor is running or not")
+ g_arm.add_argument("--psci", "--psci-state",
+ type=lambda x: int(x, 0),
+ help="Power State Coordination Interface - PSCI state")
+
+ # TODO: Add vendor-specific support
+
+ # UEFI N.17 bitmaps (type and flags)
+ g_pei = parser.add_argument_group("ARM Processor Error Info (PEI)")
+ g_pei.add_argument("-t", "--type", nargs="+",
+ help=f"one or more error types: {error_types}")
+ g_pei.add_argument("-f", "--flags", nargs="*",
+ help=f"zero or more error flags: {flags}")
+ g_pei.add_argument("-V", "--pei-valid", "--error-valid", nargs="*",
+ help=f"zero or more PEI valid bits: {pei_valid_bits}")
+
+ # UEFI N.17 Integer values
+ g_pei.add_argument("-m", "--multiple-error", nargs="+",
+ help="Number of errors: 0: Single error, 1: Multiple errors, 2-65535: Error count if known")
+ g_pei.add_argument("-e", "--error-info", nargs="+",
+ help="Error information (UEFI 2.10 tables N.18 to N.20)")
+ g_pei.add_argument("-p", "--physical-address", nargs="+",
+ help="Physical address")
+ g_pei.add_argument("-v", "--virtual-address", nargs="+",
+ help="Virtual address")
+
+ # UEFI N.21 Context
+ g_ctx = parser.add_argument_group("Processor Context")
+ g_ctx.add_argument("--ctx-type", "--context-type", nargs="*",
+ help="Type of the context (0=ARM32 GPR, 5=ARM64 EL1, other values supported)")
+ g_ctx.add_argument("--ctx-size", "--context-size", nargs="*",
+ help="Minimal size of the context")
+ g_ctx.add_argument("--ctx-array", "--context-array", nargs="*",
+ help="Comma-separated arrays for each context")
+
+ # Vendor-specific data
+ g_vendor = parser.add_argument_group("Vendor-specific data")
+ g_vendor.add_argument("--vendor", "--vendor-specific", nargs="+",
+ help="Vendor-specific byte arrays of data")
+
+ # Add arguments for Generic Error Data
+ qmp.argparse(parser)
+
+ parser.set_defaults(func=self.send_cper)
+
+ def send_cper(self, args):
+ """Parse subcommand arguments and send a CPER via QMP"""
+
+ qmp_cmd = qmp(args.host, args.port, args.debug)
+
+ # Handle Generic Error Data arguments if any
+ qmp_cmd.set_args(args)
+
+ is_cpu_type = re.compile(r"^([\w+]+\-)?arm\-cpu$")
+ cpus = qmp_cmd.search_qom("/machine/unattached/device",
+ "type", is_cpu_type)
+
+ cper = {}
+ pei = {}
+ ctx = {}
+ vendor = {}
+
+ arg = vars(args)
+
+ # Handle global parameters
+ if args.arm:
+ arm_valid_init = False
+ cper["valid"] = util.get_choice(name="valid",
+ value=args.arm,
+ choices=self.arm_valid_bits,
+ suffixes=["-error", "-err"])
+ else:
+ cper["valid"] = 0
+ arm_valid_init = True
+
+ if "running" in arg:
+ if args.running:
+ cper["running-state"] = util.bit(0)
+ else:
+ cper["running-state"] = 0
+ else:
+ cper["running-state"] = 0
+
+ if arm_valid_init:
+ if args.affinity:
+ cper["valid"] |= self.arm_valid_bits["affinity"]
+
+ if args.mpidr:
+ cper["valid"] |= self.arm_valid_bits["mpidr"]
+
+ if "running-state" in cper:
+ cper["valid"] |= self.arm_valid_bits["running"]
+
+ if args.psci:
+ cper["valid"] |= self.arm_valid_bits["running"]
+
+ # Handle PEI
+ if not args.type:
+ args.type = ["cache-error"]
+
+ util.get_mult_choices(
+ pei,
+ name="valid",
+ values=args.pei_valid,
+ choices=self.pei_valid_bits,
+ suffixes=["-valid", "--addr"],
+ )
+ util.get_mult_choices(
+ pei,
+ name="type",
+ values=args.type,
+ choices=self.pei_error_types,
+ suffixes=["-error", "-err"],
+ )
+ util.get_mult_choices(
+ pei,
+ name="flags",
+ values=args.flags,
+ choices=self.pei_flags,
+ suffixes=["-error", "-cap"],
+ )
+ util.get_mult_int(pei, "error-info", args.error_info)
+ util.get_mult_int(pei, "multiple-error", args.multiple_error)
+ util.get_mult_int(pei, "phy-addr", args.physical_address)
+ util.get_mult_int(pei, "virt-addr", args.virtual_address)
+
+ # Handle context
+ util.get_mult_int(ctx, "type", args.ctx_type, allow_zero=True)
+ util.get_mult_int(ctx, "minimal-size", args.ctx_size, allow_zero=True)
+ util.get_mult_array(ctx, "register", args.ctx_array, allow_zero=True)
+
+ util.get_mult_array(vendor, "bytes", args.vendor, max_val=255)
+
+ # Store PEI
+ pei_data = bytearray()
+ default_flags = self.pei_flags["first"]
+ default_flags |= self.pei_flags["last"]
+
+ error_info_num = 0
+
+ for i, p in pei.items(): # pylint: disable=W0612
+ error_info_num += 1
+
+ # UEFI 2.10 doesn't define how to encode error information
+ # when multiple types are raised. So, provide a default only
+ # if a single type is there
+ if "error-info" not in p:
+ if p["type"] == util.bit(1):
+ p["error-info"] = 0x0091000F
+ if p["type"] == util.bit(2):
+ p["error-info"] = 0x0054007F
+ if p["type"] == util.bit(3):
+ p["error-info"] = 0x80D6460FFF
+ if p["type"] == util.bit(4):
+ p["error-info"] = 0x78DA03FF
+
+ if "valid" not in p:
+ p["valid"] = 0
+ if "multiple-error" in p:
+ p["valid"] |= self.pei_valid_bits["multiple-error"]
+
+ if "flags" in p:
+ p["valid"] |= self.pei_valid_bits["flags"]
+
+ if "error-info" in p:
+ p["valid"] |= self.pei_valid_bits["error-info"]
+
+ if "phy-addr" in p:
+ p["valid"] |= self.pei_valid_bits["phy-addr"]
+
+ if "virt-addr" in p:
+ p["valid"] |= self.pei_valid_bits["virt-addr"]
+
+ # Version
+ util.data_add(pei_data, 0, 1)
+
+ util.data_add(pei_data,
+ self.ACPI_GHES_ARM_CPER_PEI_LENGTH, 1)
+
+ util.data_add(pei_data, p["valid"], 2)
+ util.data_add(pei_data, p["type"], 1)
+ util.data_add(pei_data, p.get("multiple-error", 1), 2)
+ util.data_add(pei_data, p.get("flags", default_flags), 1)
+ util.data_add(pei_data, p.get("error-info", 0), 8)
+ util.data_add(pei_data, p.get("virt-addr", 0xDEADBEEF), 8)
+ util.data_add(pei_data, p.get("phy-addr", 0xABBA0BAD), 8)
+
+ # Store Context
+ ctx_data = bytearray()
+ context_info_num = 0
+
+ if ctx:
+ ret = qmp_cmd.send_cmd("query-target", may_open=True)
+
+ default_ctx = self.CONTEXT_MISC_REG
+
+ if "arch" in ret:
+ if ret["arch"] == "aarch64":
+ default_ctx = self.CONTEXT_AARCH64_EL1
+ elif ret["arch"] == "arm":
+ default_ctx = self.CONTEXT_AARCH32_EL1
+
+ for k in sorted(ctx.keys()):
+ context_info_num += 1
+
+ if "type" not in ctx[k]:
+ ctx[k]["type"] = default_ctx
+
+ if "register" not in ctx[k]:
+ ctx[k]["register"] = []
+
+ reg_size = len(ctx[k]["register"])
+ size = 0
+
+ if "minimal-size" in ctx:
+ size = ctx[k]["minimal-size"]
+
+ size = max(size, reg_size)
+
+ size = (size + 1) % 0xFFFE
+
+ # Version
+ util.data_add(ctx_data, 0, 2)
+
+ util.data_add(ctx_data, ctx[k]["type"], 2)
+
+ util.data_add(ctx_data, 8 * size, 4)
+
+ for r in ctx[k]["register"]:
+ util.data_add(ctx_data, r, 8)
+
+ for i in range(reg_size, size): # pylint: disable=W0612
+ util.data_add(ctx_data, 0, 8)
+
+ # Vendor-specific bytes are not grouped
+ vendor_data = bytearray()
+ if vendor:
+ for k in sorted(vendor.keys()):
+ for b in vendor[k]["bytes"]:
+ util.data_add(vendor_data, b, 1)
+
+ # Encode ARM Processor Error
+ data = bytearray()
+
+ util.data_add(data, cper["valid"], 4)
+
+ util.data_add(data, error_info_num, 2)
+ util.data_add(data, context_info_num, 2)
+
+ # Calculate the length of the CPER data
+ cper_length = self.ACPI_GHES_ARM_CPER_LENGTH
+ cper_length += len(pei_data)
+ cper_length += len(vendor_data)
+ cper_length += len(ctx_data)
+ util.data_add(data, cper_length, 4)
+
+ util.data_add(data, arg.get("affinity-level", 0), 1)
+
+ # Reserved
+ util.data_add(data, 0, 3)
+
+ if "midr-el1" not in arg:
+ if cpus:
+ cmd_arg = {
+ 'path': cpus[0],
+ 'property': "midr"
+ }
+ ret = qmp_cmd.send_cmd("qom-get", cmd_arg, may_open=True)
+ if isinstance(ret, int):
+ arg["midr-el1"] = ret
+
+ util.data_add(data, arg.get("mpidr-el1", 0), 8)
+ util.data_add(data, arg.get("midr-el1", 0), 8)
+ util.data_add(data, cper["running-state"], 4)
+ util.data_add(data, arg.get("psci-state", 0), 4)
+
+ # Add PEI
+ data.extend(pei_data)
+ data.extend(ctx_data)
+ data.extend(vendor_data)
+
+ self.data = data
+
+ qmp_cmd.send_cper(cper_guid.CPER_PROC_ARM, self.data)
diff --git a/scripts/ghes_inject.py b/scripts/ghes_inject.py
new file mode 100755
index 000000000000..9a235201418b
--- /dev/null
+++ b/scripts/ghes_inject.py
@@ -0,0 +1,51 @@
+#!/usr/bin/env python3
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (C) 2024-2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+
+"""
+Handle ACPI GHESv2 error injection logic QEMU QMP interface.
+"""
+
+import argparse
+import sys
+
+from arm_processor_error import ArmProcessorEinj
+
+EINJ_DESC = """
+Handle ACPI GHESv2 error injection logic QEMU QMP interface.
+
+It allows using UEFI BIOS EINJ features to generate GHES records.
+
+It helps testing CPER and GHES drivers at the guest OS and how
+userspace applications at the guest handle them.
+"""
+
+def main():
+ """Main program"""
+
+ # Main parser - handle generic args like QEMU QMP TCP socket options
+ parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter,
+ usage="%(prog)s [options]",
+ description=EINJ_DESC)
+
+ g_options = parser.add_argument_group("QEMU QMP socket options")
+ g_options.add_argument("-H", "--host", default="localhost", type=str,
+ help="host name")
+ g_options.add_argument("-P", "--port", default=4445, type=int,
+ help="TCP port number")
+ g_options.add_argument('-d', '--debug', action='store_true')
+
+ subparsers = parser.add_subparsers()
+
+ ArmProcessorEinj(subparsers)
+
+ args = parser.parse_args()
+ if "func" in args:
+ args.func(args)
+ else:
+ sys.exit(f"Please specify a valid command for {sys.argv[0]}")
+
+if __name__ == "__main__":
+ main()
diff --git a/scripts/qmp_helper.py b/scripts/qmp_helper.py
new file mode 100755
index 000000000000..d7e6aabce8fe
--- /dev/null
+++ b/scripts/qmp_helper.py
@@ -0,0 +1,702 @@
+#!/usr/bin/env python3
+#
+# pylint: disable=C0103,E0213,E1135,E1136,E1137,R0902,R0903,R0912,R0913,R0917
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (C) 2024-2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+
+"""
+Helper classes to be used by ghes_inject command classes.
+"""
+
+import json
+import sys
+
+from datetime import datetime
+from os import path as os_path
+
+try:
+ qemu_dir = os_path.abspath(os_path.dirname(os_path.dirname(__file__)))
+ sys.path.append(os_path.join(qemu_dir, 'python'))
+
+ from qemu.qmp.legacy import QEMUMonitorProtocol
+
+except ModuleNotFoundError as exc:
+ print(f"Module '{exc.name}' not found.")
+ print("Try export PYTHONPATH=top-qemu-dir/python or run from top-qemu-dir")
+ sys.exit(1)
+
+from base64 import b64encode
+
+class util:
+ """
+ Ancillary functions to deal with bitmaps, parse arguments,
+ generate GUID and encode data on a bytearray buffer.
+ """
+
+ #
+ # Helper routines to handle multiple choice arguments
+ #
+ def get_choice(name, value, choices, suffixes=None, bitmask=True):
+ """Produce a list from multiple choice argument"""
+
+ new_values = 0
+
+ if not value:
+ return new_values
+
+ for val in value.split(","):
+ val = val.lower()
+
+ if suffixes:
+ for suffix in suffixes:
+ val = val.removesuffix(suffix)
+
+ if val not in choices.keys():
+ if suffixes:
+ for suffix in suffixes:
+ if val + suffix in choices.keys():
+ val += suffix
+ break
+
+ if val not in choices.keys():
+ sys.exit(f"Error on '{name}': choice '{val}' is invalid.")
+
+ val = choices[val]
+
+ if bitmask:
+ new_values |= val
+ else:
+ if new_values:
+ sys.exit(f"Error on '{name}': only one value is accepted.")
+
+ new_values = val
+
+ return new_values
+
+ def get_array(name, values, max_val=None):
+ """Add numbered hashes from integer lists into an array"""
+
+ array = []
+
+ for value in values:
+ for val in value.split(","):
+ try:
+ val = int(val, 0)
+ except ValueError:
+ sys.exit(f"Error on '{name}': {val} is not an integer")
+
+ if val < 0:
+ sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+ if max_val and val > max_val:
+ sys.exit(f"Error on '{name}': {val} is too little")
+
+ array.append(val)
+
+ return array
+
+ def get_mult_array(mult, name, values, allow_zero=False, max_val=None):
+ """Add numbered hashes from integer lists"""
+
+ if not allow_zero:
+ if not values:
+ return
+ else:
+ if values is None:
+ return
+
+ if not values:
+ i = 0
+ if i not in mult:
+ mult[i] = {}
+
+ mult[i][name] = []
+ return
+
+ i = 0
+ for value in values:
+ for val in value.split(","):
+ try:
+ val = int(val, 0)
+ except ValueError:
+ sys.exit(f"Error on '{name}': {val} is not an integer")
+
+ if val < 0:
+ sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+ if max_val and val > max_val:
+ sys.exit(f"Error on '{name}': {val} is too little")
+
+ if i not in mult:
+ mult[i] = {}
+
+ if name not in mult[i]:
+ mult[i][name] = []
+
+ mult[i][name].append(val)
+
+ i += 1
+
+
+ def get_mult_choices(mult, name, values, choices,
+ suffixes=None, allow_zero=False):
+ """Add numbered hashes from multiple choice arguments"""
+
+ if not allow_zero:
+ if not values:
+ return
+ else:
+ if values is None:
+ return
+
+ i = 0
+ for val in values:
+ new_values = util.get_choice(name, val, choices, suffixes)
+
+ if i not in mult:
+ mult[i] = {}
+
+ mult[i][name] = new_values
+ i += 1
+
+
+ def get_mult_int(mult, name, values, allow_zero=False):
+ """Add numbered hashes from integer arguments"""
+ if not allow_zero:
+ if not values:
+ return
+ else:
+ if values is None:
+ return
+
+ i = 0
+ for val in values:
+ try:
+ val = int(val, 0)
+ except ValueError:
+ sys.exit(f"Error on '{name}': {val} is not an integer")
+
+ if val < 0:
+ sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+ if i not in mult:
+ mult[i] = {}
+
+ mult[i][name] = val
+ i += 1
+
+
+ #
+ # Data encode helper functions
+ #
+ def bit(b):
+ """Simple macro to define a bit on a bitmask"""
+ return 1 << b
+
+
+ def data_add(data, value, num_bytes):
+ """Adds bytes from value inside a bitarray"""
+
+ data.extend(value.to_bytes(num_bytes, byteorder="little")) # pylint: disable=E1101
+
+ def dump_bytearray(name, data):
+ """Does an hexdump of a byte array, grouping in bytes"""
+
+ print(f"{name} ({len(data)} bytes):")
+
+ for ln_start in range(0, len(data), 16):
+ ln_end = min(ln_start + 16, len(data))
+ print(f" {ln_start:08x} ", end="")
+ for i in range(ln_start, ln_end):
+ print(f"{data[i]:02x} ", end="")
+ for i in range(ln_end, ln_start + 16):
+ print(" ", end="")
+ print(" ", end="")
+ for i in range(ln_start, ln_end):
+ if data[i] >= 32 and data[i] < 127:
+ print(chr(data[i]), end="")
+ else:
+ print(".", end="")
+
+ print()
+ print()
+
+ def time(string):
+ """Handle BCD timestamps used on Generic Error Data Block"""
+
+ time = None
+
+ # Formats to be used when parsing time stamps
+ formats = [
+ "%Y-%m-%d %H:%M:%S",
+ ]
+
+ if string == "now":
+ time = datetime.now()
+
+ if time is None:
+ for fmt in formats:
+ try:
+ time = datetime.strptime(string, fmt)
+ break
+ except ValueError:
+ pass
+
+ if time is None:
+ raise ValueError("Invalid time format")
+
+ return time
+
+class guid:
+ """
+ Simple class to handle GUID fields.
+ """
+
+ def __init__(self, time_low, time_mid, time_high, nodes):
+ """Initialize a GUID value"""
+
+ assert len(nodes) == 8
+
+ self.time_low = time_low
+ self.time_mid = time_mid
+ self.time_high = time_high
+ self.nodes = nodes
+
+ @classmethod
+ def UUID(cls, guid_str):
+ """Initialize a GUID using a string on its standard format"""
+
+ if len(guid_str) != 36:
+ print("Size not 36")
+ raise ValueError('Invalid GUID size')
+
+ # It is easier to parse without separators. So, drop them
+ guid_str = guid_str.replace('-', '')
+
+ if len(guid_str) != 32:
+ print("Size not 32", guid_str, len(guid_str))
+ raise ValueError('Invalid GUID hex size')
+
+ time_low = 0
+ time_mid = 0
+ time_high = 0
+ nodes = []
+
+ for i in reversed(range(16, 32, 2)):
+ h = guid_str[i:i + 2]
+ value = int(h, 16)
+ nodes.insert(0, value)
+
+ time_high = int(guid_str[12:16], 16)
+ time_mid = int(guid_str[8:12], 16)
+ time_low = int(guid_str[0:8], 16)
+
+ return cls(time_low, time_mid, time_high, nodes)
+
+ def __str__(self):
+ """Output a GUID value on its default string representation"""
+
+ clock = self.nodes[0] << 8 | self.nodes[1]
+
+ node = 0
+ for i in range(2, len(self.nodes)):
+ node = node << 8 | self.nodes[i]
+
+ s = f"{self.time_low:08x}-{self.time_mid:04x}-"
+ s += f"{self.time_high:04x}-{clock:04x}-{node:012x}"
+ return s
+
+ def to_bytes(self):
+ """Output a GUID value in bytes"""
+
+ data = bytearray()
+
+ util.data_add(data, self.time_low, 4)
+ util.data_add(data, self.time_mid, 2)
+ util.data_add(data, self.time_high, 2)
+ data.extend(bytearray(self.nodes))
+
+ return data
+
+class qmp:
+ """
+ Opens a connection and send/receive QMP commands.
+ """
+
+ def send_cmd(self, command, args=None, may_open=False, return_error=True):
+ """Send a command to QMP, optinally opening a connection"""
+
+ if may_open:
+ self._connect()
+ elif not self.connected:
+ return False
+
+ msg = { 'execute': command }
+ if args:
+ msg['arguments'] = args
+
+ try:
+ obj = self.qmp_monitor.cmd_obj(msg)
+ # Can we use some other exception class here?
+ except Exception as e: # pylint: disable=W0718
+ print(f"Command: {command}")
+ print(f"Failed to inject error: {e}.")
+ return None
+
+ if "return" in obj:
+ if isinstance(obj.get("return"), dict):
+ if obj["return"]:
+ return obj["return"]
+ return "OK"
+
+ return obj["return"]
+
+ if isinstance(obj.get("error"), dict):
+ error = obj["error"]
+ if return_error:
+ print(f"Command: {msg}")
+ print(f'{error["class"]}: {error["desc"]}')
+ else:
+ print(json.dumps(obj))
+
+ return None
+
+ def _close(self):
+ """Shutdown and close the socket, if opened"""
+ if not self.connected:
+ return
+
+ self.qmp_monitor.close()
+ self.connected = False
+
+ def _connect(self):
+ """Connect to a QMP TCP/IP port, if not connected yet"""
+
+ if self.connected:
+ return True
+
+ try:
+ self.qmp_monitor.connect(negotiate=True)
+ except ConnectionError:
+ sys.exit(f"Can't connect to QMP host {self.host}:{self.port}")
+
+ self.connected = True
+
+ return True
+
+ BLOCK_STATUS_BITS = {
+ "uncorrectable": util.bit(0),
+ "correctable": util.bit(1),
+ "multi-uncorrectable": util.bit(2),
+ "multi-correctable": util.bit(3),
+ }
+
+ ERROR_SEVERITY = {
+ "recoverable": 0,
+ "fatal": 1,
+ "corrected": 2,
+ "none": 3,
+ }
+
+ VALIDATION_BITS = {
+ "fru-id": util.bit(0),
+ "fru-text": util.bit(1),
+ "timestamp": util.bit(2),
+ }
+
+ GEDB_FLAGS_BITS = {
+ "recovered": util.bit(0),
+ "prev-error": util.bit(1),
+ "simulated": util.bit(2),
+ }
+
+ GENERIC_DATA_SIZE = 72
+
+ def argparse(parser):
+ """Prepare a parser group to query generic error data"""
+
+ block_status_bits = ",".join(qmp.BLOCK_STATUS_BITS.keys())
+ error_severity_enum = ",".join(qmp.ERROR_SEVERITY.keys())
+ validation_bits = ",".join(qmp.VALIDATION_BITS.keys())
+ gedb_flags_bits = ",".join(qmp.GEDB_FLAGS_BITS.keys())
+
+ g_gen = parser.add_argument_group("Generic Error Data") # pylint: disable=E1101
+ g_gen.add_argument("--block-status",
+ help=f"block status bits: {block_status_bits}")
+ g_gen.add_argument("--raw-data", nargs="+",
+ help="Raw data inside the Error Status Block")
+ g_gen.add_argument("--error-severity", "--severity",
+ help=f"error severity: {error_severity_enum}")
+ g_gen.add_argument("--gen-err-valid-bits",
+ "--generic-error-validation-bits",
+ help=f"validation bits: {validation_bits}")
+ g_gen.add_argument("--fru-id", type=guid.UUID,
+ help="GUID representing a physical device")
+ g_gen.add_argument("--fru-text",
+ help="ASCII string identifying the FRU hardware")
+ g_gen.add_argument("--timestamp", type=util.time,
+ help="Time when the error info was collected")
+ g_gen.add_argument("--precise", "--precise-timestamp",
+ action='store_true',
+ help="Marks the timestamp as precise if --timestamp is used")
+ g_gen.add_argument("--gedb-flags",
+ help=f"General Error Data Block flags: {gedb_flags_bits}")
+
+ def set_args(self, args):
+ """Set the arguments optionally defined via self.argparse()"""
+
+ if args.block_status:
+ self.block_status = util.get_choice(name="block-status",
+ value=args.block_status,
+ choices=self.BLOCK_STATUS_BITS,
+ bitmask=False)
+ if args.raw_data:
+ self.raw_data = util.get_array("raw-data", args.raw_data,
+ max_val=255)
+ print(self.raw_data)
+
+ if args.error_severity:
+ self.error_severity = util.get_choice(name="error-severity",
+ value=args.error_severity,
+ choices=self.ERROR_SEVERITY,
+ bitmask=False)
+
+ if args.fru_id:
+ self.fru_id = args.fru_id.to_bytes()
+ if not args.gen_err_valid_bits:
+ self.validation_bits |= self.VALIDATION_BITS["fru-id"]
+
+ if args.fru_text:
+ text = bytearray(args.fru_text.encode('ascii'))
+ if len(text) > 20:
+ sys.exit("FRU text is too big to fit")
+
+ self.fru_text = text
+ if not args.gen_err_valid_bits:
+ self.validation_bits |= self.VALIDATION_BITS["fru-text"]
+
+ if args.timestamp:
+ time = args.timestamp
+ century = int(time.year / 100)
+
+ bcd = bytearray()
+ util.data_add(bcd, (time.second // 10) << 4 | (time.second % 10), 1)
+ util.data_add(bcd, (time.minute // 10) << 4 | (time.minute % 10), 1)
+ util.data_add(bcd, (time.hour // 10) << 4 | (time.hour % 10), 1)
+
+ if args.precise:
+ util.data_add(bcd, 1, 1)
+ else:
+ util.data_add(bcd, 0, 1)
+
+ util.data_add(bcd, (time.day // 10) << 4 | (time.day % 10), 1)
+ util.data_add(bcd, (time.month // 10) << 4 | (time.month % 10), 1)
+ util.data_add(bcd,
+ ((time.year % 100) // 10) << 4 | (time.year % 10), 1)
+ util.data_add(bcd, ((century % 100) // 10) << 4 | (century % 10), 1)
+
+ self.timestamp = bcd
+ if not args.gen_err_valid_bits:
+ self.validation_bits |= self.VALIDATION_BITS["timestamp"]
+
+ if args.gen_err_valid_bits:
+ self.validation_bits = util.get_choice(name="validation",
+ value=args.gen_err_valid_bits,
+ choices=self.VALIDATION_BITS)
+
+ def __init__(self, host, port, debug=False):
+ """Initialize variables used by the QMP send logic"""
+
+ self.connected = False
+ self.host = host
+ self.port = port
+ self.debug = debug
+
+ # ACPI 6.1: 18.3.2.7.1 Generic Error Data: Generic Error Status Block
+ self.block_status = self.BLOCK_STATUS_BITS["uncorrectable"]
+ self.raw_data = []
+ self.error_severity = self.ERROR_SEVERITY["recoverable"]
+
+ # ACPI 6.1: 18.3.2.7.1 Generic Error Data: Generic Error Data Entry
+ self.validation_bits = 0
+ self.flags = 0
+ self.fru_id = bytearray(16)
+ self.fru_text = bytearray(20)
+ self.timestamp = bytearray(8)
+
+ self.qmp_monitor = QEMUMonitorProtocol(address=(self.host, self.port))
+
+ #
+ # Socket QMP send command
+ #
+ def send_cper_raw(self, cper_data):
+ """Send a raw CPER data to QEMU though QMP TCP socket"""
+
+ data = b64encode(bytes(cper_data)).decode('ascii')
+
+ cmd_arg = {
+ 'cper': data
+ }
+
+ self._connect()
+
+ if self.send_cmd("inject-ghes-v2-error", cmd_arg):
+ print("Error injected.")
+
+ def send_cper(self, notif_type, payload):
+ """Send commands to QEMU though QMP TCP socket"""
+
+ # Fill CPER record header
+
+ # NOTE: bits 4 to 13 of block status contain the number of
+ # data entries in the data section. This is currently unsupported.
+
+ cper_length = len(payload)
+ data_length = cper_length + len(self.raw_data) + self.GENERIC_DATA_SIZE
+
+ # Generic Error Data Entry
+ gede = bytearray()
+
+ gede.extend(notif_type.to_bytes())
+ util.data_add(gede, self.error_severity, 4)
+ util.data_add(gede, 0x300, 2)
+ util.data_add(gede, self.validation_bits, 1)
+ util.data_add(gede, self.flags, 1)
+ util.data_add(gede, cper_length, 4)
+ gede.extend(self.fru_id)
+ gede.extend(self.fru_text)
+ gede.extend(self.timestamp)
+
+ # Generic Error Status Block
+ gebs = bytearray()
+
+ if self.raw_data:
+ raw_data_offset = len(gebs)
+ else:
+ raw_data_offset = 0
+
+ util.data_add(gebs, self.block_status, 4)
+ util.data_add(gebs, raw_data_offset, 4)
+ util.data_add(gebs, len(self.raw_data), 4)
+ util.data_add(gebs, data_length, 4)
+ util.data_add(gebs, self.error_severity, 4)
+
+ cper_data = bytearray()
+ cper_data.extend(gebs)
+ cper_data.extend(gede)
+ cper_data.extend(bytearray(self.raw_data))
+ cper_data.extend(bytearray(payload))
+
+ if self.debug:
+ print(f"GUID: {notif_type}")
+
+ util.dump_bytearray("Generic Error Status Block", gebs)
+ util.dump_bytearray("Generic Error Data Entry", gede)
+
+ if self.raw_data:
+ util.dump_bytearray("Raw data", bytearray(self.raw_data))
+
+ util.dump_bytearray("Payload", payload)
+
+ self.send_cper_raw(cper_data)
+
+
+ def search_qom(self, path, prop, regex):
+ """
+ Return a list of devices that match path array like:
+
+ /machine/unattached/device
+ /machine/peripheral-anon/device
+ ...
+ """
+
+ found = []
+
+ i = 0
+ while 1:
+ dev = f"{path}[{i}]"
+ args = {
+ 'path': dev,
+ 'property': prop
+ }
+ ret = self.send_cmd("qom-get", args, may_open=True, return_error=False)
+ if not ret:
+ break
+
+ if isinstance(ret, str):
+ if regex.search(ret):
+ found.append(dev)
+
+ i += 1
+ if i > 10000:
+ print("Too many objects returned by qom-get!")
+ break
+
+ return found
+
+class cper_guid:
+ """
+ Contains CPER GUID, as per:
+ https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html
+ """
+
+ CPER_PROC_GENERIC = guid(0x9876CCAD, 0x47B4, 0x4bdb,
+ [0xB6, 0x5E, 0x16, 0xF1,
+ 0x93, 0xC4, 0xF3, 0xDB])
+
+ CPER_PROC_X86 = guid(0xDC3EA0B0, 0xA144, 0x4797,
+ [0xB9, 0x5B, 0x53, 0xFA,
+ 0x24, 0x2B, 0x6E, 0x1D])
+
+ CPER_PROC_ITANIUM = guid(0xe429faf1, 0x3cb7, 0x11d4,
+ [0xbc, 0xa7, 0x00, 0x80,
+ 0xc7, 0x3c, 0x88, 0x81])
+
+ CPER_PROC_ARM = guid(0xE19E3D16, 0xBC11, 0x11E4,
+ [0x9C, 0xAA, 0xC2, 0x05,
+ 0x1D, 0x5D, 0x46, 0xB0])
+
+ CPER_PLATFORM_MEM = guid(0xA5BC1114, 0x6F64, 0x4EDE,
+ [0xB8, 0x63, 0x3E, 0x83,
+ 0xED, 0x7C, 0x83, 0xB1])
+
+ CPER_PLATFORM_MEM2 = guid(0x61EC04FC, 0x48E6, 0xD813,
+ [0x25, 0xC9, 0x8D, 0xAA,
+ 0x44, 0x75, 0x0B, 0x12])
+
+ CPER_PCIE = guid(0xD995E954, 0xBBC1, 0x430F,
+ [0xAD, 0x91, 0xB4, 0x4D,
+ 0xCB, 0x3C, 0x6F, 0x35])
+
+ CPER_PCI_BUS = guid(0xC5753963, 0x3B84, 0x4095,
+ [0xBF, 0x78, 0xED, 0xDA,
+ 0xD3, 0xF9, 0xC9, 0xDD])
+
+ CPER_PCI_DEV = guid(0xEB5E4685, 0xCA66, 0x4769,
+ [0xB6, 0xA2, 0x26, 0x06,
+ 0x8B, 0x00, 0x13, 0x26])
+
+ CPER_FW_ERROR = guid(0x81212A96, 0x09ED, 0x4996,
+ [0x94, 0x71, 0x8D, 0x72,
+ 0x9C, 0x8E, 0x69, 0xED])
+
+ CPER_DMA_GENERIC = guid(0x5B51FEF7, 0xC79D, 0x4434,
+ [0x8F, 0x1B, 0xAA, 0x62,
+ 0xDE, 0x3E, 0x2C, 0x64])
+
+ CPER_DMA_VT = guid(0x71761D37, 0x32B2, 0x45cd,
+ [0xA7, 0xD0, 0xB0, 0xFE,
+ 0xDD, 0x93, 0xE8, 0xCF])
+
+ CPER_DMA_IOMMU = guid(0x036F84E1, 0x7F37, 0x428c,
+ [0xA7, 0x9E, 0x57, 0x5F,
+ 0xDF, 0xAA, 0x84, 0xEC])
+
+ CPER_CCIX_PER = guid(0x91335EF6, 0xEBFB, 0x4478,
+ [0xA6, 0xA6, 0x88, 0xB7,
+ 0x28, 0xCF, 0x75, 0xD7])
+
+ CPER_CXL_PROT_ERR = guid(0x80B9EFB4, 0x52B5, 0x4DE3,
+ [0xA7, 0x77, 0x68, 0x78,
+ 0x4B, 0x77, 0x10, 0x48])
--
2.48.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH v5 10/21] acpi/ghes: create an ancillary acpi_ghes_get_state() function
2025-02-27 11:03 ` [PATCH v5 10/21] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
@ 2025-02-27 11:31 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:31 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel, Ani Sinha,
Dongjiu Geng, linux-kernel
Em Thu, 27 Feb 2025 12:03:40 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:
> Instead of having a function to check if ACPI is enabled
> (acpi_ghes_present), change its logic to be more generic,
> returing a pointed to AcpiGhesState.
>
> Such change allows cleanup the ghes GED state code, avoiding
> to read it multiple times, and simplifying the code.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> ---
> hw/acpi/ghes.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index c3a64adfe5ed..0135ac844bcf 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -608,7 +608,7 @@ AcpiGhesState *acpi_ghes_get_state(void)
> }
> ags = &acpi_ged_state->ghes_state;
>
> - if (!ags->hw_error_le) {
> + if (!ags->hw_error_le && !ags->hest_addr_le) {
> return NULL;
> }
> return ags;
Sorry, I moved most of the stuff on this patch to
[PATCH 04/21] acpi/ghes: Cleanup the code which gets ghes ged state
This hunk was a left over from it. I was meant to place this hunk
elsewhere but I ended forgetting while waiting for the rebase bisect
tests to pass.
I'll move this hunk to
[PATCH 06/21] acpi/ghes: add a firmware file with HEST address
for the next respin (and hopefully the final one).
Regards,
Mauro
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 01/21] tests/acpi: virt: add an empty HEST file
2025-02-27 11:03 ` [PATCH v5 01/21] tests/acpi: virt: add an empty HEST file Mauro Carvalho Chehab
@ 2025-02-27 12:02 ` Igor Mammedov
0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 12:02 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, linux-kernel
On Thu, 27 Feb 2025 12:03:31 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Such file will be used to track HEST table changes.
>
> For now, disallow HEST table check until we update it to the
> current data.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
> ---
> tests/data/acpi/aarch64/virt/HEST | 0
> tests/qtest/bios-tables-test-allowed-diff.h | 1 +
> 2 files changed, 1 insertion(+)
> create mode 100644 tests/data/acpi/aarch64/virt/HEST
>
> diff --git a/tests/data/acpi/aarch64/virt/HEST b/tests/data/acpi/aarch64/virt/HEST
> new file mode 100644
> index 000000000000..e69de29bb2d1
> diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
> index dfb8523c8bf4..39901c58d647 100644
> --- a/tests/qtest/bios-tables-test-allowed-diff.h
> +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> @@ -1 +1,2 @@
> /* List of comma-separated changed AML files to ignore */
> +"tests/data/acpi/aarch64/virt/HEST",
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 02/21] tests/qtest/bios-tables-test: extend to also check HEST table
2025-02-27 11:03 ` [PATCH v5 02/21] tests/qtest/bios-tables-test: extend to also check HEST table Mauro Carvalho Chehab
@ 2025-02-27 12:03 ` Igor Mammedov
0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 12:03 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, linux-kernel
On Thu, 27 Feb 2025 12:03:32 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Currently, aarch64 can generate a HEST table when loaded with
> -machine ras=on. Add support for it.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> ---
> tests/qtest/bios-tables-test.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
> index 0a333ec43536..8d41601cc9e9 100644
> --- a/tests/qtest/bios-tables-test.c
> +++ b/tests/qtest/bios-tables-test.c
> @@ -2122,7 +2122,7 @@ static void test_acpi_aarch64_virt_tcg(void)
>
> data.smbios_cpu_max_speed = 2900;
> data.smbios_cpu_curr_speed = 2700;
> - test_acpi_one("-cpu cortex-a57 "
> + test_acpi_one("-cpu cortex-a57 -machine ras=on "
> "-smbios type=4,max-speed=2900,current-speed=2700", &data);
> free_test_data(&data);
> }
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 03/21] tests/acpi: virt: update HEST file with its current data
2025-02-27 11:03 ` [PATCH v5 03/21] tests/acpi: virt: update HEST file with its current data Mauro Carvalho Chehab
@ 2025-02-27 12:03 ` Igor Mammedov
0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 12:03 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, linux-kernel
On Thu, 27 Feb 2025 12:03:33 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Now that HEST table is checked for aarch64, add the current
> firmware file.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
> ---
> tests/data/acpi/aarch64/virt/HEST | Bin 0 -> 132 bytes
> tests/qtest/bios-tables-test-allowed-diff.h | 1 -
> 2 files changed, 1 deletion(-)
>
> diff --git a/tests/data/acpi/aarch64/virt/HEST b/tests/data/acpi/aarch64/virt/HEST
> index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..4c5d8c5b5da5b3241f93cd0839e94272bf6b1486 100644
> GIT binary patch
> literal 132
> zcmeZp4Gw8xU|?W;<mB({5v<@85#X$#prF9Wz`y`vgJ=-uVqjqS|DS;o#%Ew*U|?_n
> dk++-~7#J8hWI!Yi09DHYRr~Kh1c1x}0RY>66afGL
>
> literal 0
> HcmV?d00001
>
> diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
> index 39901c58d647..dfb8523c8bf4 100644
> --- a/tests/qtest/bios-tables-test-allowed-diff.h
> +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> @@ -1,2 +1 @@
> /* List of comma-separated changed AML files to ignore */
> -"tests/data/acpi/aarch64/virt/HEST",
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 17/21] tests/acpi: virt: update HEST table to accept two sources
2025-02-27 11:03 ` [PATCH v5 17/21] tests/acpi: virt: update HEST table to accept two sources Mauro Carvalho Chehab
@ 2025-02-27 13:10 ` Igor Mammedov
2025-02-27 13:16 ` Igor Mammedov
0 siblings, 1 reply; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 13:10 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, linux-kernel
On Thu, 27 Feb 2025 12:03:47 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
squash this patch into the next one
Also at this point there is no visible HEST changes yet, so a soon as you remove
white-list without enabling new HEST, the tests should start failing.
I suggest to move 20/21 before this patch,
as result one would see dsdt and hest diffs when running tests
and then you can use rebuild-expected-aml.sh to generate updated
tables and update them in one patch (that's what we typically do,
we don't split updates in increments).
> --- /tmp/asl-38PE22.dsl 2025-02-26 16:25:32.362148388 +0100
> +++ /tmp/asl-HSPE22.dsl 2025-02-26 16:25:32.361148402 +0100
> @@ -1,39 +1,39 @@
> /*
> * Intel ACPI Component Architecture
> * AML/ASL+ Disassembler version 20240322 (64-bit version)
> * Copyright (c) 2000 - 2023 Intel Corporation
> *
> - * Disassembly of tests/data/acpi/aarch64/virt/HEST
> + * Disassembly of /tmp/aml-DMPE22
> *
> * ACPI Data Table [HEST]
> *
> * Format: [HexOffset DecimalOffset ByteLength] FieldName : FieldValue (in hex)
> */
>
> [000h 0000 004h] Signature : "HEST" [Hardware Error Source Table]
> -[004h 0004 004h] Table Length : 00000084
> +[004h 0004 004h] Table Length : 000000E0
> [008h 0008 001h] Revision : 01
> -[009h 0009 001h] Checksum : E2
> +[009h 0009 001h] Checksum : 6C
> [00Ah 0010 006h] Oem ID : "BOCHS "
> [010h 0016 008h] Oem Table ID : "BXPC "
> [018h 0024 004h] Oem Revision : 00000001
> [01Ch 0028 004h] Asl Compiler ID : "BXPC"
> [020h 0032 004h] Asl Compiler Revision : 00000001
>
> -[024h 0036 004h] Error Source Count : 00000001
> +[024h 0036 004h] Error Source Count : 00000002
>
> [028h 0040 002h] Subtable Type : 000A [Generic Hardware Error Source V2]
> [02Ah 0042 002h] Source Id : 0000
> [02Ch 0044 002h] Related Source Id : FFFF
> [02Eh 0046 001h] Reserved : 00
> [02Fh 0047 001h] Enabled : 01
> [030h 0048 004h] Records To Preallocate : 00000001
> [034h 0052 004h] Max Sections Per Record : 00000001
> [038h 0056 004h] Max Raw Data Length : 00000400
>
> [03Ch 0060 00Ch] Error Status Address : [Generic Address Structure]
> [03Ch 0060 001h] Space ID : 00 [SystemMemory]
> [03Dh 0061 001h] Bit Width : 40
> [03Eh 0062 001h] Bit Offset : 00
> [03Fh 0063 001h] Encoded Access Width : 04 [QWord Access:64]
> [040h 0064 008h] Address : 0000000043DA0000
> @@ -42,32 +42,75 @@
> [048h 0072 001h] Notify Type : 08 [SEA]
> [049h 0073 001h] Notify Length : 1C
> [04Ah 0074 002h] Configuration Write Enable : 0000
> [04Ch 0076 004h] PollInterval : 00000000
> [050h 0080 004h] Vector : 00000000
> [054h 0084 004h] Polling Threshold Value : 00000000
> [058h 0088 004h] Polling Threshold Window : 00000000
> [05Ch 0092 004h] Error Threshold Value : 00000000
> [060h 0096 004h] Error Threshold Window : 00000000
>
> [064h 0100 004h] Error Status Block Length : 00000400
> [068h 0104 00Ch] Read Ack Register : [Generic Address Structure]
> [068h 0104 001h] Space ID : 00 [SystemMemory]
> [069h 0105 001h] Bit Width : 40
> [06Ah 0106 001h] Bit Offset : 00
> [06Bh 0107 001h] Encoded Access Width : 04 [QWord Access:64]
> -[06Ch 0108 008h] Address : 0000000043DA0008
> +[06Ch 0108 008h] Address : 0000000043DA0010
>
> [074h 0116 008h] Read Ack Preserve : FFFFFFFFFFFFFFFE
> [07Ch 0124 008h] Read Ack Write : 0000000000000001
>
> -Raw Table Data: Length 132 (0x84)
> +[084h 0132 002h] Subtable Type : 000A [Generic Hardware Error Source V2]
> +[086h 0134 002h] Source Id : 0001
> +[088h 0136 002h] Related Source Id : FFFF
> +[08Ah 0138 001h] Reserved : 00
> +[08Bh 0139 001h] Enabled : 01
> +[08Ch 0140 004h] Records To Preallocate : 00000001
> +[090h 0144 004h] Max Sections Per Record : 00000001
> +[094h 0148 004h] Max Raw Data Length : 00000400
> +
> +[098h 0152 00Ch] Error Status Address : [Generic Address Structure]
> +[098h 0152 001h] Space ID : 00 [SystemMemory]
> +[099h 0153 001h] Bit Width : 40
> +[09Ah 0154 001h] Bit Offset : 00
> +[09Bh 0155 001h] Encoded Access Width : 04 [QWord Access:64]
> +[09Ch 0156 008h] Address : 0000000043DA0008
> +
> +[0A4h 0164 01Ch] Notify : [Hardware Error Notification Structure]
> +[0A4h 0164 001h] Notify Type : 07 [GPIO]
> +[0A5h 0165 001h] Notify Length : 1C
> +[0A6h 0166 002h] Configuration Write Enable : 0000
> +[0A8h 0168 004h] PollInterval : 00000000
> +[0ACh 0172 004h] Vector : 00000000
> +[0B0h 0176 004h] Polling Threshold Value : 00000000
> +[0B4h 0180 004h] Polling Threshold Window : 00000000
> +[0B8h 0184 004h] Error Threshold Value : 00000000
> +[0BCh 0188 004h] Error Threshold Window : 00000000
> +
> +[0C0h 0192 004h] Error Status Block Length : 00000400
> +[0C4h 0196 00Ch] Read Ack Register : [Generic Address Structure]
> +[0C4h 0196 001h] Space ID : 00 [SystemMemory]
> +[0C5h 0197 001h] Bit Width : 40
> +[0C6h 0198 001h] Bit Offset : 00
> +[0C7h 0199 001h] Encoded Access Width : 04 [QWord Access:64]
> +[0C8h 0200 008h] Address : 0000000043DA0018
>
> - 0000: 48 45 53 54 84 00 00 00 01 E2 42 4F 43 48 53 20 // HEST......BOCHS
> +[0D0h 0208 008h] Read Ack Preserve : FFFFFFFFFFFFFFFE
> +[0D8h 0216 008h] Read Ack Write : 0000000000000001
> +
> +Raw Table Data: Length 224 (0xE0)
> +
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> tests/data/acpi/aarch64/virt/HEST | Bin 132 -> 224 bytes
> 1 file changed, 0 insertions(+), 0 deletions(-)
>
> diff --git a/tests/data/acpi/aarch64/virt/HEST b/tests/data/acpi/aarch64/virt/HEST
> index 4c5d8c5b5da5b3241f93cd0839e94272bf6b1486..674272922db7d48f7821aa7c83ec76bb3b556d2a 100644
> GIT binary patch
> delta 68
> zcmZo+e89-%;TjzBfPsO5F=rx|6eH6_Rd+^#iMisuTnvm1|Nk>EGJ@nLCJHmL%S;Ru
> WnV7)J#lXPAz`)?Zz#=g*R~!HcF%5eF
>
> delta 29
> lcmaFB*uu!=;Tjy$!oa}5_-G=R6eHtARriT=I3|_|004Ge2nqlI
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 17/21] tests/acpi: virt: update HEST table to accept two sources
2025-02-27 13:10 ` Igor Mammedov
@ 2025-02-27 13:16 ` Igor Mammedov
2025-02-27 15:51 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 13:16 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, linux-kernel
On Thu, 27 Feb 2025 14:10:38 +0100
Igor Mammedov <imammedo@redhat.com> wrote:
> On Thu, 27 Feb 2025 12:03:47 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> squash this patch into the next one
>
> Also at this point there is no visible HEST changes yet, so a soon as you remove
> white-list without enabling new HEST, the tests should start failing.
>
> I suggest to move 20/21 before this patch,
> as result one would see dsdt and hest diffs when running tests
> and then you can use rebuild-expected-aml.sh to generate updated
> tables and update them in one patch (that's what we typically do,
> we don't split updates in increments).
on top of that,
it seems the patch doesn't apply for some reason.
>
>
> > --- /tmp/asl-38PE22.dsl 2025-02-26 16:25:32.362148388 +0100
> > +++ /tmp/asl-HSPE22.dsl 2025-02-26 16:25:32.361148402 +0100
> > @@ -1,39 +1,39 @@
> > /*
> > * Intel ACPI Component Architecture
> > * AML/ASL+ Disassembler version 20240322 (64-bit version)
> > * Copyright (c) 2000 - 2023 Intel Corporation
> > *
> > - * Disassembly of tests/data/acpi/aarch64/virt/HEST
> > + * Disassembly of /tmp/aml-DMPE22
> > *
> > * ACPI Data Table [HEST]
> > *
> > * Format: [HexOffset DecimalOffset ByteLength] FieldName : FieldValue (in hex)
> > */
> >
> > [000h 0000 004h] Signature : "HEST" [Hardware Error Source Table]
> > -[004h 0004 004h] Table Length : 00000084
> > +[004h 0004 004h] Table Length : 000000E0
> > [008h 0008 001h] Revision : 01
> > -[009h 0009 001h] Checksum : E2
> > +[009h 0009 001h] Checksum : 6C
> > [00Ah 0010 006h] Oem ID : "BOCHS "
> > [010h 0016 008h] Oem Table ID : "BXPC "
> > [018h 0024 004h] Oem Revision : 00000001
> > [01Ch 0028 004h] Asl Compiler ID : "BXPC"
> > [020h 0032 004h] Asl Compiler Revision : 00000001
> >
> > -[024h 0036 004h] Error Source Count : 00000001
> > +[024h 0036 004h] Error Source Count : 00000002
> >
> > [028h 0040 002h] Subtable Type : 000A [Generic Hardware Error Source V2]
> > [02Ah 0042 002h] Source Id : 0000
> > [02Ch 0044 002h] Related Source Id : FFFF
> > [02Eh 0046 001h] Reserved : 00
> > [02Fh 0047 001h] Enabled : 01
> > [030h 0048 004h] Records To Preallocate : 00000001
> > [034h 0052 004h] Max Sections Per Record : 00000001
> > [038h 0056 004h] Max Raw Data Length : 00000400
> >
> > [03Ch 0060 00Ch] Error Status Address : [Generic Address Structure]
> > [03Ch 0060 001h] Space ID : 00 [SystemMemory]
> > [03Dh 0061 001h] Bit Width : 40
> > [03Eh 0062 001h] Bit Offset : 00
> > [03Fh 0063 001h] Encoded Access Width : 04 [QWord Access:64]
> > [040h 0064 008h] Address : 0000000043DA0000
> > @@ -42,32 +42,75 @@
> > [048h 0072 001h] Notify Type : 08 [SEA]
> > [049h 0073 001h] Notify Length : 1C
> > [04Ah 0074 002h] Configuration Write Enable : 0000
> > [04Ch 0076 004h] PollInterval : 00000000
> > [050h 0080 004h] Vector : 00000000
> > [054h 0084 004h] Polling Threshold Value : 00000000
> > [058h 0088 004h] Polling Threshold Window : 00000000
> > [05Ch 0092 004h] Error Threshold Value : 00000000
> > [060h 0096 004h] Error Threshold Window : 00000000
> >
> > [064h 0100 004h] Error Status Block Length : 00000400
> > [068h 0104 00Ch] Read Ack Register : [Generic Address Structure]
> > [068h 0104 001h] Space ID : 00 [SystemMemory]
> > [069h 0105 001h] Bit Width : 40
> > [06Ah 0106 001h] Bit Offset : 00
> > [06Bh 0107 001h] Encoded Access Width : 04 [QWord Access:64]
> > -[06Ch 0108 008h] Address : 0000000043DA0008
> > +[06Ch 0108 008h] Address : 0000000043DA0010
> >
> > [074h 0116 008h] Read Ack Preserve : FFFFFFFFFFFFFFFE
> > [07Ch 0124 008h] Read Ack Write : 0000000000000001
> >
> > -Raw Table Data: Length 132 (0x84)
> > +[084h 0132 002h] Subtable Type : 000A [Generic Hardware Error Source V2]
> > +[086h 0134 002h] Source Id : 0001
> > +[088h 0136 002h] Related Source Id : FFFF
> > +[08Ah 0138 001h] Reserved : 00
> > +[08Bh 0139 001h] Enabled : 01
> > +[08Ch 0140 004h] Records To Preallocate : 00000001
> > +[090h 0144 004h] Max Sections Per Record : 00000001
> > +[094h 0148 004h] Max Raw Data Length : 00000400
> > +
> > +[098h 0152 00Ch] Error Status Address : [Generic Address Structure]
> > +[098h 0152 001h] Space ID : 00 [SystemMemory]
> > +[099h 0153 001h] Bit Width : 40
> > +[09Ah 0154 001h] Bit Offset : 00
> > +[09Bh 0155 001h] Encoded Access Width : 04 [QWord Access:64]
> > +[09Ch 0156 008h] Address : 0000000043DA0008
> > +
> > +[0A4h 0164 01Ch] Notify : [Hardware Error Notification Structure]
> > +[0A4h 0164 001h] Notify Type : 07 [GPIO]
> > +[0A5h 0165 001h] Notify Length : 1C
> > +[0A6h 0166 002h] Configuration Write Enable : 0000
> > +[0A8h 0168 004h] PollInterval : 00000000
> > +[0ACh 0172 004h] Vector : 00000000
> > +[0B0h 0176 004h] Polling Threshold Value : 00000000
> > +[0B4h 0180 004h] Polling Threshold Window : 00000000
> > +[0B8h 0184 004h] Error Threshold Value : 00000000
> > +[0BCh 0188 004h] Error Threshold Window : 00000000
> > +
> > +[0C0h 0192 004h] Error Status Block Length : 00000400
> > +[0C4h 0196 00Ch] Read Ack Register : [Generic Address Structure]
> > +[0C4h 0196 001h] Space ID : 00 [SystemMemory]
> > +[0C5h 0197 001h] Bit Width : 40
> > +[0C6h 0198 001h] Bit Offset : 00
> > +[0C7h 0199 001h] Encoded Access Width : 04 [QWord Access:64]
> > +[0C8h 0200 008h] Address : 0000000043DA0018
> >
> > - 0000: 48 45 53 54 84 00 00 00 01 E2 42 4F 43 48 53 20 // HEST......BOCHS
> > +[0D0h 0208 008h] Read Ack Preserve : FFFFFFFFFFFFFFFE
> > +[0D8h 0216 008h] Read Ack Write : 0000000000000001
> > +
> > +Raw Table Data: Length 224 (0xE0)
> > +
> >
> > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > ---
> > tests/data/acpi/aarch64/virt/HEST | Bin 132 -> 224 bytes
> > 1 file changed, 0 insertions(+), 0 deletions(-)
> >
> > diff --git a/tests/data/acpi/aarch64/virt/HEST b/tests/data/acpi/aarch64/virt/HEST
> > index 4c5d8c5b5da5b3241f93cd0839e94272bf6b1486..674272922db7d48f7821aa7c83ec76bb3b556d2a 100644
> > GIT binary patch
> > delta 68
> > zcmZo+e89-%;TjzBfPsO5F=rx|6eH6_Rd+^#iMisuTnvm1|Nk>EGJ@nLCJHmL%S;Ru
> > WnV7)J#lXPAz`)?Zz#=g*R~!HcF%5eF
> >
> > delta 29
> > lcmaFB*uu!=;Tjy$!oa}5_-G=R6eHtARriT=I3|_|004Ge2nqlI
> >
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 19/21] docs: hest: add new "etc/acpi_table_hest_addr" and update workflow
2025-02-27 11:03 ` [PATCH v5 19/21] docs: hest: add new "etc/acpi_table_hest_addr" and update workflow Mauro Carvalho Chehab
@ 2025-02-27 13:21 ` Igor Mammedov
0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 13:21 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Dongjiu Geng, linux-kernel
On Thu, 27 Feb 2025 12:03:49 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> While the HEST layout didn't change, there are some internal
> changes related to how offsets are calculated and how memory error
> events are triggered.
>
> Update specs to reflect such changes.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> docs/specs/acpi_hest_ghes.rst | 28 +++++++++++++++++-----------
> 1 file changed, 17 insertions(+), 11 deletions(-)
>
> diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
> index c3e9f8d9a702..f3cb3074b082 100644
> --- a/docs/specs/acpi_hest_ghes.rst
> +++ b/docs/specs/acpi_hest_ghes.rst
> @@ -89,12 +89,21 @@ Design Details
> addresses in the "error_block_address" fields with a pointer to the
> respective "Error Status Data Block" in the "etc/hardware_errors" blob.
>
> -(8) QEMU defines a third and write-only fw_cfg blob which is called
> - "etc/hardware_errors_addr". Through that blob, the firmware can send back
> - the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
> - blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
> - for the firmware. The firmware will write back the start address of
> - "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
> +(8) QEMU defines a third and write-only fw_cfg blob to store the location
> + where the error block offsets, read ack registers and CPER records are
> + stored.
> +
> + Up to QEMU 9.2, the location was at "etc/hardware_errors_addr", and
> + contains an offset for the beginning of "etc/hardware_errors".
s/^^^^/GPA/
> +
> + Newer versions place the location at "etc/acpi_table_hest_addr",
s/^^^^^^^^^^^/GPA or address/
> + pointing to the beginning of the HEST table.
> +
> + Through that such offsets, the firmware can send back the guest-side
^^^ see my previous s comment on that
> + allocation addresses to QEMU. They contain a 8-byte entry. QEMU generates
> + a single WRITE_POINTER command for the firmware. The firmware will write
> + back the start address of either "etc/hardware_errors" or HEST table at
> + the correspoinding address firmware.
^^^^^^^^^^^^^^^^ what is it?
perhaps it should be "fwcfg file"?
>
> (9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
> "Error Status Data Block", guest memory, and then injects platform specific
> @@ -105,8 +114,5 @@ Design Details
> kernel, on receiving notification, guest APEI driver could read the CPER error
> and take appropriate action.
>
> -(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
> - find out "Error Status Data Block" entry corresponding to error source. So supported
> - source_id values should be assigned here and not be changed afterwards to make sure
> - that guest will write error into expected "Error Status Data Block" even if guest was
> - migrated to a newer QEMU.
> +(11) kvm_arch_on_sigbus_vcpu() report RAS errors via a SEA notifications,
> + when a SIGBUS event is triggered.
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 06/21] acpi/ghes: add a firmware file with HEST address
2025-02-27 11:03 ` [PATCH v5 06/21] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
@ 2025-02-27 13:23 ` Igor Mammedov
0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 13:23 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, linux-kernel
On Thu, 27 Feb 2025 12:03:36 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Store HEST table address at GPA, placing its the start of the table at
> hest_addr_le variable.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> ---
> hw/acpi/ghes.c | 20 +++++++++++++++++++-
> include/hw/acpi/ghes.h | 7 ++++++-
> 2 files changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 9243b5ad4acb..8ec423726b3f 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -30,6 +30,7 @@
>
> #define ACPI_HW_ERROR_FW_CFG_FILE "etc/hardware_errors"
> #define ACPI_HW_ERROR_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
> +#define ACPI_HEST_ADDR_FW_CFG_FILE "etc/acpi_table_hest_addr"
>
> /* The max size in bytes for one error block */
> #define ACPI_GHES_MAX_RAW_DATA_LENGTH (1 * KiB)
> @@ -341,6 +342,9 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> {
> AcpiTable table = { .sig = "HEST", .rev = 1,
> .oem_id = oem_id, .oem_table_id = oem_table_id };
> + uint32_t hest_offset;
> +
> + hest_offset = table_data->len;
>
> build_ghes_error_table(ags, hardware_errors, linker);
>
> @@ -352,6 +356,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> ACPI_GHES_NOTIFY_SEA, ACPI_HEST_SRC_ID_SEA);
>
> acpi_table_end(linker, &table);
> +
> + if (ags->use_hest_addr) {
> + /*
> + * Tell firmware to write into GPA the address of HEST via fw_cfg,
> + * once initialized.
> + */
> + bios_linker_loader_write_pointer(linker,
> + ACPI_HEST_ADDR_FW_CFG_FILE, 0,
> + sizeof(uint64_t),
> + ACPI_BUILD_TABLE_FILE, hest_offset);
> + }
> }
>
> void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> @@ -361,7 +376,10 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
> hardware_error->len);
>
> - if (!ags->use_hest_addr) {
> + if (ags->use_hest_addr) {
> + fw_cfg_add_file_callback(s, ACPI_HEST_ADDR_FW_CFG_FILE, NULL, NULL,
> + NULL, &(ags->hest_addr_le), sizeof(ags->hest_addr_le), false);
> + } else {
> /* Create a read-write fw_cfg file for Address */
> fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
> NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index 5000891f163f..38abe6e3db52 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -70,9 +70,14 @@ enum {
> * When use_hest_addr is false, the GPA of the etc/hardware_errors firmware
> * is stored at hw_error_le. This is the default on QEMU 9.x.
> *
> - * An GPA value equal to zero means that GHES is not present.
> + * When use_hest_addr is true, the stored offset is placed at hest_addr_le,
^^^^^ it's not offset, it's GPA
please get rid of offset language in this comment.
> + * meaning an offset from the HEST table address from etc/acpi/tables firmware.
> + * This is the default for QEMU 10.x and above.
> + *
> + * Whe both GPA values are equal to zero means that GHES is not present.
> */
> typedef struct AcpiGhesState {
> + uint64_t hest_addr_le;
> uint64_t hw_error_le;
> bool use_hest_addr; /* Currently, always false */
> } AcpiGhesState;
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 05/21] acpi/ghes: prepare to change the way HEST offsets are calculated
2025-02-27 11:03 ` [PATCH v5 05/21] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
@ 2025-02-27 13:25 ` Igor Mammedov
0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 13:25 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, Peter Maydell, Shannon Zhao,
linux-kernel
On Thu, 27 Feb 2025 12:03:35 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Add a new ags flag to change the way HEST offsets are calculated.
> Currently, offsets needed to store ACPI HEST offsets and read ack
> are calculated based on a previous knowledge from the logic
> which creates the HEST table.
>
> Such logic is not generic, not allowing to easily add more HEST
> entries nor replicates what OSPM does.
>
> As the next patches will be adding a more generic logic, add a
> new use_hest_addr, set to false, in preparation for such changes.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> hw/acpi/ghes.c | 39 ++++++++++++++++++++++++---------------
> hw/arm/virt-acpi-build.c | 14 +++++++++++---
> include/hw/acpi/ghes.h | 12 +++++++++++-
> 3 files changed, 46 insertions(+), 19 deletions(-)
>
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 84b891fd3dcf..9243b5ad4acb 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -206,7 +206,8 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
> * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
> * See docs/specs/acpi_hest_ghes.rst for blobs format.
> */
> -static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
> +static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
> + BIOSLinker *linker)
> {
> int i, error_status_block_offset;
>
> @@ -251,13 +252,15 @@ static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
> i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
> }
>
> - /*
> - * tell firmware to write hardware_errors GPA into
> - * hardware_errors_addr fw_cfg, once the former has been initialized.
> - */
> - bios_linker_loader_write_pointer(linker, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, 0,
> - sizeof(uint64_t),
> - ACPI_HW_ERROR_FW_CFG_FILE, 0);
> + if (!ags->use_hest_addr) {
> + /*
> + * Tell firmware to write hardware_errors GPA into
> + * hardware_errors_addr fw_cfg, once the former has been initialized.
> + */
> + bios_linker_loader_write_pointer(linker, ACPI_HW_ERROR_ADDR_FW_CFG_FILE,
> + 0, sizeof(uint64_t),
> + ACPI_HW_ERROR_FW_CFG_FILE, 0);
> + }
> }
>
> /* Build Generic Hardware Error Source version 2 (GHESv2) */
> @@ -331,14 +334,15 @@ static void build_ghes_v2(GArray *table_data,
> }
>
> /* Build Hardware Error Source Table */
> -void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
> +void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> + GArray *hardware_errors,
> BIOSLinker *linker,
> const char *oem_id, const char *oem_table_id)
> {
> AcpiTable table = { .sig = "HEST", .rev = 1,
> .oem_id = oem_id, .oem_table_id = oem_table_id };
>
> - build_ghes_error_table(hardware_errors, linker);
> + build_ghes_error_table(ags, hardware_errors, linker);
>
> acpi_table_begin(&table, table_data);
>
> @@ -357,9 +361,11 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
> hardware_error->len);
>
> - /* Create a read-write fw_cfg file for Address */
> - fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
> - NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
> + if (!ags->use_hest_addr) {
> + /* Create a read-write fw_cfg file for Address */
> + fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
> + NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
> + }
> }
>
> static void get_hw_error_offsets(uint64_t ghes_addr,
> @@ -395,8 +401,11 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
> }
>
> assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
> - get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
> - &cper_addr, &read_ack_register_addr);
> +
> + if (!ags->use_hest_addr) {
> + get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
> + &cper_addr, &read_ack_register_addr);
> + }
>
> cpu_physical_memory_read(read_ack_register_addr,
> &read_ack_register, sizeof(read_ack_register));
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 3ac8f8e17861..e6328af5d238 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -946,9 +946,17 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> build_dbg2(tables_blob, tables->linker, vms);
>
> if (vms->ras) {
> - acpi_add_table(table_offsets, tables_blob);
> - acpi_build_hest(tables_blob, tables->hardware_errors, tables->linker,
> - vms->oem_id, vms->oem_table_id);
> + AcpiGedState *acpi_ged_state;
> + AcpiGhesState *ags;
> +
> + acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
> + NULL));
> + ags = &acpi_ged_state->ghes_state;
> + if (ags) {
> + acpi_add_table(table_offsets, tables_blob);
> + acpi_build_hest(ags, tables_blob, tables->hardware_errors,
> + tables->linker, vms->oem_id, vms->oem_table_id);
> + }
> }
>
> if (ms->numa_state->num_nodes > 0) {
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index f96ac3e85ca2..5000891f163f 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -64,11 +64,21 @@ enum {
> ACPI_GHES_ERROR_SOURCE_COUNT
> };
>
> +/*
> + * AcpiGhesState stores an offset that will be used to fill HEST entries.
s/^^^^/GPA/
> + *
> + * When use_hest_addr is false, the GPA of the etc/hardware_errors firmware
> + * is stored at hw_error_le. This is the default on QEMU 9.x.
> + *
> + * An GPA value equal to zero means that GHES is not present.
> + */
> typedef struct AcpiGhesState {
> uint64_t hw_error_le;
> + bool use_hest_addr; /* Currently, always false */
> } AcpiGhesState;
>
> -void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
> +void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> + GArray *hardware_errors,
> BIOSLinker *linker,
> const char *oem_id, const char *oem_table_id);
> void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 07/21] acpi/ghes: Use HEST table offsets when preparing GHES records
2025-02-27 11:03 ` [PATCH v5 07/21] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
@ 2025-02-27 13:27 ` Igor Mammedov
0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 13:27 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, linux-kernel
On Thu, 27 Feb 2025 12:03:37 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> There are two pointers that are needed during error injection:
>
> 1. The start address of the CPER block to be stored;
> 2. The address of the read ack.
>
> It is preferable to calculate them from the HEST table. This allows
> checking the source ID, the size of the table and the type of the
> HEST error block structures.
>
> Yet, keep the old code, as this is needed for migration purposes
> from older QEMU versions.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> ---
> hw/acpi/ghes.c | 100 +++++++++++++++++++++++++++++++++++++++++
> include/hw/acpi/ghes.h | 2 +-
> 2 files changed, 101 insertions(+), 1 deletion(-)
>
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 8ec423726b3f..5158418f93cb 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -41,6 +41,12 @@
> /* Address offset in Generic Address Structure(GAS) */
> #define GAS_ADDR_OFFSET 4
>
> +/*
> + * ACPI spec 1.0b
> + * 5.2.3 System Description Table Header
> + */
> +#define ACPI_DESC_HEADER_OFFSET 36
> +
> /*
> * The total size of Generic Error Data Entry
> * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
> @@ -61,6 +67,30 @@
> */
> #define ACPI_GHES_GESB_SIZE 20
>
> +/*
> + * See the memory layout map at docs/specs/acpi_hest_ghes.rst.
> + */
> +
> +/*
> + * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source version 2
> + * Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure
> + */
> +#define HEST_GHES_V2_ENTRY_SIZE 92
> +
> +/*
> + * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source version 2
> + * Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure
> + * Read Ack Register
> + */
> +#define GHES_READ_ACK_ADDR_OFF 64
> +
> +/*
> + * ACPI 6.1: 18.3.2.7: Generic Hardware Error Source
> + * Table 18-341 Generic Hardware Error Source Structure
> + * Error Status Address
> + */
> +#define GHES_ERR_STATUS_ADDR_OFF 20
> +
> /*
> * Values for error_severity field
> */
> @@ -408,6 +438,73 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
> *read_ack_register_addr = ghes_addr + sizeof(uint64_t);
> }
>
> +static void get_ghes_source_offsets(uint16_t source_id,
> + uint64_t hest_addr,
> + uint64_t *cper_addr,
> + uint64_t *read_ack_start_addr,
> + Error **errp)
> +{
> + uint64_t hest_err_block_addr, hest_read_ack_addr;
> + uint64_t err_source_entry, error_block_addr;
> + uint32_t num_sources, i;
> +
> + hest_addr += ACPI_DESC_HEADER_OFFSET;
> +
> + cpu_physical_memory_read(hest_addr, &num_sources,
> + sizeof(num_sources));
> + num_sources = le32_to_cpu(num_sources);
> +
> + err_source_entry = hest_addr + sizeof(num_sources);
> +
> + /*
> + * Currently, HEST Error source navigates only for GHESv2 tables
> + */
> + for (i = 0; i < num_sources; i++) {
> + uint64_t addr = err_source_entry;
> + uint16_t type, src_id;
> +
> + cpu_physical_memory_read(addr, &type, sizeof(type));
> + type = le16_to_cpu(type);
> +
> + /* For now, we only know the size of GHESv2 table */
> + if (type != ACPI_GHES_SOURCE_GENERIC_ERROR_V2) {
> + error_setg(errp, "HEST: type %d not supported.", type);
> + return;
> + }
> +
> + /* Compare CPER source ID at the GHESv2 structure */
> + addr += sizeof(type);
> + cpu_physical_memory_read(addr, &src_id, sizeof(src_id));
> + if (le16_to_cpu(src_id) == source_id) {
> + break;
> + }
> +
> + err_source_entry += HEST_GHES_V2_ENTRY_SIZE;
> + }
> + if (i == num_sources) {
> + error_setg(errp, "HEST: Source %d not found.", source_id);
> + return;
> + }
> +
> + /* Navigate through table address pointers */
> + hest_err_block_addr = err_source_entry + GHES_ERR_STATUS_ADDR_OFF +
> + GAS_ADDR_OFFSET;
> +
> + cpu_physical_memory_read(hest_err_block_addr, &error_block_addr,
> + sizeof(error_block_addr));
> + error_block_addr = le64_to_cpu(error_block_addr);
> +
> + cpu_physical_memory_read(error_block_addr, cper_addr,
> + sizeof(*cper_addr));
> + *cper_addr = le64_to_cpu(*cper_addr);
> +
> + hest_read_ack_addr = err_source_entry + GHES_READ_ACK_ADDR_OFF +
> + GAS_ADDR_OFFSET;
> + cpu_physical_memory_read(hest_read_ack_addr, read_ack_start_addr,
> + sizeof(*read_ack_start_addr));
> + *read_ack_start_addr = le64_to_cpu(*read_ack_start_addr);
> +}
> +
> void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
> uint16_t source_id, Error **errp)
> {
> @@ -423,6 +520,9 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
> if (!ags->use_hest_addr) {
> get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
> &cper_addr, &read_ack_register_addr);
> + } else {
> + get_ghes_source_offsets(source_id, le64_to_cpu(ags->hest_addr_le),
> + &cper_addr, &read_ack_register_addr, errp);
> }
>
> cpu_physical_memory_read(read_ack_register_addr,
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index 38abe6e3db52..dcc7288ffba5 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -79,7 +79,7 @@ enum {
> typedef struct AcpiGhesState {
> uint64_t hest_addr_le;
> uint64_t hw_error_le;
> - bool use_hest_addr; /* Currently, always false */
> + bool use_hest_addr; /* True if HEST address is present */
> } AcpiGhesState;
>
> void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
` (20 preceding siblings ...)
2025-02-27 11:03 ` [PATCH v5 21/21] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
@ 2025-02-27 13:30 ` Igor Mammedov
2025-02-27 15:13 ` Mauro Carvalho Chehab
21 siblings, 1 reply; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 13:30 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
On Thu, 27 Feb 2025 12:03:30 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Now that the ghes preparation patches were merged, let's add support
> for error injection.
>
> On this version, HEST table got added to ACPI tables testing for aarch64 virt.
>
> There are also some patch reorder to help reviewers to check the changes.
>
> The code itself is almost identical to v4, with just a few minor nits addressed.
series still has checkpatch errors 'line over 80' which are not false positive,
it needs to be fixed
>
> ---
> v5:
> - make checkpatch happier;
> - HEST table is now tested;
> - some changes at HEST spec documentation to align with code changes;
> - extra care was taken with regards to git bisectability.
>
> v4:
> - added an extra comment for AcpiGhesState structure;
> - patches reordered;
> - no functional changes, just code shift between the patches in this series.
>
> v3:
> - addressed more nits;
> - hest_add_le now points to the beginning of HEST table;
> - removed HEST from tests/data/acpi;
> - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
>
> v2:
> - address some nits;
> - improved ags cleanup patch and removed ags.present field;
> - added some missing le*_to_cpu() calls;
> - update date at copyright for new files to 2024-2025;
> - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> - added HEST and DSDT tables after the changes to make check target happy.
> (two patches: first one whitelisting such tables; second one removing from
> whitelist and updating/adding such tables to tests/data/acpi)
>
>
> Mauro Carvalho Chehab (21):
> tests/acpi: virt: add an empty HEST file
> tests/qtest/bios-tables-test: extend to also check HEST table
> tests/acpi: virt: update HEST file with its current data
> acpi/ghes: Cleanup the code which gets ghes ged state
> acpi/ghes: prepare to change the way HEST offsets are calculated
> acpi/ghes: add a firmware file with HEST address
> acpi/ghes: Use HEST table offsets when preparing GHES records
> acpi/ghes: don't hard-code the number of sources for HEST table
> acpi/ghes: add a notifier to notify when error data is ready
> acpi/ghes: create an ancillary acpi_ghes_get_state() function
> acpi/generic_event_device: Update GHES migration to cover hest addr
> acpi/generic_event_device: add logic to detect if HEST addr is
> available
> acpi/generic_event_device: add an APEI error device
> tests/acpi: virt: allow acpi table changes at DSDT and HEST tables
> arm/virt: Wire up a GED error device for ACPI / GHES
> qapi/acpi-hest: add an interface to do generic CPER error injection
> tests/acpi: virt: update HEST table to accept two sources
> tests/acpi: virt: and update DSDT table to add the new GED device
> docs: hest: add new "etc/acpi_table_hest_addr" and update workflow
> acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x
> scripts/ghes_inject: add a script to generate GHES error inject
>
> MAINTAINERS | 10 +
> docs/specs/acpi_hest_ghes.rst | 28 +-
> hw/acpi/Kconfig | 5 +
> hw/acpi/aml-build.c | 10 +
> hw/acpi/generic_event_device.c | 43 ++
> hw/acpi/ghes-stub.c | 7 +-
> hw/acpi/ghes.c | 231 ++++--
> hw/acpi/ghes_cper.c | 38 +
> hw/acpi/ghes_cper_stub.c | 19 +
> hw/acpi/meson.build | 2 +
> hw/arm/virt-acpi-build.c | 36 +-
> hw/arm/virt.c | 19 +-
> hw/core/machine.c | 2 +
> include/hw/acpi/acpi_dev_interface.h | 1 +
> include/hw/acpi/aml-build.h | 2 +
> include/hw/acpi/generic_event_device.h | 1 +
> include/hw/acpi/ghes.h | 52 +-
> include/hw/arm/virt.h | 2 +
> qapi/acpi-hest.json | 35 +
> qapi/meson.build | 1 +
> qapi/qapi-schema.json | 1 +
> scripts/arm_processor_error.py | 476 ++++++++++++
> scripts/ghes_inject.py | 51 ++
> scripts/qmp_helper.py | 702 ++++++++++++++++++
> target/arm/kvm.c | 7 +-
> tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> tests/data/acpi/aarch64/virt/HEST | Bin 0 -> 224 bytes
> tests/qtest/bios-tables-test.c | 2 +-
> 32 files changed, 1692 insertions(+), 91 deletions(-)
> create mode 100644 hw/acpi/ghes_cper.c
> create mode 100644 hw/acpi/ghes_cper_stub.c
> create mode 100644 qapi/acpi-hest.json
> create mode 100644 scripts/arm_processor_error.py
> create mode 100755 scripts/ghes_inject.py
> create mode 100755 scripts/qmp_helper.py
> create mode 100644 tests/data/acpi/aarch64/virt/HEST
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 12/21] acpi/generic_event_device: add logic to detect if HEST addr is available
2025-02-27 11:03 ` [PATCH v5 12/21] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
@ 2025-02-27 13:33 ` Igor Mammedov
0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 13:33 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha,
Eduardo Habkost, Marcel Apfelbaum, Peter Maydell, Shannon Zhao,
Yanan Wang, Zhao Liu, linux-kernel
On Thu, 27 Feb 2025 12:03:42 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Create a new property (x-has-hest-addr) and use it to detect if
> the GHES table offsets can be calculated from the HEST address
> (qemu 10.0 and upper) or via the legacy way via an offset obtained
> from the hardware_errors firmware file.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
with checkpatch issues fixed
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> ---
> hw/acpi/generic_event_device.c | 1 +
> hw/arm/virt-acpi-build.c | 18 ++++++++++++++++--
> hw/core/machine.c | 2 ++
> 3 files changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index 5346cae573b7..14d8513a5440 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -318,6 +318,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>
> static const Property acpi_ged_properties[] = {
> DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
> + DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState, ghes_state.use_hest_addr, false),
> };
>
> static const VMStateDescription vmstate_memhp_state = {
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index af5056201c22..03ee30b3b3f0 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -897,6 +897,10 @@ static const AcpiNotificationSourceId hest_ghes_notify[] = {
> { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
> };
>
> +static const AcpiNotificationSourceId hest_ghes_notify_9_2[] = {
> + { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
> +};
> +
> static
> void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> {
> @@ -951,6 +955,8 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>
> if (vms->ras) {
> AcpiGedState *acpi_ged_state;
> + static const AcpiNotificationSourceId *notify;
> + unsigned int notify_sz;
> AcpiGhesState *ags;
>
> acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
> @@ -958,9 +964,17 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> ags = &acpi_ged_state->ghes_state;
> if (ags) {
> acpi_add_table(table_offsets, tables_blob);
> +
> + if (!ags->use_hest_addr) {
> + notify = hest_ghes_notify_9_2;
> + notify_sz = ARRAY_SIZE(hest_ghes_notify_9_2);
> + } else {
> + notify = hest_ghes_notify;
> + notify_sz = ARRAY_SIZE(hest_ghes_notify);
> + }
> +
> acpi_build_hest(ags, tables_blob, tables->hardware_errors,
> - tables->linker, hest_ghes_notify,
> - ARRAY_SIZE(hest_ghes_notify),
> + tables->linker, notify, notify_sz,
> vms->oem_id, vms->oem_table_id);
> }
> }
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 02cff735b3fb..7a11e0f87b11 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -34,6 +34,7 @@
> #include "hw/virtio/virtio-pci.h"
> #include "hw/virtio/virtio-net.h"
> #include "hw/virtio/virtio-iommu.h"
> +#include "hw/acpi/generic_event_device.h"
> #include "audio/audio.h"
>
> GlobalProperty hw_compat_9_2[] = {
> @@ -43,6 +44,7 @@ GlobalProperty hw_compat_9_2[] = {
> { "virtio-balloon-pci-non-transitional", "vectors", "0" },
> { "virtio-mem-pci", "vectors", "0" },
> { "migration", "multifd-clean-tls-termination", "false" },
> + { TYPE_ACPI_GED, "x-has-hest-addr", "false" },
> };
> const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2);
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 14/21] tests/acpi: virt: allow acpi table changes at DSDT and HEST tables
2025-02-27 11:03 ` [PATCH v5 14/21] tests/acpi: virt: allow acpi table changes at DSDT and HEST tables Mauro Carvalho Chehab
@ 2025-02-27 13:34 ` Igor Mammedov
0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2025-02-27 13:34 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, linux-kernel
On Thu, 27 Feb 2025 12:03:44 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> We'll be adding a new GED device for HEST GPIO notification and
> increasing the number of entries at the HEST table.
>
> Blocklist testing HEST and DSDT tables until such changes
> are completed.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
> ---
> tests/qtest/bios-tables-test-allowed-diff.h | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
> index dfb8523c8bf4..0a1a26543ba2 100644
> --- a/tests/qtest/bios-tables-test-allowed-diff.h
> +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> @@ -1 +1,7 @@
> /* List of comma-separated changed AML files to ignore */
> +"tests/data/acpi/aarch64/virt/HEST",
> +"tests/data/acpi/aarch64/virt/DSDT",
> +"tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt",
> +"tests/data/acpi/aarch64/virt/DSDT.memhp",
> +"tests/data/acpi/aarch64/virt/DSDT.pxb",
> +"tests/data/acpi/aarch64/virt/DSDT.topology",
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject
2025-02-27 13:30 ` [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for " Igor Mammedov
@ 2025-02-27 15:13 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 15:13 UTC (permalink / raw)
To: Igor Mammedov
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
Em Thu, 27 Feb 2025 14:30:28 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Thu, 27 Feb 2025 12:03:30 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Now that the ghes preparation patches were merged, let's add support
> > for error injection.
> >
> > On this version, HEST table got added to ACPI tables testing for aarch64 virt.
> >
> > There are also some patch reorder to help reviewers to check the changes.
> >
> > The code itself is almost identical to v4, with just a few minor nits addressed.
>
> series still has checkpatch errors 'line over 80' which are not false positive,
> it needs to be fixed
The long line warnings are at the patch adding the Python script. IMO,
all but one are false positives:
1. Long lines at patch description because of the tool output example added
inside the commit description:
ERROR: line over 90 characters
#148: FILE: scripts/arm_processor_error.py:83:
+[Hardware Error]: bus error, operation type: Generic read (type of instruction or data request cannot be determined)
ERROR: line over 90 characters
#153: FILE: scripts/arm_processor_error.py:88:
+[Hardware Error]: Program execution can be restarted reliably at the PC associated with the error.
WARNING: line over 80 characters
#170: FILE: scripts/arm_processor_error.py:105:
+[Hardware Error]: 00000000: 13 7b 04 05 01 .{...
WARNING: line over 80 characters
#174: FILE: scripts/arm_processor_error.py:109:
+[Firmware Warn]: GHES: Unhandled processor error type 0x10: micro-architectural error
ERROR: line over 90 characters
#175: FILE: scripts/arm_processor_error.py:110:
+[Firmware Warn]: GHES: Unhandled processor error type 0x14: TLB error|micro-architectural error
IMO, breaking command output at the description is a bad practice.
2. Big strings at help message:
WARNING: line over 80 characters
#261: FILE: scripts/arm_processor_error.py:196:
+ help="Power State Coordination Interface - PSCI state")
ERROR: line over 90 characters
#276: FILE: scripts/arm_processor_error.py:211:
+ help="Number of errors: 0: Single error, 1: Multiple errors, 2-65535: Error count if known")
WARNING: line over 80 characters
#278: FILE: scripts/arm_processor_error.py:213:
+ help="Error information (UEFI 2.10 tables N.18 to N.20)")
ERROR: line over 90 characters
#287: FILE: scripts/arm_processor_error.py:222:
+ help="Type of the context (0=ARM32 GPR, 5=ARM64 EL1, other values supported)")
WARNING: line over 80 characters
#1046: FILE: scripts/qmp_helper.py:442:
+ help="Marks the timestamp as precise if --timestamp is used")
WARNING: line over 80 characters
#1048: FILE: scripts/qmp_helper.py:444:
+ help=f"General Error Data Block flags: {gedb_flags_bits}")
Those might be changed if we add one variable per string to store the
help lines, at the expense of doing some code obfuscation.
I don't think doing it is a good idea.
3. Long class function names that are part of Python's standard library:
ERROR: line over 90 characters
#576: FILE: scripts/ghes_inject.py:29:
+ parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter,
We can't change the big name of the argparse formatter. The only
possible fix would be to obfuscate it by doing:
format = argparse.ArgumentDefaultsHelpFormatter,
parser = argparse.ArgumentParser(formatter_class=format,
IMO this is a bad practice.
4. False-positive warning disable for pylint coding style tool:
ERROR: line over 90 characters
#805: FILE: scripts/qmp_helper.py:201:
+ data.extend(value.to_bytes(num_bytes, byteorder="little")) # pylint: disable=E1101
WARNING: line over 80 characters
#1028: FILE: scripts/qmp_helper.py:424:
+ g_gen = parser.add_argument_group("Generic Error Data") # pylint: disable=E1101
AFAIKT, those need to be at the same line for pylint to process them
properly.
5. A long name inside an indented block:
WARNING: line over 80 characters
#1109: FILE: scripts/qmp_helper.py:505:
+ value=args.gen_err_valid_bits,
Again the only solution would be to obfuscate the argument, like:
a = args.gen_err_valid_bits
value=a,
Not nice, IMHO.
Now, there is one warning that I is not a false positive, which I ended
missing:
WARNING: line over 80 characters
#1227: FILE: scripts/qmp_helper.py:623:
+ ret = self.send_cmd("qom-get", args, may_open=True, return_error=False)
I'll fix it at the next respin.
Regards,
Mauro
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 17/21] tests/acpi: virt: update HEST table to accept two sources
2025-02-27 13:16 ` Igor Mammedov
@ 2025-02-27 15:51 ` Mauro Carvalho Chehab
2025-02-27 15:56 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 15:51 UTC (permalink / raw)
To: Igor Mammedov
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, linux-kernel
Em Thu, 27 Feb 2025 14:16:03 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Thu, 27 Feb 2025 14:10:38 +0100
> Igor Mammedov <imammedo@redhat.com> wrote:
>
> > On Thu, 27 Feb 2025 12:03:47 +0100
> > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> >
> > squash this patch into the next one
> >
> > Also at this point there is no visible HEST changes yet, so a soon as you remove
> > white-list without enabling new HEST, the tests should start failing.
> >
> > I suggest to move 20/21 before this patch,
> > as result one would see dsdt and hest diffs when running tests
> > and then you can use rebuild-expected-aml.sh to generate updated
> > tables and update them in one patch (that's what we typically do,
> > we don't split updates in increments).
>
> on top of that,
> it seems the patch doesn't apply for some reason.
Hmm... perhaps the diffstat that I place here (produced by bios-tables-test
output) is causing some confusion when you're trying to apply the patch.
Any suggestions to avoid that?
Thanks,
Mauro
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v5 17/21] tests/acpi: virt: update HEST table to accept two sources
2025-02-27 15:51 ` Mauro Carvalho Chehab
@ 2025-02-27 15:56 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 38+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 15:56 UTC (permalink / raw)
To: Igor Mammedov
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, linux-kernel
Em Thu, 27 Feb 2025 16:51:24 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:
> Em Thu, 27 Feb 2025 14:16:03 +0100
> Igor Mammedov <imammedo@redhat.com> escreveu:
>
> > On Thu, 27 Feb 2025 14:10:38 +0100
> > Igor Mammedov <imammedo@redhat.com> wrote:
> >
> > > On Thu, 27 Feb 2025 12:03:47 +0100
> > > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> > >
> > > squash this patch into the next one
> > >
> > > Also at this point there is no visible HEST changes yet, so a soon as you remove
> > > white-list without enabling new HEST, the tests should start failing.
> > >
> > > I suggest to move 20/21 before this patch,
> > > as result one would see dsdt and hest diffs when running tests
> > > and then you can use rebuild-expected-aml.sh to generate updated
> > > tables and update them in one patch (that's what we typically do,
> > > we don't split updates in increments).
> >
> > on top of that,
> > it seems the patch doesn't apply for some reason.
>
> Hmm... perhaps the diffstat that I place here (produced by bios-tables-test
> output) is causing some confusion when you're trying to apply the patch.
>
> Any suggestions to avoid that?
Nevermind. I fixed by removing the name of the file before the diff, e.g.
the description is now:
tests/acpi: virt: update HEST and DSDT tables
- The HEST table now accept two sources;
- The DSDT tables now have a GED error device.
@@ -1,39 +1,39 @@
/*
* Intel ACPI Component Architecture
* AML/ASL+ Disassembler version 20240322 (64-bit version)
* Copyright (c) 2000
...
Regards,
Mauro
^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2025-02-27 15:56 UTC | newest]
Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-27 11:03 [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 01/21] tests/acpi: virt: add an empty HEST file Mauro Carvalho Chehab
2025-02-27 12:02 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 02/21] tests/qtest/bios-tables-test: extend to also check HEST table Mauro Carvalho Chehab
2025-02-27 12:03 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 03/21] tests/acpi: virt: update HEST file with its current data Mauro Carvalho Chehab
2025-02-27 12:03 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 04/21] acpi/ghes: Cleanup the code which gets ghes ged state Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 05/21] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
2025-02-27 13:25 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 06/21] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
2025-02-27 13:23 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 07/21] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
2025-02-27 13:27 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 08/21] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 09/21] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 10/21] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
2025-02-27 11:31 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 11/21] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 12/21] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
2025-02-27 13:33 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 13/21] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 14/21] tests/acpi: virt: allow acpi table changes at DSDT and HEST tables Mauro Carvalho Chehab
2025-02-27 13:34 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 15/21] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 16/21] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 17/21] tests/acpi: virt: update HEST table to accept two sources Mauro Carvalho Chehab
2025-02-27 13:10 ` Igor Mammedov
2025-02-27 13:16 ` Igor Mammedov
2025-02-27 15:51 ` Mauro Carvalho Chehab
2025-02-27 15:56 ` Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 18/21] tests/acpi: virt: and update DSDT table to add the new GED device Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 19/21] docs: hest: add new "etc/acpi_table_hest_addr" and update workflow Mauro Carvalho Chehab
2025-02-27 13:21 ` Igor Mammedov
2025-02-27 11:03 ` [PATCH v5 20/21] acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x Mauro Carvalho Chehab
2025-02-27 11:03 ` [PATCH v5 21/21] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
2025-02-27 13:30 ` [PATCH v5 00/21]Change ghes to use HEST-based offsets and add support for " Igor Mammedov
2025-02-27 15:13 ` Mauro Carvalho Chehab
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).