linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject
@ 2025-03-07 19:14 Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 01/20] tests/acpi: virt: add an empty HEST file Mauro Carvalho Chehab
                   ` (21 more replies)
  0 siblings, 22 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Philippe Mathieu-Daudé, Gavin Shan,
	Ani Sinha, Cleber Rosa, Dongjiu Geng, Eduardo Habkost, Eric Blake,
	John Snow, Marcel Apfelbaum, Markus Armbruster, Michael Roth,
	Paolo Bonzini, Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu,
	kvm, linux-kernel

Hi Michael,

I'm sending v8 to avoid a merge conflict with v7 due to this
changeset:

   611f3bdb20f7 ("hw/acpi/ghes: Make  static")

As ghes_record_cper_errors() was written since the beginning
to be public and used by ghes-cper.c. It ended being meged
earlier because the error-injection series become too big,
so it was decided last year to split in two to make easier for
reviewers and maintainers to discuss.

Anyway, as mentioned on v7, I guess we're ready to merge this
series, as patches here have been thoughfully reviewed mainly 
by Igor and Jonathan over the last 5-6 months.

The only change from v7 is a minor editorial change at HEST doc
spec, and the addition of Igor and Jonathan's A-B/R-B.

This series change the way HEST table offsets are calculated,
making them identical to what an OSPM would do and allowing
multiple HEST entries without causing migration issues. It open
space to add HEST support for non-arm architectures, as now
the number and type of HEST notification entries are not
hardcoded at ghes.c. Instead, they're passed as a parameter
from the arch-dependent init code.

With such issue addressed, it adds a new notification type and
add support to inject errors via a Python script. The script
itself is at the final patch.

---
v8:
  - added a patch to revert recently-added changeset causing a
    conflict with these. All remaining patches are identical.

v7:
  - minor editorial change at the patch updating HEST doc spec
   with the new workflow

v6:
- some minor nits addressed:
   - use GPA instead of offset;
   - merged two patches;
   - fixed a couple of long line coding style issues;
   - the HEST/DSDT diff inside a patch was changed to avoid troubles
     applying it.

v5:
- make checkpatch happier;
- HEST table is now tested;
- some changes at HEST spec documentation to align with code changes;
- extra care was taken with regards to git bisectability.

v4:
- added an extra comment for AcpiGhesState structure;
- patches reordered;
- no functional changes, just code shift between the patches in this series.

v3:
- addressed more nits;
- hest_add_le now points to the beginning of HEST table;
- removed HEST from tests/data/acpi;
- added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le

v2: 
- address some nits;
- improved ags cleanup patch and removed ags.present field;
- added some missing le*_to_cpu() calls;
- update date at copyright for new files to 2024-2025;
- qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
- added HEST and DSDT tables after the changes to make check target happy.
  (two patches: first one whitelisting such tables; second one removing from
   whitelist and updating/adding such tables to tests/data/acpi)

Mauro Carvalho Chehab (20):
  tests/acpi: virt: add an empty HEST file
  tests/qtest/bios-tables-test: extend to also check HEST table
  tests/acpi: virt: update HEST file with its current data
  Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static"
  acpi/ghes: Cleanup the code which gets ghes ged state
  acpi/ghes: prepare to change the way HEST offsets are calculated
  acpi/ghes: add a firmware file with HEST address
  acpi/ghes: Use HEST table offsets when preparing GHES records
  acpi/ghes: don't hard-code the number of sources for HEST table
  acpi/ghes: add a notifier to notify when error data is ready
  acpi/generic_event_device: Update GHES migration to cover hest addr
  acpi/generic_event_device: add logic to detect if HEST addr is
    available
  acpi/generic_event_device: add an APEI error device
  tests/acpi: virt: allow acpi table changes at DSDT and HEST tables
  arm/virt: Wire up a GED error device for ACPI / GHES
  qapi/acpi-hest: add an interface to do generic CPER error injection
  acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x
  tests/acpi: virt: update HEST and DSDT tables
  docs: hest: add new "etc/acpi_table_hest_addr" and update workflow
  scripts/ghes_inject: add a script to generate GHES error inject

 MAINTAINERS                                   |  10 +
 docs/specs/acpi_hest_ghes.rst                 |  28 +-
 hw/acpi/Kconfig                               |   5 +
 hw/acpi/aml-build.c                           |  10 +
 hw/acpi/generic_event_device.c                |  44 ++
 hw/acpi/ghes-stub.c                           |   7 +-
 hw/acpi/ghes.c                                | 233 ++++--
 hw/acpi/ghes_cper.c                           |  38 +
 hw/acpi/ghes_cper_stub.c                      |  19 +
 hw/acpi/meson.build                           |   2 +
 hw/arm/virt-acpi-build.c                      |  35 +-
 hw/arm/virt.c                                 |  19 +-
 hw/core/machine.c                             |   2 +
 include/hw/acpi/acpi_dev_interface.h          |   1 +
 include/hw/acpi/aml-build.h                   |   2 +
 include/hw/acpi/generic_event_device.h        |   1 +
 include/hw/acpi/ghes.h                        |  51 +-
 include/hw/arm/virt.h                         |   2 +
 qapi/acpi-hest.json                           |  35 +
 qapi/meson.build                              |   1 +
 qapi/qapi-schema.json                         |   1 +
 scripts/arm_processor_error.py                | 476 ++++++++++++
 scripts/ghes_inject.py                        |  51 ++
 scripts/qmp_helper.py                         | 703 ++++++++++++++++++
 target/arm/kvm.c                              |   7 +-
 tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
 .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
 tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
 tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
 tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
 tests/data/acpi/aarch64/virt/HEST             | Bin 0 -> 224 bytes
 tests/qtest/bios-tables-test.c                |   2 +-
 32 files changed, 1695 insertions(+), 90 deletions(-)
 create mode 100644 hw/acpi/ghes_cper.c
 create mode 100644 hw/acpi/ghes_cper_stub.c
 create mode 100644 qapi/acpi-hest.json
 create mode 100644 scripts/arm_processor_error.py
 create mode 100755 scripts/ghes_inject.py
 create mode 100755 scripts/qmp_helper.py
 create mode 100644 tests/data/acpi/aarch64/virt/HEST

-- 
2.48.1



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v8 01/20] tests/acpi: virt: add an empty HEST file
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 02/20] tests/qtest/bios-tables-test: extend to also check HEST table Mauro Carvalho Chehab
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, linux-kernel

Such file will be used to track HEST table changes.

For now, disallow HEST table check until we update it to the
current data.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 tests/data/acpi/aarch64/virt/HEST           | 0
 tests/qtest/bios-tables-test-allowed-diff.h | 1 +
 2 files changed, 1 insertion(+)
 create mode 100644 tests/data/acpi/aarch64/virt/HEST

diff --git a/tests/data/acpi/aarch64/virt/HEST b/tests/data/acpi/aarch64/virt/HEST
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8bf4..39901c58d647 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,2 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/aarch64/virt/HEST",
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 02/20] tests/qtest/bios-tables-test: extend to also check HEST table
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 01/20] tests/acpi: virt: add an empty HEST file Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 03/20] tests/acpi: virt: update HEST file with its current data Mauro Carvalho Chehab
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, linux-kernel

Currently, aarch64 can generate a HEST table when loaded with
-machine ras=on. Add support for it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 tests/qtest/bios-tables-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index 0a333ec43536..8d41601cc9e9 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -2122,7 +2122,7 @@ static void test_acpi_aarch64_virt_tcg(void)
 
     data.smbios_cpu_max_speed = 2900;
     data.smbios_cpu_curr_speed = 2700;
-    test_acpi_one("-cpu cortex-a57 "
+    test_acpi_one("-cpu cortex-a57 -machine ras=on "
                   "-smbios type=4,max-speed=2900,current-speed=2700", &data);
     free_test_data(&data);
 }
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 03/20] tests/acpi: virt: update HEST file with its current data
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 01/20] tests/acpi: virt: add an empty HEST file Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 02/20] tests/qtest/bios-tables-test: extend to also check HEST table Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 04/20] Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static" Mauro Carvalho Chehab
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, linux-kernel

Now that HEST table is checked for aarch64, add the current
firmware file.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 tests/data/acpi/aarch64/virt/HEST           | Bin 0 -> 132 bytes
 tests/qtest/bios-tables-test-allowed-diff.h |   1 -
 2 files changed, 1 deletion(-)

diff --git a/tests/data/acpi/aarch64/virt/HEST b/tests/data/acpi/aarch64/virt/HEST
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..4c5d8c5b5da5b3241f93cd0839e94272bf6b1486 100644
GIT binary patch
literal 132
zcmeZp4Gw8xU|?W;<mB({5v<@85#X$#prF9Wz`y`vgJ=-uVqjqS|DS;o#%Ew*U|?_n
dk++-~7#J8hWI!Yi09DHYRr~Kh1c1x}0RY>66afGL

literal 0
HcmV?d00001

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 39901c58d647..dfb8523c8bf4 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,2 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/aarch64/virt/HEST",
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 04/20] Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static"
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (2 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 03/20] tests/acpi: virt: update HEST file with its current data Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 21:02   ` Philippe Mathieu-Daudé
  2025-03-07 19:14 ` [PATCH v8 05/20] acpi/ghes: Cleanup the code which gets ghes ged state Mauro Carvalho Chehab
                   ` (17 subsequent siblings)
  21 siblings, 1 reply; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, Dongjiu Geng, linux-kernel

The ghes_record_cper_errors() function was introduced to be used
by other types of errors, as part of the error injection
patch series. That's why it is not static.

Make it non-static again to allow its usage outside ghes.c

This reverts commit 611f3bdb20f7828b0813aa90d47d1275ef18329b.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 hw/acpi/ghes.c         | 6 ++++--
 include/hw/acpi/ghes.h | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index b85bb48195a0..b709c177cdea 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -390,8 +390,8 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
     *read_ack_register_addr = ghes_addr + sizeof(uint64_t);
 }
 
-static void ghes_record_cper_errors(const void *cper, size_t len,
-                                    uint16_t source_id, Error **errp)
+void ghes_record_cper_errors(const void *cper, size_t len,
+                             uint16_t source_id, Error **errp)
 {
     uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
     AcpiGedState *acpi_ged_state;
@@ -440,6 +440,8 @@ static void ghes_record_cper_errors(const void *cper, size_t len,
 
     /* Write the generic error data entry into guest memory */
     cpu_physical_memory_write(cper_addr, cper, len);
+
+    return;
 }
 
 int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 578a582203ce..39619a2457cb 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -75,6 +75,8 @@ void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
 void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
                           GArray *hardware_errors);
 int acpi_ghes_memory_errors(uint16_t source_id, uint64_t error_physical_addr);
+void ghes_record_cper_errors(const void *cper, size_t len,
+                             uint16_t source_id, Error **errp);
 
 /**
  * acpi_ghes_present: Report whether ACPI GHES table is present
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 05/20] acpi/ghes: Cleanup the code which gets ghes ged state
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (3 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 04/20] Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static" Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 06/20] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, Dongjiu Geng, Paolo Bonzini, Peter Maydell, kvm,
	linux-kernel

Move the check logic into a common function and simplify the
code which checks if GHES is enabled and was properly setup.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/ghes-stub.c    |  7 ++++---
 hw/acpi/ghes.c         | 38 +++++++++++---------------------------
 include/hw/acpi/ghes.h | 14 +++++++-------
 target/arm/kvm.c       |  7 +++++--
 4 files changed, 27 insertions(+), 39 deletions(-)

diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
index 7cec1812dad9..40f660c246fe 100644
--- a/hw/acpi/ghes-stub.c
+++ b/hw/acpi/ghes-stub.c
@@ -11,12 +11,13 @@
 #include "qemu/osdep.h"
 #include "hw/acpi/ghes.h"
 
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+                            uint64_t physical_address)
 {
     return -1;
 }
 
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
 {
-    return false;
+    return NULL;
 }
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index b709c177cdea..84b891fd3dcf 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -360,18 +360,12 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
     /* Create a read-write fw_cfg file for Address */
     fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
         NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
-
-    ags->present = true;
 }
 
 static void get_hw_error_offsets(uint64_t ghes_addr,
                                  uint64_t *cper_addr,
                                  uint64_t *read_ack_register_addr)
 {
-    if (!ghes_addr) {
-        return;
-    }
-
     /*
      * non-HEST version supports only one source, so no need to change
      * the start offset based on the source ID. Also, we can't validate
@@ -390,35 +384,20 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
     *read_ack_register_addr = ghes_addr + sizeof(uint64_t);
 }
 
-void ghes_record_cper_errors(const void *cper, size_t len,
+void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
                              uint16_t source_id, Error **errp)
 {
     uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
-    AcpiGedState *acpi_ged_state;
-    AcpiGhesState *ags;
 
     if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
         error_setg(errp, "GHES CPER record is too big: %zd", len);
         return;
     }
 
-    acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
-                                                       NULL));
-    if (!acpi_ged_state) {
-        error_setg(errp, "Can't find ACPI_GED object");
-        return;
-    }
-    ags = &acpi_ged_state->ghes_state;
-
     assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
     get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
                          &cper_addr, &read_ack_register_addr);
 
-    if (!cper_addr) {
-        error_setg(errp, "can not find Generic Error Status Block");
-        return;
-    }
-
     cpu_physical_memory_read(read_ack_register_addr,
                              &read_ack_register, sizeof(read_ack_register));
 
@@ -444,7 +423,8 @@ void ghes_record_cper_errors(const void *cper, size_t len,
     return;
 }
 
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+                            uint64_t physical_address)
 {
     /* Memory Error Section Type */
     const uint8_t guid[] =
@@ -470,7 +450,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
     acpi_ghes_build_append_mem_cper(block, physical_address);
 
     /* Report the error */
-    ghes_record_cper_errors(block->data, block->len, source_id, &errp);
+    ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
 
     g_array_free(block, true);
 
@@ -482,7 +462,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
     return 0;
 }
 
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
 {
     AcpiGedState *acpi_ged_state;
     AcpiGhesState *ags;
@@ -491,8 +471,12 @@ bool acpi_ghes_present(void)
                                                        NULL));
 
     if (!acpi_ged_state) {
-        return false;
+        return NULL;
     }
     ags = &acpi_ged_state->ghes_state;
-    return ags->present;
+
+    if (!ags->hw_error_le) {
+        return NULL;
+    }
+    return ags;
 }
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 39619a2457cb..f96ac3e85ca2 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -66,7 +66,6 @@ enum {
 
 typedef struct AcpiGhesState {
     uint64_t hw_error_le;
-    bool present; /* True if GHES is present at all on this board */
 } AcpiGhesState;
 
 void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
@@ -74,15 +73,16 @@ void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
                      const char *oem_id, const char *oem_table_id);
 void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
                           GArray *hardware_errors);
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t error_physical_addr);
-void ghes_record_cper_errors(const void *cper, size_t len,
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+                            uint64_t error_physical_addr);
+void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
                              uint16_t source_id, Error **errp);
 
 /**
- * acpi_ghes_present: Report whether ACPI GHES table is present
+ * acpi_ghes_get_state: Get a pointer for ACPI ghes state
  *
- * Returns: true if the system has an ACPI GHES table and it is
- * safe to call acpi_ghes_memory_errors() to record a memory error.
+ * Returns: a pointer to ghes state if the system has an ACPI GHES table,
+ * NULL, otherwise.
  */
-bool acpi_ghes_present(void);
+AcpiGhesState *acpi_ghes_get_state(void);
 #endif
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index da30bdbb2349..80ca7779797b 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2366,10 +2366,12 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
 {
     ram_addr_t ram_addr;
     hwaddr paddr;
+    AcpiGhesState *ags;
 
     assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
 
-    if (acpi_ghes_present() && addr) {
+    ags = acpi_ghes_get_state();
+    if (ags && addr) {
         ram_addr = qemu_ram_addr_from_host(addr);
         if (ram_addr != RAM_ADDR_INVALID &&
             kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
@@ -2387,7 +2389,8 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
              */
             if (code == BUS_MCEERR_AR) {
                 kvm_cpu_synchronize_state(c);
-                if (!acpi_ghes_memory_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
+                if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SEA,
+                                             paddr)) {
                     kvm_inject_arm_sea(c);
                 } else {
                     error_report("failed to record the error");
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 06/20] acpi/ghes: prepare to change the way HEST offsets are calculated
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (4 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 05/20] acpi/ghes: Cleanup the code which gets ghes ged state Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 07/20] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, Dongjiu Geng, Peter Maydell, Shannon Zhao,
	linux-kernel

Add a new ags flag to change the way HEST offsets are calculated.
Currently, offsets needed to store ACPI HEST offsets and read ack
are calculated based on a previous knowledge from the logic
which creates the HEST table.

Such logic is not generic, not allowing to easily add more HEST
entries nor replicates what OSPM does.

As the next patches will be adding a more generic logic, add a
new use_hest_addr, set to false, in preparation for such changes.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/ghes.c           | 39 ++++++++++++++++++++++++---------------
 hw/arm/virt-acpi-build.c | 13 ++++++++++---
 include/hw/acpi/ghes.h   | 12 +++++++++++-
 3 files changed, 45 insertions(+), 19 deletions(-)

diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 84b891fd3dcf..9243b5ad4acb 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -206,7 +206,8 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
  * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
  * See docs/specs/acpi_hest_ghes.rst for blobs format.
  */
-static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
+static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
+                                   BIOSLinker *linker)
 {
     int i, error_status_block_offset;
 
@@ -251,13 +252,15 @@ static void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
                                        i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
     }
 
-    /*
-     * tell firmware to write hardware_errors GPA into
-     * hardware_errors_addr fw_cfg, once the former has been initialized.
-     */
-    bios_linker_loader_write_pointer(linker, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, 0,
-                                     sizeof(uint64_t),
-                                     ACPI_HW_ERROR_FW_CFG_FILE, 0);
+    if (!ags->use_hest_addr) {
+        /*
+         * Tell firmware to write hardware_errors GPA into
+         * hardware_errors_addr fw_cfg, once the former has been initialized.
+         */
+        bios_linker_loader_write_pointer(linker, ACPI_HW_ERROR_ADDR_FW_CFG_FILE,
+                                         0, sizeof(uint64_t),
+                                         ACPI_HW_ERROR_FW_CFG_FILE, 0);
+    }
 }
 
 /* Build Generic Hardware Error Source version 2 (GHESv2) */
@@ -331,14 +334,15 @@ static void build_ghes_v2(GArray *table_data,
 }
 
 /* Build Hardware Error Source Table */
-void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
+void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
+                     GArray *hardware_errors,
                      BIOSLinker *linker,
                      const char *oem_id, const char *oem_table_id)
 {
     AcpiTable table = { .sig = "HEST", .rev = 1,
                         .oem_id = oem_id, .oem_table_id = oem_table_id };
 
-    build_ghes_error_table(hardware_errors, linker);
+    build_ghes_error_table(ags, hardware_errors, linker);
 
     acpi_table_begin(&table, table_data);
 
@@ -357,9 +361,11 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
     fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
                     hardware_error->len);
 
-    /* Create a read-write fw_cfg file for Address */
-    fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
-        NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
+    if (!ags->use_hest_addr) {
+        /* Create a read-write fw_cfg file for Address */
+        fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
+            NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
+    }
 }
 
 static void get_hw_error_offsets(uint64_t ghes_addr,
@@ -395,8 +401,11 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
     }
 
     assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
-    get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
-                         &cper_addr, &read_ack_register_addr);
+
+    if (!ags->use_hest_addr) {
+        get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
+                             &cper_addr, &read_ack_register_addr);
+    }
 
     cpu_physical_memory_read(read_ack_register_addr,
                              &read_ack_register, sizeof(read_ack_register));
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 3ac8f8e17861..040d875d4e83 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -946,9 +946,16 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
     build_dbg2(tables_blob, tables->linker, vms);
 
     if (vms->ras) {
-        acpi_add_table(table_offsets, tables_blob);
-        acpi_build_hest(tables_blob, tables->hardware_errors, tables->linker,
-                        vms->oem_id, vms->oem_table_id);
+        AcpiGedState *acpi_ged_state;
+        AcpiGhesState *ags;
+
+        acpi_ged_state = ACPI_GED(vms->acpi_dev);
+        ags = &acpi_ged_state->ghes_state;
+        if (ags) {
+            acpi_add_table(table_offsets, tables_blob);
+            acpi_build_hest(ags, tables_blob, tables->hardware_errors,
+                            tables->linker, vms->oem_id, vms->oem_table_id);
+        }
     }
 
     if (ms->numa_state->num_nodes > 0) {
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index f96ac3e85ca2..411f592662af 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -64,11 +64,21 @@ enum {
     ACPI_GHES_ERROR_SOURCE_COUNT
 };
 
+/*
+ * AcpiGhesState stores GPA values that will be used to fill HEST entries.
+ *
+ * When use_hest_addr is false, the GPA of the etc/hardware_errors firmware
+ * is stored at hw_error_le. This is the default on QEMU 9.x.
+ *
+ * An GPA value equal to zero means that GHES is not present.
+ */
 typedef struct AcpiGhesState {
     uint64_t hw_error_le;
+    bool use_hest_addr;         /* Currently, always false */
 } AcpiGhesState;
 
-void acpi_build_hest(GArray *table_data, GArray *hardware_errors,
+void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
+                     GArray *hardware_errors,
                      BIOSLinker *linker,
                      const char *oem_id, const char *oem_table_id);
 void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 07/20] acpi/ghes: add a firmware file with HEST address
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (5 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 06/20] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 08/20] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, Dongjiu Geng, linux-kernel

Store HEST table address at GPA, placing its the start of the table at
hest_addr_le variable.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/ghes.c         | 22 ++++++++++++++++++++--
 include/hw/acpi/ghes.h |  6 +++++-
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 9243b5ad4acb..cbc96c1909f0 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -30,6 +30,7 @@
 
 #define ACPI_HW_ERROR_FW_CFG_FILE           "etc/hardware_errors"
 #define ACPI_HW_ERROR_ADDR_FW_CFG_FILE      "etc/hardware_errors_addr"
+#define ACPI_HEST_ADDR_FW_CFG_FILE          "etc/acpi_table_hest_addr"
 
 /* The max size in bytes for one error block */
 #define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
@@ -341,6 +342,9 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
 {
     AcpiTable table = { .sig = "HEST", .rev = 1,
                         .oem_id = oem_id, .oem_table_id = oem_table_id };
+    uint32_t hest_offset;
+
+    hest_offset = table_data->len;
 
     build_ghes_error_table(ags, hardware_errors, linker);
 
@@ -352,6 +356,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
                   ACPI_GHES_NOTIFY_SEA, ACPI_HEST_SRC_ID_SEA);
 
     acpi_table_end(linker, &table);
+
+    if (ags->use_hest_addr) {
+        /*
+         * Tell firmware to write into GPA the address of HEST via fw_cfg,
+         * once initialized.
+         */
+        bios_linker_loader_write_pointer(linker,
+                                         ACPI_HEST_ADDR_FW_CFG_FILE, 0,
+                                         sizeof(uint64_t),
+                                         ACPI_BUILD_TABLE_FILE, hest_offset);
+    }
 }
 
 void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
@@ -361,7 +376,10 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
     fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
                     hardware_error->len);
 
-    if (!ags->use_hest_addr) {
+    if (ags->use_hest_addr) {
+        fw_cfg_add_file_callback(s, ACPI_HEST_ADDR_FW_CFG_FILE, NULL, NULL,
+            NULL, &(ags->hest_addr_le), sizeof(ags->hest_addr_le), false);
+    } else {
         /* Create a read-write fw_cfg file for Address */
         fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
             NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
@@ -484,7 +502,7 @@ AcpiGhesState *acpi_ghes_get_state(void)
     }
     ags = &acpi_ged_state->ghes_state;
 
-    if (!ags->hw_error_le) {
+    if (!ags->hw_error_le && !ags->hest_addr_le) {
         return NULL;
     }
     return ags;
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 411f592662af..ea9baab764e2 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -70,9 +70,13 @@ enum {
  * When use_hest_addr is false, the GPA of the etc/hardware_errors firmware
  * is stored at hw_error_le. This is the default on QEMU 9.x.
  *
- * An GPA value equal to zero means that GHES is not present.
+ * When use_hest_addr is true, the GPA of the HEST table is stored at
+ * hest_addr_le. This is the default for QEMU 10.x and above.
+ *
+ * Whe both GPA values are equal to zero means that GHES is not present.
  */
 typedef struct AcpiGhesState {
+    uint64_t hest_addr_le;
     uint64_t hw_error_le;
     bool use_hest_addr;         /* Currently, always false */
 } AcpiGhesState;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 08/20] acpi/ghes: Use HEST table offsets when preparing GHES records
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (6 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 07/20] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 09/20] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, Dongjiu Geng, linux-kernel

There are two pointers that are needed during error injection:

1. The start address of the CPER block to be stored;
2. The address of the read ack.

It is preferable to calculate them from the HEST table.  This allows
checking the source ID, the size of the table and the type of the
HEST error block structures.

Yet, keep the old code, as this is needed for migration purposes
from older QEMU versions.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/ghes.c         | 100 +++++++++++++++++++++++++++++++++++++++++
 include/hw/acpi/ghes.h |   2 +-
 2 files changed, 101 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index cbc96c1909f0..668ca72587c7 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -41,6 +41,12 @@
 /* Address offset in Generic Address Structure(GAS) */
 #define GAS_ADDR_OFFSET 4
 
+/*
+ * ACPI spec 1.0b
+ * 5.2.3 System Description Table Header
+ */
+#define ACPI_DESC_HEADER_OFFSET     36
+
 /*
  * The total size of Generic Error Data Entry
  * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
@@ -61,6 +67,30 @@
  */
 #define ACPI_GHES_GESB_SIZE                 20
 
+/*
+ * See the memory layout map at docs/specs/acpi_hest_ghes.rst.
+ */
+
+/*
+ * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source version 2
+ * Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure
+ */
+#define HEST_GHES_V2_ENTRY_SIZE  92
+
+/*
+ * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source version 2
+ * Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure
+ * Read Ack Register
+ */
+#define GHES_READ_ACK_ADDR_OFF          64
+
+/*
+ * ACPI 6.1: 18.3.2.7: Generic Hardware Error Source
+ * Table 18-341 Generic Hardware Error Source Structure
+ * Error Status Address
+ */
+#define GHES_ERR_STATUS_ADDR_OFF  20
+
 /*
  * Values for error_severity field
  */
@@ -408,6 +438,73 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
     *read_ack_register_addr = ghes_addr + sizeof(uint64_t);
 }
 
+static void get_ghes_source_offsets(uint16_t source_id,
+                                    uint64_t hest_addr,
+                                    uint64_t *cper_addr,
+                                    uint64_t *read_ack_start_addr,
+                                    Error **errp)
+{
+    uint64_t hest_err_block_addr, hest_read_ack_addr;
+    uint64_t err_source_entry, error_block_addr;
+    uint32_t num_sources, i;
+
+    hest_addr += ACPI_DESC_HEADER_OFFSET;
+
+    cpu_physical_memory_read(hest_addr, &num_sources,
+                             sizeof(num_sources));
+    num_sources = le32_to_cpu(num_sources);
+
+    err_source_entry = hest_addr + sizeof(num_sources);
+
+    /*
+     * Currently, HEST Error source navigates only for GHESv2 tables
+     */
+    for (i = 0; i < num_sources; i++) {
+        uint64_t addr = err_source_entry;
+        uint16_t type, src_id;
+
+        cpu_physical_memory_read(addr, &type, sizeof(type));
+        type = le16_to_cpu(type);
+
+        /* For now, we only know the size of GHESv2 table */
+        if (type != ACPI_GHES_SOURCE_GENERIC_ERROR_V2) {
+            error_setg(errp, "HEST: type %d not supported.", type);
+            return;
+        }
+
+        /* Compare CPER source ID at the GHESv2 structure */
+        addr += sizeof(type);
+        cpu_physical_memory_read(addr, &src_id, sizeof(src_id));
+        if (le16_to_cpu(src_id) == source_id) {
+            break;
+        }
+
+        err_source_entry += HEST_GHES_V2_ENTRY_SIZE;
+    }
+    if (i == num_sources) {
+        error_setg(errp, "HEST: Source %d not found.", source_id);
+        return;
+    }
+
+    /* Navigate through table address pointers */
+    hest_err_block_addr = err_source_entry + GHES_ERR_STATUS_ADDR_OFF +
+                          GAS_ADDR_OFFSET;
+
+    cpu_physical_memory_read(hest_err_block_addr, &error_block_addr,
+                             sizeof(error_block_addr));
+    error_block_addr = le64_to_cpu(error_block_addr);
+
+    cpu_physical_memory_read(error_block_addr, cper_addr,
+                             sizeof(*cper_addr));
+    *cper_addr = le64_to_cpu(*cper_addr);
+
+    hest_read_ack_addr = err_source_entry + GHES_READ_ACK_ADDR_OFF +
+                         GAS_ADDR_OFFSET;
+    cpu_physical_memory_read(hest_read_ack_addr, read_ack_start_addr,
+                             sizeof(*read_ack_start_addr));
+    *read_ack_start_addr = le64_to_cpu(*read_ack_start_addr);
+}
+
 void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
                              uint16_t source_id, Error **errp)
 {
@@ -423,6 +520,9 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
     if (!ags->use_hest_addr) {
         get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
                              &cper_addr, &read_ack_register_addr);
+    } else {
+        get_ghes_source_offsets(source_id, le64_to_cpu(ags->hest_addr_le),
+                                &cper_addr, &read_ack_register_addr, errp);
     }
 
     cpu_physical_memory_read(read_ack_register_addr,
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index ea9baab764e2..5265102ba51f 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -78,7 +78,7 @@ enum {
 typedef struct AcpiGhesState {
     uint64_t hest_addr_le;
     uint64_t hw_error_le;
-    bool use_hest_addr;         /* Currently, always false */
+    bool use_hest_addr; /* True if HEST address is present */
 } AcpiGhesState;
 
 void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 09/20] acpi/ghes: don't hard-code the number of sources for HEST table
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (7 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 08/20] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-05-12 19:42   ` Michael S. Tsirkin
  2025-03-07 19:14 ` [PATCH v8 10/20] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
                   ` (12 subsequent siblings)
  21 siblings, 1 reply; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, Dongjiu Geng, Peter Maydell, Shannon Zhao,
	linux-kernel

The current code is actually dependent on having just one error
structure with a single source, as any change there would cause
migration issues.

As the number of sources should be arch-dependent, as it will depend on
what kind of notifications will exist, and how many errors can be
reported at the same time, change the logic to be more flexible,
allowing the number of sources to be defined when building the
HEST table by the caller.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/ghes.c           | 39 +++++++++++++++++++++------------------
 hw/arm/virt-acpi-build.c |  8 +++++++-
 include/hw/acpi/ghes.h   | 17 ++++++++++++-----
 3 files changed, 40 insertions(+), 24 deletions(-)

diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 668ca72587c7..f49d0d628fc4 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -238,17 +238,17 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
  * See docs/specs/acpi_hest_ghes.rst for blobs format.
  */
 static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
-                                   BIOSLinker *linker)
+                                   BIOSLinker *linker, int num_sources)
 {
     int i, error_status_block_offset;
 
     /* Build error_block_address */
-    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+    for (i = 0; i < num_sources; i++) {
         build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
     }
 
     /* Build read_ack_register */
-    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+    for (i = 0; i < num_sources; i++) {
         /*
          * Initialize the value of read_ack_register to 1, so GHES can be
          * writable after (re)boot.
@@ -263,13 +263,13 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
 
     /* Reserve space for Error Status Data Block */
     acpi_data_push(hardware_errors,
-        ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
+        ACPI_GHES_MAX_RAW_DATA_LENGTH * num_sources);
 
     /* Tell guest firmware to place hardware_errors blob into RAM */
     bios_linker_loader_alloc(linker, ACPI_HW_ERROR_FW_CFG_FILE,
                              hardware_errors, sizeof(uint64_t), false);
 
-    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+    for (i = 0; i < num_sources; i++) {
         /*
          * Tell firmware to patch error_block_address entries to point to
          * corresponding "Generic Error Status Block"
@@ -295,12 +295,14 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
 }
 
 /* Build Generic Hardware Error Source version 2 (GHESv2) */
-static void build_ghes_v2(GArray *table_data,
-                          BIOSLinker *linker,
-                          enum AcpiGhesNotifyType notify,
-                          uint16_t source_id)
+static void build_ghes_v2_entry(GArray *table_data,
+                                BIOSLinker *linker,
+                                const AcpiNotificationSourceId *notif_src,
+                                uint16_t index, int num_sources)
 {
     uint64_t address_offset;
+    const uint16_t notify = notif_src->notify;
+    const uint16_t source_id = notif_src->source_id;
 
     /*
      * Type:
@@ -331,7 +333,7 @@ static void build_ghes_v2(GArray *table_data,
                                    address_offset + GAS_ADDR_OFFSET,
                                    sizeof(uint64_t),
                                    ACPI_HW_ERROR_FW_CFG_FILE,
-                                   source_id * sizeof(uint64_t));
+                                   index * sizeof(uint64_t));
 
     /* Notification Structure */
     build_ghes_hw_error_notification(table_data, notify);
@@ -351,8 +353,7 @@ static void build_ghes_v2(GArray *table_data,
                                    address_offset + GAS_ADDR_OFFSET,
                                    sizeof(uint64_t),
                                    ACPI_HW_ERROR_FW_CFG_FILE,
-                                   (ACPI_GHES_ERROR_SOURCE_COUNT + source_id)
-                                   * sizeof(uint64_t));
+                                   (num_sources + index) * sizeof(uint64_t));
 
     /*
      * Read Ack Preserve field
@@ -368,22 +369,26 @@ static void build_ghes_v2(GArray *table_data,
 void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
                      GArray *hardware_errors,
                      BIOSLinker *linker,
+                     const AcpiNotificationSourceId *notif_source,
+                     int num_sources,
                      const char *oem_id, const char *oem_table_id)
 {
     AcpiTable table = { .sig = "HEST", .rev = 1,
                         .oem_id = oem_id, .oem_table_id = oem_table_id };
     uint32_t hest_offset;
+    int i;
 
     hest_offset = table_data->len;
 
-    build_ghes_error_table(ags, hardware_errors, linker);
+    build_ghes_error_table(ags, hardware_errors, linker, num_sources);
 
     acpi_table_begin(&table, table_data);
 
     /* Error Source Count */
-    build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
-    build_ghes_v2(table_data, linker,
-                  ACPI_GHES_NOTIFY_SEA, ACPI_HEST_SRC_ID_SEA);
+    build_append_int_noprefix(table_data, num_sources, 4);
+    for (i = 0; i < num_sources; i++) {
+        build_ghes_v2_entry(table_data, linker, &notif_source[i], i, num_sources);
+    }
 
     acpi_table_end(linker, &table);
 
@@ -515,8 +520,6 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
         return;
     }
 
-    assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
-
     if (!ags->use_hest_addr) {
         get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
                              &cper_addr, &read_ack_register_addr);
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 040d875d4e83..ea9682ee2662 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -893,6 +893,10 @@ static void acpi_align_size(GArray *blob, unsigned align)
     g_array_set_size(blob, ROUND_UP(acpi_data_len(blob), align));
 }
 
+static const AcpiNotificationSourceId hest_ghes_notify[] = {
+    { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
+};
+
 static
 void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 {
@@ -954,7 +958,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
         if (ags) {
             acpi_add_table(table_offsets, tables_blob);
             acpi_build_hest(ags, tables_blob, tables->hardware_errors,
-                            tables->linker, vms->oem_id, vms->oem_table_id);
+                            tables->linker, hest_ghes_notify,
+                            ARRAY_SIZE(hest_ghes_notify),
+                            vms->oem_id, vms->oem_table_id);
         }
     }
 
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 5265102ba51f..8c4b08433760 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -57,13 +57,18 @@ enum AcpiGhesNotifyType {
     ACPI_GHES_NOTIFY_RESERVED = 12
 };
 
-enum {
-    ACPI_HEST_SRC_ID_SEA = 0,
-    /* future ids go here */
-
-    ACPI_GHES_ERROR_SOURCE_COUNT
+/*
+ * ID numbers used to fill HEST source ID field
+ */
+enum AcpiGhesSourceID {
+    ACPI_HEST_SRC_ID_SYNC,
 };
 
+typedef struct AcpiNotificationSourceId {
+    enum AcpiGhesSourceID source_id;
+    enum AcpiGhesNotifyType notify;
+} AcpiNotificationSourceId;
+
 /*
  * AcpiGhesState stores GPA values that will be used to fill HEST entries.
  *
@@ -84,6 +89,8 @@ typedef struct AcpiGhesState {
 void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
                      GArray *hardware_errors,
                      BIOSLinker *linker,
+                     const AcpiNotificationSourceId * const notif_source,
+                     int num_sources,
                      const char *oem_id, const char *oem_table_id);
 void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
                           GArray *hardware_errors);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 10/20] acpi/ghes: add a notifier to notify when error data is ready
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (8 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 09/20] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 11/20] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, Dongjiu Geng, linux-kernel

Some error injection notify methods are async, like GPIO
notify. Add a notifier to be used when the error record is
ready to be sent to the guest OS.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/ghes.c         | 5 ++++-
 include/hw/acpi/ghes.h | 3 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index f49d0d628fc4..0135ac844bcf 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -510,6 +510,9 @@ static void get_ghes_source_offsets(uint16_t source_id,
     *read_ack_start_addr = le64_to_cpu(*read_ack_start_addr);
 }
 
+NotifierList acpi_generic_error_notifiers =
+    NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
+
 void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
                              uint16_t source_id, Error **errp)
 {
@@ -550,7 +553,7 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
     /* Write the generic error data entry into guest memory */
     cpu_physical_memory_write(cper_addr, cper, len);
 
-    return;
+    notifier_list_notify(&acpi_generic_error_notifiers, NULL);
 }
 
 int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 8c4b08433760..390943e46d99 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -24,6 +24,9 @@
 
 #include "hw/acpi/bios-linker-loader.h"
 #include "qapi/error.h"
+#include "qemu/notify.h"
+
+extern NotifierList acpi_generic_error_notifiers;
 
 /*
  * Values for Hardware Error Notification Type field
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 11/20] acpi/generic_event_device: Update GHES migration to cover hest addr
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (9 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 10/20] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 12/20] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, linux-kernel

The GHES migration logic should now support HEST table location too.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/generic_event_device.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index c85d97ca3776..5346cae573b7 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -386,6 +386,34 @@ static const VMStateDescription vmstate_ghes_state = {
     }
 };
 
+static const VMStateDescription vmstate_hest = {
+    .name = "acpi-hest",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (const VMStateField[]) {
+        VMSTATE_UINT64(hest_addr_le, AcpiGhesState),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static bool hest_needed(void *opaque)
+{
+    AcpiGedState *s = opaque;
+    return s->ghes_state.hest_addr_le;
+}
+
+static const VMStateDescription vmstate_hest_state = {
+    .name = "acpi-ged/hest",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = hest_needed,
+    .fields = (const VMStateField[]) {
+        VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
+                       vmstate_hest, AcpiGhesState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_acpi_ged = {
     .name = "acpi-ged",
     .version_id = 1,
@@ -398,6 +426,7 @@ static const VMStateDescription vmstate_acpi_ged = {
         &vmstate_memhp_state,
         &vmstate_cpuhp_state,
         &vmstate_ghes_state,
+        &vmstate_hest_state,
         NULL
     }
 };
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 12/20] acpi/generic_event_device: add logic to detect if HEST addr is available
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (10 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 11/20] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 13/20] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, Eduardo Habkost, Marcel Apfelbaum, Peter Maydell,
	Shannon Zhao, Yanan Wang, Zhao Liu, linux-kernel

Create a new property (x-has-hest-addr) and use it to detect if
the GHES table offsets can be calculated from the HEST address
(qemu 10.0 and upper) or via the legacy way via an offset obtained
from the hardware_errors firmware file.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/generic_event_device.c |  2 ++
 hw/arm/virt-acpi-build.c       | 18 ++++++++++++++++--
 hw/core/machine.c              |  2 ++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 5346cae573b7..471d1a7afc76 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -318,6 +318,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
 
 static const Property acpi_ged_properties[] = {
     DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
+    DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState,
+                     ghes_state.use_hest_addr, false),
 };
 
 static const VMStateDescription vmstate_memhp_state = {
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index ea9682ee2662..5443615d976d 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -897,6 +897,10 @@ static const AcpiNotificationSourceId hest_ghes_notify[] = {
     { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
 };
 
+static const AcpiNotificationSourceId hest_ghes_notify_9_2[] = {
+    { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
+};
+
 static
 void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 {
@@ -951,15 +955,25 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 
     if (vms->ras) {
         AcpiGedState *acpi_ged_state;
+        static const AcpiNotificationSourceId *notify;
+        unsigned int notify_sz;
         AcpiGhesState *ags;
 
         acpi_ged_state = ACPI_GED(vms->acpi_dev);
         ags = &acpi_ged_state->ghes_state;
         if (ags) {
             acpi_add_table(table_offsets, tables_blob);
+
+            if (!ags->use_hest_addr) {
+                notify = hest_ghes_notify_9_2;
+                notify_sz = ARRAY_SIZE(hest_ghes_notify_9_2);
+            } else {
+                notify = hest_ghes_notify;
+                notify_sz = ARRAY_SIZE(hest_ghes_notify);
+            }
+
             acpi_build_hest(ags, tables_blob, tables->hardware_errors,
-                            tables->linker, hest_ghes_notify,
-                            ARRAY_SIZE(hest_ghes_notify),
+                            tables->linker, notify, notify_sz,
                             vms->oem_id, vms->oem_table_id);
         }
     }
diff --git a/hw/core/machine.c b/hw/core/machine.c
index b68b8b94a3c4..f5f0731e7062 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -35,6 +35,7 @@
 #include "hw/virtio/virtio-pci.h"
 #include "hw/virtio/virtio-net.h"
 #include "hw/virtio/virtio-iommu.h"
+#include "hw/acpi/generic_event_device.h"
 #include "audio/audio.h"
 
 GlobalProperty hw_compat_9_2[] = {
@@ -44,6 +45,7 @@ GlobalProperty hw_compat_9_2[] = {
     { "virtio-balloon-pci-non-transitional", "vectors", "0" },
     { "virtio-mem-pci", "vectors", "0" },
     { "migration", "multifd-clean-tls-termination", "false" },
+    { TYPE_ACPI_GED, "x-has-hest-addr", "false" },
 };
 const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2);
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 13/20] acpi/generic_event_device: add an APEI error device
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (11 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 12/20] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 14/20] tests/acpi: virt: allow acpi table changes at DSDT and HEST tables Mauro Carvalho Chehab
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, linux-kernel

Adds a generic error device to handle generic hardware error
events as specified at ACPI 6.5 specification at 18.3.2.7.2:
https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#event-notification-for-generic-error-sources
using HID PNP0C33.

The PNP0C33 device is used to report hardware errors to
the guest via ACPI APEI Generic Hardware Error Source (GHES).

Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/aml-build.c                    | 10 ++++++++++
 hw/acpi/generic_event_device.c         | 13 +++++++++++++
 include/hw/acpi/acpi_dev_interface.h   |  1 +
 include/hw/acpi/aml-build.h            |  2 ++
 include/hw/acpi/generic_event_device.h |  1 +
 5 files changed, 27 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index f8f93a9f66c8..e4bd7b611372 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2614,3 +2614,13 @@ Aml *aml_i2c_serial_bus_device(uint16_t address, const char *resource_source)
 
     return var;
 }
+
+/* ACPI 5.0b: 18.3.2.6.2 Event Notification For Generic Error Sources */
+Aml *aml_error_device(void)
+{
+    Aml *dev = aml_device(ACPI_APEI_ERROR_DEVICE);
+    aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C33")));
+    aml_append(dev, aml_name_decl("_UID", aml_int(0)));
+
+    return dev;
+}
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 471d1a7afc76..f029753ab709 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -26,6 +26,7 @@ static const uint32_t ged_supported_events[] = {
     ACPI_GED_PWR_DOWN_EVT,
     ACPI_GED_NVDIMM_HOTPLUG_EVT,
     ACPI_GED_CPU_HOTPLUG_EVT,
+    ACPI_GED_ERROR_EVT,
 };
 
 /*
@@ -116,6 +117,16 @@ void build_ged_aml(Aml *table, const char *name, HotplugHandler *hotplug_dev,
                            aml_notify(aml_name(ACPI_POWER_BUTTON_DEVICE),
                                       aml_int(0x80)));
                 break;
+            case ACPI_GED_ERROR_EVT:
+                /*
+                 * ACPI 5.0b: 5.6.6 Device Object Notifications
+                 * Table 5-135 Error Device Notification Values
+                 * Defines 0x80 as the value to be used on notifications
+                 */
+                aml_append(if_ctx,
+                           aml_notify(aml_name(ACPI_APEI_ERROR_DEVICE),
+                                      aml_int(0x80)));
+                break;
             case ACPI_GED_NVDIMM_HOTPLUG_EVT:
                 aml_append(if_ctx,
                            aml_notify(aml_name("\\_SB.NVDR"),
@@ -295,6 +306,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
         sel = ACPI_GED_MEM_HOTPLUG_EVT;
     } else if (ev & ACPI_POWER_DOWN_STATUS) {
         sel = ACPI_GED_PWR_DOWN_EVT;
+    } else if (ev & ACPI_GENERIC_ERROR) {
+        sel = ACPI_GED_ERROR_EVT;
     } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
         sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
     } else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
diff --git a/include/hw/acpi/acpi_dev_interface.h b/include/hw/acpi/acpi_dev_interface.h
index 68d9d15f50aa..8294f8f0ccca 100644
--- a/include/hw/acpi/acpi_dev_interface.h
+++ b/include/hw/acpi/acpi_dev_interface.h
@@ -13,6 +13,7 @@ typedef enum {
     ACPI_NVDIMM_HOTPLUG_STATUS = 16,
     ACPI_VMGENID_CHANGE_STATUS = 32,
     ACPI_POWER_DOWN_STATUS = 64,
+    ACPI_GENERIC_ERROR = 128,
 } AcpiEventStatusBits;
 
 #define TYPE_ACPI_DEVICE_IF "acpi-device-interface"
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index c18f68134246..f38e12971932 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -252,6 +252,7 @@ struct CrsRangeSet {
 /* Consumer/Producer */
 #define AML_SERIAL_BUS_FLAG_CONSUME_ONLY        (1 << 1)
 
+#define ACPI_APEI_ERROR_DEVICE   "GEDD"
 /**
  * init_aml_allocator:
  *
@@ -382,6 +383,7 @@ Aml *aml_dma(AmlDmaType typ, AmlDmaBusMaster bm, AmlTransferSize sz,
              uint8_t channel);
 Aml *aml_sleep(uint64_t msec);
 Aml *aml_i2c_serial_bus_device(uint16_t address, const char *resource_source);
+Aml *aml_error_device(void);
 
 /* Block AML object primitives */
 Aml *aml_scope(const char *name_format, ...) G_GNUC_PRINTF(1, 2);
diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
index d2dac87b4a9f..1c18ac296fcb 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -101,6 +101,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
 #define ACPI_GED_PWR_DOWN_EVT      0x2
 #define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4
 #define ACPI_GED_CPU_HOTPLUG_EVT    0x8
+#define ACPI_GED_ERROR_EVT          0x10
 
 typedef struct GEDState {
     MemoryRegion evt;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 14/20] tests/acpi: virt: allow acpi table changes at DSDT and HEST tables
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (12 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 13/20] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 15/20] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, linux-kernel

We'll be adding a new GED device for HEST GPIO notification and
increasing the number of entries at the HEST table.

Blocklist testing HEST and DSDT tables until such changes
are completed.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 tests/qtest/bios-tables-test-allowed-diff.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8bf4..0a1a26543ba2 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,7 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/aarch64/virt/HEST",
+"tests/data/acpi/aarch64/virt/DSDT",
+"tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt",
+"tests/data/acpi/aarch64/virt/DSDT.memhp",
+"tests/data/acpi/aarch64/virt/DSDT.pxb",
+"tests/data/acpi/aarch64/virt/DSDT.topology",
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 15/20] arm/virt: Wire up a GED error device for ACPI / GHES
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (13 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 14/20] tests/acpi: virt: allow acpi table changes at DSDT and HEST tables Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 16/20] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, Peter Maydell, Shannon Zhao, linux-kernel

Adds support to ARM virtualization to allow handling
generic error ACPI Event via GED & error source device.

It is aligned with Linux Kernel patch:
https://lore.kernel.org/lkml/1272350481-27951-8-git-send-email-ying.huang@intel.com/

Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/arm/virt-acpi-build.c |  1 +
 hw/arm/virt.c            | 12 +++++++++++-
 include/hw/arm/virt.h    |  1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 5443615d976d..2bf9118fda55 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -861,6 +861,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     }
 
     acpi_dsdt_add_power_button(scope);
+    aml_append(scope, aml_error_device());
 #ifdef CONFIG_TPM
     acpi_dsdt_add_tpm(scope, vms);
 #endif
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 904c698b1406..6fe2c04aaabd 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -687,7 +687,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
     DeviceState *dev;
     MachineState *ms = MACHINE(vms);
     int irq = vms->irqmap[VIRT_ACPI_GED];
-    uint32_t event = ACPI_GED_PWR_DOWN_EVT;
+    uint32_t event = ACPI_GED_PWR_DOWN_EVT | ACPI_GED_ERROR_EVT;
 
     if (ms->ram_slots) {
         event |= ACPI_GED_MEM_HOTPLUG_EVT;
@@ -1019,6 +1019,13 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
     }
 }
 
+static void virt_generic_error_req(Notifier *n, void *opaque)
+{
+    VirtMachineState *s = container_of(n, VirtMachineState, generic_error_notifier);
+
+    acpi_send_event(s->acpi_dev, ACPI_GENERIC_ERROR);
+}
+
 static void create_gpio_keys(char *fdt, DeviceState *pl061_dev,
                              uint32_t phandle)
 {
@@ -2413,6 +2420,9 @@ static void machvirt_init(MachineState *machine)
 
     if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
         vms->acpi_dev = create_acpi_ged(vms);
+        vms->generic_error_notifier.notify = virt_generic_error_req;
+        notifier_list_add(&acpi_generic_error_notifiers,
+                          &vms->generic_error_notifier);
     } else {
         create_gpio_devices(vms, VIRT_GPIO, sysmem);
     }
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index c8e94e6aedc9..f3cf28436770 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -176,6 +176,7 @@ struct VirtMachineState {
     DeviceState *gic;
     DeviceState *acpi_dev;
     Notifier powerdown_notifier;
+    Notifier generic_error_notifier;
     PCIBus *bus;
     char *oem_id;
     char *oem_table_id;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 16/20] qapi/acpi-hest: add an interface to do generic CPER error injection
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (14 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 15/20] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 17/20] acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x Mauro Carvalho Chehab
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, Dongjiu Geng, Eric Blake, Markus Armbruster,
	Michael Roth, Paolo Bonzini, Peter Maydell, Shannon Zhao,
	linux-kernel

Creates a QMP command to be used for generic ACPI APEI hardware error
injection (HEST) via GHESv2, and add support for it for ARM guests.

Error injection uses ACPI_HEST_SRC_ID_QMP source ID to be platform
independent. This is mapped at arch virt bindings, depending on the
types supported by QEMU and by the BIOS. So, on ARM, this is supported
via ACPI_GHES_NOTIFY_GPIO notification type.

This patch is co-authored:
    - original ghes logic to inject a simple ARM record by Shiju Jose;
    - generic logic to handle block addresses by Jonathan Cameron;
    - generic GHESv2 error inject by Mauro Carvalho Chehab;

Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-authored-by: Shiju Jose <shiju.jose@huawei.com>
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
---
 MAINTAINERS              |  7 +++++++
 hw/acpi/Kconfig          |  5 +++++
 hw/acpi/ghes.c           |  2 +-
 hw/acpi/ghes_cper.c      | 38 ++++++++++++++++++++++++++++++++++++++
 hw/acpi/ghes_cper_stub.c | 19 +++++++++++++++++++
 hw/acpi/meson.build      |  2 ++
 hw/arm/virt-acpi-build.c |  1 +
 hw/arm/virt.c            |  7 +++++++
 include/hw/acpi/ghes.h   |  1 +
 include/hw/arm/virt.h    |  1 +
 qapi/acpi-hest.json      | 35 +++++++++++++++++++++++++++++++++++
 qapi/meson.build         |  1 +
 qapi/qapi-schema.json    |  1 +
 13 files changed, 119 insertions(+), 1 deletion(-)
 create mode 100644 hw/acpi/ghes_cper.c
 create mode 100644 hw/acpi/ghes_cper_stub.c
 create mode 100644 qapi/acpi-hest.json

diff --git a/MAINTAINERS b/MAINTAINERS
index 692628cd7801..479664a4f6e8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2091,6 +2091,13 @@ F: hw/acpi/ghes.c
 F: include/hw/acpi/ghes.h
 F: docs/specs/acpi_hest_ghes.rst
 
+ACPI/HEST/GHES/ARM processor CPER
+R: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+S: Maintained
+F: hw/arm/ghes_cper.c
+F: hw/acpi/ghes_cper_stub.c
+F: qapi/acpi-hest.json
+
 ppc4xx
 L: qemu-ppc@nongnu.org
 S: Orphan
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 1d4e9f0845c0..daabbe6cd11e 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -51,6 +51,11 @@ config ACPI_APEI
     bool
     depends on ACPI
 
+config GHES_CPER
+    bool
+    depends on ACPI_APEI
+    default y
+
 config ACPI_PCI
     bool
     depends on ACPI && PCI
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 0135ac844bcf..1d02ef6dcb70 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -553,7 +553,7 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
     /* Write the generic error data entry into guest memory */
     cpu_physical_memory_write(cper_addr, cper, len);
 
-    notifier_list_notify(&acpi_generic_error_notifiers, NULL);
+    notifier_list_notify(&acpi_generic_error_notifiers, &source_id);
 }
 
 int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
diff --git a/hw/acpi/ghes_cper.c b/hw/acpi/ghes_cper.c
new file mode 100644
index 000000000000..0a2d95dd8b27
--- /dev/null
+++ b/hw/acpi/ghes_cper.c
@@ -0,0 +1,38 @@
+/*
+ * CPER payload parser for error injection
+ *
+ * Copyright(C) 2024-2025 Huawei LTD.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include "qemu/base64.h"
+#include "qemu/error-report.h"
+#include "qemu/uuid.h"
+#include "qapi/qapi-commands-acpi-hest.h"
+#include "hw/acpi/ghes.h"
+
+void qmp_inject_ghes_v2_error(const char *qmp_cper, Error **errp)
+{
+    AcpiGhesState *ags;
+
+    ags = acpi_ghes_get_state();
+    if (!ags) {
+        return;
+    }
+
+    uint8_t *cper;
+    size_t  len;
+
+    cper = qbase64_decode(qmp_cper, -1, &len, errp);
+    if (!cper) {
+        error_setg(errp, "missing GHES CPER payload");
+        return;
+    }
+
+    ghes_record_cper_errors(ags, cper, len, ACPI_HEST_SRC_ID_QMP, errp);
+}
diff --git a/hw/acpi/ghes_cper_stub.c b/hw/acpi/ghes_cper_stub.c
new file mode 100644
index 000000000000..5ebc61970a78
--- /dev/null
+++ b/hw/acpi/ghes_cper_stub.c
@@ -0,0 +1,19 @@
+/*
+ * Stub interface for CPER payload parser for error injection
+ *
+ * Copyright(C) 2024-2025 Huawei LTD.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-acpi-hest.h"
+#include "hw/acpi/ghes.h"
+
+void qmp_inject_ghes_v2_error(const char *cper, Error **errp)
+{
+    error_setg(errp, "GHES QMP error inject is not compiled in");
+}
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
index 73f02b96912b..56b5d1ec9691 100644
--- a/hw/acpi/meson.build
+++ b/hw/acpi/meson.build
@@ -34,4 +34,6 @@ endif
 system_ss.add(when: 'CONFIG_ACPI', if_false: files('acpi-stub.c', 'aml-build-stub.c', 'ghes-stub.c', 'acpi_interface.c'))
 system_ss.add(when: 'CONFIG_ACPI_PCI_BRIDGE', if_false: files('pci-bridge-stub.c'))
 system_ss.add_all(when: 'CONFIG_ACPI', if_true: acpi_ss)
+system_ss.add(when: 'CONFIG_GHES_CPER', if_true: files('ghes_cper.c'))
+system_ss.add(when: 'CONFIG_GHES_CPER', if_false: files('ghes_cper_stub.c'))
 system_ss.add(files('acpi-qmp-cmds.c'))
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 2bf9118fda55..154337e1a77b 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -896,6 +896,7 @@ static void acpi_align_size(GArray *blob, unsigned align)
 
 static const AcpiNotificationSourceId hest_ghes_notify[] = {
     { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
+    { ACPI_HEST_SRC_ID_QMP, ACPI_GHES_NOTIFY_GPIO },
 };
 
 static const AcpiNotificationSourceId hest_ghes_notify_9_2[] = {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6fe2c04aaabd..8f5cd0959e5f 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1021,6 +1021,13 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
 
 static void virt_generic_error_req(Notifier *n, void *opaque)
 {
+    uint16_t *source_id = opaque;
+
+    /* Currently, only QMP source ID is async */
+    if (*source_id != ACPI_HEST_SRC_ID_QMP) {
+        return;
+    }
+
     VirtMachineState *s = container_of(n, VirtMachineState, generic_error_notifier);
 
     acpi_send_event(s->acpi_dev, ACPI_GENERIC_ERROR);
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 390943e46d99..df2ecbf6e4a9 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -65,6 +65,7 @@ enum AcpiGhesNotifyType {
  */
 enum AcpiGhesSourceID {
     ACPI_HEST_SRC_ID_SYNC,
+    ACPI_HEST_SRC_ID_QMP,       /* Use it only for QMP injected errors */
 };
 
 typedef struct AcpiNotificationSourceId {
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index f3cf28436770..56f270f61cf5 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -33,6 +33,7 @@
 #include "exec/hwaddr.h"
 #include "qemu/notify.h"
 #include "hw/boards.h"
+#include "hw/acpi/ghes.h"
 #include "hw/arm/boot.h"
 #include "hw/arm/bsa.h"
 #include "hw/block/flash.h"
diff --git a/qapi/acpi-hest.json b/qapi/acpi-hest.json
new file mode 100644
index 000000000000..fff5018c7ec1
--- /dev/null
+++ b/qapi/acpi-hest.json
@@ -0,0 +1,35 @@
+# -*- Mode: Python -*-
+# vim: filetype=python
+
+##
+# == GHESv2 CPER Error Injection
+#
+# Defined since ACPI Specification 6.1,
+# section 18.3.2.8 Generic Hardware Error Source version 2. See:
+#
+# https://uefi.org/sites/default/files/resources/ACPI_6_1.pdf
+##
+
+
+##
+# @inject-ghes-v2-error:
+#
+# Inject an error with additional ACPI 6.1 GHESv2 error information
+#
+# @cper: contains a base64 encoded string with raw data for a single
+#     CPER record with Generic Error Status Block, Generic Error Data
+#     Entry and generic error data payload, as described at
+#     https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#format
+#
+# Features:
+#
+# @unstable: This command is experimental.
+#
+# Since: 10.0
+##
+{ 'command': 'inject-ghes-v2-error',
+  'data': {
+    'cper': 'str'
+  },
+  'features': [ 'unstable' ]
+}
diff --git a/qapi/meson.build b/qapi/meson.build
index eadde4db307f..8017346a6f6e 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -59,6 +59,7 @@ qapi_all_modules = [
 if have_system
   qapi_all_modules += [
     'acpi',
+    'acpi-hest',
     'audio',
     'cryptodev',
     'qdev',
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 2877aff73d0c..a86044e4cd32 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -75,6 +75,7 @@
 { 'include': 'misc-target.json' }
 { 'include': 'audio.json' }
 { 'include': 'acpi.json' }
+{ 'include': 'acpi-hest.json' }
 { 'include': 'pci.json' }
 { 'include': 'stats.json' }
 { 'include': 'virtio.json' }
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 17/20] acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (15 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 16/20] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 18/20] tests/acpi: virt: update HEST and DSDT tables Mauro Carvalho Chehab
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, linux-kernel

Now that we have everything in place, enable using HEST GPA
instead of etc/hardware_errors GPA.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/generic_event_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index f029753ab709..9fe70b74bd42 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -332,7 +332,7 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
 static const Property acpi_ged_properties[] = {
     DEFINE_PROP_UINT32("ged-event", AcpiGedState, ged_event_bitmap, 0),
     DEFINE_PROP_BOOL("x-has-hest-addr", AcpiGedState,
-                     ghes_state.use_hest_addr, false),
+                     ghes_state.use_hest_addr, true),
 };
 
 static const VMStateDescription vmstate_memhp_state = {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 18/20] tests/acpi: virt: update HEST and DSDT tables
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (16 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 17/20] acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 21:04   ` Philippe Mathieu-Daudé
  2025-03-07 19:14 ` [PATCH v8 19/20] docs: hest: add new "etc/acpi_table_hest_addr" and update workflow Mauro Carvalho Chehab
                   ` (3 subsequent siblings)
  21 siblings, 1 reply; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Ani Sinha, linux-kernel

@@ -1,39 +1,39 @@
 /*
  * Intel ACPI Component Architecture
  * AML/ASL+ Disassembler version 20240322 (64-bit version)
  * Copyright (c) 2000 - 2023 Intel Corporation
  *
- * Disassembly of tests/data/acpi/aarch64/virt/HEST
+ * Disassembly of /tmp/aml-DMPE22
  *
  * ACPI Data Table [HEST]
  *
  * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue (in hex)
  */

 [000h 0000 004h]                   Signature : "HEST"    [Hardware Error Source Table]
-[004h 0004 004h]                Table Length : 00000084
+[004h 0004 004h]                Table Length : 000000E0
 [008h 0008 001h]                    Revision : 01
-[009h 0009 001h]                    Checksum : E2
+[009h 0009 001h]                    Checksum : 6C
 [00Ah 0010 006h]                      Oem ID : "BOCHS "
 [010h 0016 008h]                Oem Table ID : "BXPC    "
 [018h 0024 004h]                Oem Revision : 00000001
 [01Ch 0028 004h]             Asl Compiler ID : "BXPC"
 [020h 0032 004h]       Asl Compiler Revision : 00000001

-[024h 0036 004h]          Error Source Count : 00000001
+[024h 0036 004h]          Error Source Count : 00000002

 [028h 0040 002h]               Subtable Type : 000A [Generic Hardware Error Source V2]
 [02Ah 0042 002h]                   Source Id : 0000
 [02Ch 0044 002h]           Related Source Id : FFFF
 [02Eh 0046 001h]                    Reserved : 00
 [02Fh 0047 001h]                     Enabled : 01
 [030h 0048 004h]      Records To Preallocate : 00000001
 [034h 0052 004h]     Max Sections Per Record : 00000001
 [038h 0056 004h]         Max Raw Data Length : 00000400

 [03Ch 0060 00Ch]        Error Status Address : [Generic Address Structure]
 [03Ch 0060 001h]                    Space ID : 00 [SystemMemory]
 [03Dh 0061 001h]                   Bit Width : 40
 [03Eh 0062 001h]                  Bit Offset : 00
 [03Fh 0063 001h]        Encoded Access Width : 04 [QWord Access:64]
 [040h 0064 008h]                     Address : 0000000043DA0000
@@ -42,32 +42,75 @@
 [048h 0072 001h]                 Notify Type : 08 [SEA]
 [049h 0073 001h]               Notify Length : 1C
 [04Ah 0074 002h]  Configuration Write Enable : 0000
 [04Ch 0076 004h]                PollInterval : 00000000
 [050h 0080 004h]                      Vector : 00000000
 [054h 0084 004h]     Polling Threshold Value : 00000000
 [058h 0088 004h]    Polling Threshold Window : 00000000
 [05Ch 0092 004h]       Error Threshold Value : 00000000
 [060h 0096 004h]      Error Threshold Window : 00000000

 [064h 0100 004h]   Error Status Block Length : 00000400
 [068h 0104 00Ch]           Read Ack Register : [Generic Address Structure]
 [068h 0104 001h]                    Space ID : 00 [SystemMemory]
 [069h 0105 001h]                   Bit Width : 40
 [06Ah 0106 001h]                  Bit Offset : 00
 [06Bh 0107 001h]        Encoded Access Width : 04 [QWord Access:64]
-[06Ch 0108 008h]                     Address : 0000000043DA0008
+[06Ch 0108 008h]                     Address : 0000000043DA0010

 [074h 0116 008h]           Read Ack Preserve : FFFFFFFFFFFFFFFE
 [07Ch 0124 008h]              Read Ack Write : 0000000000000001

-Raw Table Data: Length 132 (0x84)
+[084h 0132 002h]               Subtable Type : 000A [Generic Hardware Error Source V2]
+[086h 0134 002h]                   Source Id : 0001
+[088h 0136 002h]           Related Source Id : FFFF
+[08Ah 0138 001h]                    Reserved : 00
+[08Bh 0139 001h]                     Enabled : 01
+[08Ch 0140 004h]      Records To Preallocate : 00000001
+[090h 0144 004h]     Max Sections Per Record : 00000001
+[094h 0148 004h]         Max Raw Data Length : 00000400
+
+[098h 0152 00Ch]        Error Status Address : [Generic Address Structure]
+[098h 0152 001h]                    Space ID : 00 [SystemMemory]
+[099h 0153 001h]                   Bit Width : 40
+[09Ah 0154 001h]                  Bit Offset : 00
+[09Bh 0155 001h]        Encoded Access Width : 04 [QWord Access:64]
+[09Ch 0156 008h]                     Address : 0000000043DA0008
+
+[0A4h 0164 01Ch]                      Notify : [Hardware Error Notification Structure]
+[0A4h 0164 001h]                 Notify Type : 07 [GPIO]
+[0A5h 0165 001h]               Notify Length : 1C
+[0A6h 0166 002h]  Configuration Write Enable : 0000
+[0A8h 0168 004h]                PollInterval : 00000000
+[0ACh 0172 004h]                      Vector : 00000000
+[0B0h 0176 004h]     Polling Threshold Value : 00000000
+[0B4h 0180 004h]    Polling Threshold Window : 00000000
+[0B8h 0184 004h]       Error Threshold Value : 00000000
+[0BCh 0188 004h]      Error Threshold Window : 00000000
+
+[0C0h 0192 004h]   Error Status Block Length : 00000400
+[0C4h 0196 00Ch]           Read Ack Register : [Generic Address Structure]
+[0C4h 0196 001h]                    Space ID : 00 [SystemMemory]
+[0C5h 0197 001h]                   Bit Width : 40
+[0C6h 0198 001h]                  Bit Offset : 00
+[0C7h 0199 001h]        Encoded Access Width : 04 [QWord Access:64]
+[0C8h 0200 008h]                     Address : 0000000043DA0018

-    0000: 48 45 53 54 84 00 00 00 01 E2 42 4F 43 48 53 20  // HEST......BOCHS
+[0D0h 0208 008h]           Read Ack Preserve : FFFFFFFFFFFFFFFE
+[0D8h 0216 008h]              Read Ack Write : 0000000000000001
+
+Raw Table Data: Length 224 (0xE0)
+

@@ -1,30 +1,30 @@
 /*
  * Intel ACPI Component Architecture
  * AML/ASL+ Disassembler version 20240322 (64-bit version)
  * Copyright (c) 2000 - 2023 Intel Corporation
  *
  * Disassembling to symbolic ASL+ operators
  *
- * Disassembly of tests/data/acpi/aarch64/virt/DSDT
+ * Disassembly of /tmp/aml-TNQ912
  *
  * Original Table Header:
  *     Signature        "DSDT"
- *     Length           0x0000144C (5196)
+ *     Length           0x00001478 (5240)
  *     Revision         0x02
- *     Checksum         0x1B
+ *     Checksum         0x04
  *     OEM ID           "BOCHS "
  *     OEM Table ID     "BXPC    "
  *     OEM Revision     0x00000001 (1)
  *     Compiler ID      "BXPC"
  *     Compiler Version 0x00000001 (1)
  */
 DefinitionBlock ("", "DSDT", 2, "BOCHS ", "BXPC    ", 0x00000001)
 {
     Scope (\_SB)
     {
         Device (C000)
         {
             Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: Hardware ID
             Name (_UID, Zero)  // _UID: Unique ID
         }

@@ -1876,27 +1876,38 @@
                     0x00000029,
                 }
             })
             OperationRegion (EREG, SystemMemory, 0x09080000, 0x04)
             Field (EREG, DWordAcc, NoLock, WriteAsZeros)
             {
                 ESEL,   32
             }

             Method (_EVT, 1, Serialized)  // _EVT: Event
             {
                 Local0 = ESEL /* \_SB_.GED_.ESEL */
                 If (((Local0 & 0x02) == 0x02))
                 {
                     Notify (PWRB, 0x80) // Status Change
                 }
+
+                If (((Local0 & 0x10) == 0x10))
+                {
+                    Notify (GEDD, 0x80) // Status Change
+                }
             }
         }

         Device (PWRB)
         {
             Name (_HID, "PNP0C0C" /* Power Button Device */)  // _HID: Hardware ID
             Name (_UID, Zero)  // _UID: Unique ID
         }
+
+        Device (GEDD)
+        {
+            Name (_HID, "PNP0C33" /* Error Device */)  // _HID: Hardware ID
+            Name (_UID, Zero)  // _UID: Unique ID
+        }
     }
 }

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
---
 tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
 .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
 tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
 tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
 tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
 tests/data/acpi/aarch64/virt/HEST             | Bin 132 -> 224 bytes
 tests/qtest/bios-tables-test-allowed-diff.h   |   6 ------
 7 files changed, 6 deletions(-)

diff --git a/tests/data/acpi/aarch64/virt/DSDT b/tests/data/acpi/aarch64/virt/DSDT
index 36d3e5d5a5e47359b6dcb3706f98b4f225677591..a182bd9d7182dccdf63c650d048c58f18505d001 100644
GIT binary patch
delta 109
zcmX@3@k4{lCD<jTLWF^ViDe>}G*h$dM)euOOwJsW4+;nC=*7E+g>V+Q2D|zsED)Gn
zoxsJ!z{S)S5FX^j)c_F?VBivHb9Z%dnXE4&D;?b=31V}^dw9C=2KWUSI2#)?aKwjt
Hx-b9$X;vI^

delta 64
zcmeyNaYlp7CD<jzM}&caNqQoeG*i3NM)euOOit{R4+;lM%f`Egg>V+Q2D|zsED)Gn
UoxsJ!z{S)S5FX?-*+E1W06%jPR{#J2

diff --git a/tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt b/tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt
index e6154d0355f84fdcc51387b4db8f9ee63acae4e9..af1f2b0eb0b77a80c5bd74f201d24f71e486627f 100644
GIT binary patch
delta 110
zcmZ3ac}|ndCD<k8oCpI0)4_>c(oCIR8`a+lGdXii78eO-)SH|wBICY5U~+W=mjDBo
yK%2X(iwjpnbdzL2c#soEyoaX?Z-8HbfwO@#14n$Qrwc=LlO#wDl9aJAR0;r(tsHj%

delta 66
zcmX@7xk!`CCD<iokq83=(~XH-(oDVX8`a+lGdZzO78eO-l%1R{A|oB$BpDDM<irv0
W;pxH~;1^)vY~akm5g+R5!T<noi4jWx

diff --git a/tests/data/acpi/aarch64/virt/DSDT.memhp b/tests/data/acpi/aarch64/virt/DSDT.memhp
index 33f011d6b635035a04c0b39ce9b4e219f7ae74b7..10436ec87c4859fb84b3ecb7bba5788f38112e59 100644
GIT binary patch
delta 88
zcmbPheA1Z9CD<k8q$C3algUIbX{MH08`WnBGdXcjJ}4Z_<jXo)OvH<SfxzVI1TFyv
qE`c_8R~MJfaU%At($P(lAPz^oho=i~fM0-tv#~J)M|`NK3j+W#;TF9B

delta 44
zcmX?UJlB}ZCD<iot|S8klg&gfX{L_p8`WnBGdXfiJ}4Z_<ij#qOvGz*p@=Oj039?8
AE&u=k

diff --git a/tests/data/acpi/aarch64/virt/DSDT.pxb b/tests/data/acpi/aarch64/virt/DSDT.pxb
index c0fdc6e9c1396cc2259dc4bc665ba023adcf4c9b..0524b3cbe00bfe552de824dd1090bd00a208c527 100644
GIT binary patch
delta 110
zcmexwz1oJ$CD<iITaJN&sbC_PG*jDyjq2XAOwJsWOJsu?^(LQ?m2qDnFu6K`OMrn(
ypv~RY#f7UOx=Au1JjjV7-ow*{H^48zz}di=fg?WD(}f|rNfM+6Ny^w5Dg^+WYaFrw

delta 66
zcmZ2&^WU1wCD<k8zbpd-Q^!OuX{N5b8`ZsKnVi@sm&gV)%1%BZD<d7<BpDDM<irv0
W;pxH~;1^)vY~akm5g+R5!T<oNArgiF

diff --git a/tests/data/acpi/aarch64/virt/DSDT.topology b/tests/data/acpi/aarch64/virt/DSDT.topology
index 029d03eecc4efddc001e5377e85ac8e831294362..8c0423fe62d6950f9098983d86bfee256d7d003a 100644
GIT binary patch
delta 86
zcmbQHbx4cLCD<jzNtA(s>E%Q&X{O%5jp|7vOwJsWyG4Q-^(NmJk>Ot;Fu6K`OMrn(
opv~RY#bxqO5n1WzCP@&RBi_T)g*U)2z`)tqn1Lfc)YF9l01l28<p2Nx

delta 42
ycmX@4HBF1lCD<iIOq79viGL!OG*hGhM)f2SCMWjE-6Fw^vXk$N$V}!Dl?DLb(h64q

diff --git a/tests/data/acpi/aarch64/virt/HEST b/tests/data/acpi/aarch64/virt/HEST
index 4c5d8c5b5da5b3241f93cd0839e94272bf6b1486..674272922db7d48f7821aa7c83ec76bb3b556d2a 100644
GIT binary patch
delta 68
zcmZo+e89-%;TjzBfPsO5F=rx|6eH6_Rd+^#iMisuTnvm1|Nk>EGJ@nLCJHmL%S;Ru
WnV7)J#lXPAz`)?Zz#=g*R~!HcF%5eF

delta 29
lcmaFB*uu!=;Tjy$!oa}5_-G=R6eHtARriT=I3|_|004Ge2nqlI

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 0a1a26543ba2..dfb8523c8bf4 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,7 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/aarch64/virt/HEST",
-"tests/data/acpi/aarch64/virt/DSDT",
-"tests/data/acpi/aarch64/virt/DSDT.acpihmatvirt",
-"tests/data/acpi/aarch64/virt/DSDT.memhp",
-"tests/data/acpi/aarch64/virt/DSDT.pxb",
-"tests/data/acpi/aarch64/virt/DSDT.topology",
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 19/20] docs: hest: add new "etc/acpi_table_hest_addr" and update workflow
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (17 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 18/20] tests/acpi: virt: update HEST and DSDT tables Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 19:14 ` [PATCH v8 20/20] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Dongjiu Geng, linux-kernel

While the HEST layout didn't change, there are some internal
changes related to how offsets are calculated and how memory error
events are triggered.

Update specs to reflect such changes.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 docs/specs/acpi_hest_ghes.rst | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
index c3e9f8d9a702..3d1b85d74b70 100644
--- a/docs/specs/acpi_hest_ghes.rst
+++ b/docs/specs/acpi_hest_ghes.rst
@@ -89,12 +89,21 @@ Design Details
     addresses in the "error_block_address" fields with a pointer to the
     respective "Error Status Data Block" in the "etc/hardware_errors" blob.
 
-(8) QEMU defines a third and write-only fw_cfg blob which is called
-    "etc/hardware_errors_addr". Through that blob, the firmware can send back
-    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
-    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
-    for the firmware. The firmware will write back the start address of
-    "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
+(8) QEMU defines a third and write-only fw_cfg blob to store the location
+    where the error block offsets, read ack registers and CPER records are
+    stored.
+
+    Up to QEMU 9.2, the location was at "etc/hardware_errors_addr", and
+    contains a GPA for the beginning of "etc/hardware_errors".
+
+    Newer versions place the location at "etc/acpi_table_hest_addr",
+    pointing to the GPA of the HEST table.
+
+    Using above mentioned 'fw_cfg' files, the firmware can send back the
+    guest-side allocation addresses to QEMU. They contain a 8-byte entry.
+    QEMU generates a single WRITE_POINTER command for the firmware. The
+    firmware will write back the start address of either "etc/hardware_errors"
+    or HEST table at the corresponding fw_cfg file.
 
 (9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
     "Error Status Data Block", guest memory, and then injects platform specific
@@ -105,8 +114,5 @@ Design Details
      kernel, on receiving notification, guest APEI driver could read the CPER error
      and take appropriate action.
 
-(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
-     find out "Error Status Data Block" entry corresponding to error source. So supported
-     source_id values should be assigned here and not be changed afterwards to make sure
-     that guest will write error into expected "Error Status Data Block" even if guest was
-     migrated to a newer QEMU.
+(11) kvm_arch_on_sigbus_vcpu() report RAS errors via a SEA notifications,
+     when a SIGBUS event is triggered.
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 20/20] scripts/ghes_inject: add a script to generate GHES error inject
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (18 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 19/20] docs: hest: add new "etc/acpi_table_hest_addr" and update workflow Mauro Carvalho Chehab
@ 2025-03-07 19:14 ` Mauro Carvalho Chehab
  2025-03-07 21:05   ` Philippe Mathieu-Daudé
  2025-03-25  1:55 ` [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for " Mauro Carvalho Chehab
  2025-05-11 13:44 ` Michael S. Tsirkin
  21 siblings, 1 reply; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-07 19:14 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Gavin Shan, Philippe Mathieu-Daudé,
	Cleber Rosa, John Snow, linux-kernel

Using the QMP GHESv2 API requires preparing a raw data array
containing a CPER record.

Add a helper script with subcommands to prepare such data.

Currently, only ARM Processor error CPER record is supported, by
using:
	$ ghes_inject.py arm

which produces those warnings on Linux:

[  705.032426] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
[  774.866308] {4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
[  774.866583] {4}[Hardware Error]: event severity: recoverable
[  774.866738] {4}[Hardware Error]:  Error 0, type: recoverable
[  774.866889] {4}[Hardware Error]:   section_type: ARM processor error
[  774.867048] {4}[Hardware Error]:   MIDR: 0x00000000000f0510
[  774.867189] {4}[Hardware Error]:   running state: 0x0
[  774.867321] {4}[Hardware Error]:   Power State Coordination Interface state: 0
[  774.867511] {4}[Hardware Error]:   Error info structure 0:
[  774.867679] {4}[Hardware Error]:   num errors: 2
[  774.867801] {4}[Hardware Error]:    error_type: 0x02: cache error
[  774.867962] {4}[Hardware Error]:    error_info: 0x000000000091000f
[  774.868124] {4}[Hardware Error]:     transaction type: Data Access
[  774.868280] {4}[Hardware Error]:     cache error, operation type: Data write
[  774.868465] {4}[Hardware Error]:     cache level: 2
[  774.868592] {4}[Hardware Error]:     processor context not corrupted
[  774.868774] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error

Such script allows customizing the error data, allowing to change
all fields at the record. Please use:

	$ ghes_inject.py arm -h

For more details about its usage.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 MAINTAINERS                    |   3 +
 scripts/arm_processor_error.py | 476 ++++++++++++++++++++++
 scripts/ghes_inject.py         |  51 +++
 scripts/qmp_helper.py          | 703 +++++++++++++++++++++++++++++++++
 4 files changed, 1233 insertions(+)
 create mode 100644 scripts/arm_processor_error.py
 create mode 100755 scripts/ghes_inject.py
 create mode 100755 scripts/qmp_helper.py

diff --git a/MAINTAINERS b/MAINTAINERS
index 479664a4f6e8..d1d2b3fd5d44 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2097,6 +2097,9 @@ S: Maintained
 F: hw/arm/ghes_cper.c
 F: hw/acpi/ghes_cper_stub.c
 F: qapi/acpi-hest.json
+F: scripts/ghes_inject.py
+F: scripts/arm_processor_error.py
+F: scripts/qmp_helper.py
 
 ppc4xx
 L: qemu-ppc@nongnu.org
diff --git a/scripts/arm_processor_error.py b/scripts/arm_processor_error.py
new file mode 100644
index 000000000000..1dd42e42a877
--- /dev/null
+++ b/scripts/arm_processor_error.py
@@ -0,0 +1,476 @@
+#!/usr/bin/env python3
+#
+# pylint: disable=C0301,C0114,R0903,R0912,R0913,R0914,R0915,W0511
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (C) 2024-2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+
+# TODO: current implementation has dummy defaults.
+#
+# For a better implementation, a QMP addition/call is needed to
+# retrieve some data for ARM Processor Error injection:
+#
+#   - ARM registers: power_state, mpidr.
+
+"""
+Generates an ARM processor error CPER, compatible with
+UEFI 2.9A Errata.
+
+Injecting such errors can be done using:
+
+    $ ./scripts/ghes_inject.py arm
+    Error injected.
+
+Produces a simple CPER register, as detected on a Linux guest:
+
+[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
+[Hardware Error]: event severity: recoverable
+[Hardware Error]:  Error 0, type: recoverable
+[Hardware Error]:   section_type: ARM processor error
+[Hardware Error]:   MIDR: 0x0000000000000000
+[Hardware Error]:   running state: 0x0
+[Hardware Error]:   Power State Coordination Interface state: 0
+[Hardware Error]:   Error info structure 0:
+[Hardware Error]:   num errors: 2
+[Hardware Error]:    error_type: 0x02: cache error
+[Hardware Error]:    error_info: 0x000000000091000f
+[Hardware Error]:     transaction type: Data Access
+[Hardware Error]:     cache error, operation type: Data write
+[Hardware Error]:     cache level: 2
+[Hardware Error]:     processor context not corrupted
+[Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
+
+The ARM Processor Error message can be customized via command line
+parameters. For instance:
+
+    $ ./scripts/ghes_inject.py arm --mpidr 0x444 --running --affinity 1 \
+        --error-info 12345678 --vendor 0x13,123,4,5,1 --ctx-array 0,1,2,3,4,5 \
+        -t cache tlb bus micro-arch tlb,micro-arch
+    Error injected.
+
+Injects this error, as detected on a Linux guest:
+
+[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
+[Hardware Error]: event severity: recoverable
+[Hardware Error]:  Error 0, type: recoverable
+[Hardware Error]:   section_type: ARM processor error
+[Hardware Error]:   MIDR: 0x0000000000000000
+[Hardware Error]:   Multiprocessor Affinity Register (MPIDR): 0x0000000000000000
+[Hardware Error]:   error affinity level: 0
+[Hardware Error]:   running state: 0x1
+[Hardware Error]:   Power State Coordination Interface state: 0
+[Hardware Error]:   Error info structure 0:
+[Hardware Error]:   num errors: 2
+[Hardware Error]:    error_type: 0x02: cache error
+[Hardware Error]:    error_info: 0x0000000000bc614e
+[Hardware Error]:     cache level: 2
+[Hardware Error]:     processor context not corrupted
+[Hardware Error]:   Error info structure 1:
+[Hardware Error]:   num errors: 2
+[Hardware Error]:    error_type: 0x04: TLB error
+[Hardware Error]:    error_info: 0x000000000054007f
+[Hardware Error]:     transaction type: Instruction
+[Hardware Error]:     TLB error, operation type: Instruction fetch
+[Hardware Error]:     TLB level: 1
+[Hardware Error]:     processor context not corrupted
+[Hardware Error]:     the error has not been corrected
+[Hardware Error]:     PC is imprecise
+[Hardware Error]:   Error info structure 2:
+[Hardware Error]:   num errors: 2
+[Hardware Error]:    error_type: 0x08: bus error
+[Hardware Error]:    error_info: 0x00000080d6460fff
+[Hardware Error]:     transaction type: Generic
+[Hardware Error]:     bus error, operation type: Generic read (type of instruction or data request cannot be determined)
+[Hardware Error]:     affinity level at which the bus error occurred: 1
+[Hardware Error]:     processor context corrupted
+[Hardware Error]:     the error has been corrected
+[Hardware Error]:     PC is imprecise
+[Hardware Error]:     Program execution can be restarted reliably at the PC associated with the error.
+[Hardware Error]:     participation type: Local processor observed
+[Hardware Error]:     request timed out
+[Hardware Error]:     address space: External Memory Access
+[Hardware Error]:     memory access attributes:0x20
+[Hardware Error]:     access mode: secure
+[Hardware Error]:   Error info structure 3:
+[Hardware Error]:   num errors: 2
+[Hardware Error]:    error_type: 0x10: micro-architectural error
+[Hardware Error]:    error_info: 0x0000000078da03ff
+[Hardware Error]:   Error info structure 4:
+[Hardware Error]:   num errors: 2
+[Hardware Error]:    error_type: 0x14: TLB error|micro-architectural error
+[Hardware Error]:   Context info structure 0:
+[Hardware Error]:    register context type: AArch64 EL1 context registers
+[Hardware Error]:    00000000: 00000000 00000000
+[Hardware Error]:   Vendor specific error info has 5 bytes:
+[Hardware Error]:    00000000: 13 7b 04 05 01                                   .{...
+[Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
+[Firmware Warn]: GHES: Unhandled processor error type 0x04: TLB error
+[Firmware Warn]: GHES: Unhandled processor error type 0x08: bus error
+[Firmware Warn]: GHES: Unhandled processor error type 0x10: micro-architectural error
+[Firmware Warn]: GHES: Unhandled processor error type 0x14: TLB error|micro-architectural error
+"""
+
+import argparse
+import re
+
+from qmp_helper import qmp, util, cper_guid
+
+
+class ArmProcessorEinj:
+    """
+    Implements ARM Processor Error injection via GHES
+    """
+
+    DESC = """
+    Generates an ARM processor error CPER, compatible with
+    UEFI 2.9A Errata.
+    """
+
+    ACPI_GHES_ARM_CPER_LENGTH = 40
+    ACPI_GHES_ARM_CPER_PEI_LENGTH = 32
+
+    # Context types
+    CONTEXT_AARCH32_EL1 = 1
+    CONTEXT_AARCH64_EL1 = 5
+    CONTEXT_MISC_REG = 8
+
+    def __init__(self, subparsers):
+        """Initialize the error injection class and add subparser"""
+
+        # Valid choice values
+        self.arm_valid_bits = {
+            "mpidr":    util.bit(0),
+            "affinity": util.bit(1),
+            "running":  util.bit(2),
+            "vendor":   util.bit(3),
+        }
+
+        self.pei_flags = {
+            "first":        util.bit(0),
+            "last":         util.bit(1),
+            "propagated":   util.bit(2),
+            "overflow":     util.bit(3),
+        }
+
+        self.pei_error_types = {
+            "cache":        util.bit(1),
+            "tlb":          util.bit(2),
+            "bus":          util.bit(3),
+            "micro-arch":   util.bit(4),
+        }
+
+        self.pei_valid_bits = {
+            "multiple-error":   util.bit(0),
+            "flags":            util.bit(1),
+            "error-info":       util.bit(2),
+            "virt-addr":        util.bit(3),
+            "phy-addr":         util.bit(4),
+        }
+
+        self.data = bytearray()
+
+        parser = subparsers.add_parser("arm", description=self.DESC)
+
+        arm_valid_bits = ",".join(self.arm_valid_bits.keys())
+        flags = ",".join(self.pei_flags.keys())
+        error_types = ",".join(self.pei_error_types.keys())
+        pei_valid_bits = ",".join(self.pei_valid_bits.keys())
+
+        # UEFI N.16 ARM Validation bits
+        g_arm = parser.add_argument_group("ARM processor")
+        g_arm.add_argument("--arm", "--arm-valid",
+                           help=f"ARM valid bits: {arm_valid_bits}")
+        g_arm.add_argument("-a", "--affinity",  "--level", "--affinity-level",
+                           type=lambda x: int(x, 0),
+                           help="Affinity level (when multiple levels apply)")
+        g_arm.add_argument("-l", "--mpidr", type=lambda x: int(x, 0),
+                           help="Multiprocessor Affinity Register")
+        g_arm.add_argument("-i", "--midr", type=lambda x: int(x, 0),
+                           help="Main ID Register")
+        g_arm.add_argument("-r", "--running",
+                           action=argparse.BooleanOptionalAction,
+                           default=None,
+                           help="Indicates if the processor is running or not")
+        g_arm.add_argument("--psci", "--psci-state",
+                           type=lambda x: int(x, 0),
+                           help="Power State Coordination Interface - PSCI state")
+
+        # TODO: Add vendor-specific support
+
+        # UEFI N.17 bitmaps (type and flags)
+        g_pei = parser.add_argument_group("ARM Processor Error Info (PEI)")
+        g_pei.add_argument("-t", "--type", nargs="+",
+                        help=f"one or more error types: {error_types}")
+        g_pei.add_argument("-f", "--flags", nargs="*",
+                        help=f"zero or more error flags: {flags}")
+        g_pei.add_argument("-V", "--pei-valid", "--error-valid", nargs="*",
+                        help=f"zero or more PEI valid bits: {pei_valid_bits}")
+
+        # UEFI N.17 Integer values
+        g_pei.add_argument("-m", "--multiple-error", nargs="+",
+                        help="Number of errors: 0: Single error, 1: Multiple errors, 2-65535: Error count if known")
+        g_pei.add_argument("-e", "--error-info", nargs="+",
+                        help="Error information (UEFI 2.10 tables N.18 to N.20)")
+        g_pei.add_argument("-p", "--physical-address",  nargs="+",
+                        help="Physical address")
+        g_pei.add_argument("-v", "--virtual-address",  nargs="+",
+                        help="Virtual address")
+
+        # UEFI N.21 Context
+        g_ctx = parser.add_argument_group("Processor Context")
+        g_ctx.add_argument("--ctx-type", "--context-type", nargs="*",
+                        help="Type of the context (0=ARM32 GPR, 5=ARM64 EL1, other values supported)")
+        g_ctx.add_argument("--ctx-size", "--context-size", nargs="*",
+                        help="Minimal size of the context")
+        g_ctx.add_argument("--ctx-array", "--context-array", nargs="*",
+                        help="Comma-separated arrays for each context")
+
+        # Vendor-specific data
+        g_vendor = parser.add_argument_group("Vendor-specific data")
+        g_vendor.add_argument("--vendor", "--vendor-specific", nargs="+",
+                        help="Vendor-specific byte arrays of data")
+
+        # Add arguments for Generic Error Data
+        qmp.argparse(parser)
+
+        parser.set_defaults(func=self.send_cper)
+
+    def send_cper(self, args):
+        """Parse subcommand arguments and send a CPER via QMP"""
+
+        qmp_cmd = qmp(args.host, args.port, args.debug)
+
+        # Handle Generic Error Data arguments if any
+        qmp_cmd.set_args(args)
+
+        is_cpu_type = re.compile(r"^([\w+]+\-)?arm\-cpu$")
+        cpus = qmp_cmd.search_qom("/machine/unattached/device",
+                                  "type", is_cpu_type)
+
+        cper = {}
+        pei = {}
+        ctx = {}
+        vendor = {}
+
+        arg = vars(args)
+
+        # Handle global parameters
+        if args.arm:
+            arm_valid_init = False
+            cper["valid"] = util.get_choice(name="valid",
+                                       value=args.arm,
+                                       choices=self.arm_valid_bits,
+                                       suffixes=["-error", "-err"])
+        else:
+            cper["valid"] = 0
+            arm_valid_init = True
+
+        if "running" in arg:
+            if args.running:
+                cper["running-state"] = util.bit(0)
+            else:
+                cper["running-state"] = 0
+        else:
+            cper["running-state"] = 0
+
+        if arm_valid_init:
+            if args.affinity:
+                cper["valid"] |= self.arm_valid_bits["affinity"]
+
+            if args.mpidr:
+                cper["valid"] |= self.arm_valid_bits["mpidr"]
+
+            if "running-state" in cper:
+                cper["valid"] |= self.arm_valid_bits["running"]
+
+            if args.psci:
+                cper["valid"] |= self.arm_valid_bits["running"]
+
+        # Handle PEI
+        if not args.type:
+            args.type = ["cache-error"]
+
+        util.get_mult_choices(
+            pei,
+            name="valid",
+            values=args.pei_valid,
+            choices=self.pei_valid_bits,
+            suffixes=["-valid", "--addr"],
+        )
+        util.get_mult_choices(
+            pei,
+            name="type",
+            values=args.type,
+            choices=self.pei_error_types,
+            suffixes=["-error", "-err"],
+        )
+        util.get_mult_choices(
+            pei,
+            name="flags",
+            values=args.flags,
+            choices=self.pei_flags,
+            suffixes=["-error", "-cap"],
+        )
+        util.get_mult_int(pei, "error-info", args.error_info)
+        util.get_mult_int(pei, "multiple-error", args.multiple_error)
+        util.get_mult_int(pei, "phy-addr", args.physical_address)
+        util.get_mult_int(pei, "virt-addr", args.virtual_address)
+
+        # Handle context
+        util.get_mult_int(ctx, "type", args.ctx_type, allow_zero=True)
+        util.get_mult_int(ctx, "minimal-size", args.ctx_size, allow_zero=True)
+        util.get_mult_array(ctx, "register", args.ctx_array, allow_zero=True)
+
+        util.get_mult_array(vendor, "bytes", args.vendor, max_val=255)
+
+        # Store PEI
+        pei_data = bytearray()
+        default_flags  = self.pei_flags["first"]
+        default_flags |= self.pei_flags["last"]
+
+        error_info_num = 0
+
+        for i, p in pei.items():        # pylint: disable=W0612
+            error_info_num += 1
+
+            # UEFI 2.10 doesn't define how to encode error information
+            # when multiple types are raised. So, provide a default only
+            # if a single type is there
+            if "error-info" not in p:
+                if p["type"] == util.bit(1):
+                    p["error-info"] = 0x0091000F
+                if p["type"] == util.bit(2):
+                    p["error-info"] = 0x0054007F
+                if p["type"] == util.bit(3):
+                    p["error-info"] = 0x80D6460FFF
+                if p["type"] == util.bit(4):
+                    p["error-info"] = 0x78DA03FF
+
+            if "valid" not in p:
+                p["valid"] = 0
+                if "multiple-error" in p:
+                    p["valid"] |= self.pei_valid_bits["multiple-error"]
+
+                if "flags" in p:
+                    p["valid"] |= self.pei_valid_bits["flags"]
+
+                if "error-info" in p:
+                    p["valid"] |= self.pei_valid_bits["error-info"]
+
+                if "phy-addr" in p:
+                    p["valid"] |= self.pei_valid_bits["phy-addr"]
+
+                if "virt-addr" in p:
+                    p["valid"] |= self.pei_valid_bits["virt-addr"]
+
+            # Version
+            util.data_add(pei_data, 0, 1)
+
+            util.data_add(pei_data,
+                         self.ACPI_GHES_ARM_CPER_PEI_LENGTH, 1)
+
+            util.data_add(pei_data, p["valid"], 2)
+            util.data_add(pei_data, p["type"], 1)
+            util.data_add(pei_data, p.get("multiple-error", 1), 2)
+            util.data_add(pei_data, p.get("flags", default_flags), 1)
+            util.data_add(pei_data, p.get("error-info", 0), 8)
+            util.data_add(pei_data, p.get("virt-addr", 0xDEADBEEF), 8)
+            util.data_add(pei_data, p.get("phy-addr", 0xABBA0BAD), 8)
+
+        # Store Context
+        ctx_data = bytearray()
+        context_info_num = 0
+
+        if ctx:
+            ret = qmp_cmd.send_cmd("query-target", may_open=True)
+
+            default_ctx = self.CONTEXT_MISC_REG
+
+            if "arch" in ret:
+                if ret["arch"] == "aarch64":
+                    default_ctx = self.CONTEXT_AARCH64_EL1
+                elif ret["arch"] == "arm":
+                    default_ctx = self.CONTEXT_AARCH32_EL1
+
+            for k in sorted(ctx.keys()):
+                context_info_num += 1
+
+                if "type" not in ctx[k]:
+                    ctx[k]["type"] = default_ctx
+
+                if "register" not in ctx[k]:
+                    ctx[k]["register"] = []
+
+                reg_size = len(ctx[k]["register"])
+                size = 0
+
+                if "minimal-size" in ctx:
+                    size = ctx[k]["minimal-size"]
+
+                size = max(size, reg_size)
+
+                size = (size + 1) % 0xFFFE
+
+                # Version
+                util.data_add(ctx_data, 0, 2)
+
+                util.data_add(ctx_data, ctx[k]["type"], 2)
+
+                util.data_add(ctx_data, 8 * size, 4)
+
+                for r in ctx[k]["register"]:
+                    util.data_add(ctx_data, r, 8)
+
+                for i in range(reg_size, size):   # pylint: disable=W0612
+                    util.data_add(ctx_data, 0, 8)
+
+        # Vendor-specific bytes are not grouped
+        vendor_data = bytearray()
+        if vendor:
+            for k in sorted(vendor.keys()):
+                for b in vendor[k]["bytes"]:
+                    util.data_add(vendor_data, b, 1)
+
+        # Encode ARM Processor Error
+        data = bytearray()
+
+        util.data_add(data, cper["valid"], 4)
+
+        util.data_add(data, error_info_num, 2)
+        util.data_add(data, context_info_num, 2)
+
+        # Calculate the length of the CPER data
+        cper_length = self.ACPI_GHES_ARM_CPER_LENGTH
+        cper_length += len(pei_data)
+        cper_length += len(vendor_data)
+        cper_length += len(ctx_data)
+        util.data_add(data, cper_length, 4)
+
+        util.data_add(data, arg.get("affinity-level", 0), 1)
+
+        # Reserved
+        util.data_add(data, 0, 3)
+
+        if "midr-el1" not in arg:
+            if cpus:
+                cmd_arg = {
+                    'path': cpus[0],
+                    'property': "midr"
+                }
+                ret = qmp_cmd.send_cmd("qom-get", cmd_arg, may_open=True)
+                if isinstance(ret, int):
+                    arg["midr-el1"] = ret
+
+        util.data_add(data, arg.get("mpidr-el1", 0), 8)
+        util.data_add(data, arg.get("midr-el1", 0), 8)
+        util.data_add(data, cper["running-state"], 4)
+        util.data_add(data, arg.get("psci-state", 0), 4)
+
+        # Add PEI
+        data.extend(pei_data)
+        data.extend(ctx_data)
+        data.extend(vendor_data)
+
+        self.data = data
+
+        qmp_cmd.send_cper(cper_guid.CPER_PROC_ARM, self.data)
diff --git a/scripts/ghes_inject.py b/scripts/ghes_inject.py
new file mode 100755
index 000000000000..9a235201418b
--- /dev/null
+++ b/scripts/ghes_inject.py
@@ -0,0 +1,51 @@
+#!/usr/bin/env python3
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (C) 2024-2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+
+"""
+Handle ACPI GHESv2 error injection logic QEMU QMP interface.
+"""
+
+import argparse
+import sys
+
+from arm_processor_error import ArmProcessorEinj
+
+EINJ_DESC = """
+Handle ACPI GHESv2 error injection logic QEMU QMP interface.
+
+It allows using UEFI BIOS EINJ features to generate GHES records.
+
+It helps testing CPER and GHES drivers at the guest OS and how
+userspace applications at the guest handle them.
+"""
+
+def main():
+    """Main program"""
+
+    # Main parser - handle generic args like QEMU QMP TCP socket options
+    parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter,
+                                     usage="%(prog)s [options]",
+                                     description=EINJ_DESC)
+
+    g_options = parser.add_argument_group("QEMU QMP socket options")
+    g_options.add_argument("-H", "--host", default="localhost", type=str,
+                           help="host name")
+    g_options.add_argument("-P", "--port", default=4445, type=int,
+                           help="TCP port number")
+    g_options.add_argument('-d', '--debug', action='store_true')
+
+    subparsers = parser.add_subparsers()
+
+    ArmProcessorEinj(subparsers)
+
+    args = parser.parse_args()
+    if "func" in args:
+        args.func(args)
+    else:
+        sys.exit(f"Please specify a valid command for {sys.argv[0]}")
+
+if __name__ == "__main__":
+    main()
diff --git a/scripts/qmp_helper.py b/scripts/qmp_helper.py
new file mode 100755
index 000000000000..c1e7e0fd80ce
--- /dev/null
+++ b/scripts/qmp_helper.py
@@ -0,0 +1,703 @@
+#!/usr/bin/env python3
+#
+# pylint: disable=C0103,E0213,E1135,E1136,E1137,R0902,R0903,R0912,R0913,R0917
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (C) 2024-2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+
+"""
+Helper classes to be used by ghes_inject command classes.
+"""
+
+import json
+import sys
+
+from datetime import datetime
+from os import path as os_path
+
+try:
+    qemu_dir = os_path.abspath(os_path.dirname(os_path.dirname(__file__)))
+    sys.path.append(os_path.join(qemu_dir, 'python'))
+
+    from qemu.qmp.legacy import QEMUMonitorProtocol
+
+except ModuleNotFoundError as exc:
+    print(f"Module '{exc.name}' not found.")
+    print("Try export PYTHONPATH=top-qemu-dir/python or run from top-qemu-dir")
+    sys.exit(1)
+
+from base64 import b64encode
+
+class util:
+    """
+    Ancillary functions to deal with bitmaps, parse arguments,
+    generate GUID and encode data on a bytearray buffer.
+    """
+
+    #
+    # Helper routines to handle multiple choice arguments
+    #
+    def get_choice(name, value, choices, suffixes=None, bitmask=True):
+        """Produce a list from multiple choice argument"""
+
+        new_values = 0
+
+        if not value:
+            return new_values
+
+        for val in value.split(","):
+            val = val.lower()
+
+            if suffixes:
+                for suffix in suffixes:
+                    val = val.removesuffix(suffix)
+
+            if val not in choices.keys():
+                if suffixes:
+                    for suffix in suffixes:
+                        if val + suffix in choices.keys():
+                            val += suffix
+                            break
+
+            if val not in choices.keys():
+                sys.exit(f"Error on '{name}': choice '{val}' is invalid.")
+
+            val = choices[val]
+
+            if bitmask:
+                new_values |= val
+            else:
+                if new_values:
+                    sys.exit(f"Error on '{name}': only one value is accepted.")
+
+                new_values = val
+
+        return new_values
+
+    def get_array(name, values, max_val=None):
+        """Add numbered hashes from integer lists into an array"""
+
+        array = []
+
+        for value in values:
+            for val in value.split(","):
+                try:
+                    val = int(val, 0)
+                except ValueError:
+                    sys.exit(f"Error on '{name}': {val} is not an integer")
+
+                if val < 0:
+                    sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+                if max_val and val > max_val:
+                    sys.exit(f"Error on '{name}': {val} is too little")
+
+                array.append(val)
+
+        return array
+
+    def get_mult_array(mult, name, values, allow_zero=False, max_val=None):
+        """Add numbered hashes from integer lists"""
+
+        if not allow_zero:
+            if not values:
+                return
+        else:
+            if values is None:
+                return
+
+            if not values:
+                i = 0
+                if i not in mult:
+                    mult[i] = {}
+
+                mult[i][name] = []
+                return
+
+        i = 0
+        for value in values:
+            for val in value.split(","):
+                try:
+                    val = int(val, 0)
+                except ValueError:
+                    sys.exit(f"Error on '{name}': {val} is not an integer")
+
+                if val < 0:
+                    sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+                if max_val and val > max_val:
+                    sys.exit(f"Error on '{name}': {val} is too little")
+
+                if i not in mult:
+                    mult[i] = {}
+
+                if name not in mult[i]:
+                    mult[i][name] = []
+
+                mult[i][name].append(val)
+
+            i += 1
+
+
+    def get_mult_choices(mult, name, values, choices,
+                        suffixes=None, allow_zero=False):
+        """Add numbered hashes from multiple choice arguments"""
+
+        if not allow_zero:
+            if not values:
+                return
+        else:
+            if values is None:
+                return
+
+        i = 0
+        for val in values:
+            new_values = util.get_choice(name, val, choices, suffixes)
+
+            if i not in mult:
+                mult[i] = {}
+
+            mult[i][name] = new_values
+            i += 1
+
+
+    def get_mult_int(mult, name, values, allow_zero=False):
+        """Add numbered hashes from integer arguments"""
+        if not allow_zero:
+            if not values:
+                return
+        else:
+            if values is None:
+                return
+
+        i = 0
+        for val in values:
+            try:
+                val = int(val, 0)
+            except ValueError:
+                sys.exit(f"Error on '{name}': {val} is not an integer")
+
+            if val < 0:
+                sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+            if i not in mult:
+                mult[i] = {}
+
+            mult[i][name] = val
+            i += 1
+
+
+    #
+    # Data encode helper functions
+    #
+    def bit(b):
+        """Simple macro to define a bit on a bitmask"""
+        return 1 << b
+
+
+    def data_add(data, value, num_bytes):
+        """Adds bytes from value inside a bitarray"""
+
+        data.extend(value.to_bytes(num_bytes, byteorder="little"))  # pylint: disable=E1101
+
+    def dump_bytearray(name, data):
+        """Does an hexdump of a byte array, grouping in bytes"""
+
+        print(f"{name} ({len(data)} bytes):")
+
+        for ln_start in range(0, len(data), 16):
+            ln_end = min(ln_start + 16, len(data))
+            print(f"      {ln_start:08x}  ", end="")
+            for i in range(ln_start, ln_end):
+                print(f"{data[i]:02x} ", end="")
+            for i in range(ln_end, ln_start + 16):
+                print("   ", end="")
+            print("  ", end="")
+            for i in range(ln_start, ln_end):
+                if data[i] >= 32 and data[i] < 127:
+                    print(chr(data[i]), end="")
+                else:
+                    print(".", end="")
+
+            print()
+        print()
+
+    def time(string):
+        """Handle BCD timestamps used on Generic Error Data Block"""
+
+        time = None
+
+        # Formats to be used when parsing time stamps
+        formats = [
+            "%Y-%m-%d %H:%M:%S",
+        ]
+
+        if string == "now":
+            time = datetime.now()
+
+        if time is None:
+            for fmt in formats:
+                try:
+                    time = datetime.strptime(string, fmt)
+                    break
+                except ValueError:
+                    pass
+
+            if time is None:
+                raise ValueError("Invalid time format")
+
+        return time
+
+class guid:
+    """
+    Simple class to handle GUID fields.
+    """
+
+    def __init__(self, time_low, time_mid, time_high, nodes):
+        """Initialize a GUID value"""
+
+        assert len(nodes) == 8
+
+        self.time_low = time_low
+        self.time_mid = time_mid
+        self.time_high = time_high
+        self.nodes = nodes
+
+    @classmethod
+    def UUID(cls, guid_str):
+        """Initialize a GUID using a string on its standard format"""
+
+        if len(guid_str) != 36:
+            print("Size not 36")
+            raise ValueError('Invalid GUID size')
+
+        # It is easier to parse without separators. So, drop them
+        guid_str = guid_str.replace('-', '')
+
+        if len(guid_str) != 32:
+            print("Size not 32", guid_str, len(guid_str))
+            raise ValueError('Invalid GUID hex size')
+
+        time_low = 0
+        time_mid = 0
+        time_high = 0
+        nodes = []
+
+        for i in reversed(range(16, 32, 2)):
+            h = guid_str[i:i + 2]
+            value = int(h, 16)
+            nodes.insert(0, value)
+
+        time_high = int(guid_str[12:16], 16)
+        time_mid = int(guid_str[8:12], 16)
+        time_low = int(guid_str[0:8], 16)
+
+        return cls(time_low, time_mid, time_high, nodes)
+
+    def __str__(self):
+        """Output a GUID value on its default string representation"""
+
+        clock = self.nodes[0] << 8 | self.nodes[1]
+
+        node = 0
+        for i in range(2, len(self.nodes)):
+            node = node << 8 | self.nodes[i]
+
+        s = f"{self.time_low:08x}-{self.time_mid:04x}-"
+        s += f"{self.time_high:04x}-{clock:04x}-{node:012x}"
+        return s
+
+    def to_bytes(self):
+        """Output a GUID value in bytes"""
+
+        data = bytearray()
+
+        util.data_add(data, self.time_low, 4)
+        util.data_add(data, self.time_mid, 2)
+        util.data_add(data, self.time_high, 2)
+        data.extend(bytearray(self.nodes))
+
+        return data
+
+class qmp:
+    """
+    Opens a connection and send/receive QMP commands.
+    """
+
+    def send_cmd(self, command, args=None, may_open=False, return_error=True):
+        """Send a command to QMP, optinally opening a connection"""
+
+        if may_open:
+            self._connect()
+        elif not self.connected:
+            return False
+
+        msg = { 'execute': command }
+        if args:
+            msg['arguments'] = args
+
+        try:
+            obj = self.qmp_monitor.cmd_obj(msg)
+        # Can we use some other exception class here?
+        except Exception as e:                         # pylint: disable=W0718
+            print(f"Command: {command}")
+            print(f"Failed to inject error: {e}.")
+            return None
+
+        if "return" in obj:
+            if isinstance(obj.get("return"), dict):
+                if obj["return"]:
+                    return obj["return"]
+                return "OK"
+
+            return obj["return"]
+
+        if isinstance(obj.get("error"), dict):
+            error = obj["error"]
+            if return_error:
+                print(f"Command: {msg}")
+                print(f'{error["class"]}: {error["desc"]}')
+        else:
+            print(json.dumps(obj))
+
+        return None
+
+    def _close(self):
+        """Shutdown and close the socket, if opened"""
+        if not self.connected:
+            return
+
+        self.qmp_monitor.close()
+        self.connected = False
+
+    def _connect(self):
+        """Connect to a QMP TCP/IP port, if not connected yet"""
+
+        if self.connected:
+            return True
+
+        try:
+            self.qmp_monitor.connect(negotiate=True)
+        except ConnectionError:
+            sys.exit(f"Can't connect to QMP host {self.host}:{self.port}")
+
+        self.connected = True
+
+        return True
+
+    BLOCK_STATUS_BITS = {
+        "uncorrectable":            util.bit(0),
+        "correctable":              util.bit(1),
+        "multi-uncorrectable":      util.bit(2),
+        "multi-correctable":        util.bit(3),
+    }
+
+    ERROR_SEVERITY = {
+        "recoverable":  0,
+        "fatal":        1,
+        "corrected":    2,
+        "none":         3,
+    }
+
+    VALIDATION_BITS = {
+        "fru-id":       util.bit(0),
+        "fru-text":     util.bit(1),
+        "timestamp":    util.bit(2),
+    }
+
+    GEDB_FLAGS_BITS = {
+        "recovered":    util.bit(0),
+        "prev-error":   util.bit(1),
+        "simulated":    util.bit(2),
+    }
+
+    GENERIC_DATA_SIZE = 72
+
+    def argparse(parser):
+        """Prepare a parser group to query generic error data"""
+
+        block_status_bits = ",".join(qmp.BLOCK_STATUS_BITS.keys())
+        error_severity_enum = ",".join(qmp.ERROR_SEVERITY.keys())
+        validation_bits = ",".join(qmp.VALIDATION_BITS.keys())
+        gedb_flags_bits = ",".join(qmp.GEDB_FLAGS_BITS.keys())
+
+        g_gen = parser.add_argument_group("Generic Error Data")  # pylint: disable=E1101
+        g_gen.add_argument("--block-status",
+                           help=f"block status bits: {block_status_bits}")
+        g_gen.add_argument("--raw-data", nargs="+",
+                        help="Raw data inside the Error Status Block")
+        g_gen.add_argument("--error-severity", "--severity",
+                           help=f"error severity: {error_severity_enum}")
+        g_gen.add_argument("--gen-err-valid-bits",
+                           "--generic-error-validation-bits",
+                           help=f"validation bits: {validation_bits}")
+        g_gen.add_argument("--fru-id", type=guid.UUID,
+                           help="GUID representing a physical device")
+        g_gen.add_argument("--fru-text",
+                           help="ASCII string identifying the FRU hardware")
+        g_gen.add_argument("--timestamp", type=util.time,
+                           help="Time when the error info was collected")
+        g_gen.add_argument("--precise", "--precise-timestamp",
+                           action='store_true',
+                           help="Marks the timestamp as precise if --timestamp is used")
+        g_gen.add_argument("--gedb-flags",
+                           help=f"General Error Data Block flags: {gedb_flags_bits}")
+
+    def set_args(self, args):
+        """Set the arguments optionally defined via self.argparse()"""
+
+        if args.block_status:
+            self.block_status = util.get_choice(name="block-status",
+                                                value=args.block_status,
+                                                choices=self.BLOCK_STATUS_BITS,
+                                                bitmask=False)
+        if args.raw_data:
+            self.raw_data = util.get_array("raw-data", args.raw_data,
+                                           max_val=255)
+            print(self.raw_data)
+
+        if args.error_severity:
+            self.error_severity = util.get_choice(name="error-severity",
+                                                  value=args.error_severity,
+                                                  choices=self.ERROR_SEVERITY,
+                                                  bitmask=False)
+
+        if args.fru_id:
+            self.fru_id = args.fru_id.to_bytes()
+            if not args.gen_err_valid_bits:
+                self.validation_bits |= self.VALIDATION_BITS["fru-id"]
+
+        if args.fru_text:
+            text = bytearray(args.fru_text.encode('ascii'))
+            if len(text) > 20:
+                sys.exit("FRU text is too big to fit")
+
+            self.fru_text = text
+            if not args.gen_err_valid_bits:
+                self.validation_bits |= self.VALIDATION_BITS["fru-text"]
+
+        if args.timestamp:
+            time = args.timestamp
+            century = int(time.year / 100)
+
+            bcd = bytearray()
+            util.data_add(bcd, (time.second // 10) << 4 | (time.second % 10), 1)
+            util.data_add(bcd, (time.minute // 10) << 4 | (time.minute % 10), 1)
+            util.data_add(bcd, (time.hour // 10) << 4 | (time.hour % 10), 1)
+
+            if args.precise:
+                util.data_add(bcd, 1, 1)
+            else:
+                util.data_add(bcd, 0, 1)
+
+            util.data_add(bcd, (time.day // 10) << 4 | (time.day % 10), 1)
+            util.data_add(bcd, (time.month // 10) << 4 | (time.month % 10), 1)
+            util.data_add(bcd,
+                          ((time.year % 100) // 10) << 4 | (time.year % 10), 1)
+            util.data_add(bcd, ((century % 100) // 10) << 4 | (century % 10), 1)
+
+            self.timestamp = bcd
+            if not args.gen_err_valid_bits:
+                self.validation_bits |= self.VALIDATION_BITS["timestamp"]
+
+        if args.gen_err_valid_bits:
+            self.validation_bits = util.get_choice(name="validation",
+                                                   value=args.gen_err_valid_bits,
+                                                   choices=self.VALIDATION_BITS)
+
+    def __init__(self, host, port, debug=False):
+        """Initialize variables used by the QMP send logic"""
+
+        self.connected = False
+        self.host = host
+        self.port = port
+        self.debug = debug
+
+        # ACPI 6.1: 18.3.2.7.1 Generic Error Data: Generic Error Status Block
+        self.block_status = self.BLOCK_STATUS_BITS["uncorrectable"]
+        self.raw_data = []
+        self.error_severity = self.ERROR_SEVERITY["recoverable"]
+
+        # ACPI 6.1: 18.3.2.7.1 Generic Error Data: Generic Error Data Entry
+        self.validation_bits = 0
+        self.flags = 0
+        self.fru_id = bytearray(16)
+        self.fru_text = bytearray(20)
+        self.timestamp = bytearray(8)
+
+        self.qmp_monitor = QEMUMonitorProtocol(address=(self.host, self.port))
+
+    #
+    # Socket QMP send command
+    #
+    def send_cper_raw(self, cper_data):
+        """Send a raw CPER data to QEMU though QMP TCP socket"""
+
+        data = b64encode(bytes(cper_data)).decode('ascii')
+
+        cmd_arg = {
+            'cper': data
+        }
+
+        self._connect()
+
+        if self.send_cmd("inject-ghes-v2-error", cmd_arg):
+            print("Error injected.")
+
+    def send_cper(self, notif_type, payload):
+        """Send commands to QEMU though QMP TCP socket"""
+
+        # Fill CPER record header
+
+        # NOTE: bits 4 to 13 of block status contain the number of
+        # data entries in the data section. This is currently unsupported.
+
+        cper_length = len(payload)
+        data_length = cper_length + len(self.raw_data) + self.GENERIC_DATA_SIZE
+
+        #  Generic Error Data Entry
+        gede = bytearray()
+
+        gede.extend(notif_type.to_bytes())
+        util.data_add(gede, self.error_severity, 4)
+        util.data_add(gede, 0x300, 2)
+        util.data_add(gede, self.validation_bits, 1)
+        util.data_add(gede, self.flags, 1)
+        util.data_add(gede, cper_length, 4)
+        gede.extend(self.fru_id)
+        gede.extend(self.fru_text)
+        gede.extend(self.timestamp)
+
+        # Generic Error Status Block
+        gebs = bytearray()
+
+        if self.raw_data:
+            raw_data_offset = len(gebs)
+        else:
+            raw_data_offset = 0
+
+        util.data_add(gebs, self.block_status, 4)
+        util.data_add(gebs, raw_data_offset, 4)
+        util.data_add(gebs, len(self.raw_data), 4)
+        util.data_add(gebs, data_length, 4)
+        util.data_add(gebs, self.error_severity, 4)
+
+        cper_data = bytearray()
+        cper_data.extend(gebs)
+        cper_data.extend(gede)
+        cper_data.extend(bytearray(self.raw_data))
+        cper_data.extend(bytearray(payload))
+
+        if self.debug:
+            print(f"GUID: {notif_type}")
+
+            util.dump_bytearray("Generic Error Status Block", gebs)
+            util.dump_bytearray("Generic Error Data Entry", gede)
+
+            if self.raw_data:
+                util.dump_bytearray("Raw data", bytearray(self.raw_data))
+
+            util.dump_bytearray("Payload", payload)
+
+        self.send_cper_raw(cper_data)
+
+
+    def search_qom(self, path, prop, regex):
+        """
+        Return a list of devices that match path array like:
+
+            /machine/unattached/device
+            /machine/peripheral-anon/device
+            ...
+        """
+
+        found = []
+
+        i = 0
+        while 1:
+            dev = f"{path}[{i}]"
+            args = {
+                'path': dev,
+                'property': prop
+            }
+            ret = self.send_cmd("qom-get", args, may_open=True,
+                                return_error=False)
+            if not ret:
+                break
+
+            if isinstance(ret, str):
+                if regex.search(ret):
+                    found.append(dev)
+
+            i += 1
+            if i > 10000:
+                print("Too many objects returned by qom-get!")
+                break
+
+        return found
+
+class cper_guid:
+    """
+    Contains CPER GUID, as per:
+    https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html
+    """
+
+    CPER_PROC_GENERIC =  guid(0x9876CCAD, 0x47B4, 0x4bdb,
+                              [0xB6, 0x5E, 0x16, 0xF1,
+                               0x93, 0xC4, 0xF3, 0xDB])
+
+    CPER_PROC_X86 = guid(0xDC3EA0B0, 0xA144, 0x4797,
+                         [0xB9, 0x5B, 0x53, 0xFA,
+                          0x24, 0x2B, 0x6E, 0x1D])
+
+    CPER_PROC_ITANIUM = guid(0xe429faf1, 0x3cb7, 0x11d4,
+                             [0xbc, 0xa7, 0x00, 0x80,
+                              0xc7, 0x3c, 0x88, 0x81])
+
+    CPER_PROC_ARM = guid(0xE19E3D16, 0xBC11, 0x11E4,
+                         [0x9C, 0xAA, 0xC2, 0x05,
+                          0x1D, 0x5D, 0x46, 0xB0])
+
+    CPER_PLATFORM_MEM = guid(0xA5BC1114, 0x6F64, 0x4EDE,
+                             [0xB8, 0x63, 0x3E, 0x83,
+                              0xED, 0x7C, 0x83, 0xB1])
+
+    CPER_PLATFORM_MEM2 = guid(0x61EC04FC, 0x48E6, 0xD813,
+                              [0x25, 0xC9, 0x8D, 0xAA,
+                               0x44, 0x75, 0x0B, 0x12])
+
+    CPER_PCIE = guid(0xD995E954, 0xBBC1, 0x430F,
+                     [0xAD, 0x91, 0xB4, 0x4D,
+                      0xCB, 0x3C, 0x6F, 0x35])
+
+    CPER_PCI_BUS = guid(0xC5753963, 0x3B84, 0x4095,
+                        [0xBF, 0x78, 0xED, 0xDA,
+                         0xD3, 0xF9, 0xC9, 0xDD])
+
+    CPER_PCI_DEV = guid(0xEB5E4685, 0xCA66, 0x4769,
+                        [0xB6, 0xA2, 0x26, 0x06,
+                         0x8B, 0x00, 0x13, 0x26])
+
+    CPER_FW_ERROR = guid(0x81212A96, 0x09ED, 0x4996,
+                         [0x94, 0x71, 0x8D, 0x72,
+                          0x9C, 0x8E, 0x69, 0xED])
+
+    CPER_DMA_GENERIC = guid(0x5B51FEF7, 0xC79D, 0x4434,
+                            [0x8F, 0x1B, 0xAA, 0x62,
+                             0xDE, 0x3E, 0x2C, 0x64])
+
+    CPER_DMA_VT = guid(0x71761D37, 0x32B2, 0x45cd,
+                       [0xA7, 0xD0, 0xB0, 0xFE,
+                        0xDD, 0x93, 0xE8, 0xCF])
+
+    CPER_DMA_IOMMU = guid(0x036F84E1, 0x7F37, 0x428c,
+                         [0xA7, 0x9E, 0x57, 0x5F,
+                          0xDF, 0xAA, 0x84, 0xEC])
+
+    CPER_CCIX_PER = guid(0x91335EF6, 0xEBFB, 0x4478,
+                         [0xA6, 0xA6, 0x88, 0xB7,
+                          0x28, 0xCF, 0x75, 0xD7])
+
+    CPER_CXL_PROT_ERR = guid(0x80B9EFB4, 0x52B5, 0x4DE3,
+                             [0xA7, 0x77, 0x68, 0x78,
+                              0x4B, 0x77, 0x10, 0x48])
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 04/20] Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static"
  2025-03-07 19:14 ` [PATCH v8 04/20] Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static" Mauro Carvalho Chehab
@ 2025-03-07 21:02   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-03-07 21:02 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel, Gavin Shan,
	Ani Sinha, Dongjiu Geng, linux-kernel

On 7/3/25 20:14, Mauro Carvalho Chehab wrote:
> The ghes_record_cper_errors() function was introduced to be used
> by other types of errors, as part of the error injection
> patch series. That's why it is not static.
> 
> Make it non-static again to allow its usage outside ghes.c
> 
> This reverts commit 611f3bdb20f7828b0813aa90d47d1275ef18329b.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
>   hw/acpi/ghes.c         | 6 ++++--
>   include/hw/acpi/ghes.h | 2 ++
>   2 files changed, 6 insertions(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

And I apologize for missing to relate the 2 previous series.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 18/20] tests/acpi: virt: update HEST and DSDT tables
  2025-03-07 19:14 ` [PATCH v8 18/20] tests/acpi: virt: update HEST and DSDT tables Mauro Carvalho Chehab
@ 2025-03-07 21:04   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-03-07 21:04 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel, Gavin Shan,
	Ani Sinha, linux-kernel

On 7/3/25 20:14, Mauro Carvalho Chehab wrote:
> @@ -1,39 +1,39 @@
>   /*
>    * Intel ACPI Component Architecture
>    * AML/ASL+ Disassembler version 20240322 (64-bit version)
>    * Copyright (c) 2000 - 2023 Intel Corporation
>    *
> - * Disassembly of tests/data/acpi/aarch64/virt/HEST
> + * Disassembly of /tmp/aml-DMPE22

The other path is more meaningful IMHO.

>    *
>    * ACPI Data Table [HEST]
>    *
>    * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue (in hex)
>    */


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 20/20] scripts/ghes_inject: add a script to generate GHES error inject
  2025-03-07 19:14 ` [PATCH v8 20/20] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
@ 2025-03-07 21:05   ` Philippe Mathieu-Daudé
  2025-03-08  9:22     ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-03-07 21:05 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel, Gavin Shan,
	Cleber Rosa, John Snow, linux-kernel, Thomas Huth

Hi Mauro,

On 7/3/25 20:14, Mauro Carvalho Chehab wrote:
> Using the QMP GHESv2 API requires preparing a raw data array
> containing a CPER record.
> 
> Add a helper script with subcommands to prepare such data.
> 
> Currently, only ARM Processor error CPER record is supported, by
> using:
> 	$ ghes_inject.py arm
> 
> which produces those warnings on Linux:
> 
> [  705.032426] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
> [  774.866308] {4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
> [  774.866583] {4}[Hardware Error]: event severity: recoverable
> [  774.866738] {4}[Hardware Error]:  Error 0, type: recoverable
> [  774.866889] {4}[Hardware Error]:   section_type: ARM processor error
> [  774.867048] {4}[Hardware Error]:   MIDR: 0x00000000000f0510
> [  774.867189] {4}[Hardware Error]:   running state: 0x0
> [  774.867321] {4}[Hardware Error]:   Power State Coordination Interface state: 0
> [  774.867511] {4}[Hardware Error]:   Error info structure 0:
> [  774.867679] {4}[Hardware Error]:   num errors: 2
> [  774.867801] {4}[Hardware Error]:    error_type: 0x02: cache error
> [  774.867962] {4}[Hardware Error]:    error_info: 0x000000000091000f
> [  774.868124] {4}[Hardware Error]:     transaction type: Data Access
> [  774.868280] {4}[Hardware Error]:     cache error, operation type: Data write
> [  774.868465] {4}[Hardware Error]:     cache level: 2
> [  774.868592] {4}[Hardware Error]:     processor context not corrupted
> [  774.868774] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
> 
> Such script allows customizing the error data, allowing to change
> all fields at the record. Please use:
> 
> 	$ ghes_inject.py arm -h

It should be easy enough to add a functional test covering this,
do you mind having a look?

> 
> For more details about its usage.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>   MAINTAINERS                    |   3 +
>   scripts/arm_processor_error.py | 476 ++++++++++++++++++++++
>   scripts/ghes_inject.py         |  51 +++
>   scripts/qmp_helper.py          | 703 +++++++++++++++++++++++++++++++++
>   4 files changed, 1233 insertions(+)
>   create mode 100644 scripts/arm_processor_error.py
>   create mode 100755 scripts/ghes_inject.py
>   create mode 100755 scripts/qmp_helper.py


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 20/20] scripts/ghes_inject: add a script to generate GHES error inject
  2025-03-07 21:05   ` Philippe Mathieu-Daudé
@ 2025-03-08  9:22     ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-08  9:22 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Igor Mammedov, Michael S . Tsirkin, Jonathan Cameron, Shiju Jose,
	qemu-arm, qemu-devel, Gavin Shan, Cleber Rosa, John Snow,
	linux-kernel, Thomas Huth

Hi Phillipe,

Em Fri, 7 Mar 2025 22:05:27 +0100
Philippe Mathieu-Daudé <philmd@linaro.org> escreveu:

> Hi Mauro,
> 
> On 7/3/25 20:14, Mauro Carvalho Chehab wrote:
> > Using the QMP GHESv2 API requires preparing a raw data array
> > containing a CPER record.
> > 
> > Add a helper script with subcommands to prepare such data.
> > 
> > Currently, only ARM Processor error CPER record is supported, by
> > using:
> > 	$ ghes_inject.py arm
> > 
> > which produces those warnings on Linux:
> > 
> > [  705.032426] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
> > [  774.866308] {4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
> > [  774.866583] {4}[Hardware Error]: event severity: recoverable
> > [  774.866738] {4}[Hardware Error]:  Error 0, type: recoverable
> > [  774.866889] {4}[Hardware Error]:   section_type: ARM processor error
> > [  774.867048] {4}[Hardware Error]:   MIDR: 0x00000000000f0510
> > [  774.867189] {4}[Hardware Error]:   running state: 0x0
> > [  774.867321] {4}[Hardware Error]:   Power State Coordination Interface state: 0
> > [  774.867511] {4}[Hardware Error]:   Error info structure 0:
> > [  774.867679] {4}[Hardware Error]:   num errors: 2
> > [  774.867801] {4}[Hardware Error]:    error_type: 0x02: cache error
> > [  774.867962] {4}[Hardware Error]:    error_info: 0x000000000091000f
> > [  774.868124] {4}[Hardware Error]:     transaction type: Data Access
> > [  774.868280] {4}[Hardware Error]:     cache error, operation type: Data write
> > [  774.868465] {4}[Hardware Error]:     cache level: 2
> > [  774.868592] {4}[Hardware Error]:     processor context not corrupted
> > [  774.868774] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error

Thanks for your review!

> > Such script allows customizing the error data, allowing to change
> > all fields at the record. Please use:
> > 
> > 	$ ghes_inject.py arm -h  
> 
> It should be easy enough to add a functional test covering this,
> do you mind having a look?

It is on my TODO plan to add some tests to check it, but instead of
a functional test, I'm aiming to test the full stack.

See, I'm one of the reviewers of the RAS subsystem at the Linux Kernel, 
and the author/maintainer of the userspace tool used to report and take
actions in case of troubles [1]. So, I'm targeting a solution that
will have rasdaemon installed on a Linux VM, testing all three
components altogether.

This will require to implement something at rasdaemon that will have
an interface at the rasdaemon to report errors to the host OS. It
currently have ABRT support, but it will likely need something
different to output error report in a way that the same error will
report the same over newer versions of the components inside the
stack.

For such purpose, I'm planning to implement a new feature on rasdaemon 
to allow reading the errors eithe via a TCP/IP socket with some
simple text output interface, or maybe add a SQL interface[3].

[1] https://github.com/mchehab/rasdaemon
[2] https://docs.kernel.org/dev-tools/ktap.html
[3] internally, rasdaemon has already a SQL interface, used with
    SQLite. It shouldn't be hard to add PostgreSQL and/or
    mariaDB/Mysql support on it.

Before implementing it, we need to have this series merged.

So, in summary, my plan to add tests for firmware-first error
report is:

1. Have this patch series merged;
2. Add a new report mechanism on rasdaemon to report errors via
   a TCP/IP socket;
3. Setup a runner that would periodically test the full stack and
   report regressions. Such runner would need to fetch from 3 different
   sources (QEMU, Kernel, rasdaemon), so it would likely be triggered
   by some scheduler.

Btw, for the first version of the script, only ARM Processor Error is
there, but my long term plan is to be able to test other type of
GHESv2 errors, like this one [4]:

	https://gitlab.com/mchehab_kernel/qemu/-/commit/8a774121750def2723ea59ce2343a774a3f01ca6

[3] I implemented PCIe bus error without checking first if the Kernel
    supported it (when I tested, it didn't). I opted to add this one
    to ensure that adding new subcommands to the ghes_inject.py script
    would be trivial. It helped me to organize the code in a way that a
    new error injection code means just a two lines change at the main
    script. In this specific case, it is:

	+ from pcie_bus_error import PcieBusError
	...
	+    PcieBusError(subparsers)

    With the actual implementation handled on a separate .py module.

    This way, we can add multiple handlers there, each one with its
    own separate Python file.

    After having this series merged, my TODO plan for GHES type support is
    to add error injection code for the errors that are already implemented
    inside the Kernel and rasdaemon, after checking that the support for
    they are OK. Then, add support for it at the runner that will be
    checking for potential regressions at the full stack.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (19 preceding siblings ...)
  2025-03-07 19:14 ` [PATCH v8 20/20] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
@ 2025-03-25  1:55 ` Mauro Carvalho Chehab
  2025-05-11 13:44 ` Michael S. Tsirkin
  21 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-25  1:55 UTC (permalink / raw)
  To: Michael S . Tsirkin
  Cc: Igor Mammedov, Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Philippe Mathieu-Daudé, Gavin Shan, Ani Sinha, Cleber Rosa,
	Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

Hi Michael,

Gentile ping.

Regards,
Mauro

Em Fri,  7 Mar 2025 20:14:29 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:

> Hi Michael,
> 
> I'm sending v8 to avoid a merge conflict with v7 due to this
> changeset:
> 
>    611f3bdb20f7 ("hw/acpi/ghes: Make  static")
> 
> As ghes_record_cper_errors() was written since the beginning
> to be public and used by ghes-cper.c. It ended being meged
> earlier because the error-injection series become too big,
> so it was decided last year to split in two to make easier for
> reviewers and maintainers to discuss.
> 
> Anyway, as mentioned on v7, I guess we're ready to merge this
> series, as patches here have been thoughfully reviewed mainly 
> by Igor and Jonathan over the last 5-6 months.
> 
> The only change from v7 is a minor editorial change at HEST doc
> spec, and the addition of Igor and Jonathan's A-B/R-B.
> 
> This series change the way HEST table offsets are calculated,
> making them identical to what an OSPM would do and allowing
> multiple HEST entries without causing migration issues. It open
> space to add HEST support for non-arm architectures, as now
> the number and type of HEST notification entries are not
> hardcoded at ghes.c. Instead, they're passed as a parameter
> from the arch-dependent init code.
> 
> With such issue addressed, it adds a new notification type and
> add support to inject errors via a Python script. The script
> itself is at the final patch.
> 
> ---
> v8:
>   - added a patch to revert recently-added changeset causing a
>     conflict with these. All remaining patches are identical.
> 
> v7:
>   - minor editorial change at the patch updating HEST doc spec
>    with the new workflow
> 
> v6:
> - some minor nits addressed:
>    - use GPA instead of offset;
>    - merged two patches;
>    - fixed a couple of long line coding style issues;
>    - the HEST/DSDT diff inside a patch was changed to avoid troubles
>      applying it.
> 
> v5:
> - make checkpatch happier;
> - HEST table is now tested;
> - some changes at HEST spec documentation to align with code changes;
> - extra care was taken with regards to git bisectability.
> 
> v4:
> - added an extra comment for AcpiGhesState structure;
> - patches reordered;
> - no functional changes, just code shift between the patches in this series.
> 
> v3:
> - addressed more nits;
> - hest_add_le now points to the beginning of HEST table;
> - removed HEST from tests/data/acpi;
> - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> 
> v2: 
> - address some nits;
> - improved ags cleanup patch and removed ags.present field;
> - added some missing le*_to_cpu() calls;
> - update date at copyright for new files to 2024-2025;
> - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> - added HEST and DSDT tables after the changes to make check target happy.
>   (two patches: first one whitelisting such tables; second one removing from
>    whitelist and updating/adding such tables to tests/data/acpi)
> 
> Mauro Carvalho Chehab (20):
>   tests/acpi: virt: add an empty HEST file
>   tests/qtest/bios-tables-test: extend to also check HEST table
>   tests/acpi: virt: update HEST file with its current data
>   Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static"
>   acpi/ghes: Cleanup the code which gets ghes ged state
>   acpi/ghes: prepare to change the way HEST offsets are calculated
>   acpi/ghes: add a firmware file with HEST address
>   acpi/ghes: Use HEST table offsets when preparing GHES records
>   acpi/ghes: don't hard-code the number of sources for HEST table
>   acpi/ghes: add a notifier to notify when error data is ready
>   acpi/generic_event_device: Update GHES migration to cover hest addr
>   acpi/generic_event_device: add logic to detect if HEST addr is
>     available
>   acpi/generic_event_device: add an APEI error device
>   tests/acpi: virt: allow acpi table changes at DSDT and HEST tables
>   arm/virt: Wire up a GED error device for ACPI / GHES
>   qapi/acpi-hest: add an interface to do generic CPER error injection
>   acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x
>   tests/acpi: virt: update HEST and DSDT tables
>   docs: hest: add new "etc/acpi_table_hest_addr" and update workflow
>   scripts/ghes_inject: add a script to generate GHES error inject
> 
>  MAINTAINERS                                   |  10 +
>  docs/specs/acpi_hest_ghes.rst                 |  28 +-
>  hw/acpi/Kconfig                               |   5 +
>  hw/acpi/aml-build.c                           |  10 +
>  hw/acpi/generic_event_device.c                |  44 ++
>  hw/acpi/ghes-stub.c                           |   7 +-
>  hw/acpi/ghes.c                                | 233 ++++--
>  hw/acpi/ghes_cper.c                           |  38 +
>  hw/acpi/ghes_cper_stub.c                      |  19 +
>  hw/acpi/meson.build                           |   2 +
>  hw/arm/virt-acpi-build.c                      |  35 +-
>  hw/arm/virt.c                                 |  19 +-
>  hw/core/machine.c                             |   2 +
>  include/hw/acpi/acpi_dev_interface.h          |   1 +
>  include/hw/acpi/aml-build.h                   |   2 +
>  include/hw/acpi/generic_event_device.h        |   1 +
>  include/hw/acpi/ghes.h                        |  51 +-
>  include/hw/arm/virt.h                         |   2 +
>  qapi/acpi-hest.json                           |  35 +
>  qapi/meson.build                              |   1 +
>  qapi/qapi-schema.json                         |   1 +
>  scripts/arm_processor_error.py                | 476 ++++++++++++
>  scripts/ghes_inject.py                        |  51 ++
>  scripts/qmp_helper.py                         | 703 ++++++++++++++++++
>  target/arm/kvm.c                              |   7 +-
>  tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
>  .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
>  tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
>  tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
>  tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
>  tests/data/acpi/aarch64/virt/HEST             | Bin 0 -> 224 bytes
>  tests/qtest/bios-tables-test.c                |   2 +-
>  32 files changed, 1695 insertions(+), 90 deletions(-)
>  create mode 100644 hw/acpi/ghes_cper.c
>  create mode 100644 hw/acpi/ghes_cper_stub.c
>  create mode 100644 qapi/acpi-hest.json
>  create mode 100644 scripts/arm_processor_error.py
>  create mode 100755 scripts/ghes_inject.py
>  create mode 100755 scripts/qmp_helper.py
>  create mode 100644 tests/data/acpi/aarch64/virt/HEST
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject
  2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
                   ` (20 preceding siblings ...)
  2025-03-25  1:55 ` [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for " Mauro Carvalho Chehab
@ 2025-05-11 13:44 ` Michael S. Tsirkin
  2025-05-11 13:45   ` Michael S. Tsirkin
  21 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2025-05-11 13:44 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Igor Mammedov, Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Philippe Mathieu-Daudé, Gavin Shan, Ani Sinha, Cleber Rosa,
	Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

On Fri, Mar 07, 2025 at 08:14:29PM +0100, Mauro Carvalho Chehab wrote:
> Hi Michael,
> 
> I'm sending v8 to avoid a merge conflict with v7 due to this
> changeset:
> 
>    611f3bdb20f7 ("hw/acpi/ghes: Make  static")



Applied 1-13.
Patch 14 needs to apply compat to 10.0 machine type as well.


> As ghes_record_cper_errors() was written since the beginning
> to be public and used by ghes-cper.c. It ended being meged
> earlier because the error-injection series become too big,
> so it was decided last year to split in two to make easier for
> reviewers and maintainers to discuss.
> 
> Anyway, as mentioned on v7, I guess we're ready to merge this
> series, as patches here have been thoughfully reviewed mainly 
> by Igor and Jonathan over the last 5-6 months.
> 
> The only change from v7 is a minor editorial change at HEST doc
> spec, and the addition of Igor and Jonathan's A-B/R-B.
> 
> This series change the way HEST table offsets are calculated,
> making them identical to what an OSPM would do and allowing
> multiple HEST entries without causing migration issues. It open
> space to add HEST support for non-arm architectures, as now
> the number and type of HEST notification entries are not
> hardcoded at ghes.c. Instead, they're passed as a parameter
> from the arch-dependent init code.
> 
> With such issue addressed, it adds a new notification type and
> add support to inject errors via a Python script. The script
> itself is at the final patch.
> 
> ---
> v8:
>   - added a patch to revert recently-added changeset causing a
>     conflict with these. All remaining patches are identical.
> 
> v7:
>   - minor editorial change at the patch updating HEST doc spec
>    with the new workflow
> 
> v6:
> - some minor nits addressed:
>    - use GPA instead of offset;
>    - merged two patches;
>    - fixed a couple of long line coding style issues;
>    - the HEST/DSDT diff inside a patch was changed to avoid troubles
>      applying it.
> 
> v5:
> - make checkpatch happier;
> - HEST table is now tested;
> - some changes at HEST spec documentation to align with code changes;
> - extra care was taken with regards to git bisectability.
> 
> v4:
> - added an extra comment for AcpiGhesState structure;
> - patches reordered;
> - no functional changes, just code shift between the patches in this series.
> 
> v3:
> - addressed more nits;
> - hest_add_le now points to the beginning of HEST table;
> - removed HEST from tests/data/acpi;
> - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> 
> v2: 
> - address some nits;
> - improved ags cleanup patch and removed ags.present field;
> - added some missing le*_to_cpu() calls;
> - update date at copyright for new files to 2024-2025;
> - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> - added HEST and DSDT tables after the changes to make check target happy.
>   (two patches: first one whitelisting such tables; second one removing from
>    whitelist and updating/adding such tables to tests/data/acpi)
> 
> Mauro Carvalho Chehab (20):
>   tests/acpi: virt: add an empty HEST file
>   tests/qtest/bios-tables-test: extend to also check HEST table
>   tests/acpi: virt: update HEST file with its current data
>   Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static"
>   acpi/ghes: Cleanup the code which gets ghes ged state
>   acpi/ghes: prepare to change the way HEST offsets are calculated
>   acpi/ghes: add a firmware file with HEST address
>   acpi/ghes: Use HEST table offsets when preparing GHES records
>   acpi/ghes: don't hard-code the number of sources for HEST table
>   acpi/ghes: add a notifier to notify when error data is ready
>   acpi/generic_event_device: Update GHES migration to cover hest addr
>   acpi/generic_event_device: add logic to detect if HEST addr is
>     available
>   acpi/generic_event_device: add an APEI error device
>   tests/acpi: virt: allow acpi table changes at DSDT and HEST tables
>   arm/virt: Wire up a GED error device for ACPI / GHES
>   qapi/acpi-hest: add an interface to do generic CPER error injection
>   acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x
>   tests/acpi: virt: update HEST and DSDT tables
>   docs: hest: add new "etc/acpi_table_hest_addr" and update workflow
>   scripts/ghes_inject: add a script to generate GHES error inject
> 
>  MAINTAINERS                                   |  10 +
>  docs/specs/acpi_hest_ghes.rst                 |  28 +-
>  hw/acpi/Kconfig                               |   5 +
>  hw/acpi/aml-build.c                           |  10 +
>  hw/acpi/generic_event_device.c                |  44 ++
>  hw/acpi/ghes-stub.c                           |   7 +-
>  hw/acpi/ghes.c                                | 233 ++++--
>  hw/acpi/ghes_cper.c                           |  38 +
>  hw/acpi/ghes_cper_stub.c                      |  19 +
>  hw/acpi/meson.build                           |   2 +
>  hw/arm/virt-acpi-build.c                      |  35 +-
>  hw/arm/virt.c                                 |  19 +-
>  hw/core/machine.c                             |   2 +
>  include/hw/acpi/acpi_dev_interface.h          |   1 +
>  include/hw/acpi/aml-build.h                   |   2 +
>  include/hw/acpi/generic_event_device.h        |   1 +
>  include/hw/acpi/ghes.h                        |  51 +-
>  include/hw/arm/virt.h                         |   2 +
>  qapi/acpi-hest.json                           |  35 +
>  qapi/meson.build                              |   1 +
>  qapi/qapi-schema.json                         |   1 +
>  scripts/arm_processor_error.py                | 476 ++++++++++++
>  scripts/ghes_inject.py                        |  51 ++
>  scripts/qmp_helper.py                         | 703 ++++++++++++++++++
>  target/arm/kvm.c                              |   7 +-
>  tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
>  .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
>  tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
>  tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
>  tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
>  tests/data/acpi/aarch64/virt/HEST             | Bin 0 -> 224 bytes
>  tests/qtest/bios-tables-test.c                |   2 +-
>  32 files changed, 1695 insertions(+), 90 deletions(-)
>  create mode 100644 hw/acpi/ghes_cper.c
>  create mode 100644 hw/acpi/ghes_cper_stub.c
>  create mode 100644 qapi/acpi-hest.json
>  create mode 100644 scripts/arm_processor_error.py
>  create mode 100755 scripts/ghes_inject.py
>  create mode 100755 scripts/qmp_helper.py
>  create mode 100644 tests/data/acpi/aarch64/virt/HEST
> 
> -- 
> 2.48.1
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject
  2025-05-11 13:44 ` Michael S. Tsirkin
@ 2025-05-11 13:45   ` Michael S. Tsirkin
  2025-05-20  6:48     ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2025-05-11 13:45 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Igor Mammedov, Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Philippe Mathieu-Daudé, Gavin Shan, Ani Sinha, Cleber Rosa,
	Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

On Sun, May 11, 2025 at 09:45:04AM -0400, Michael S. Tsirkin wrote:
> On Fri, Mar 07, 2025 at 08:14:29PM +0100, Mauro Carvalho Chehab wrote:
> > Hi Michael,
> > 
> > I'm sending v8 to avoid a merge conflict with v7 due to this
> > changeset:
> > 
> >    611f3bdb20f7 ("hw/acpi/ghes: Make  static")
> 
> 
> 
> Applied 1-13.
> Patch 14 needs to apply compat to 10.0 machine type as well.

Sorry i meant 1-11.

> 
> > As ghes_record_cper_errors() was written since the beginning
> > to be public and used by ghes-cper.c. It ended being meged
> > earlier because the error-injection series become too big,
> > so it was decided last year to split in two to make easier for
> > reviewers and maintainers to discuss.
> > 
> > Anyway, as mentioned on v7, I guess we're ready to merge this
> > series, as patches here have been thoughfully reviewed mainly 
> > by Igor and Jonathan over the last 5-6 months.
> > 
> > The only change from v7 is a minor editorial change at HEST doc
> > spec, and the addition of Igor and Jonathan's A-B/R-B.
> > 
> > This series change the way HEST table offsets are calculated,
> > making them identical to what an OSPM would do and allowing
> > multiple HEST entries without causing migration issues. It open
> > space to add HEST support for non-arm architectures, as now
> > the number and type of HEST notification entries are not
> > hardcoded at ghes.c. Instead, they're passed as a parameter
> > from the arch-dependent init code.
> > 
> > With such issue addressed, it adds a new notification type and
> > add support to inject errors via a Python script. The script
> > itself is at the final patch.
> > 
> > ---
> > v8:
> >   - added a patch to revert recently-added changeset causing a
> >     conflict with these. All remaining patches are identical.
> > 
> > v7:
> >   - minor editorial change at the patch updating HEST doc spec
> >    with the new workflow
> > 
> > v6:
> > - some minor nits addressed:
> >    - use GPA instead of offset;
> >    - merged two patches;
> >    - fixed a couple of long line coding style issues;
> >    - the HEST/DSDT diff inside a patch was changed to avoid troubles
> >      applying it.
> > 
> > v5:
> > - make checkpatch happier;
> > - HEST table is now tested;
> > - some changes at HEST spec documentation to align with code changes;
> > - extra care was taken with regards to git bisectability.
> > 
> > v4:
> > - added an extra comment for AcpiGhesState structure;
> > - patches reordered;
> > - no functional changes, just code shift between the patches in this series.
> > 
> > v3:
> > - addressed more nits;
> > - hest_add_le now points to the beginning of HEST table;
> > - removed HEST from tests/data/acpi;
> > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> > 
> > v2: 
> > - address some nits;
> > - improved ags cleanup patch and removed ags.present field;
> > - added some missing le*_to_cpu() calls;
> > - update date at copyright for new files to 2024-2025;
> > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > - added HEST and DSDT tables after the changes to make check target happy.
> >   (two patches: first one whitelisting such tables; second one removing from
> >    whitelist and updating/adding such tables to tests/data/acpi)
> > 
> > Mauro Carvalho Chehab (20):
> >   tests/acpi: virt: add an empty HEST file
> >   tests/qtest/bios-tables-test: extend to also check HEST table
> >   tests/acpi: virt: update HEST file with its current data
> >   Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static"
> >   acpi/ghes: Cleanup the code which gets ghes ged state
> >   acpi/ghes: prepare to change the way HEST offsets are calculated
> >   acpi/ghes: add a firmware file with HEST address
> >   acpi/ghes: Use HEST table offsets when preparing GHES records
> >   acpi/ghes: don't hard-code the number of sources for HEST table
> >   acpi/ghes: add a notifier to notify when error data is ready
> >   acpi/generic_event_device: Update GHES migration to cover hest addr
> >   acpi/generic_event_device: add logic to detect if HEST addr is
> >     available
> >   acpi/generic_event_device: add an APEI error device
> >   tests/acpi: virt: allow acpi table changes at DSDT and HEST tables
> >   arm/virt: Wire up a GED error device for ACPI / GHES
> >   qapi/acpi-hest: add an interface to do generic CPER error injection
> >   acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x
> >   tests/acpi: virt: update HEST and DSDT tables
> >   docs: hest: add new "etc/acpi_table_hest_addr" and update workflow
> >   scripts/ghes_inject: add a script to generate GHES error inject
> > 
> >  MAINTAINERS                                   |  10 +
> >  docs/specs/acpi_hest_ghes.rst                 |  28 +-
> >  hw/acpi/Kconfig                               |   5 +
> >  hw/acpi/aml-build.c                           |  10 +
> >  hw/acpi/generic_event_device.c                |  44 ++
> >  hw/acpi/ghes-stub.c                           |   7 +-
> >  hw/acpi/ghes.c                                | 233 ++++--
> >  hw/acpi/ghes_cper.c                           |  38 +
> >  hw/acpi/ghes_cper_stub.c                      |  19 +
> >  hw/acpi/meson.build                           |   2 +
> >  hw/arm/virt-acpi-build.c                      |  35 +-
> >  hw/arm/virt.c                                 |  19 +-
> >  hw/core/machine.c                             |   2 +
> >  include/hw/acpi/acpi_dev_interface.h          |   1 +
> >  include/hw/acpi/aml-build.h                   |   2 +
> >  include/hw/acpi/generic_event_device.h        |   1 +
> >  include/hw/acpi/ghes.h                        |  51 +-
> >  include/hw/arm/virt.h                         |   2 +
> >  qapi/acpi-hest.json                           |  35 +
> >  qapi/meson.build                              |   1 +
> >  qapi/qapi-schema.json                         |   1 +
> >  scripts/arm_processor_error.py                | 476 ++++++++++++
> >  scripts/ghes_inject.py                        |  51 ++
> >  scripts/qmp_helper.py                         | 703 ++++++++++++++++++
> >  target/arm/kvm.c                              |   7 +-
> >  tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
> >  .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
> >  tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
> >  tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
> >  tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
> >  tests/data/acpi/aarch64/virt/HEST             | Bin 0 -> 224 bytes
> >  tests/qtest/bios-tables-test.c                |   2 +-
> >  32 files changed, 1695 insertions(+), 90 deletions(-)
> >  create mode 100644 hw/acpi/ghes_cper.c
> >  create mode 100644 hw/acpi/ghes_cper_stub.c
> >  create mode 100644 qapi/acpi-hest.json
> >  create mode 100644 scripts/arm_processor_error.py
> >  create mode 100755 scripts/ghes_inject.py
> >  create mode 100755 scripts/qmp_helper.py
> >  create mode 100644 tests/data/acpi/aarch64/virt/HEST
> > 
> > -- 
> > 2.48.1
> > 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 09/20] acpi/ghes: don't hard-code the number of sources for HEST table
  2025-03-07 19:14 ` [PATCH v8 09/20] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
@ 2025-05-12 19:42   ` Michael S. Tsirkin
  2025-05-20  6:47     ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2025-05-12 19:42 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Igor Mammedov, Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Gavin Shan, Philippe Mathieu-Daudé, Ani Sinha, Dongjiu Geng,
	Peter Maydell, Shannon Zhao, linux-kernel

On Fri, Mar 07, 2025 at 08:14:38PM +0100, Mauro Carvalho Chehab wrote:
> The current code is actually dependent on having just one error
> structure with a single source, as any change there would cause
> migration issues.
> 
> As the number of sources should be arch-dependent, as it will depend on
> what kind of notifications will exist, and how many errors can be
> reported at the same time, change the logic to be more flexible,
> allowing the number of sources to be defined when building the
> HEST table by the caller.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Igor Mammedov <imammedo@redhat.com>

Breaks CI:

https://gitlab.com/mstredhat/qemu/-/jobs/10007863974

I dropped the patchset for now, pls rebase and repost.

> ---
>  hw/acpi/ghes.c           | 39 +++++++++++++++++++++------------------
>  hw/arm/virt-acpi-build.c |  8 +++++++-
>  include/hw/acpi/ghes.h   | 17 ++++++++++++-----
>  3 files changed, 40 insertions(+), 24 deletions(-)
> 
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 668ca72587c7..f49d0d628fc4 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -238,17 +238,17 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
>   * See docs/specs/acpi_hest_ghes.rst for blobs format.
>   */
>  static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
> -                                   BIOSLinker *linker)
> +                                   BIOSLinker *linker, int num_sources)
>  {
>      int i, error_status_block_offset;
>  
>      /* Build error_block_address */
> -    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
> +    for (i = 0; i < num_sources; i++) {
>          build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
>      }
>  
>      /* Build read_ack_register */
> -    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
> +    for (i = 0; i < num_sources; i++) {
>          /*
>           * Initialize the value of read_ack_register to 1, so GHES can be
>           * writable after (re)boot.
> @@ -263,13 +263,13 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
>  
>      /* Reserve space for Error Status Data Block */
>      acpi_data_push(hardware_errors,
> -        ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
> +        ACPI_GHES_MAX_RAW_DATA_LENGTH * num_sources);
>  
>      /* Tell guest firmware to place hardware_errors blob into RAM */
>      bios_linker_loader_alloc(linker, ACPI_HW_ERROR_FW_CFG_FILE,
>                               hardware_errors, sizeof(uint64_t), false);
>  
> -    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
> +    for (i = 0; i < num_sources; i++) {
>          /*
>           * Tell firmware to patch error_block_address entries to point to
>           * corresponding "Generic Error Status Block"
> @@ -295,12 +295,14 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
>  }
>  
>  /* Build Generic Hardware Error Source version 2 (GHESv2) */
> -static void build_ghes_v2(GArray *table_data,
> -                          BIOSLinker *linker,
> -                          enum AcpiGhesNotifyType notify,
> -                          uint16_t source_id)
> +static void build_ghes_v2_entry(GArray *table_data,
> +                                BIOSLinker *linker,
> +                                const AcpiNotificationSourceId *notif_src,
> +                                uint16_t index, int num_sources)
>  {
>      uint64_t address_offset;
> +    const uint16_t notify = notif_src->notify;
> +    const uint16_t source_id = notif_src->source_id;
>  
>      /*
>       * Type:
> @@ -331,7 +333,7 @@ static void build_ghes_v2(GArray *table_data,
>                                     address_offset + GAS_ADDR_OFFSET,
>                                     sizeof(uint64_t),
>                                     ACPI_HW_ERROR_FW_CFG_FILE,
> -                                   source_id * sizeof(uint64_t));
> +                                   index * sizeof(uint64_t));
>  
>      /* Notification Structure */
>      build_ghes_hw_error_notification(table_data, notify);
> @@ -351,8 +353,7 @@ static void build_ghes_v2(GArray *table_data,
>                                     address_offset + GAS_ADDR_OFFSET,
>                                     sizeof(uint64_t),
>                                     ACPI_HW_ERROR_FW_CFG_FILE,
> -                                   (ACPI_GHES_ERROR_SOURCE_COUNT + source_id)
> -                                   * sizeof(uint64_t));
> +                                   (num_sources + index) * sizeof(uint64_t));
>  
>      /*
>       * Read Ack Preserve field
> @@ -368,22 +369,26 @@ static void build_ghes_v2(GArray *table_data,
>  void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
>                       GArray *hardware_errors,
>                       BIOSLinker *linker,
> +                     const AcpiNotificationSourceId *notif_source,
> +                     int num_sources,
>                       const char *oem_id, const char *oem_table_id)
>  {
>      AcpiTable table = { .sig = "HEST", .rev = 1,
>                          .oem_id = oem_id, .oem_table_id = oem_table_id };
>      uint32_t hest_offset;
> +    int i;
>  
>      hest_offset = table_data->len;
>  
> -    build_ghes_error_table(ags, hardware_errors, linker);
> +    build_ghes_error_table(ags, hardware_errors, linker, num_sources);
>  
>      acpi_table_begin(&table, table_data);
>  
>      /* Error Source Count */
> -    build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
> -    build_ghes_v2(table_data, linker,
> -                  ACPI_GHES_NOTIFY_SEA, ACPI_HEST_SRC_ID_SEA);
> +    build_append_int_noprefix(table_data, num_sources, 4);
> +    for (i = 0; i < num_sources; i++) {
> +        build_ghes_v2_entry(table_data, linker, &notif_source[i], i, num_sources);
> +    }
>  
>      acpi_table_end(linker, &table);
>  
> @@ -515,8 +520,6 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
>          return;
>      }
>  
> -    assert(ACPI_GHES_ERROR_SOURCE_COUNT == 1);
> -
>      if (!ags->use_hest_addr) {
>          get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
>                               &cper_addr, &read_ack_register_addr);
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 040d875d4e83..ea9682ee2662 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -893,6 +893,10 @@ static void acpi_align_size(GArray *blob, unsigned align)
>      g_array_set_size(blob, ROUND_UP(acpi_data_len(blob), align));
>  }
>  
> +static const AcpiNotificationSourceId hest_ghes_notify[] = {
> +    { ACPI_HEST_SRC_ID_SYNC, ACPI_GHES_NOTIFY_SEA },
> +};
> +
>  static
>  void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>  {
> @@ -954,7 +958,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>          if (ags) {
>              acpi_add_table(table_offsets, tables_blob);
>              acpi_build_hest(ags, tables_blob, tables->hardware_errors,
> -                            tables->linker, vms->oem_id, vms->oem_table_id);
> +                            tables->linker, hest_ghes_notify,
> +                            ARRAY_SIZE(hest_ghes_notify),
> +                            vms->oem_id, vms->oem_table_id);
>          }
>      }
>  
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index 5265102ba51f..8c4b08433760 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -57,13 +57,18 @@ enum AcpiGhesNotifyType {
>      ACPI_GHES_NOTIFY_RESERVED = 12
>  };
>  
> -enum {
> -    ACPI_HEST_SRC_ID_SEA = 0,
> -    /* future ids go here */
> -
> -    ACPI_GHES_ERROR_SOURCE_COUNT
> +/*
> + * ID numbers used to fill HEST source ID field
> + */
> +enum AcpiGhesSourceID {
> +    ACPI_HEST_SRC_ID_SYNC,
>  };
>  
> +typedef struct AcpiNotificationSourceId {
> +    enum AcpiGhesSourceID source_id;
> +    enum AcpiGhesNotifyType notify;
> +} AcpiNotificationSourceId;
> +
>  /*
>   * AcpiGhesState stores GPA values that will be used to fill HEST entries.
>   *
> @@ -84,6 +89,8 @@ typedef struct AcpiGhesState {
>  void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
>                       GArray *hardware_errors,
>                       BIOSLinker *linker,
> +                     const AcpiNotificationSourceId * const notif_source,
> +                     int num_sources,
>                       const char *oem_id, const char *oem_table_id);
>  void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
>                            GArray *hardware_errors);
> -- 
> 2.48.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 09/20] acpi/ghes: don't hard-code the number of sources for HEST table
  2025-05-12 19:42   ` Michael S. Tsirkin
@ 2025-05-20  6:47     ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-05-20  6:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Igor Mammedov, Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Gavin Shan, Philippe Mathieu-Daudé, Ani Sinha, Dongjiu Geng,
	Peter Maydell, Shannon Zhao, linux-kernel

Em Mon, 12 May 2025 15:42:27 -0400
"Michael S. Tsirkin" <mst@redhat.com> escreveu:

> On Fri, Mar 07, 2025 at 08:14:38PM +0100, Mauro Carvalho Chehab wrote:
> > The current code is actually dependent on having just one error
> > structure with a single source, as any change there would cause
> > migration issues.
> > 
> > As the number of sources should be arch-dependent, as it will depend on
> > what kind of notifications will exist, and how many errors can be
> > reported at the same time, change the logic to be more flexible,
> > allowing the number of sources to be defined when building the
> > HEST table by the caller.
> > 
> > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Reviewed-by: Igor Mammedov <imammedo@redhat.com>  
> 
> Breaks CI:
> 
> https://gitlab.com/mstredhat/qemu/-/jobs/10007863974
> 
> I dropped the patchset for now, pls rebase and repost.

Compilation was happening fine for me. I suspect that some rebase
dropped a change here:

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index a256f9d817cd..a1492bcaeb8d 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2360,7 +2360,7 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
              */
             if (code == BUS_MCEERR_AR) {
                 kvm_cpu_synchronize_state(c);
-                if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SEA,
+                if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
                                              paddr)) {
                     kvm_inject_arm_sea(c);
                 } else {

but somehow, ninja didn't detect and was keep using an already
compiled kvm.o object. Anyway, fixed at the new series:

	https://lore.kernel.org/qemu-devel/9404b305e0fefdb14dce4839c3d96ccedf0533e0.1747722973.git.mchehab+huawei@kernel.org/T/#u

And, just in case, I removed ./build directory and recompiled it
again.

Thanks,
Mauro

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject
  2025-05-11 13:45   ` Michael S. Tsirkin
@ 2025-05-20  6:48     ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 31+ messages in thread
From: Mauro Carvalho Chehab @ 2025-05-20  6:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Igor Mammedov, Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Philippe Mathieu-Daudé, Gavin Shan, Ani Sinha, Cleber Rosa,
	Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

Em Sun, 11 May 2025 09:45:55 -0400
"Michael S. Tsirkin" <mst@redhat.com> escreveu:

> On Sun, May 11, 2025 at 09:45:04AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Mar 07, 2025 at 08:14:29PM +0100, Mauro Carvalho Chehab wrote:  
> > > Hi Michael,
> > > 
> > > I'm sending v8 to avoid a merge conflict with v7 due to this
> > > changeset:
> > > 
> > >    611f3bdb20f7 ("hw/acpi/ghes: Make  static")  
> > 
> > 
> > 
> > Applied 1-13.
> > Patch 14 needs to apply compat to 10.0 machine type as well.  
> 
> Sorry i meant 1-11.

Changed on the newest version I just submitted:

	https://lore.kernel.org/qemu-devel/cover.1747722973.git.mchehab+huawei@kernel.org/

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2025-05-20  6:48 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-07 19:14 [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 01/20] tests/acpi: virt: add an empty HEST file Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 02/20] tests/qtest/bios-tables-test: extend to also check HEST table Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 03/20] tests/acpi: virt: update HEST file with its current data Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 04/20] Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static" Mauro Carvalho Chehab
2025-03-07 21:02   ` Philippe Mathieu-Daudé
2025-03-07 19:14 ` [PATCH v8 05/20] acpi/ghes: Cleanup the code which gets ghes ged state Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 06/20] acpi/ghes: prepare to change the way HEST offsets are calculated Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 07/20] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 08/20] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 09/20] acpi/ghes: don't hard-code the number of sources for HEST table Mauro Carvalho Chehab
2025-05-12 19:42   ` Michael S. Tsirkin
2025-05-20  6:47     ` Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 10/20] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 11/20] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 12/20] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 13/20] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 14/20] tests/acpi: virt: allow acpi table changes at DSDT and HEST tables Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 15/20] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 16/20] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 17/20] acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 18/20] tests/acpi: virt: update HEST and DSDT tables Mauro Carvalho Chehab
2025-03-07 21:04   ` Philippe Mathieu-Daudé
2025-03-07 19:14 ` [PATCH v8 19/20] docs: hest: add new "etc/acpi_table_hest_addr" and update workflow Mauro Carvalho Chehab
2025-03-07 19:14 ` [PATCH v8 20/20] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
2025-03-07 21:05   ` Philippe Mathieu-Daudé
2025-03-08  9:22     ` Mauro Carvalho Chehab
2025-03-25  1:55 ` [PATCH v8 00/20] Change ghes to use HEST-based offsets and add support for " Mauro Carvalho Chehab
2025-05-11 13:44 ` Michael S. Tsirkin
2025-05-11 13:45   ` Michael S. Tsirkin
2025-05-20  6:48     ` Mauro Carvalho Chehab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).