public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
@ 2025-02-21 14:35 Mauro Carvalho Chehab
  2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Philippe Mathieu-Daudé, Ani Sinha,
	Cleber Rosa, Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

Now that the ghes preparation patches were merged, let's add support
for error injection.

On this series, the first 6 patches chang to the math used to calculate offsets at HEST
table and hardware_error firmware file, together with its migration code. Migration tested
with both latest QEMU released kernel and upstream, on both directions.

The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
   to inject ARM Processor Error records.

---
v4:
- added an extra comment for AcpiGhesState structure;
- patches reordered;
- no functional changes, just code shift between the patches in this series.

v3:
- addressed more nits;
- hest_add_le now points to the beginning of HEST table;
- removed HEST from tests/data/acpi;
- added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le

v2: 
- address some nits;
- improved ags cleanup patch and removed ags.present field;
- added some missing le*_to_cpu() calls;
- update date at copyright for new files to 2024-2025;
- qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
- added HEST and DSDT tables after the changes to make check target happy.
  (two patches: first one whitelisting such tables; second one removing from
   whitelist and updating/adding such tables to tests/data/acpi)



Mauro Carvalho Chehab (14):
  acpi/ghes: prepare to change the way HEST offsets are calculated
  acpi/ghes: add a firmware file with HEST address
  acpi/ghes: Use HEST table offsets when preparing GHES records
  acpi/ghes: don't hard-code the number of sources for HEST table
  acpi/ghes: add a notifier to notify when error data is ready
  acpi/ghes: create an ancillary acpi_ghes_get_state() function
  acpi/generic_event_device: Update GHES migration to cover hest addr
  acpi/generic_event_device: add logic to detect if HEST addr is
    available
  acpi/generic_event_device: add an APEI error device
  tests/acpi: virt: allow acpi table changes for a new table: HEST
  arm/virt: Wire up a GED error device for ACPI / GHES
  tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
  qapi/acpi-hest: add an interface to do generic CPER error injection
  scripts/ghes_inject: add a script to generate GHES error inject

 MAINTAINERS                                   |  10 +
 hw/acpi/Kconfig                               |   5 +
 hw/acpi/aml-build.c                           |  10 +
 hw/acpi/generic_event_device.c                |  43 ++
 hw/acpi/ghes-stub.c                           |   7 +-
 hw/acpi/ghes.c                                | 231 ++++--
 hw/acpi/ghes_cper.c                           |  38 +
 hw/acpi/ghes_cper_stub.c                      |  19 +
 hw/acpi/meson.build                           |   2 +
 hw/arm/virt-acpi-build.c                      |  37 +-
 hw/arm/virt.c                                 |  19 +-
 hw/core/machine.c                             |   2 +
 include/hw/acpi/acpi_dev_interface.h          |   1 +
 include/hw/acpi/aml-build.h                   |   2 +
 include/hw/acpi/generic_event_device.h        |   1 +
 include/hw/acpi/ghes.h                        |  54 +-
 include/hw/arm/virt.h                         |   2 +
 qapi/acpi-hest.json                           |  35 +
 qapi/meson.build                              |   1 +
 qapi/qapi-schema.json                         |   1 +
 scripts/arm_processor_error.py                | 476 ++++++++++++
 scripts/ghes_inject.py                        |  51 ++
 scripts/qmp_helper.py                         | 702 ++++++++++++++++++
 target/arm/kvm.c                              |   7 +-
 tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
 .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
 tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
 tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
 tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
 29 files changed, 1677 insertions(+), 79 deletions(-)
 create mode 100644 hw/acpi/ghes_cper.c
 create mode 100644 hw/acpi/ghes_cper_stub.c
 create mode 100644 qapi/acpi-hest.json
 create mode 100644 scripts/arm_processor_error.py
 create mode 100755 scripts/ghes_inject.py
 create mode 100755 scripts/qmp_helper.py

-- 
2.48.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function
  2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
  2025-02-26 15:27   ` Igor Mammedov
  2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Igor Mammedov
  2025-02-27  9:54 ` Igor Mammedov
  2 siblings, 1 reply; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Paolo Bonzini,
	Peter Maydell, kvm, linux-kernel

Instead of having a function to check if ACPI is enabled
(acpi_ghes_present), change its logic to be more generic,
returing a pointed to AcpiGhesState.

Such change allows cleanup the ghes GED state code, avoiding
to read it multiple times, and simplifying the code.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by:  Igor Mammedov <imammedo@redhat.com>
---
 hw/acpi/ghes-stub.c    |  7 ++++---
 hw/acpi/ghes.c         | 38 ++++++++++----------------------------
 include/hw/acpi/ghes.h | 14 ++++++++------
 target/arm/kvm.c       |  7 +++++--
 4 files changed, 27 insertions(+), 39 deletions(-)

diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
index 7cec1812dad9..40f660c246fe 100644
--- a/hw/acpi/ghes-stub.c
+++ b/hw/acpi/ghes-stub.c
@@ -11,12 +11,13 @@
 #include "qemu/osdep.h"
 #include "hw/acpi/ghes.h"
 
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+                            uint64_t physical_address)
 {
     return -1;
 }
 
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
 {
-    return false;
+    return NULL;
 }
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index f2d1cc7369f4..401789259f60 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -425,10 +425,6 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
                                  uint64_t *cper_addr,
                                  uint64_t *read_ack_register_addr)
 {
-    if (!ghes_addr) {
-        return;
-    }
-
     /*
      * non-HEST version supports only one source, so no need to change
      * the start offset based on the source ID. Also, we can't validate
@@ -517,27 +513,16 @@ static void get_ghes_source_offsets(uint16_t source_id,
 NotifierList acpi_generic_error_notifiers =
     NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
 
-void ghes_record_cper_errors(const void *cper, size_t len,
+void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
                              uint16_t source_id, Error **errp)
 {
     uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
-    AcpiGedState *acpi_ged_state;
-    AcpiGhesState *ags;
 
     if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
         error_setg(errp, "GHES CPER record is too big: %zd", len);
         return;
     }
 
-    acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
-                                                       NULL));
-    if (!acpi_ged_state) {
-        error_setg(errp, "Can't find ACPI_GED object");
-        return;
-    }
-    ags = &acpi_ged_state->ghes_state;
-
-
     if (!ags->use_hest_addr) {
         get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
                              &cper_addr, &read_ack_register_addr);
@@ -546,11 +531,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
                                 &cper_addr, &read_ack_register_addr, errp);
     }
 
-    if (!cper_addr) {
-        error_setg(errp, "can not find Generic Error Status Block");
-        return;
-    }
-
     cpu_physical_memory_read(read_ack_register_addr,
                              &read_ack_register, sizeof(read_ack_register));
 
@@ -576,7 +556,8 @@ void ghes_record_cper_errors(const void *cper, size_t len,
     notifier_list_notify(&acpi_generic_error_notifiers, NULL);
 }
 
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+                            uint64_t physical_address)
 {
     /* Memory Error Section Type */
     const uint8_t guid[] =
@@ -602,7 +583,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
     acpi_ghes_build_append_mem_cper(block, physical_address);
 
     /* Report the error */
-    ghes_record_cper_errors(block->data, block->len, source_id, &errp);
+    ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
 
     g_array_free(block, true);
 
@@ -614,7 +595,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
     return 0;
 }
 
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
 {
     AcpiGedState *acpi_ged_state;
     AcpiGhesState *ags;
@@ -623,11 +604,12 @@ bool acpi_ghes_present(void)
                                                        NULL));
 
     if (!acpi_ged_state) {
-        return false;
+        return NULL;
     }
     ags = &acpi_ged_state->ghes_state;
-    if (!ags->hw_error_le && !ags->hest_addr_le)
-        return false;
 
-    return true;
+    if (!ags->hw_error_le && !ags->hest_addr_le) {
+        return NULL;
+    }
+    return ags;
 }
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 219aa7ab4fe0..276f9dc076d9 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -99,15 +99,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
                      const char *oem_id, const char *oem_table_id);
 void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
                           GArray *hardware_errors);
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t error_physical_addr);
-void ghes_record_cper_errors(const void *cper, size_t len,
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+                            uint64_t error_physical_addr);
+void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
                              uint16_t source_id, Error **errp);
 
 /**
- * acpi_ghes_present: Report whether ACPI GHES table is present
+ * acpi_ghes_get_state: Get a pointer for ACPI ghes state
  *
- * Returns: true if the system has an ACPI GHES table and it is
- * safe to call acpi_ghes_memory_errors() to record a memory error.
+ * Returns: a pointer to ghes state if the system has an ACPI GHES table,
+ * it is enabled and it is safe to call acpi_ghes_memory_errors() to record
+ * a memory error. Returns false, otherwise.
  */
-bool acpi_ghes_present(void);
+AcpiGhesState *acpi_ghes_get_state(void);
 #endif
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index da30bdbb2349..80ca7779797b 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2366,10 +2366,12 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
 {
     ram_addr_t ram_addr;
     hwaddr paddr;
+    AcpiGhesState *ags;
 
     assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
 
-    if (acpi_ghes_present() && addr) {
+    ags = acpi_ghes_get_state();
+    if (ags && addr) {
         ram_addr = qemu_ram_addr_from_host(addr);
         if (ram_addr != RAM_ADDR_INVALID &&
             kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
@@ -2387,7 +2389,8 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
              */
             if (code == BUS_MCEERR_AR) {
                 kvm_cpu_synchronize_state(c);
-                if (!acpi_ghes_memory_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
+                if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SEA,
+                                             paddr)) {
                     kvm_inject_arm_sea(c);
                 } else {
                     error_report("failed to record the error");
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
  2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
  2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
@ 2025-02-26 14:16 ` Igor Mammedov
  2025-02-26 14:39   ` Mauro Carvalho Chehab
  2025-02-27  9:54 ` Igor Mammedov
  2 siblings, 1 reply; 9+ messages in thread
From: Igor Mammedov @ 2025-02-26 14:16 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
	qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
	Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

On Fri, 21 Feb 2025 15:35:09 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:

> Now that the ghes preparation patches were merged, let's add support
> for error injection.
> 
> On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> table and hardware_error firmware file, together with its migration code. Migration tested
> with both latest QEMU released kernel and upstream, on both directions.
> 
> The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
>    to inject ARM Processor Error records.

please, run ./scripts/checkpatch on patches before submitting them.
as it stands now series cannot be merged due to failing checkpatch

> 
> ---
> v4:
> - added an extra comment for AcpiGhesState structure;
> - patches reordered;
> - no functional changes, just code shift between the patches in this series.
> 
> v3:
> - addressed more nits;
> - hest_add_le now points to the beginning of HEST table;
> - removed HEST from tests/data/acpi;
> - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> 
> v2: 
> - address some nits;
> - improved ags cleanup patch and removed ags.present field;
> - added some missing le*_to_cpu() calls;
> - update date at copyright for new files to 2024-2025;
> - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> - added HEST and DSDT tables after the changes to make check target happy.
>   (two patches: first one whitelisting such tables; second one removing from
>    whitelist and updating/adding such tables to tests/data/acpi)
> 
> 
> 
> Mauro Carvalho Chehab (14):
>   acpi/ghes: prepare to change the way HEST offsets are calculated
>   acpi/ghes: add a firmware file with HEST address
>   acpi/ghes: Use HEST table offsets when preparing GHES records
>   acpi/ghes: don't hard-code the number of sources for HEST table
>   acpi/ghes: add a notifier to notify when error data is ready
>   acpi/ghes: create an ancillary acpi_ghes_get_state() function
>   acpi/generic_event_device: Update GHES migration to cover hest addr
>   acpi/generic_event_device: add logic to detect if HEST addr is
>     available
>   acpi/generic_event_device: add an APEI error device
>   tests/acpi: virt: allow acpi table changes for a new table: HEST
>   arm/virt: Wire up a GED error device for ACPI / GHES
>   tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
>   qapi/acpi-hest: add an interface to do generic CPER error injection
>   scripts/ghes_inject: add a script to generate GHES error inject
> 
>  MAINTAINERS                                   |  10 +
>  hw/acpi/Kconfig                               |   5 +
>  hw/acpi/aml-build.c                           |  10 +
>  hw/acpi/generic_event_device.c                |  43 ++
>  hw/acpi/ghes-stub.c                           |   7 +-
>  hw/acpi/ghes.c                                | 231 ++++--
>  hw/acpi/ghes_cper.c                           |  38 +
>  hw/acpi/ghes_cper_stub.c                      |  19 +
>  hw/acpi/meson.build                           |   2 +
>  hw/arm/virt-acpi-build.c                      |  37 +-
>  hw/arm/virt.c                                 |  19 +-
>  hw/core/machine.c                             |   2 +
>  include/hw/acpi/acpi_dev_interface.h          |   1 +
>  include/hw/acpi/aml-build.h                   |   2 +
>  include/hw/acpi/generic_event_device.h        |   1 +
>  include/hw/acpi/ghes.h                        |  54 +-
>  include/hw/arm/virt.h                         |   2 +
>  qapi/acpi-hest.json                           |  35 +
>  qapi/meson.build                              |   1 +
>  qapi/qapi-schema.json                         |   1 +
>  scripts/arm_processor_error.py                | 476 ++++++++++++
>  scripts/ghes_inject.py                        |  51 ++
>  scripts/qmp_helper.py                         | 702 ++++++++++++++++++
>  target/arm/kvm.c                              |   7 +-
>  tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
>  .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
>  tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
>  tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
>  tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
>  29 files changed, 1677 insertions(+), 79 deletions(-)
>  create mode 100644 hw/acpi/ghes_cper.c
>  create mode 100644 hw/acpi/ghes_cper_stub.c
>  create mode 100644 qapi/acpi-hest.json
>  create mode 100644 scripts/arm_processor_error.py
>  create mode 100755 scripts/ghes_inject.py
>  create mode 100755 scripts/qmp_helper.py
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
  2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Igor Mammedov
@ 2025-02-26 14:39   ` Mauro Carvalho Chehab
  2025-02-26 14:51     ` Igor Mammedov
  0 siblings, 1 reply; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-26 14:39 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
	qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
	Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

Em Wed, 26 Feb 2025 15:16:56 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:

> On Fri, 21 Feb 2025 15:35:09 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> 
> > Now that the ghes preparation patches were merged, let's add support
> > for error injection.
> > 
> > On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> > table and hardware_error firmware file, together with its migration code. Migration tested
> > with both latest QEMU released kernel and upstream, on both directions.
> > 
> > The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> >    to inject ARM Processor Error records.  
> 
> please, run ./scripts/checkpatch on patches before submitting them.
> as it stands now series cannot be merged due to failing checkpatch

Weird... checkpatch is at pre-commit hook, as recommended at QEMU 
documentation. It is actually a little harder to manage this way, as it 
sometimes cause troubles with binary files.

Anyway, I'll run it by hand before sending the next version.

> 
> > 
> > ---
> > v4:
> > - added an extra comment for AcpiGhesState structure;
> > - patches reordered;
> > - no functional changes, just code shift between the patches in this series.
> > 
> > v3:
> > - addressed more nits;
> > - hest_add_le now points to the beginning of HEST table;
> > - removed HEST from tests/data/acpi;
> > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> > 
> > v2: 
> > - address some nits;
> > - improved ags cleanup patch and removed ags.present field;
> > - added some missing le*_to_cpu() calls;
> > - update date at copyright for new files to 2024-2025;
> > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > - added HEST and DSDT tables after the changes to make check target happy.
> >   (two patches: first one whitelisting such tables; second one removing from
> >    whitelist and updating/adding such tables to tests/data/acpi)
> > 
> > 
> > 
> > Mauro Carvalho Chehab (14):
> >   acpi/ghes: prepare to change the way HEST offsets are calculated
> >   acpi/ghes: add a firmware file with HEST address
> >   acpi/ghes: Use HEST table offsets when preparing GHES records
> >   acpi/ghes: don't hard-code the number of sources for HEST table
> >   acpi/ghes: add a notifier to notify when error data is ready
> >   acpi/ghes: create an ancillary acpi_ghes_get_state() function
> >   acpi/generic_event_device: Update GHES migration to cover hest addr
> >   acpi/generic_event_device: add logic to detect if HEST addr is
> >     available
> >   acpi/generic_event_device: add an APEI error device
> >   tests/acpi: virt: allow acpi table changes for a new table: HEST
> >   arm/virt: Wire up a GED error device for ACPI / GHES
> >   tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> >   qapi/acpi-hest: add an interface to do generic CPER error injection
> >   scripts/ghes_inject: add a script to generate GHES error inject
> > 
> >  MAINTAINERS                                   |  10 +
> >  hw/acpi/Kconfig                               |   5 +
> >  hw/acpi/aml-build.c                           |  10 +
> >  hw/acpi/generic_event_device.c                |  43 ++
> >  hw/acpi/ghes-stub.c                           |   7 +-
> >  hw/acpi/ghes.c                                | 231 ++++--
> >  hw/acpi/ghes_cper.c                           |  38 +
> >  hw/acpi/ghes_cper_stub.c                      |  19 +
> >  hw/acpi/meson.build                           |   2 +
> >  hw/arm/virt-acpi-build.c                      |  37 +-
> >  hw/arm/virt.c                                 |  19 +-
> >  hw/core/machine.c                             |   2 +
> >  include/hw/acpi/acpi_dev_interface.h          |   1 +
> >  include/hw/acpi/aml-build.h                   |   2 +
> >  include/hw/acpi/generic_event_device.h        |   1 +
> >  include/hw/acpi/ghes.h                        |  54 +-
> >  include/hw/arm/virt.h                         |   2 +
> >  qapi/acpi-hest.json                           |  35 +
> >  qapi/meson.build                              |   1 +
> >  qapi/qapi-schema.json                         |   1 +
> >  scripts/arm_processor_error.py                | 476 ++++++++++++
> >  scripts/ghes_inject.py                        |  51 ++
> >  scripts/qmp_helper.py                         | 702 ++++++++++++++++++
> >  target/arm/kvm.c                              |   7 +-
> >  tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
> >  .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
> >  tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
> >  tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
> >  tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
> >  29 files changed, 1677 insertions(+), 79 deletions(-)
> >  create mode 100644 hw/acpi/ghes_cper.c
> >  create mode 100644 hw/acpi/ghes_cper_stub.c
> >  create mode 100644 qapi/acpi-hest.json
> >  create mode 100644 scripts/arm_processor_error.py
> >  create mode 100755 scripts/ghes_inject.py
> >  create mode 100755 scripts/qmp_helper.py
> >   
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
  2025-02-26 14:39   ` Mauro Carvalho Chehab
@ 2025-02-26 14:51     ` Igor Mammedov
  2025-02-26 16:00       ` Igor Mammedov
  0 siblings, 1 reply; 9+ messages in thread
From: Igor Mammedov @ 2025-02-26 14:51 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
	qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
	Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

On Wed, 26 Feb 2025 15:39:13 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:

> Em Wed, 26 Feb 2025 15:16:56 +0100
> Igor Mammedov <imammedo@redhat.com> escreveu:
> 
> > On Fri, 21 Feb 2025 15:35:09 +0100
> > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> >   
> > > Now that the ghes preparation patches were merged, let's add support
> > > for error injection.
> > > 
> > > On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> > > table and hardware_error firmware file, together with its migration code. Migration tested
> > > with both latest QEMU released kernel and upstream, on both directions.
> > > 
> > > The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> > >    to inject ARM Processor Error records.    
> > 
> > please, run ./scripts/checkpatch on patches before submitting them.
> > as it stands now series cannot be merged due to failing checkpatch  
> 
> Weird... checkpatch is at pre-commit hook, as recommended at QEMU 
> documentation. It is actually a little harder to manage this way, as it 
> sometimes cause troubles with binary files.
> 
> Anyway, I'll run it by hand before sending the next version.

I've just applied v4 => format-patch => checkpatch
maybe I did something wrong (don't see how) but it complains overhere


PS: do not respin until I've finish this review.
 
> >   
> > > 
> > > ---
> > > v4:
> > > - added an extra comment for AcpiGhesState structure;
> > > - patches reordered;
> > > - no functional changes, just code shift between the patches in this series.
> > > 
> > > v3:
> > > - addressed more nits;
> > > - hest_add_le now points to the beginning of HEST table;
> > > - removed HEST from tests/data/acpi;
> > > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> > > 
> > > v2: 
> > > - address some nits;
> > > - improved ags cleanup patch and removed ags.present field;
> > > - added some missing le*_to_cpu() calls;
> > > - update date at copyright for new files to 2024-2025;
> > > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > > - added HEST and DSDT tables after the changes to make check target happy.
> > >   (two patches: first one whitelisting such tables; second one removing from
> > >    whitelist and updating/adding such tables to tests/data/acpi)
> > > 
> > > 
> > > 
> > > Mauro Carvalho Chehab (14):
> > >   acpi/ghes: prepare to change the way HEST offsets are calculated
> > >   acpi/ghes: add a firmware file with HEST address
> > >   acpi/ghes: Use HEST table offsets when preparing GHES records
> > >   acpi/ghes: don't hard-code the number of sources for HEST table
> > >   acpi/ghes: add a notifier to notify when error data is ready
> > >   acpi/ghes: create an ancillary acpi_ghes_get_state() function
> > >   acpi/generic_event_device: Update GHES migration to cover hest addr
> > >   acpi/generic_event_device: add logic to detect if HEST addr is
> > >     available
> > >   acpi/generic_event_device: add an APEI error device
> > >   tests/acpi: virt: allow acpi table changes for a new table: HEST
> > >   arm/virt: Wire up a GED error device for ACPI / GHES
> > >   tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> > >   qapi/acpi-hest: add an interface to do generic CPER error injection
> > >   scripts/ghes_inject: add a script to generate GHES error inject
> > > 
> > >  MAINTAINERS                                   |  10 +
> > >  hw/acpi/Kconfig                               |   5 +
> > >  hw/acpi/aml-build.c                           |  10 +
> > >  hw/acpi/generic_event_device.c                |  43 ++
> > >  hw/acpi/ghes-stub.c                           |   7 +-
> > >  hw/acpi/ghes.c                                | 231 ++++--
> > >  hw/acpi/ghes_cper.c                           |  38 +
> > >  hw/acpi/ghes_cper_stub.c                      |  19 +
> > >  hw/acpi/meson.build                           |   2 +
> > >  hw/arm/virt-acpi-build.c                      |  37 +-
> > >  hw/arm/virt.c                                 |  19 +-
> > >  hw/core/machine.c                             |   2 +
> > >  include/hw/acpi/acpi_dev_interface.h          |   1 +
> > >  include/hw/acpi/aml-build.h                   |   2 +
> > >  include/hw/acpi/generic_event_device.h        |   1 +
> > >  include/hw/acpi/ghes.h                        |  54 +-
> > >  include/hw/arm/virt.h                         |   2 +
> > >  qapi/acpi-hest.json                           |  35 +
> > >  qapi/meson.build                              |   1 +
> > >  qapi/qapi-schema.json                         |   1 +
> > >  scripts/arm_processor_error.py                | 476 ++++++++++++
> > >  scripts/ghes_inject.py                        |  51 ++
> > >  scripts/qmp_helper.py                         | 702 ++++++++++++++++++
> > >  target/arm/kvm.c                              |   7 +-
> > >  tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
> > >  .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
> > >  tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
> > >  tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
> > >  tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
> > >  29 files changed, 1677 insertions(+), 79 deletions(-)
> > >  create mode 100644 hw/acpi/ghes_cper.c
> > >  create mode 100644 hw/acpi/ghes_cper_stub.c
> > >  create mode 100644 qapi/acpi-hest.json
> > >  create mode 100644 scripts/arm_processor_error.py
> > >  create mode 100755 scripts/ghes_inject.py
> > >  create mode 100755 scripts/qmp_helper.py
> > >     
> >   
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function
  2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
@ 2025-02-26 15:27   ` Igor Mammedov
  0 siblings, 0 replies; 9+ messages in thread
From: Igor Mammedov @ 2025-02-26 15:27 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
	qemu-devel, Ani Sinha, Dongjiu Geng, Paolo Bonzini, Peter Maydell,
	kvm, linux-kernel

On Fri, 21 Feb 2025 15:35:15 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:

> Instead of having a function to check if ACPI is enabled
> (acpi_ghes_present), change its logic to be more generic,
> returing a pointed to AcpiGhesState.
> 
> Such change allows cleanup the ghes GED state code, avoiding
> to read it multiple times, and simplifying the code.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by:  Igor Mammedov <imammedo@redhat.com>
> ---
>  hw/acpi/ghes-stub.c    |  7 ++++---
>  hw/acpi/ghes.c         | 38 ++++++++++----------------------------
>  include/hw/acpi/ghes.h | 14 ++++++++------
>  target/arm/kvm.c       |  7 +++++--
>  4 files changed, 27 insertions(+), 39 deletions(-)
> 
> diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
> index 7cec1812dad9..40f660c246fe 100644
> --- a/hw/acpi/ghes-stub.c
> +++ b/hw/acpi/ghes-stub.c
> @@ -11,12 +11,13 @@
>  #include "qemu/osdep.h"
>  #include "hw/acpi/ghes.h"
>  
> -int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
> +int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
> +                            uint64_t physical_address)
>  {
>      return -1;
>  }
>  
> -bool acpi_ghes_present(void)
> +AcpiGhesState *acpi_ghes_get_state(void)
>  {
> -    return false;
> +    return NULL;
>  }
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index f2d1cc7369f4..401789259f60 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -425,10 +425,6 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
>                                   uint64_t *cper_addr,
>                                   uint64_t *read_ack_register_addr)
>  {
> -    if (!ghes_addr) {
> -        return;
> -    }
> -
>      /*
>       * non-HEST version supports only one source, so no need to change
>       * the start offset based on the source ID. Also, we can't validate
> @@ -517,27 +513,16 @@ static void get_ghes_source_offsets(uint16_t source_id,
>  NotifierList acpi_generic_error_notifiers =
>      NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
>  
> -void ghes_record_cper_errors(const void *cper, size_t len,
> +void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
>                               uint16_t source_id, Error **errp)
>  {
>      uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
> -    AcpiGedState *acpi_ged_state;
> -    AcpiGhesState *ags;
>  
>      if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
>          error_setg(errp, "GHES CPER record is too big: %zd", len);
>          return;
>      }
>  
> -    acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
> -                                                       NULL));
> -    if (!acpi_ged_state) {
> -        error_setg(errp, "Can't find ACPI_GED object");
> -        return;
> -    }
> -    ags = &acpi_ged_state->ghes_state;
> -
> -
>      if (!ags->use_hest_addr) {
>          get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
>                               &cper_addr, &read_ack_register_addr);
> @@ -546,11 +531,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
>                                  &cper_addr, &read_ack_register_addr, errp);
>      }
>  
> -    if (!cper_addr) {
> -        error_setg(errp, "can not find Generic Error Status Block");
> -        return;
> -    }
> -
>      cpu_physical_memory_read(read_ack_register_addr,
>                               &read_ack_register, sizeof(read_ack_register));
>  
> @@ -576,7 +556,8 @@ void ghes_record_cper_errors(const void *cper, size_t len,
>      notifier_list_notify(&acpi_generic_error_notifiers, NULL);
>  }
>  
> -int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
> +int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
> +                            uint64_t physical_address)
>  {
>      /* Memory Error Section Type */
>      const uint8_t guid[] =
> @@ -602,7 +583,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
>      acpi_ghes_build_append_mem_cper(block, physical_address);
>  
>      /* Report the error */
> -    ghes_record_cper_errors(block->data, block->len, source_id, &errp);
> +    ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
>  
>      g_array_free(block, true);
>  
> @@ -614,7 +595,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
>      return 0;
>  }
>  
> -bool acpi_ghes_present(void)
> +AcpiGhesState *acpi_ghes_get_state(void)
>  {
>      AcpiGedState *acpi_ged_state;
>      AcpiGhesState *ags;
> @@ -623,11 +604,12 @@ bool acpi_ghes_present(void)
>                                                         NULL));
>  
>      if (!acpi_ged_state) {
> -        return false;
> +        return NULL;
>      }
>      ags = &acpi_ged_state->ghes_state;
> -    if (!ags->hw_error_le && !ags->hest_addr_le)
> -        return false;
>  
> -    return true;
> +    if (!ags->hw_error_le && !ags->hest_addr_le) {
> +        return NULL;
> +    }
> +    return ags;
>  }
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index 219aa7ab4fe0..276f9dc076d9 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -99,15 +99,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
>                       const char *oem_id, const char *oem_table_id);
>  void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
>                            GArray *hardware_errors);
> -int acpi_ghes_memory_errors(uint16_t source_id, uint64_t error_physical_addr);
> -void ghes_record_cper_errors(const void *cper, size_t len,
> +int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
> +                            uint64_t error_physical_addr);
> +void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
>                               uint16_t source_id, Error **errp);
>  
>  /**
> - * acpi_ghes_present: Report whether ACPI GHES table is present
> + * acpi_ghes_get_state: Get a pointer for ACPI ghes state
>   *
> - * Returns: true if the system has an ACPI GHES table and it is
> - * safe to call acpi_ghes_memory_errors() to record a memory error.
> + * Returns: a pointer to ghes state if the system has an ACPI GHES table,
> + * it is enabled and it is safe to call acpi_ghes_memory_errors() to record
      ^^^^^^^^^^^^^ can't link 'it' with anything, I'd drop this

> + * a memory error. Returns false, otherwise.
                              ^^^ NULL ??

>   */
> -bool acpi_ghes_present(void);
> +AcpiGhesState *acpi_ghes_get_state(void);
>  #endif
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index da30bdbb2349..80ca7779797b 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -2366,10 +2366,12 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>  {
>      ram_addr_t ram_addr;
>      hwaddr paddr;
> +    AcpiGhesState *ags;
>  
>      assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
>  
> -    if (acpi_ghes_present() && addr) {
> +    ags = acpi_ghes_get_state();
> +    if (ags && addr) {
>          ram_addr = qemu_ram_addr_from_host(addr);
>          if (ram_addr != RAM_ADDR_INVALID &&
>              kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
> @@ -2387,7 +2389,8 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>               */
>              if (code == BUS_MCEERR_AR) {
>                  kvm_cpu_synchronize_state(c);
> -                if (!acpi_ghes_memory_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
> +                if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SEA,
> +                                             paddr)) {
>                      kvm_inject_arm_sea(c);
>                  } else {
>                      error_report("failed to record the error");


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
  2025-02-26 14:51     ` Igor Mammedov
@ 2025-02-26 16:00       ` Igor Mammedov
  0 siblings, 0 replies; 9+ messages in thread
From: Igor Mammedov @ 2025-02-26 16:00 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
	qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
	Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

On Wed, 26 Feb 2025 15:51:43 +0100
Igor Mammedov <imammedo@redhat.com> wrote:

> On Wed, 26 Feb 2025 15:39:13 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
[...]
> 
> PS: do not respin until I've finish this review.

finished

>  
> > >     
> > > > 
> > > > ---
> > > > v4:
> > > > - added an extra comment for AcpiGhesState structure;
> > > > - patches reordered;
> > > > - no functional changes, just code shift between the patches in this series.
> > > > 
> > > > v3:
> > > > - addressed more nits;
> > > > - hest_add_le now points to the beginning of HEST table;
> > > > - removed HEST from tests/data/acpi;
> > > > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> > > > 
> > > > v2: 
> > > > - address some nits;
> > > > - improved ags cleanup patch and removed ags.present field;
> > > > - added some missing le*_to_cpu() calls;
> > > > - update date at copyright for new files to 2024-2025;
> > > > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > > > - added HEST and DSDT tables after the changes to make check target happy.
> > > >   (two patches: first one whitelisting such tables; second one removing from
> > > >    whitelist and updating/adding such tables to tests/data/acpi)
> > > > 
> > > > 
> > > > 
> > > > Mauro Carvalho Chehab (14):
> > > >   acpi/ghes: prepare to change the way HEST offsets are calculated
> > > >   acpi/ghes: add a firmware file with HEST address
> > > >   acpi/ghes: Use HEST table offsets when preparing GHES records
> > > >   acpi/ghes: don't hard-code the number of sources for HEST table
> > > >   acpi/ghes: add a notifier to notify when error data is ready
> > > >   acpi/ghes: create an ancillary acpi_ghes_get_state() function
> > > >   acpi/generic_event_device: Update GHES migration to cover hest addr
> > > >   acpi/generic_event_device: add logic to detect if HEST addr is
> > > >     available
> > > >   acpi/generic_event_device: add an APEI error device
> > > >   tests/acpi: virt: allow acpi table changes for a new table: HEST
> > > >   arm/virt: Wire up a GED error device for ACPI / GHES
> > > >   tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> > > >   qapi/acpi-hest: add an interface to do generic CPER error injection
> > > >   scripts/ghes_inject: add a script to generate GHES error inject
> > > > 
> > > >  MAINTAINERS                                   |  10 +
> > > >  hw/acpi/Kconfig                               |   5 +
> > > >  hw/acpi/aml-build.c                           |  10 +
> > > >  hw/acpi/generic_event_device.c                |  43 ++
> > > >  hw/acpi/ghes-stub.c                           |   7 +-
> > > >  hw/acpi/ghes.c                                | 231 ++++--
> > > >  hw/acpi/ghes_cper.c                           |  38 +
> > > >  hw/acpi/ghes_cper_stub.c                      |  19 +
> > > >  hw/acpi/meson.build                           |   2 +
> > > >  hw/arm/virt-acpi-build.c                      |  37 +-
> > > >  hw/arm/virt.c                                 |  19 +-
> > > >  hw/core/machine.c                             |   2 +
> > > >  include/hw/acpi/acpi_dev_interface.h          |   1 +
> > > >  include/hw/acpi/aml-build.h                   |   2 +
> > > >  include/hw/acpi/generic_event_device.h        |   1 +
> > > >  include/hw/acpi/ghes.h                        |  54 +-
> > > >  include/hw/arm/virt.h                         |   2 +
> > > >  qapi/acpi-hest.json                           |  35 +
> > > >  qapi/meson.build                              |   1 +
> > > >  qapi/qapi-schema.json                         |   1 +
> > > >  scripts/arm_processor_error.py                | 476 ++++++++++++
> > > >  scripts/ghes_inject.py                        |  51 ++
> > > >  scripts/qmp_helper.py                         | 702 ++++++++++++++++++
> > > >  target/arm/kvm.c                              |   7 +-
> > > >  tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
> > > >  .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
> > > >  tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
> > > >  tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
> > > >  tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
> > > >  29 files changed, 1677 insertions(+), 79 deletions(-)
> > > >  create mode 100644 hw/acpi/ghes_cper.c
> > > >  create mode 100644 hw/acpi/ghes_cper_stub.c
> > > >  create mode 100644 qapi/acpi-hest.json
> > > >  create mode 100644 scripts/arm_processor_error.py
> > > >  create mode 100755 scripts/ghes_inject.py
> > > >  create mode 100755 scripts/qmp_helper.py
> > > >       
> > >     
> >   
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
  2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
  2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
  2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Igor Mammedov
@ 2025-02-27  9:54 ` Igor Mammedov
  2025-02-27 11:05   ` Mauro Carvalho Chehab
  2 siblings, 1 reply; 9+ messages in thread
From: Igor Mammedov @ 2025-02-27  9:54 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
	qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
	Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

On Fri, 21 Feb 2025 15:35:09 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:

> Now that the ghes preparation patches were merged, let's add support
> for error injection.
> 
> On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> table and hardware_error firmware file, together with its migration code. Migration tested
> with both latest QEMU released kernel and upstream, on both directions.
> 
> The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
>    to inject ARM Processor Error records.
> 
> ---
> v4:
> - added an extra comment for AcpiGhesState structure;
> - patches reordered;
> - no functional changes, just code shift between the patches in this series.
> 
> v3:
> - addressed more nits;
> - hest_add_le now points to the beginning of HEST table;
> - removed HEST from tests/data/acpi;
> - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> 
> v2: 
> - address some nits;
> - improved ags cleanup patch and removed ags.present field;
> - added some missing le*_to_cpu() calls;
> - update date at copyright for new files to 2024-2025;
> - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> - added HEST and DSDT tables after the changes to make check target happy.
>   (two patches: first one whitelisting such tables; second one removing from
>    whitelist and updating/adding such tables to tests/data/acpi)
> 
> 
> 
> Mauro Carvalho Chehab (14):
>   acpi/ghes: prepare to change the way HEST offsets are calculated
>   acpi/ghes: add a firmware file with HEST address
>   acpi/ghes: Use HEST table offsets when preparing GHES records
>   acpi/ghes: don't hard-code the number of sources for HEST table
>   acpi/ghes: add a notifier to notify when error data is ready
>   acpi/ghes: create an ancillary acpi_ghes_get_state() function
>   acpi/generic_event_device: Update GHES migration to cover hest addr
>   acpi/generic_event_device: add logic to detect if HEST addr is
>     available
>   acpi/generic_event_device: add an APEI error device
>   tests/acpi: virt: allow acpi table changes for a new table: HEST
>   arm/virt: Wire up a GED error device for ACPI / GHES
>   tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
>   qapi/acpi-hest: add an interface to do generic CPER error injection
>   scripts/ghes_inject: add a script to generate GHES error inject
> 
>  MAINTAINERS                                   |  10 +
>  hw/acpi/Kconfig                               |   5 +
>  hw/acpi/aml-build.c                           |  10 +
>  hw/acpi/generic_event_device.c                |  43 ++
>  hw/acpi/ghes-stub.c                           |   7 +-
>  hw/acpi/ghes.c                                | 231 ++++--
>  hw/acpi/ghes_cper.c                           |  38 +
>  hw/acpi/ghes_cper_stub.c                      |  19 +
>  hw/acpi/meson.build                           |   2 +
>  hw/arm/virt-acpi-build.c                      |  37 +-
>  hw/arm/virt.c                                 |  19 +-
>  hw/core/machine.c                             |   2 +
>  include/hw/acpi/acpi_dev_interface.h          |   1 +
>  include/hw/acpi/aml-build.h                   |   2 +
>  include/hw/acpi/generic_event_device.h        |   1 +
>  include/hw/acpi/ghes.h                        |  54 +-
>  include/hw/arm/virt.h                         |   2 +
>  qapi/acpi-hest.json                           |  35 +
>  qapi/meson.build                              |   1 +
>  qapi/qapi-schema.json                         |   1 +
>  scripts/arm_processor_error.py                | 476 ++++++++++++
>  scripts/ghes_inject.py                        |  51 ++
>  scripts/qmp_helper.py                         | 702 ++++++++++++++++++
>  target/arm/kvm.c                              |   7 +-
>  tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
>  .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
>  tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
>  tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
>  tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
>  29 files changed, 1677 insertions(+), 79 deletions(-)
>  create mode 100644 hw/acpi/ghes_cper.c
>  create mode 100644 hw/acpi/ghes_cper_stub.c
>  create mode 100644 qapi/acpi-hest.json
>  create mode 100644 scripts/arm_processor_error.py
>  create mode 100755 scripts/ghes_inject.py
>  create mode 100755 scripts/qmp_helper.py
> 

once you enable, ras in tests as 1st patches and fixup minor issues
please try to do patch by patch compile/bios-tables-test testing, to avoid
unnecessary respin in case at table change crept in somewhere unnoticed. 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
  2025-02-27  9:54 ` Igor Mammedov
@ 2025-02-27 11:05   ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:05 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
	qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
	Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
	Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
	linux-kernel

Em Thu, 27 Feb 2025 10:54:54 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:

> On Fri, 21 Feb 2025 15:35:09 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> 
> > Now that the ghes preparation patches were merged, let's add support
> > for error injection.
> > 
> > On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> > table and hardware_error firmware file, together with its migration code. Migration tested
> > with both latest QEMU released kernel and upstream, on both directions.
> > 
> > The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> >    to inject ARM Processor Error records.
> > 
> > ---
> > v4:
> > - added an extra comment for AcpiGhesState structure;
> > - patches reordered;
> > - no functional changes, just code shift between the patches in this series.
> > 
> > v3:
> > - addressed more nits;
> > - hest_add_le now points to the beginning of HEST table;
> > - removed HEST from tests/data/acpi;
> > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> > 
> > v2: 
> > - address some nits;
> > - improved ags cleanup patch and removed ags.present field;
> > - added some missing le*_to_cpu() calls;
> > - update date at copyright for new files to 2024-2025;
> > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > - added HEST and DSDT tables after the changes to make check target happy.
> >   (two patches: first one whitelisting such tables; second one removing from
> >    whitelist and updating/adding such tables to tests/data/acpi)
> > 
> > 
> > 
> > Mauro Carvalho Chehab (14):
> >   acpi/ghes: prepare to change the way HEST offsets are calculated
> >   acpi/ghes: add a firmware file with HEST address
> >   acpi/ghes: Use HEST table offsets when preparing GHES records
> >   acpi/ghes: don't hard-code the number of sources for HEST table
> >   acpi/ghes: add a notifier to notify when error data is ready
> >   acpi/ghes: create an ancillary acpi_ghes_get_state() function
> >   acpi/generic_event_device: Update GHES migration to cover hest addr
> >   acpi/generic_event_device: add logic to detect if HEST addr is
> >     available
> >   acpi/generic_event_device: add an APEI error device
> >   tests/acpi: virt: allow acpi table changes for a new table: HEST
> >   arm/virt: Wire up a GED error device for ACPI / GHES
> >   tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> >   qapi/acpi-hest: add an interface to do generic CPER error injection
> >   scripts/ghes_inject: add a script to generate GHES error inject
> > 
> >  MAINTAINERS                                   |  10 +
> >  hw/acpi/Kconfig                               |   5 +
> >  hw/acpi/aml-build.c                           |  10 +
> >  hw/acpi/generic_event_device.c                |  43 ++
> >  hw/acpi/ghes-stub.c                           |   7 +-
> >  hw/acpi/ghes.c                                | 231 ++++--
> >  hw/acpi/ghes_cper.c                           |  38 +
> >  hw/acpi/ghes_cper_stub.c                      |  19 +
> >  hw/acpi/meson.build                           |   2 +
> >  hw/arm/virt-acpi-build.c                      |  37 +-
> >  hw/arm/virt.c                                 |  19 +-
> >  hw/core/machine.c                             |   2 +
> >  include/hw/acpi/acpi_dev_interface.h          |   1 +
> >  include/hw/acpi/aml-build.h                   |   2 +
> >  include/hw/acpi/generic_event_device.h        |   1 +
> >  include/hw/acpi/ghes.h                        |  54 +-
> >  include/hw/arm/virt.h                         |   2 +
> >  qapi/acpi-hest.json                           |  35 +
> >  qapi/meson.build                              |   1 +
> >  qapi/qapi-schema.json                         |   1 +
> >  scripts/arm_processor_error.py                | 476 ++++++++++++
> >  scripts/ghes_inject.py                        |  51 ++
> >  scripts/qmp_helper.py                         | 702 ++++++++++++++++++
> >  target/arm/kvm.c                              |   7 +-
> >  tests/data/acpi/aarch64/virt/DSDT             | Bin 5196 -> 5240 bytes
> >  .../data/acpi/aarch64/virt/DSDT.acpihmatvirt  | Bin 5282 -> 5326 bytes
> >  tests/data/acpi/aarch64/virt/DSDT.memhp       | Bin 6557 -> 6601 bytes
> >  tests/data/acpi/aarch64/virt/DSDT.pxb         | Bin 7679 -> 7723 bytes
> >  tests/data/acpi/aarch64/virt/DSDT.topology    | Bin 5398 -> 5442 bytes
> >  29 files changed, 1677 insertions(+), 79 deletions(-)
> >  create mode 100644 hw/acpi/ghes_cper.c
> >  create mode 100644 hw/acpi/ghes_cper_stub.c
> >  create mode 100644 qapi/acpi-hest.json
> >  create mode 100644 scripts/arm_processor_error.py
> >  create mode 100755 scripts/ghes_inject.py
> >  create mode 100755 scripts/qmp_helper.py
> >   
> 
> once you enable, ras in tests as 1st patches and fixup minor issues
> please try to do patch by patch compile/bios-tables-test testing, to avoid
> unnecessary respin in case at table change crept in somewhere unnoticed. 

Just submitted v5.

I took some extra care to avoid bisect issues. Still checkpatch 
had some warnings, but they seemed false positives.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-02-27 11:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
2025-02-26 15:27   ` Igor Mammedov
2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Igor Mammedov
2025-02-26 14:39   ` Mauro Carvalho Chehab
2025-02-26 14:51     ` Igor Mammedov
2025-02-26 16:00       ` Igor Mammedov
2025-02-27  9:54 ` Igor Mammedov
2025-02-27 11:05   ` Mauro Carvalho Chehab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox