* [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
@ 2025-02-21 14:35 Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Philippe Mathieu-Daudé, Ani Sinha,
Cleber Rosa, Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
Now that the ghes preparation patches were merged, let's add support
for error injection.
On this series, the first 6 patches chang to the math used to calculate offsets at HEST
table and hardware_error firmware file, together with its migration code. Migration tested
with both latest QEMU released kernel and upstream, on both directions.
The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
to inject ARM Processor Error records.
---
v4:
- added an extra comment for AcpiGhesState structure;
- patches reordered;
- no functional changes, just code shift between the patches in this series.
v3:
- addressed more nits;
- hest_add_le now points to the beginning of HEST table;
- removed HEST from tests/data/acpi;
- added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
v2:
- address some nits;
- improved ags cleanup patch and removed ags.present field;
- added some missing le*_to_cpu() calls;
- update date at copyright for new files to 2024-2025;
- qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
- added HEST and DSDT tables after the changes to make check target happy.
(two patches: first one whitelisting such tables; second one removing from
whitelist and updating/adding such tables to tests/data/acpi)
Mauro Carvalho Chehab (14):
acpi/ghes: prepare to change the way HEST offsets are calculated
acpi/ghes: add a firmware file with HEST address
acpi/ghes: Use HEST table offsets when preparing GHES records
acpi/ghes: don't hard-code the number of sources for HEST table
acpi/ghes: add a notifier to notify when error data is ready
acpi/ghes: create an ancillary acpi_ghes_get_state() function
acpi/generic_event_device: Update GHES migration to cover hest addr
acpi/generic_event_device: add logic to detect if HEST addr is
available
acpi/generic_event_device: add an APEI error device
tests/acpi: virt: allow acpi table changes for a new table: HEST
arm/virt: Wire up a GED error device for ACPI / GHES
tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
qapi/acpi-hest: add an interface to do generic CPER error injection
scripts/ghes_inject: add a script to generate GHES error inject
MAINTAINERS | 10 +
hw/acpi/Kconfig | 5 +
hw/acpi/aml-build.c | 10 +
hw/acpi/generic_event_device.c | 43 ++
hw/acpi/ghes-stub.c | 7 +-
hw/acpi/ghes.c | 231 ++++--
hw/acpi/ghes_cper.c | 38 +
hw/acpi/ghes_cper_stub.c | 19 +
hw/acpi/meson.build | 2 +
hw/arm/virt-acpi-build.c | 37 +-
hw/arm/virt.c | 19 +-
hw/core/machine.c | 2 +
include/hw/acpi/acpi_dev_interface.h | 1 +
include/hw/acpi/aml-build.h | 2 +
include/hw/acpi/generic_event_device.h | 1 +
include/hw/acpi/ghes.h | 54 +-
include/hw/arm/virt.h | 2 +
qapi/acpi-hest.json | 35 +
qapi/meson.build | 1 +
qapi/qapi-schema.json | 1 +
scripts/arm_processor_error.py | 476 ++++++++++++
scripts/ghes_inject.py | 51 ++
scripts/qmp_helper.py | 702 ++++++++++++++++++
target/arm/kvm.c | 7 +-
tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
.../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
29 files changed, 1677 insertions(+), 79 deletions(-)
create mode 100644 hw/acpi/ghes_cper.c
create mode 100644 hw/acpi/ghes_cper_stub.c
create mode 100644 qapi/acpi-hest.json
create mode 100644 scripts/arm_processor_error.py
create mode 100755 scripts/ghes_inject.py
create mode 100755 scripts/qmp_helper.py
--
2.48.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
@ 2025-02-21 14:35 ` Mauro Carvalho Chehab
2025-02-26 15:27 ` Igor Mammedov
2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Igor Mammedov
2025-02-27 9:54 ` Igor Mammedov
2 siblings, 1 reply; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-21 14:35 UTC (permalink / raw)
To: Igor Mammedov, Michael S . Tsirkin
Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Paolo Bonzini,
Peter Maydell, kvm, linux-kernel
Instead of having a function to check if ACPI is enabled
(acpi_ghes_present), change its logic to be more generic,
returing a pointed to AcpiGhesState.
Such change allows cleanup the ghes GED state code, avoiding
to read it multiple times, and simplifying the code.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/ghes-stub.c | 7 ++++---
hw/acpi/ghes.c | 38 ++++++++++----------------------------
include/hw/acpi/ghes.h | 14 ++++++++------
target/arm/kvm.c | 7 +++++--
4 files changed, 27 insertions(+), 39 deletions(-)
diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
index 7cec1812dad9..40f660c246fe 100644
--- a/hw/acpi/ghes-stub.c
+++ b/hw/acpi/ghes-stub.c
@@ -11,12 +11,13 @@
#include "qemu/osdep.h"
#include "hw/acpi/ghes.h"
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t physical_address)
{
return -1;
}
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
{
- return false;
+ return NULL;
}
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index f2d1cc7369f4..401789259f60 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -425,10 +425,6 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
uint64_t *cper_addr,
uint64_t *read_ack_register_addr)
{
- if (!ghes_addr) {
- return;
- }
-
/*
* non-HEST version supports only one source, so no need to change
* the start offset based on the source ID. Also, we can't validate
@@ -517,27 +513,16 @@ static void get_ghes_source_offsets(uint16_t source_id,
NotifierList acpi_generic_error_notifiers =
NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
-void ghes_record_cper_errors(const void *cper, size_t len,
+void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
uint16_t source_id, Error **errp)
{
uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
- AcpiGedState *acpi_ged_state;
- AcpiGhesState *ags;
if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
error_setg(errp, "GHES CPER record is too big: %zd", len);
return;
}
- acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
- NULL));
- if (!acpi_ged_state) {
- error_setg(errp, "Can't find ACPI_GED object");
- return;
- }
- ags = &acpi_ged_state->ghes_state;
-
-
if (!ags->use_hest_addr) {
get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
&cper_addr, &read_ack_register_addr);
@@ -546,11 +531,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
&cper_addr, &read_ack_register_addr, errp);
}
- if (!cper_addr) {
- error_setg(errp, "can not find Generic Error Status Block");
- return;
- }
-
cpu_physical_memory_read(read_ack_register_addr,
&read_ack_register, sizeof(read_ack_register));
@@ -576,7 +556,8 @@ void ghes_record_cper_errors(const void *cper, size_t len,
notifier_list_notify(&acpi_generic_error_notifiers, NULL);
}
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t physical_address)
{
/* Memory Error Section Type */
const uint8_t guid[] =
@@ -602,7 +583,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
acpi_ghes_build_append_mem_cper(block, physical_address);
/* Report the error */
- ghes_record_cper_errors(block->data, block->len, source_id, &errp);
+ ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
g_array_free(block, true);
@@ -614,7 +595,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
return 0;
}
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
{
AcpiGedState *acpi_ged_state;
AcpiGhesState *ags;
@@ -623,11 +604,12 @@ bool acpi_ghes_present(void)
NULL));
if (!acpi_ged_state) {
- return false;
+ return NULL;
}
ags = &acpi_ged_state->ghes_state;
- if (!ags->hw_error_le && !ags->hest_addr_le)
- return false;
- return true;
+ if (!ags->hw_error_le && !ags->hest_addr_le) {
+ return NULL;
+ }
+ return ags;
}
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 219aa7ab4fe0..276f9dc076d9 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -99,15 +99,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
const char *oem_id, const char *oem_table_id);
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
GArray *hardware_errors);
-int acpi_ghes_memory_errors(uint16_t source_id, uint64_t error_physical_addr);
-void ghes_record_cper_errors(const void *cper, size_t len,
+int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t error_physical_addr);
+void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
uint16_t source_id, Error **errp);
/**
- * acpi_ghes_present: Report whether ACPI GHES table is present
+ * acpi_ghes_get_state: Get a pointer for ACPI ghes state
*
- * Returns: true if the system has an ACPI GHES table and it is
- * safe to call acpi_ghes_memory_errors() to record a memory error.
+ * Returns: a pointer to ghes state if the system has an ACPI GHES table,
+ * it is enabled and it is safe to call acpi_ghes_memory_errors() to record
+ * a memory error. Returns false, otherwise.
*/
-bool acpi_ghes_present(void);
+AcpiGhesState *acpi_ghes_get_state(void);
#endif
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index da30bdbb2349..80ca7779797b 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2366,10 +2366,12 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
{
ram_addr_t ram_addr;
hwaddr paddr;
+ AcpiGhesState *ags;
assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
- if (acpi_ghes_present() && addr) {
+ ags = acpi_ghes_get_state();
+ if (ags && addr) {
ram_addr = qemu_ram_addr_from_host(addr);
if (ram_addr != RAM_ADDR_INVALID &&
kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
@@ -2387,7 +2389,8 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
*/
if (code == BUS_MCEERR_AR) {
kvm_cpu_synchronize_state(c);
- if (!acpi_ghes_memory_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
+ if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SEA,
+ paddr)) {
kvm_inject_arm_sea(c);
} else {
error_report("failed to record the error");
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
@ 2025-02-26 14:16 ` Igor Mammedov
2025-02-26 14:39 ` Mauro Carvalho Chehab
2025-02-27 9:54 ` Igor Mammedov
2 siblings, 1 reply; 9+ messages in thread
From: Igor Mammedov @ 2025-02-26 14:16 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
On Fri, 21 Feb 2025 15:35:09 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Now that the ghes preparation patches were merged, let's add support
> for error injection.
>
> On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> table and hardware_error firmware file, together with its migration code. Migration tested
> with both latest QEMU released kernel and upstream, on both directions.
>
> The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> to inject ARM Processor Error records.
please, run ./scripts/checkpatch on patches before submitting them.
as it stands now series cannot be merged due to failing checkpatch
>
> ---
> v4:
> - added an extra comment for AcpiGhesState structure;
> - patches reordered;
> - no functional changes, just code shift between the patches in this series.
>
> v3:
> - addressed more nits;
> - hest_add_le now points to the beginning of HEST table;
> - removed HEST from tests/data/acpi;
> - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
>
> v2:
> - address some nits;
> - improved ags cleanup patch and removed ags.present field;
> - added some missing le*_to_cpu() calls;
> - update date at copyright for new files to 2024-2025;
> - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> - added HEST and DSDT tables after the changes to make check target happy.
> (two patches: first one whitelisting such tables; second one removing from
> whitelist and updating/adding such tables to tests/data/acpi)
>
>
>
> Mauro Carvalho Chehab (14):
> acpi/ghes: prepare to change the way HEST offsets are calculated
> acpi/ghes: add a firmware file with HEST address
> acpi/ghes: Use HEST table offsets when preparing GHES records
> acpi/ghes: don't hard-code the number of sources for HEST table
> acpi/ghes: add a notifier to notify when error data is ready
> acpi/ghes: create an ancillary acpi_ghes_get_state() function
> acpi/generic_event_device: Update GHES migration to cover hest addr
> acpi/generic_event_device: add logic to detect if HEST addr is
> available
> acpi/generic_event_device: add an APEI error device
> tests/acpi: virt: allow acpi table changes for a new table: HEST
> arm/virt: Wire up a GED error device for ACPI / GHES
> tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> qapi/acpi-hest: add an interface to do generic CPER error injection
> scripts/ghes_inject: add a script to generate GHES error inject
>
> MAINTAINERS | 10 +
> hw/acpi/Kconfig | 5 +
> hw/acpi/aml-build.c | 10 +
> hw/acpi/generic_event_device.c | 43 ++
> hw/acpi/ghes-stub.c | 7 +-
> hw/acpi/ghes.c | 231 ++++--
> hw/acpi/ghes_cper.c | 38 +
> hw/acpi/ghes_cper_stub.c | 19 +
> hw/acpi/meson.build | 2 +
> hw/arm/virt-acpi-build.c | 37 +-
> hw/arm/virt.c | 19 +-
> hw/core/machine.c | 2 +
> include/hw/acpi/acpi_dev_interface.h | 1 +
> include/hw/acpi/aml-build.h | 2 +
> include/hw/acpi/generic_event_device.h | 1 +
> include/hw/acpi/ghes.h | 54 +-
> include/hw/arm/virt.h | 2 +
> qapi/acpi-hest.json | 35 +
> qapi/meson.build | 1 +
> qapi/qapi-schema.json | 1 +
> scripts/arm_processor_error.py | 476 ++++++++++++
> scripts/ghes_inject.py | 51 ++
> scripts/qmp_helper.py | 702 ++++++++++++++++++
> target/arm/kvm.c | 7 +-
> tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> 29 files changed, 1677 insertions(+), 79 deletions(-)
> create mode 100644 hw/acpi/ghes_cper.c
> create mode 100644 hw/acpi/ghes_cper_stub.c
> create mode 100644 qapi/acpi-hest.json
> create mode 100644 scripts/arm_processor_error.py
> create mode 100755 scripts/ghes_inject.py
> create mode 100755 scripts/qmp_helper.py
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Igor Mammedov
@ 2025-02-26 14:39 ` Mauro Carvalho Chehab
2025-02-26 14:51 ` Igor Mammedov
0 siblings, 1 reply; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-26 14:39 UTC (permalink / raw)
To: Igor Mammedov
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
Em Wed, 26 Feb 2025 15:16:56 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Fri, 21 Feb 2025 15:35:09 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Now that the ghes preparation patches were merged, let's add support
> > for error injection.
> >
> > On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> > table and hardware_error firmware file, together with its migration code. Migration tested
> > with both latest QEMU released kernel and upstream, on both directions.
> >
> > The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> > to inject ARM Processor Error records.
>
> please, run ./scripts/checkpatch on patches before submitting them.
> as it stands now series cannot be merged due to failing checkpatch
Weird... checkpatch is at pre-commit hook, as recommended at QEMU
documentation. It is actually a little harder to manage this way, as it
sometimes cause troubles with binary files.
Anyway, I'll run it by hand before sending the next version.
>
> >
> > ---
> > v4:
> > - added an extra comment for AcpiGhesState structure;
> > - patches reordered;
> > - no functional changes, just code shift between the patches in this series.
> >
> > v3:
> > - addressed more nits;
> > - hest_add_le now points to the beginning of HEST table;
> > - removed HEST from tests/data/acpi;
> > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> >
> > v2:
> > - address some nits;
> > - improved ags cleanup patch and removed ags.present field;
> > - added some missing le*_to_cpu() calls;
> > - update date at copyright for new files to 2024-2025;
> > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > - added HEST and DSDT tables after the changes to make check target happy.
> > (two patches: first one whitelisting such tables; second one removing from
> > whitelist and updating/adding such tables to tests/data/acpi)
> >
> >
> >
> > Mauro Carvalho Chehab (14):
> > acpi/ghes: prepare to change the way HEST offsets are calculated
> > acpi/ghes: add a firmware file with HEST address
> > acpi/ghes: Use HEST table offsets when preparing GHES records
> > acpi/ghes: don't hard-code the number of sources for HEST table
> > acpi/ghes: add a notifier to notify when error data is ready
> > acpi/ghes: create an ancillary acpi_ghes_get_state() function
> > acpi/generic_event_device: Update GHES migration to cover hest addr
> > acpi/generic_event_device: add logic to detect if HEST addr is
> > available
> > acpi/generic_event_device: add an APEI error device
> > tests/acpi: virt: allow acpi table changes for a new table: HEST
> > arm/virt: Wire up a GED error device for ACPI / GHES
> > tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> > qapi/acpi-hest: add an interface to do generic CPER error injection
> > scripts/ghes_inject: add a script to generate GHES error inject
> >
> > MAINTAINERS | 10 +
> > hw/acpi/Kconfig | 5 +
> > hw/acpi/aml-build.c | 10 +
> > hw/acpi/generic_event_device.c | 43 ++
> > hw/acpi/ghes-stub.c | 7 +-
> > hw/acpi/ghes.c | 231 ++++--
> > hw/acpi/ghes_cper.c | 38 +
> > hw/acpi/ghes_cper_stub.c | 19 +
> > hw/acpi/meson.build | 2 +
> > hw/arm/virt-acpi-build.c | 37 +-
> > hw/arm/virt.c | 19 +-
> > hw/core/machine.c | 2 +
> > include/hw/acpi/acpi_dev_interface.h | 1 +
> > include/hw/acpi/aml-build.h | 2 +
> > include/hw/acpi/generic_event_device.h | 1 +
> > include/hw/acpi/ghes.h | 54 +-
> > include/hw/arm/virt.h | 2 +
> > qapi/acpi-hest.json | 35 +
> > qapi/meson.build | 1 +
> > qapi/qapi-schema.json | 1 +
> > scripts/arm_processor_error.py | 476 ++++++++++++
> > scripts/ghes_inject.py | 51 ++
> > scripts/qmp_helper.py | 702 ++++++++++++++++++
> > target/arm/kvm.c | 7 +-
> > tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> > .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> > tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> > tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> > tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> > 29 files changed, 1677 insertions(+), 79 deletions(-)
> > create mode 100644 hw/acpi/ghes_cper.c
> > create mode 100644 hw/acpi/ghes_cper_stub.c
> > create mode 100644 qapi/acpi-hest.json
> > create mode 100644 scripts/arm_processor_error.py
> > create mode 100755 scripts/ghes_inject.py
> > create mode 100755 scripts/qmp_helper.py
> >
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-26 14:39 ` Mauro Carvalho Chehab
@ 2025-02-26 14:51 ` Igor Mammedov
2025-02-26 16:00 ` Igor Mammedov
0 siblings, 1 reply; 9+ messages in thread
From: Igor Mammedov @ 2025-02-26 14:51 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
On Wed, 26 Feb 2025 15:39:13 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Em Wed, 26 Feb 2025 15:16:56 +0100
> Igor Mammedov <imammedo@redhat.com> escreveu:
>
> > On Fri, 21 Feb 2025 15:35:09 +0100
> > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> >
> > > Now that the ghes preparation patches were merged, let's add support
> > > for error injection.
> > >
> > > On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> > > table and hardware_error firmware file, together with its migration code. Migration tested
> > > with both latest QEMU released kernel and upstream, on both directions.
> > >
> > > The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> > > to inject ARM Processor Error records.
> >
> > please, run ./scripts/checkpatch on patches before submitting them.
> > as it stands now series cannot be merged due to failing checkpatch
>
> Weird... checkpatch is at pre-commit hook, as recommended at QEMU
> documentation. It is actually a little harder to manage this way, as it
> sometimes cause troubles with binary files.
>
> Anyway, I'll run it by hand before sending the next version.
I've just applied v4 => format-patch => checkpatch
maybe I did something wrong (don't see how) but it complains overhere
PS: do not respin until I've finish this review.
> >
> > >
> > > ---
> > > v4:
> > > - added an extra comment for AcpiGhesState structure;
> > > - patches reordered;
> > > - no functional changes, just code shift between the patches in this series.
> > >
> > > v3:
> > > - addressed more nits;
> > > - hest_add_le now points to the beginning of HEST table;
> > > - removed HEST from tests/data/acpi;
> > > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> > >
> > > v2:
> > > - address some nits;
> > > - improved ags cleanup patch and removed ags.present field;
> > > - added some missing le*_to_cpu() calls;
> > > - update date at copyright for new files to 2024-2025;
> > > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > > - added HEST and DSDT tables after the changes to make check target happy.
> > > (two patches: first one whitelisting such tables; second one removing from
> > > whitelist and updating/adding such tables to tests/data/acpi)
> > >
> > >
> > >
> > > Mauro Carvalho Chehab (14):
> > > acpi/ghes: prepare to change the way HEST offsets are calculated
> > > acpi/ghes: add a firmware file with HEST address
> > > acpi/ghes: Use HEST table offsets when preparing GHES records
> > > acpi/ghes: don't hard-code the number of sources for HEST table
> > > acpi/ghes: add a notifier to notify when error data is ready
> > > acpi/ghes: create an ancillary acpi_ghes_get_state() function
> > > acpi/generic_event_device: Update GHES migration to cover hest addr
> > > acpi/generic_event_device: add logic to detect if HEST addr is
> > > available
> > > acpi/generic_event_device: add an APEI error device
> > > tests/acpi: virt: allow acpi table changes for a new table: HEST
> > > arm/virt: Wire up a GED error device for ACPI / GHES
> > > tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> > > qapi/acpi-hest: add an interface to do generic CPER error injection
> > > scripts/ghes_inject: add a script to generate GHES error inject
> > >
> > > MAINTAINERS | 10 +
> > > hw/acpi/Kconfig | 5 +
> > > hw/acpi/aml-build.c | 10 +
> > > hw/acpi/generic_event_device.c | 43 ++
> > > hw/acpi/ghes-stub.c | 7 +-
> > > hw/acpi/ghes.c | 231 ++++--
> > > hw/acpi/ghes_cper.c | 38 +
> > > hw/acpi/ghes_cper_stub.c | 19 +
> > > hw/acpi/meson.build | 2 +
> > > hw/arm/virt-acpi-build.c | 37 +-
> > > hw/arm/virt.c | 19 +-
> > > hw/core/machine.c | 2 +
> > > include/hw/acpi/acpi_dev_interface.h | 1 +
> > > include/hw/acpi/aml-build.h | 2 +
> > > include/hw/acpi/generic_event_device.h | 1 +
> > > include/hw/acpi/ghes.h | 54 +-
> > > include/hw/arm/virt.h | 2 +
> > > qapi/acpi-hest.json | 35 +
> > > qapi/meson.build | 1 +
> > > qapi/qapi-schema.json | 1 +
> > > scripts/arm_processor_error.py | 476 ++++++++++++
> > > scripts/ghes_inject.py | 51 ++
> > > scripts/qmp_helper.py | 702 ++++++++++++++++++
> > > target/arm/kvm.c | 7 +-
> > > tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> > > .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> > > tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> > > tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> > > tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> > > 29 files changed, 1677 insertions(+), 79 deletions(-)
> > > create mode 100644 hw/acpi/ghes_cper.c
> > > create mode 100644 hw/acpi/ghes_cper_stub.c
> > > create mode 100644 qapi/acpi-hest.json
> > > create mode 100644 scripts/arm_processor_error.py
> > > create mode 100755 scripts/ghes_inject.py
> > > create mode 100755 scripts/qmp_helper.py
> > >
> >
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function
2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
@ 2025-02-26 15:27 ` Igor Mammedov
0 siblings, 0 replies; 9+ messages in thread
From: Igor Mammedov @ 2025-02-26 15:27 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Ani Sinha, Dongjiu Geng, Paolo Bonzini, Peter Maydell,
kvm, linux-kernel
On Fri, 21 Feb 2025 15:35:15 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Instead of having a function to check if ACPI is enabled
> (acpi_ghes_present), change its logic to be more generic,
> returing a pointed to AcpiGhesState.
>
> Such change allows cleanup the ghes GED state code, avoiding
> to read it multiple times, and simplifying the code.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> ---
> hw/acpi/ghes-stub.c | 7 ++++---
> hw/acpi/ghes.c | 38 ++++++++++----------------------------
> include/hw/acpi/ghes.h | 14 ++++++++------
> target/arm/kvm.c | 7 +++++--
> 4 files changed, 27 insertions(+), 39 deletions(-)
>
> diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
> index 7cec1812dad9..40f660c246fe 100644
> --- a/hw/acpi/ghes-stub.c
> +++ b/hw/acpi/ghes-stub.c
> @@ -11,12 +11,13 @@
> #include "qemu/osdep.h"
> #include "hw/acpi/ghes.h"
>
> -int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
> +int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
> + uint64_t physical_address)
> {
> return -1;
> }
>
> -bool acpi_ghes_present(void)
> +AcpiGhesState *acpi_ghes_get_state(void)
> {
> - return false;
> + return NULL;
> }
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index f2d1cc7369f4..401789259f60 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -425,10 +425,6 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
> uint64_t *cper_addr,
> uint64_t *read_ack_register_addr)
> {
> - if (!ghes_addr) {
> - return;
> - }
> -
> /*
> * non-HEST version supports only one source, so no need to change
> * the start offset based on the source ID. Also, we can't validate
> @@ -517,27 +513,16 @@ static void get_ghes_source_offsets(uint16_t source_id,
> NotifierList acpi_generic_error_notifiers =
> NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
>
> -void ghes_record_cper_errors(const void *cper, size_t len,
> +void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
> uint16_t source_id, Error **errp)
> {
> uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
> - AcpiGedState *acpi_ged_state;
> - AcpiGhesState *ags;
>
> if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
> error_setg(errp, "GHES CPER record is too big: %zd", len);
> return;
> }
>
> - acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
> - NULL));
> - if (!acpi_ged_state) {
> - error_setg(errp, "Can't find ACPI_GED object");
> - return;
> - }
> - ags = &acpi_ged_state->ghes_state;
> -
> -
> if (!ags->use_hest_addr) {
> get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
> &cper_addr, &read_ack_register_addr);
> @@ -546,11 +531,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
> &cper_addr, &read_ack_register_addr, errp);
> }
>
> - if (!cper_addr) {
> - error_setg(errp, "can not find Generic Error Status Block");
> - return;
> - }
> -
> cpu_physical_memory_read(read_ack_register_addr,
> &read_ack_register, sizeof(read_ack_register));
>
> @@ -576,7 +556,8 @@ void ghes_record_cper_errors(const void *cper, size_t len,
> notifier_list_notify(&acpi_generic_error_notifiers, NULL);
> }
>
> -int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
> +int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
> + uint64_t physical_address)
> {
> /* Memory Error Section Type */
> const uint8_t guid[] =
> @@ -602,7 +583,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
> acpi_ghes_build_append_mem_cper(block, physical_address);
>
> /* Report the error */
> - ghes_record_cper_errors(block->data, block->len, source_id, &errp);
> + ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
>
> g_array_free(block, true);
>
> @@ -614,7 +595,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
> return 0;
> }
>
> -bool acpi_ghes_present(void)
> +AcpiGhesState *acpi_ghes_get_state(void)
> {
> AcpiGedState *acpi_ged_state;
> AcpiGhesState *ags;
> @@ -623,11 +604,12 @@ bool acpi_ghes_present(void)
> NULL));
>
> if (!acpi_ged_state) {
> - return false;
> + return NULL;
> }
> ags = &acpi_ged_state->ghes_state;
> - if (!ags->hw_error_le && !ags->hest_addr_le)
> - return false;
>
> - return true;
> + if (!ags->hw_error_le && !ags->hest_addr_le) {
> + return NULL;
> + }
> + return ags;
> }
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index 219aa7ab4fe0..276f9dc076d9 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -99,15 +99,17 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> const char *oem_id, const char *oem_table_id);
> void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
> GArray *hardware_errors);
> -int acpi_ghes_memory_errors(uint16_t source_id, uint64_t error_physical_addr);
> -void ghes_record_cper_errors(const void *cper, size_t len,
> +int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
> + uint64_t error_physical_addr);
> +void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
> uint16_t source_id, Error **errp);
>
> /**
> - * acpi_ghes_present: Report whether ACPI GHES table is present
> + * acpi_ghes_get_state: Get a pointer for ACPI ghes state
> *
> - * Returns: true if the system has an ACPI GHES table and it is
> - * safe to call acpi_ghes_memory_errors() to record a memory error.
> + * Returns: a pointer to ghes state if the system has an ACPI GHES table,
> + * it is enabled and it is safe to call acpi_ghes_memory_errors() to record
^^^^^^^^^^^^^ can't link 'it' with anything, I'd drop this
> + * a memory error. Returns false, otherwise.
^^^ NULL ??
> */
> -bool acpi_ghes_present(void);
> +AcpiGhesState *acpi_ghes_get_state(void);
> #endif
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index da30bdbb2349..80ca7779797b 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -2366,10 +2366,12 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
> {
> ram_addr_t ram_addr;
> hwaddr paddr;
> + AcpiGhesState *ags;
>
> assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
>
> - if (acpi_ghes_present() && addr) {
> + ags = acpi_ghes_get_state();
> + if (ags && addr) {
> ram_addr = qemu_ram_addr_from_host(addr);
> if (ram_addr != RAM_ADDR_INVALID &&
> kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
> @@ -2387,7 +2389,8 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
> */
> if (code == BUS_MCEERR_AR) {
> kvm_cpu_synchronize_state(c);
> - if (!acpi_ghes_memory_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
> + if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SEA,
> + paddr)) {
> kvm_inject_arm_sea(c);
> } else {
> error_report("failed to record the error");
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-26 14:51 ` Igor Mammedov
@ 2025-02-26 16:00 ` Igor Mammedov
0 siblings, 0 replies; 9+ messages in thread
From: Igor Mammedov @ 2025-02-26 16:00 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
On Wed, 26 Feb 2025 15:51:43 +0100
Igor Mammedov <imammedo@redhat.com> wrote:
> On Wed, 26 Feb 2025 15:39:13 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
[...]
>
> PS: do not respin until I've finish this review.
finished
>
> > >
> > > >
> > > > ---
> > > > v4:
> > > > - added an extra comment for AcpiGhesState structure;
> > > > - patches reordered;
> > > > - no functional changes, just code shift between the patches in this series.
> > > >
> > > > v3:
> > > > - addressed more nits;
> > > > - hest_add_le now points to the beginning of HEST table;
> > > > - removed HEST from tests/data/acpi;
> > > > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> > > >
> > > > v2:
> > > > - address some nits;
> > > > - improved ags cleanup patch and removed ags.present field;
> > > > - added some missing le*_to_cpu() calls;
> > > > - update date at copyright for new files to 2024-2025;
> > > > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > > > - added HEST and DSDT tables after the changes to make check target happy.
> > > > (two patches: first one whitelisting such tables; second one removing from
> > > > whitelist and updating/adding such tables to tests/data/acpi)
> > > >
> > > >
> > > >
> > > > Mauro Carvalho Chehab (14):
> > > > acpi/ghes: prepare to change the way HEST offsets are calculated
> > > > acpi/ghes: add a firmware file with HEST address
> > > > acpi/ghes: Use HEST table offsets when preparing GHES records
> > > > acpi/ghes: don't hard-code the number of sources for HEST table
> > > > acpi/ghes: add a notifier to notify when error data is ready
> > > > acpi/ghes: create an ancillary acpi_ghes_get_state() function
> > > > acpi/generic_event_device: Update GHES migration to cover hest addr
> > > > acpi/generic_event_device: add logic to detect if HEST addr is
> > > > available
> > > > acpi/generic_event_device: add an APEI error device
> > > > tests/acpi: virt: allow acpi table changes for a new table: HEST
> > > > arm/virt: Wire up a GED error device for ACPI / GHES
> > > > tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> > > > qapi/acpi-hest: add an interface to do generic CPER error injection
> > > > scripts/ghes_inject: add a script to generate GHES error inject
> > > >
> > > > MAINTAINERS | 10 +
> > > > hw/acpi/Kconfig | 5 +
> > > > hw/acpi/aml-build.c | 10 +
> > > > hw/acpi/generic_event_device.c | 43 ++
> > > > hw/acpi/ghes-stub.c | 7 +-
> > > > hw/acpi/ghes.c | 231 ++++--
> > > > hw/acpi/ghes_cper.c | 38 +
> > > > hw/acpi/ghes_cper_stub.c | 19 +
> > > > hw/acpi/meson.build | 2 +
> > > > hw/arm/virt-acpi-build.c | 37 +-
> > > > hw/arm/virt.c | 19 +-
> > > > hw/core/machine.c | 2 +
> > > > include/hw/acpi/acpi_dev_interface.h | 1 +
> > > > include/hw/acpi/aml-build.h | 2 +
> > > > include/hw/acpi/generic_event_device.h | 1 +
> > > > include/hw/acpi/ghes.h | 54 +-
> > > > include/hw/arm/virt.h | 2 +
> > > > qapi/acpi-hest.json | 35 +
> > > > qapi/meson.build | 1 +
> > > > qapi/qapi-schema.json | 1 +
> > > > scripts/arm_processor_error.py | 476 ++++++++++++
> > > > scripts/ghes_inject.py | 51 ++
> > > > scripts/qmp_helper.py | 702 ++++++++++++++++++
> > > > target/arm/kvm.c | 7 +-
> > > > tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> > > > .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> > > > tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> > > > tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> > > > tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> > > > 29 files changed, 1677 insertions(+), 79 deletions(-)
> > > > create mode 100644 hw/acpi/ghes_cper.c
> > > > create mode 100644 hw/acpi/ghes_cper_stub.c
> > > > create mode 100644 qapi/acpi-hest.json
> > > > create mode 100644 scripts/arm_processor_error.py
> > > > create mode 100755 scripts/ghes_inject.py
> > > > create mode 100755 scripts/qmp_helper.py
> > > >
> > >
> >
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Igor Mammedov
@ 2025-02-27 9:54 ` Igor Mammedov
2025-02-27 11:05 ` Mauro Carvalho Chehab
2 siblings, 1 reply; 9+ messages in thread
From: Igor Mammedov @ 2025-02-27 9:54 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
On Fri, 21 Feb 2025 15:35:09 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Now that the ghes preparation patches were merged, let's add support
> for error injection.
>
> On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> table and hardware_error firmware file, together with its migration code. Migration tested
> with both latest QEMU released kernel and upstream, on both directions.
>
> The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> to inject ARM Processor Error records.
>
> ---
> v4:
> - added an extra comment for AcpiGhesState structure;
> - patches reordered;
> - no functional changes, just code shift between the patches in this series.
>
> v3:
> - addressed more nits;
> - hest_add_le now points to the beginning of HEST table;
> - removed HEST from tests/data/acpi;
> - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
>
> v2:
> - address some nits;
> - improved ags cleanup patch and removed ags.present field;
> - added some missing le*_to_cpu() calls;
> - update date at copyright for new files to 2024-2025;
> - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> - added HEST and DSDT tables after the changes to make check target happy.
> (two patches: first one whitelisting such tables; second one removing from
> whitelist and updating/adding such tables to tests/data/acpi)
>
>
>
> Mauro Carvalho Chehab (14):
> acpi/ghes: prepare to change the way HEST offsets are calculated
> acpi/ghes: add a firmware file with HEST address
> acpi/ghes: Use HEST table offsets when preparing GHES records
> acpi/ghes: don't hard-code the number of sources for HEST table
> acpi/ghes: add a notifier to notify when error data is ready
> acpi/ghes: create an ancillary acpi_ghes_get_state() function
> acpi/generic_event_device: Update GHES migration to cover hest addr
> acpi/generic_event_device: add logic to detect if HEST addr is
> available
> acpi/generic_event_device: add an APEI error device
> tests/acpi: virt: allow acpi table changes for a new table: HEST
> arm/virt: Wire up a GED error device for ACPI / GHES
> tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> qapi/acpi-hest: add an interface to do generic CPER error injection
> scripts/ghes_inject: add a script to generate GHES error inject
>
> MAINTAINERS | 10 +
> hw/acpi/Kconfig | 5 +
> hw/acpi/aml-build.c | 10 +
> hw/acpi/generic_event_device.c | 43 ++
> hw/acpi/ghes-stub.c | 7 +-
> hw/acpi/ghes.c | 231 ++++--
> hw/acpi/ghes_cper.c | 38 +
> hw/acpi/ghes_cper_stub.c | 19 +
> hw/acpi/meson.build | 2 +
> hw/arm/virt-acpi-build.c | 37 +-
> hw/arm/virt.c | 19 +-
> hw/core/machine.c | 2 +
> include/hw/acpi/acpi_dev_interface.h | 1 +
> include/hw/acpi/aml-build.h | 2 +
> include/hw/acpi/generic_event_device.h | 1 +
> include/hw/acpi/ghes.h | 54 +-
> include/hw/arm/virt.h | 2 +
> qapi/acpi-hest.json | 35 +
> qapi/meson.build | 1 +
> qapi/qapi-schema.json | 1 +
> scripts/arm_processor_error.py | 476 ++++++++++++
> scripts/ghes_inject.py | 51 ++
> scripts/qmp_helper.py | 702 ++++++++++++++++++
> target/arm/kvm.c | 7 +-
> tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> 29 files changed, 1677 insertions(+), 79 deletions(-)
> create mode 100644 hw/acpi/ghes_cper.c
> create mode 100644 hw/acpi/ghes_cper_stub.c
> create mode 100644 qapi/acpi-hest.json
> create mode 100644 scripts/arm_processor_error.py
> create mode 100755 scripts/ghes_inject.py
> create mode 100755 scripts/qmp_helper.py
>
once you enable, ras in tests as 1st patches and fixup minor issues
please try to do patch by patch compile/bios-tables-test testing, to avoid
unnecessary respin in case at table change crept in somewhere unnoticed.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject
2025-02-27 9:54 ` Igor Mammedov
@ 2025-02-27 11:05 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-27 11:05 UTC (permalink / raw)
To: Igor Mammedov
Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
qemu-devel, Philippe Mathieu-Daudé, Ani Sinha, Cleber Rosa,
Dongjiu Geng, Eduardo Habkost, Eric Blake, John Snow,
Marcel Apfelbaum, Markus Armbruster, Michael Roth, Paolo Bonzini,
Peter Maydell, Shannon Zhao, Yanan Wang, Zhao Liu, kvm,
linux-kernel
Em Thu, 27 Feb 2025 10:54:54 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Fri, 21 Feb 2025 15:35:09 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Now that the ghes preparation patches were merged, let's add support
> > for error injection.
> >
> > On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> > table and hardware_error firmware file, together with its migration code. Migration tested
> > with both latest QEMU released kernel and upstream, on both directions.
> >
> > The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> > to inject ARM Processor Error records.
> >
> > ---
> > v4:
> > - added an extra comment for AcpiGhesState structure;
> > - patches reordered;
> > - no functional changes, just code shift between the patches in this series.
> >
> > v3:
> > - addressed more nits;
> > - hest_add_le now points to the beginning of HEST table;
> > - removed HEST from tests/data/acpi;
> > - added an extra patch to not use fw_cfg with virt-10.0 for hw_error_le
> >
> > v2:
> > - address some nits;
> > - improved ags cleanup patch and removed ags.present field;
> > - added some missing le*_to_cpu() calls;
> > - update date at copyright for new files to 2024-2025;
> > - qmp command changed to: inject-ghes-v2-error ans since updated to 10.0;
> > - added HEST and DSDT tables after the changes to make check target happy.
> > (two patches: first one whitelisting such tables; second one removing from
> > whitelist and updating/adding such tables to tests/data/acpi)
> >
> >
> >
> > Mauro Carvalho Chehab (14):
> > acpi/ghes: prepare to change the way HEST offsets are calculated
> > acpi/ghes: add a firmware file with HEST address
> > acpi/ghes: Use HEST table offsets when preparing GHES records
> > acpi/ghes: don't hard-code the number of sources for HEST table
> > acpi/ghes: add a notifier to notify when error data is ready
> > acpi/ghes: create an ancillary acpi_ghes_get_state() function
> > acpi/generic_event_device: Update GHES migration to cover hest addr
> > acpi/generic_event_device: add logic to detect if HEST addr is
> > available
> > acpi/generic_event_device: add an APEI error device
> > tests/acpi: virt: allow acpi table changes for a new table: HEST
> > arm/virt: Wire up a GED error device for ACPI / GHES
> > tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT
> > qapi/acpi-hest: add an interface to do generic CPER error injection
> > scripts/ghes_inject: add a script to generate GHES error inject
> >
> > MAINTAINERS | 10 +
> > hw/acpi/Kconfig | 5 +
> > hw/acpi/aml-build.c | 10 +
> > hw/acpi/generic_event_device.c | 43 ++
> > hw/acpi/ghes-stub.c | 7 +-
> > hw/acpi/ghes.c | 231 ++++--
> > hw/acpi/ghes_cper.c | 38 +
> > hw/acpi/ghes_cper_stub.c | 19 +
> > hw/acpi/meson.build | 2 +
> > hw/arm/virt-acpi-build.c | 37 +-
> > hw/arm/virt.c | 19 +-
> > hw/core/machine.c | 2 +
> > include/hw/acpi/acpi_dev_interface.h | 1 +
> > include/hw/acpi/aml-build.h | 2 +
> > include/hw/acpi/generic_event_device.h | 1 +
> > include/hw/acpi/ghes.h | 54 +-
> > include/hw/arm/virt.h | 2 +
> > qapi/acpi-hest.json | 35 +
> > qapi/meson.build | 1 +
> > qapi/qapi-schema.json | 1 +
> > scripts/arm_processor_error.py | 476 ++++++++++++
> > scripts/ghes_inject.py | 51 ++
> > scripts/qmp_helper.py | 702 ++++++++++++++++++
> > target/arm/kvm.c | 7 +-
> > tests/data/acpi/aarch64/virt/DSDT | Bin 5196 -> 5240 bytes
> > .../data/acpi/aarch64/virt/DSDT.acpihmatvirt | Bin 5282 -> 5326 bytes
> > tests/data/acpi/aarch64/virt/DSDT.memhp | Bin 6557 -> 6601 bytes
> > tests/data/acpi/aarch64/virt/DSDT.pxb | Bin 7679 -> 7723 bytes
> > tests/data/acpi/aarch64/virt/DSDT.topology | Bin 5398 -> 5442 bytes
> > 29 files changed, 1677 insertions(+), 79 deletions(-)
> > create mode 100644 hw/acpi/ghes_cper.c
> > create mode 100644 hw/acpi/ghes_cper_stub.c
> > create mode 100644 qapi/acpi-hest.json
> > create mode 100644 scripts/arm_processor_error.py
> > create mode 100755 scripts/ghes_inject.py
> > create mode 100755 scripts/qmp_helper.py
> >
>
> once you enable, ras in tests as 1st patches and fixup minor issues
> please try to do patch by patch compile/bios-tables-test testing, to avoid
> unnecessary respin in case at table change crept in somewhere unnoticed.
Just submitted v5.
I took some extra care to avoid bisect issues. Still checkpatch
had some warnings, but they seemed false positives.
Thanks,
Mauro
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-02-27 11:05 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-21 14:35 [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-02-21 14:35 ` [PATCH v4 06/14] acpi/ghes: create an ancillary acpi_ghes_get_state() function Mauro Carvalho Chehab
2025-02-26 15:27 ` Igor Mammedov
2025-02-26 14:16 ` [PATCH v4 00/14] Change ghes to use HEST-based offsets and add support for error inject Igor Mammedov
2025-02-26 14:39 ` Mauro Carvalho Chehab
2025-02-26 14:51 ` Igor Mammedov
2025-02-26 16:00 ` Igor Mammedov
2025-02-27 9:54 ` Igor Mammedov
2025-02-27 11:05 ` Mauro Carvalho Chehab
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox