* [PATCH v2 1/5] acpi/ghes: Automate data block cleanup in acpi_ghes_memory_errors()
2025-12-01 14:17 [PATCH v2 0/5] acpi/ghes: Error object handling improvement Gavin Shan
@ 2025-12-01 14:17 ` Gavin Shan
2025-12-01 14:18 ` [PATCH v2 2/5] acpi/ghes: Abort in acpi_ghes_memory_errors() if necessary Gavin Shan
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Gavin Shan @ 2025-12-01 14:17 UTC (permalink / raw)
To: qemu-arm
Cc: qemu-devel, mst, jonathan.cameron, mchehab+huawei, imammedo,
armbru, anisinha, gengdongjiu1, peter.maydell, pbonzini,
shan.gavin
Use g_auto_ptr() to automate data block cleanup in the function so
that it won't be a burden to us.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
hw/acpi/ghes.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 06555905ce..6366c74248 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -565,9 +565,7 @@ int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
0xED, 0x7C, 0x83, 0xB1);
Error *errp = NULL;
int data_length;
- GArray *block;
-
- block = g_array_new(false, true /* clear */, 1);
+ g_autoptr(GArray) block = g_array_new(false, true /* clear */, 1);
data_length = ACPI_GHES_DATA_LENGTH + ACPI_GHES_MEM_CPER_LENGTH;
/*
@@ -585,8 +583,6 @@ int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
/* Report the error */
ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
- g_array_free(block, true);
-
if (errp) {
error_report_err(errp);
return -1;
--
2.51.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v2 2/5] acpi/ghes: Abort in acpi_ghes_memory_errors() if necessary
2025-12-01 14:17 [PATCH v2 0/5] acpi/ghes: Error object handling improvement Gavin Shan
2025-12-01 14:17 ` [PATCH v2 1/5] acpi/ghes: Automate data block cleanup in acpi_ghes_memory_errors() Gavin Shan
@ 2025-12-01 14:18 ` Gavin Shan
2025-12-01 14:18 ` [PATCH v2 3/5] target/arm/kvm: Exit on error from acpi_ghes_memory_errors() Gavin Shan
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Gavin Shan @ 2025-12-01 14:18 UTC (permalink / raw)
To: qemu-arm
Cc: qemu-devel, mst, jonathan.cameron, mchehab+huawei, imammedo,
armbru, anisinha, gengdongjiu1, peter.maydell, pbonzini,
shan.gavin
The function hw/acpi/ghes-stub.c::acpi_ghes_memory_errors() shouldn't
be called by any one. Take g_assert_not_reached() as a clearer indication.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
hw/acpi/ghes-stub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
index 40f660c246..b54f1b093c 100644
--- a/hw/acpi/ghes-stub.c
+++ b/hw/acpi/ghes-stub.c
@@ -14,7 +14,7 @@
int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
uint64_t physical_address)
{
- return -1;
+ g_assert_not_reached();
}
AcpiGhesState *acpi_ghes_get_state(void)
--
2.51.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v2 3/5] target/arm/kvm: Exit on error from acpi_ghes_memory_errors()
2025-12-01 14:17 [PATCH v2 0/5] acpi/ghes: Error object handling improvement Gavin Shan
2025-12-01 14:17 ` [PATCH v2 1/5] acpi/ghes: Automate data block cleanup in acpi_ghes_memory_errors() Gavin Shan
2025-12-01 14:18 ` [PATCH v2 2/5] acpi/ghes: Abort in acpi_ghes_memory_errors() if necessary Gavin Shan
@ 2025-12-01 14:18 ` Gavin Shan
2025-12-01 14:18 ` [PATCH v2 4/5] acpi/ghes: Bail early on error from get_ghes_source_offsets() Gavin Shan
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Gavin Shan @ 2025-12-01 14:18 UTC (permalink / raw)
To: qemu-arm
Cc: qemu-devel, mst, jonathan.cameron, mchehab+huawei, imammedo,
armbru, anisinha, gengdongjiu1, peter.maydell, pbonzini,
shan.gavin
A core dump is no sense as there isn't programming bugs related to
errors from acpi_ghes_memory_errors().
Exit instead of abort when the function returns errors, and the
excessive error message is also dropped.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
target/arm/kvm.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 0d57081e69..acda0b3fb4 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2460,8 +2460,7 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
paddr)) {
kvm_inject_arm_sea(c);
} else {
- error_report("failed to record the error");
- abort();
+ exit(1);
}
}
return;
--
2.51.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v2 4/5] acpi/ghes: Bail early on error from get_ghes_source_offsets()
2025-12-01 14:17 [PATCH v2 0/5] acpi/ghes: Error object handling improvement Gavin Shan
` (2 preceding siblings ...)
2025-12-01 14:18 ` [PATCH v2 3/5] target/arm/kvm: Exit on error from acpi_ghes_memory_errors() Gavin Shan
@ 2025-12-01 14:18 ` Gavin Shan
2025-12-01 14:18 ` [PATCH v2 5/5] acpi/ghes: Use error_fatal in acpi_ghes_memory_errors() Gavin Shan
2025-12-04 11:09 ` [PATCH v2 0/5] acpi/ghes: Error object handling improvement Gavin Shan
5 siblings, 0 replies; 7+ messages in thread
From: Gavin Shan @ 2025-12-01 14:18 UTC (permalink / raw)
To: qemu-arm
Cc: qemu-devel, mst, jonathan.cameron, mchehab+huawei, imammedo,
armbru, anisinha, gengdongjiu1, peter.maydell, pbonzini,
shan.gavin
In ghes_record_cper_errors(), get_ghes_source_offsets() can return
a error initialized by error_setg(). Without bailing on this error,
it can call into the second error_setg() due to the unexpected value
returned from the read acknowledgement register. The second error_setg()
can trigger assert(*errp == NULL) in its callee error_setv(), which
isn't expected.
Bail early in ghes_record_cper_errors() when error is received from
get_ghes_source_offsets() to avoid the unexpected behavior.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
hw/acpi/ghes.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 6366c74248..c35883dfa9 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -443,7 +443,7 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
*read_ack_register_addr = ghes_addr + sizeof(uint64_t);
}
-static void get_ghes_source_offsets(uint16_t source_id,
+static bool get_ghes_source_offsets(uint16_t source_id,
uint64_t hest_addr,
uint64_t *cper_addr,
uint64_t *read_ack_start_addr,
@@ -474,7 +474,7 @@ static void get_ghes_source_offsets(uint16_t source_id,
/* For now, we only know the size of GHESv2 table */
if (type != ACPI_GHES_SOURCE_GENERIC_ERROR_V2) {
error_setg(errp, "HEST: type %d not supported.", type);
- return;
+ return false;
}
/* Compare CPER source ID at the GHESv2 structure */
@@ -488,7 +488,7 @@ static void get_ghes_source_offsets(uint16_t source_id,
}
if (i == num_sources) {
error_setg(errp, "HEST: Source %d not found.", source_id);
- return;
+ return false;
}
/* Navigate through table address pointers */
@@ -508,6 +508,8 @@ static void get_ghes_source_offsets(uint16_t source_id,
cpu_physical_memory_read(hest_read_ack_addr, read_ack_start_addr,
sizeof(*read_ack_start_addr));
*read_ack_start_addr = le64_to_cpu(*read_ack_start_addr);
+
+ return true;
}
NotifierList acpi_generic_error_notifiers =
@@ -526,9 +528,10 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
if (!ags->use_hest_addr) {
get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
&cper_addr, &read_ack_register_addr);
- } else {
- get_ghes_source_offsets(source_id, le64_to_cpu(ags->hest_addr_le),
- &cper_addr, &read_ack_register_addr, errp);
+ } else if (!get_ghes_source_offsets(source_id,
+ le64_to_cpu(ags->hest_addr_le),
+ &cper_addr, &read_ack_register_addr, errp)) {
+ return;
}
cpu_physical_memory_read(read_ack_register_addr,
--
2.51.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v2 5/5] acpi/ghes: Use error_fatal in acpi_ghes_memory_errors()
2025-12-01 14:17 [PATCH v2 0/5] acpi/ghes: Error object handling improvement Gavin Shan
` (3 preceding siblings ...)
2025-12-01 14:18 ` [PATCH v2 4/5] acpi/ghes: Bail early on error from get_ghes_source_offsets() Gavin Shan
@ 2025-12-01 14:18 ` Gavin Shan
2025-12-04 11:09 ` [PATCH v2 0/5] acpi/ghes: Error object handling improvement Gavin Shan
5 siblings, 0 replies; 7+ messages in thread
From: Gavin Shan @ 2025-12-01 14:18 UTC (permalink / raw)
To: qemu-arm
Cc: qemu-devel, mst, jonathan.cameron, mchehab+huawei, imammedo,
armbru, anisinha, gengdongjiu1, peter.maydell, pbonzini,
shan.gavin
Use error_fatal in acpi_ghes_memory_errors() so that the caller needn't
explicitly call exit(). The return value of acpi_ghes_memory_errors()
and ghes_record_cper_errors() is changed to 'bool' indicating an error
has been raised, to be compatible with what's documented in error.h.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
hw/acpi/ghes-stub.c | 4 ++--
hw/acpi/ghes.c | 26 ++++++++++----------------
include/hw/acpi/ghes.h | 6 +++---
target/arm/kvm.c | 9 +++------
4 files changed, 18 insertions(+), 27 deletions(-)
diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
index b54f1b093c..5f9313cce9 100644
--- a/hw/acpi/ghes-stub.c
+++ b/hw/acpi/ghes-stub.c
@@ -11,8 +11,8 @@
#include "qemu/osdep.h"
#include "hw/acpi/ghes.h"
-int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
- uint64_t physical_address)
+bool acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t physical_address, Error **errp)
{
g_assert_not_reached();
}
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index c35883dfa9..3033e93d65 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -515,14 +515,14 @@ static bool get_ghes_source_offsets(uint16_t source_id,
NotifierList acpi_generic_error_notifiers =
NOTIFIER_LIST_INITIALIZER(acpi_generic_error_notifiers);
-void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
+bool ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
uint16_t source_id, Error **errp)
{
uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
error_setg(errp, "GHES CPER record is too big: %zd", len);
- return;
+ return false;
}
if (!ags->use_hest_addr) {
@@ -531,7 +531,7 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
} else if (!get_ghes_source_offsets(source_id,
le64_to_cpu(ags->hest_addr_le),
&cper_addr, &read_ack_register_addr, errp)) {
- return;
+ return false;
}
cpu_physical_memory_read(read_ack_register_addr,
@@ -542,7 +542,7 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
error_setg(errp,
"OSPM does not acknowledge previous error,"
" so can not record CPER for current error anymore");
- return;
+ return false;
}
read_ack_register = cpu_to_le64(0);
@@ -557,16 +557,17 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
cpu_physical_memory_write(cper_addr, cper, len);
notifier_list_notify(&acpi_generic_error_notifiers, &source_id);
+
+ return true;
}
-int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
- uint64_t physical_address)
+bool acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t physical_address, Error **errp)
{
/* Memory Error Section Type */
const uint8_t guid[] =
UUID_LE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
0xED, 0x7C, 0x83, 0xB1);
- Error *errp = NULL;
int data_length;
g_autoptr(GArray) block = g_array_new(false, true /* clear */, 1);
@@ -583,15 +584,8 @@ int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
/* Build the memory section CPER for above new generic error data entry */
acpi_ghes_build_append_mem_cper(block, physical_address);
- /* Report the error */
- ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
-
- if (errp) {
- error_report_err(errp);
- return -1;
- }
-
- return 0;
+ return ghes_record_cper_errors(ags, block->data, block->len,
+ source_id, errp);
}
AcpiGhesState *acpi_ghes_get_state(void)
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index df2ecbf6e4..5b29aae4dd 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -98,9 +98,9 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
const char *oem_id, const char *oem_table_id);
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
GArray *hardware_errors);
-int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
- uint64_t error_physical_addr);
-void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
+bool acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
+ uint64_t error_physical_addr, Error **errp);
+bool ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
uint16_t source_id, Error **errp);
/**
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index acda0b3fb4..76aa09810f 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2456,12 +2456,9 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
*/
if (code == BUS_MCEERR_AR) {
kvm_cpu_synchronize_state(c);
- if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
- paddr)) {
- kvm_inject_arm_sea(c);
- } else {
- exit(1);
- }
+ acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
+ paddr, &error_fatal);
+ kvm_inject_arm_sea(c);
}
return;
}
--
2.51.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH v2 0/5] acpi/ghes: Error object handling improvement
2025-12-01 14:17 [PATCH v2 0/5] acpi/ghes: Error object handling improvement Gavin Shan
` (4 preceding siblings ...)
2025-12-01 14:18 ` [PATCH v2 5/5] acpi/ghes: Use error_fatal in acpi_ghes_memory_errors() Gavin Shan
@ 2025-12-04 11:09 ` Gavin Shan
5 siblings, 0 replies; 7+ messages in thread
From: Gavin Shan @ 2025-12-04 11:09 UTC (permalink / raw)
To: qemu-arm
Cc: qemu-devel, mst, jonathan.cameron, mchehab+huawei, imammedo,
armbru, anisinha, gengdongjiu1, peter.maydell, pbonzini,
shan.gavin
Hi Michael,
On 12/2/25 12:17 AM, Gavin Shan wrote:
> This series is pulled from the series for memory error hanlding
> improvement [1] to improve the error object handling in various
> aspects.
>
> [1] https://lists.nongnu.org/archive/html/qemu-arm/2025-11/msg00534.html
>
> This series doesn't have any dependencies and can be merged by
> it own.
>
Could you help to merge this series if it looks good to you? :)
Thanks,
Gavin
> Changelog
> =========
> v2:
> * v1: https://lists.nongnu.org/archive/html/qemu-arm/2025-11/msg00969.html
> * Commit log improvement on PATCH[v1 4/5] (Igor)
> * Collected RBs (Gavin)
>
> Gavin Shan (5):
> acpi/ghes: Automate data block cleanup in acpi_ghes_memory_errors()
> acpi/ghes: Abort in acpi_ghes_memory_errors() if necessary
> target/arm/kvm: Exit on error from acpi_ghes_memory_errors()
> acpi/ghes: Bail early on error from get_ghes_source_offsets()
> acpi/ghes: Use error_fatal in acpi_ghes_memory_errors()
>
> hw/acpi/ghes-stub.c | 6 +++---
> hw/acpi/ghes.c | 45 ++++++++++++++++++------------------------
> include/hw/acpi/ghes.h | 6 +++---
> target/arm/kvm.c | 10 +++-------
> 4 files changed, 28 insertions(+), 39 deletions(-)
>
^ permalink raw reply [flat|nested] 7+ messages in thread