From: Gavin Shan <gshan@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org,
jonathan.cameron@huawei.com, mchehab+huawei@kernel.org,
gengdongjiu1@gmail.com, mst@redhat.com, imammedo@redhat.com,
anisinha@redhat.com, peter.maydell@linaro.org,
pbonzini@redhat.com, shan.gavin@gmail.com
Subject: Re: [PATCH v3 6/8] acpi/ghes: Use error_abort in acpi_ghes_memory_errors()
Date: Tue, 11 Nov 2025 16:02:24 +1000 [thread overview]
Message-ID: <b673bf36-cf1b-4103-bce8-0465a1385403@redhat.com> (raw)
In-Reply-To: <87o6p9gmy4.fsf@pond.sub.org>
Hi Markus,
On 11/11/25 3:25 PM, Markus Armbruster wrote:
> Gavin Shan <gshan@redhat.com> writes:
>
>> Use error_abort in acpi_ghes_memory_errors() so that the caller needn't
>> explicitly call abort() on errors. With this change, its return value
>> isn't needed any more.
>>
>> Suggested-by: Igor Mammedov <imammedo@redhat.com>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>> hw/acpi/ghes-stub.c | 6 +++---
>> hw/acpi/ghes.c | 15 ++++-----------
>> include/hw/acpi/ghes.h | 5 +++--
>> target/arm/kvm.c | 10 +++-------
>> 4 files changed, 13 insertions(+), 23 deletions(-)
>>
>> diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
>> index 4faf573aeb..4ef914ffc5 100644
>> --- a/hw/acpi/ghes-stub.c
>> +++ b/hw/acpi/ghes-stub.c
>> @@ -11,10 +11,10 @@
>> #include "qemu/osdep.h"
>> #include "hw/acpi/ghes.h"
>>
>> -int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> - uint64_t *addresses, uint32_t num_of_addresses)
>> +void acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> + uint64_t *addresses, uint32_t num_of_addresses,
>> + Error **errp)
>> {
>> - return -1;
>> }
>
> Before the patch, this function always fails: it returns -1.
>
> Afterwards, it always succeeds: it doesn't set @errp.
>
> Which one is correct?
>
Both are correct. This variant is only used on !CONFIG_ACPI_APEI. In this case,
there is no chance to call this variant in the only caller kvm_arch_on_sigbus_vcpu().
acpi_ghes_get_state() returns NULL on !CONFIG_ACPI_APEI there.
>>
>> AcpiGhesState *acpi_ghes_get_state(void)
>> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
>> index 055e5d719a..aa469c03f2 100644
>> --- a/hw/acpi/ghes.c
>> +++ b/hw/acpi/ghes.c
>> @@ -543,8 +543,9 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
>> notifier_list_notify(&acpi_generic_error_notifiers, &source_id);
>> }
>>
>> -int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> - uint64_t *addresses, uint32_t num_of_addresses)
>> +void acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> + uint64_t *addresses, uint32_t num_of_addresses,
>> + Error **errp)
>
> qapi/error.h:
>
> * - Whenever practical, also return a value that indicates success /
> * failure. This can make the error checking more concise, and can
> * avoid useless error object creation and destruction. Note that
> * we still have many functions returning void. We recommend
> * • bool-valued functions return true on success / false on failure,
> * • pointer-valued functions return non-null / null pointer, and
> * • integer-valued functions return non-negative / negative.
>
Question: If it's deterministic that caller passes @error_abort or @error_fatal
to acpi_ghes_memory_errors(), QEMU crashes with a core dump or exit before its
caller to check the return value. In this case, it's still preferred for
acpi_ghes_memory_errors() returns a value to indicate the success or failure?
>> {
>> /* Memory Error Section Type */
>> const uint8_t guid[] =
>> @@ -555,7 +556,6 @@ int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> * Table 17-13 Generic Error Data Entry
>> */
>> QemuUUID fru_id = {};
>> - Error *errp = NULL;
>> int data_length;
>> GArray *block;
>> uint32_t block_status, i;
>> @@ -592,16 +592,9 @@ int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> }
>>
>> /* Report the error */
>> - ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
>> + ghes_record_cper_errors(ags, block->data, block->len, source_id, errp);
>>
>> g_array_free(block, true);
>> -
>> - if (errp) {
>> - error_report_err(errp);
>> - return -1;
>> - }
>> -
>> - return 0;
>> }
>
> The error reporting moves into the caller.
>
Similar question as above. If it's deterministic for the caller passes @error_abort
or @error_fatal to acpi_ghes_memory_errors() and then to ghes_record_cper_errors(),
QEMU crashes with a core dump or exit before error_report_err(errp) can be executed.
In this case, it's still preferred to have error_report_err(&error_abort) or
error_report_err(&error_fatal) in its caller?
>>
>> AcpiGhesState *acpi_ghes_get_state(void)
>> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
>> index f73908985d..35c7bbbb01 100644
>> --- a/include/hw/acpi/ghes.h
>> +++ b/include/hw/acpi/ghes.h
>> @@ -98,8 +98,9 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
>> const char *oem_id, const char *oem_table_id);
>> void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
>> GArray *hardware_errors);
>> -int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> - uint64_t *addresses, uint32_t num_of_addresses);
>> +void acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> + uint64_t *addresses, uint32_t num_of_addresses,
>> + Error **errp);
>> void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
>> uint16_t source_id, Error **errp);
>>
>> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
>> index 459ca4a9b0..a889315606 100644
>> --- a/target/arm/kvm.c
>> +++ b/target/arm/kvm.c
>> @@ -2458,13 +2458,9 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>> addresses[0] = paddr;
>> if (code == BUS_MCEERR_AR) {
>> kvm_cpu_synchronize_state(c);
>> - if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
>> - addresses, 1)) {
>> - kvm_inject_arm_sea(c);
>> - } else {
>> - error_report("failed to record the error");
>> - abort();
>> - }
>> + acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
>> + addresses, 1, &error_abort);
>> + kvm_inject_arm_sea(c);
>
> Before the patch, we get two error reports, like this:
>
> qemu-system-FOO: OSPM does not acknowledge previous error, so can not record CPER for current error anymore
> qemu-system-FOO: failed to record the error
> Aborted (core dumped)
>
> Such error cascades should be avoided.
>
> Afterwards, we get one:
>
> Unexpected error at SOURCE-FILE:LINE-NUMBER:
> qemu-system-FOO: OSPM does not acknowledge previous error, so can not record CPER for current error anymore
> Aborted (core dumped)
>
> Are all errors reported by acpi_ghes_memory_errors() programming errors,
> i.e. when an error is reported, there's a bug for us to fix?
>
> If not, abort() is wrong before the patch, and &error_abort is wrong
> afterwards.
>
> You can use &error_fatal for fatal errors that are not programming
> errors.
>
No, there is no programming errors and the core dump is actually no sense.
It makes more sense for the caller to pass @error_fatal down to acpi_ghes_memory_errors().
>> }
>> return;
>> }
>
Thanks,
Gavin
next prev parent reply other threads:[~2025-11-11 6:03 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-05 11:44 [PATCH v3 0/8] target/arm/kvm: Improve memory error handling Gavin Shan
2025-11-05 11:44 ` [PATCH v3 1/8] tests/qtest/bios-tables-test: Prepare for changes in the HEST table Gavin Shan
2025-11-05 14:16 ` Jonathan Cameron via
2025-11-05 11:44 ` [PATCH v3 2/8] acpi/ghes: Increase GHES raw data maximal length to 4KiB Gavin Shan
2025-11-05 14:16 ` Jonathan Cameron via
2025-11-10 14:11 ` Igor Mammedov
2025-11-11 4:05 ` Gavin Shan
2025-11-12 12:32 ` Igor Mammedov
2025-11-12 17:41 ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 3/8] tests/qtest/bios-tables-test: Update HEST table Gavin Shan
2025-11-05 14:17 ` Jonathan Cameron via
2025-11-05 11:44 ` [PATCH v3 4/8] acpi/ghes: Extend acpi_ghes_memory_errors() to support multiple CPERs Gavin Shan
2025-11-05 14:14 ` Jonathan Cameron via
2025-11-06 3:15 ` Gavin Shan
2025-11-10 14:49 ` Igor Mammedov
2025-11-11 4:08 ` Gavin Shan
2025-11-11 10:07 ` Jonathan Cameron via
2025-11-11 10:55 ` Gavin Shan
2025-11-11 11:55 ` Jonathan Cameron via
2025-11-11 12:19 ` Gavin Shan
2025-11-11 13:12 ` Jonathan Cameron via
2025-11-10 14:38 ` Igor Mammedov
2025-11-11 4:40 ` Gavin Shan
2025-11-12 13:12 ` Igor Mammedov
2025-11-12 17:36 ` Gavin Shan
2025-11-10 14:43 ` Philippe Mathieu-Daudé
2025-11-10 23:38 ` Gavin Shan
2025-11-11 3:40 ` Gavin Shan
2025-11-10 14:48 ` Philippe Mathieu-Daudé
2025-11-11 3:44 ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 5/8] acpi/ghes: Bail early on error from get_ghes_source_offsets() Gavin Shan
2025-11-05 14:17 ` Jonathan Cameron via
2025-11-10 14:50 ` Philippe Mathieu-Daudé
2025-11-11 3:48 ` Gavin Shan
2025-11-10 14:51 ` Igor Mammedov
2025-11-05 11:44 ` [PATCH v3 6/8] acpi/ghes: Use error_abort in acpi_ghes_memory_errors() Gavin Shan
2025-11-05 14:18 ` Jonathan Cameron via
2025-11-10 14:53 ` Igor Mammedov
2025-11-10 14:54 ` Philippe Mathieu-Daudé
2025-11-11 3:58 ` Gavin Shan
2025-11-12 12:49 ` Igor Mammedov
2025-11-12 17:38 ` Gavin Shan
2025-11-11 5:08 ` Markus Armbruster
2025-11-11 5:25 ` Markus Armbruster
2025-11-11 6:02 ` Gavin Shan [this message]
2025-11-11 7:31 ` Markus Armbruster
2025-11-05 11:44 ` [PATCH v3 7/8] kvm/arm/kvm: Introduce helper push_ghes_memory_errors() Gavin Shan
2025-11-05 14:19 ` Jonathan Cameron via
2025-11-10 14:56 ` Igor Mammedov
2025-11-11 4:09 ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 8/8] target/arm/kvm: Support multiple memory CPERs injection Gavin Shan
2025-11-05 14:37 ` Jonathan Cameron via
2025-11-06 3:26 ` Gavin Shan
2025-11-11 10:12 ` Jonathan Cameron via
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b673bf36-cf1b-4103-bce8-0465a1385403@redhat.com \
--to=gshan@redhat.com \
--cc=anisinha@redhat.com \
--cc=armbru@redhat.com \
--cc=gengdongjiu1@gmail.com \
--cc=imammedo@redhat.com \
--cc=jonathan.cameron@huawei.com \
--cc=mchehab+huawei@kernel.org \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=shan.gavin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).