qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Gavin Shan <gshan@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org,
	jonathan.cameron@huawei.com, mchehab+huawei@kernel.org,
	gengdongjiu1@gmail.com, mst@redhat.com, imammedo@redhat.com,
	anisinha@redhat.com, peter.maydell@linaro.org,
	pbonzini@redhat.com, shan.gavin@gmail.com
Subject: Re: [PATCH v3 6/8] acpi/ghes: Use error_abort in acpi_ghes_memory_errors()
Date: Tue, 11 Nov 2025 16:02:24 +1000	[thread overview]
Message-ID: <b673bf36-cf1b-4103-bce8-0465a1385403@redhat.com> (raw)
In-Reply-To: <87o6p9gmy4.fsf@pond.sub.org>

Hi Markus,

On 11/11/25 3:25 PM, Markus Armbruster wrote:
> Gavin Shan <gshan@redhat.com> writes:
> 
>> Use error_abort in acpi_ghes_memory_errors() so that the caller needn't
>> explicitly call abort() on errors. With this change, its return value
>> isn't needed any more.
>>
>> Suggested-by: Igor Mammedov <imammedo@redhat.com>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   hw/acpi/ghes-stub.c    |  6 +++---
>>   hw/acpi/ghes.c         | 15 ++++-----------
>>   include/hw/acpi/ghes.h |  5 +++--
>>   target/arm/kvm.c       | 10 +++-------
>>   4 files changed, 13 insertions(+), 23 deletions(-)
>>
>> diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
>> index 4faf573aeb..4ef914ffc5 100644
>> --- a/hw/acpi/ghes-stub.c
>> +++ b/hw/acpi/ghes-stub.c
>> @@ -11,10 +11,10 @@
>>   #include "qemu/osdep.h"
>>   #include "hw/acpi/ghes.h"
>>   
>> -int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> -                            uint64_t *addresses, uint32_t num_of_addresses)
>> +void acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> +                             uint64_t *addresses, uint32_t num_of_addresses,
>> +                             Error **errp)
>>   {
>> -    return -1;
>>   }
> 
> Before the patch, this function always fails: it returns -1.
> 
> Afterwards, it always succeeds: it doesn't set @errp.
> 
> Which one is correct?
> 

Both are correct. This variant is only used on !CONFIG_ACPI_APEI. In this case,
there is no chance to call this variant in the only caller kvm_arch_on_sigbus_vcpu().
acpi_ghes_get_state() returns NULL on !CONFIG_ACPI_APEI there.

>>   
>>   AcpiGhesState *acpi_ghes_get_state(void)
>> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
>> index 055e5d719a..aa469c03f2 100644
>> --- a/hw/acpi/ghes.c
>> +++ b/hw/acpi/ghes.c
>> @@ -543,8 +543,9 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
>>       notifier_list_notify(&acpi_generic_error_notifiers, &source_id);
>>   }
>>   
>> -int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> -                            uint64_t *addresses, uint32_t num_of_addresses)
>> +void acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> +                             uint64_t *addresses, uint32_t num_of_addresses,
>> +                             Error **errp)
> 
> qapi/error.h:
> 
>   * - Whenever practical, also return a value that indicates success /
>   *   failure.  This can make the error checking more concise, and can
>   *   avoid useless error object creation and destruction.  Note that
>   *   we still have many functions returning void.  We recommend
>   *   • bool-valued functions return true on success / false on failure,
>   *   • pointer-valued functions return non-null / null pointer, and
>   *   • integer-valued functions return non-negative / negative.
> 

Question: If it's deterministic that caller passes @error_abort or @error_fatal
to acpi_ghes_memory_errors(), QEMU crashes with a core dump or exit before its
caller to check the return value. In this case, it's still preferred for
acpi_ghes_memory_errors() returns a value to indicate the success or failure?

>>   {
>>       /* Memory Error Section Type */
>>       const uint8_t guid[] =
>> @@ -555,7 +556,6 @@ int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>>        * Table 17-13 Generic Error Data Entry
>>        */
>>       QemuUUID fru_id = {};
>> -    Error *errp = NULL;
>>       int data_length;
>>       GArray *block;
>>       uint32_t block_status, i;
>> @@ -592,16 +592,9 @@ int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>>       }
>>   
>>       /* Report the error */
>> -    ghes_record_cper_errors(ags, block->data, block->len, source_id, &errp);
>> +    ghes_record_cper_errors(ags, block->data, block->len, source_id, errp);
>>   
>>       g_array_free(block, true);
>> -
>> -    if (errp) {
>> -        error_report_err(errp);
>> -        return -1;
>> -    }
>> -
>> -    return 0;
>>   }
> 
> The error reporting moves into the caller.
> 

Similar question as above. If it's deterministic for the caller passes @error_abort
or @error_fatal to acpi_ghes_memory_errors() and then to ghes_record_cper_errors(),
QEMU crashes with a core dump or exit before error_report_err(errp) can be executed.
In this case, it's still preferred to have error_report_err(&error_abort) or
error_report_err(&error_fatal) in its caller?

>>   
>>   AcpiGhesState *acpi_ghes_get_state(void)
>> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
>> index f73908985d..35c7bbbb01 100644
>> --- a/include/hw/acpi/ghes.h
>> +++ b/include/hw/acpi/ghes.h
>> @@ -98,8 +98,9 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
>>                        const char *oem_id, const char *oem_table_id);
>>   void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
>>                             GArray *hardware_errors);
>> -int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> -                            uint64_t *addresses, uint32_t num_of_addresses);
>> +void acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>> +                             uint64_t *addresses, uint32_t num_of_addresses,
>> +                             Error **errp);
>>   void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
>>                                uint16_t source_id, Error **errp);
>>   
>> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
>> index 459ca4a9b0..a889315606 100644
>> --- a/target/arm/kvm.c
>> +++ b/target/arm/kvm.c
>> @@ -2458,13 +2458,9 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>>               addresses[0] = paddr;
>>               if (code == BUS_MCEERR_AR) {
>>                   kvm_cpu_synchronize_state(c);
>> -                if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
>> -                                             addresses, 1)) {
>> -                    kvm_inject_arm_sea(c);
>> -                } else {
>> -                    error_report("failed to record the error");
>> -                    abort();
>> -                }
>> +                acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
>> +                                        addresses, 1, &error_abort);
>> +                kvm_inject_arm_sea(c);
> 
> Before the patch, we get two error reports, like this:
> 
>      qemu-system-FOO: OSPM does not acknowledge previous error, so can not record CPER for current error anymore
>      qemu-system-FOO: failed to record the error
>      Aborted (core dumped)
> 
> Such error cascades should be avoided.
> 
> Afterwards, we get one:
> 
>      Unexpected error at SOURCE-FILE:LINE-NUMBER:
>      qemu-system-FOO: OSPM does not acknowledge previous error, so can not record CPER for current error anymore
>      Aborted (core dumped)
> 
> Are all errors reported by acpi_ghes_memory_errors() programming errors,
> i.e. when an error is reported, there's a bug for us to fix?
> 
> If not, abort() is wrong before the patch, and &error_abort is wrong
> afterwards.
> 
> You can use &error_fatal for fatal errors that are not programming
> errors.
> 

No, there is no programming errors and the core dump is actually no sense.

It makes more sense for the caller to pass @error_fatal down to acpi_ghes_memory_errors().

>>               }
>>               return;
>>           }
> 

Thanks,
Gavin



  reply	other threads:[~2025-11-11  6:03 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-05 11:44 [PATCH v3 0/8] target/arm/kvm: Improve memory error handling Gavin Shan
2025-11-05 11:44 ` [PATCH v3 1/8] tests/qtest/bios-tables-test: Prepare for changes in the HEST table Gavin Shan
2025-11-05 14:16   ` Jonathan Cameron via
2025-11-05 11:44 ` [PATCH v3 2/8] acpi/ghes: Increase GHES raw data maximal length to 4KiB Gavin Shan
2025-11-05 14:16   ` Jonathan Cameron via
2025-11-10 14:11   ` Igor Mammedov
2025-11-11  4:05     ` Gavin Shan
2025-11-12 12:32       ` Igor Mammedov
2025-11-12 17:41         ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 3/8] tests/qtest/bios-tables-test: Update HEST table Gavin Shan
2025-11-05 14:17   ` Jonathan Cameron via
2025-11-05 11:44 ` [PATCH v3 4/8] acpi/ghes: Extend acpi_ghes_memory_errors() to support multiple CPERs Gavin Shan
2025-11-05 14:14   ` Jonathan Cameron via
2025-11-06  3:15     ` Gavin Shan
2025-11-10 14:49       ` Igor Mammedov
2025-11-11  4:08         ` Gavin Shan
2025-11-11 10:07           ` Jonathan Cameron via
2025-11-11 10:55             ` Gavin Shan
2025-11-11 11:55               ` Jonathan Cameron via
2025-11-11 12:19                 ` Gavin Shan
2025-11-11 13:12                   ` Jonathan Cameron via
2025-11-10 14:38   ` Igor Mammedov
2025-11-11  4:40     ` Gavin Shan
2025-11-12 13:12       ` Igor Mammedov
2025-11-12 17:36         ` Gavin Shan
2025-11-10 14:43   ` Philippe Mathieu-Daudé
2025-11-10 23:38     ` Gavin Shan
2025-11-11  3:40       ` Gavin Shan
2025-11-10 14:48   ` Philippe Mathieu-Daudé
2025-11-11  3:44     ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 5/8] acpi/ghes: Bail early on error from get_ghes_source_offsets() Gavin Shan
2025-11-05 14:17   ` Jonathan Cameron via
2025-11-10 14:50   ` Philippe Mathieu-Daudé
2025-11-11  3:48     ` Gavin Shan
2025-11-10 14:51   ` Igor Mammedov
2025-11-05 11:44 ` [PATCH v3 6/8] acpi/ghes: Use error_abort in acpi_ghes_memory_errors() Gavin Shan
2025-11-05 14:18   ` Jonathan Cameron via
2025-11-10 14:53   ` Igor Mammedov
2025-11-10 14:54   ` Philippe Mathieu-Daudé
2025-11-11  3:58     ` Gavin Shan
2025-11-12 12:49       ` Igor Mammedov
2025-11-12 17:38         ` Gavin Shan
2025-11-11  5:08     ` Markus Armbruster
2025-11-11  5:25   ` Markus Armbruster
2025-11-11  6:02     ` Gavin Shan [this message]
2025-11-11  7:31       ` Markus Armbruster
2025-11-05 11:44 ` [PATCH v3 7/8] kvm/arm/kvm: Introduce helper push_ghes_memory_errors() Gavin Shan
2025-11-05 14:19   ` Jonathan Cameron via
2025-11-10 14:56   ` Igor Mammedov
2025-11-11  4:09     ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 8/8] target/arm/kvm: Support multiple memory CPERs injection Gavin Shan
2025-11-05 14:37   ` Jonathan Cameron via
2025-11-06  3:26     ` Gavin Shan
2025-11-11 10:12       ` Jonathan Cameron via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b673bf36-cf1b-4103-bce8-0465a1385403@redhat.com \
    --to=gshan@redhat.com \
    --cc=anisinha@redhat.com \
    --cc=armbru@redhat.com \
    --cc=gengdongjiu1@gmail.com \
    --cc=imammedo@redhat.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=mchehab+huawei@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).