Re: [PATCH RESEND v2 1/3] acpi/ghes: Extend acpi_ghes_memory_errors() to support multiple CPERs

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jonathan Cameron via <qemu-arm@nongnu.org>
To: Gavin Shan <gshan@redhat.com>
Cc: <qemu-arm@nongnu.org>, <qemu-devel@nongnu.org>, <mst@redhat.com>,
	<imammedo@redhat.com>, <anisinha@redhat.com>,
	<gengdongjiu1@gmail.com>, <peter.maydell@linaro.org>,
	<pbonzini@redhat.com>, <mchehab+huawei@kernel.org>,
	<shan.gavin@gmail.com>
Subject: Re: [PATCH RESEND v2 1/3] acpi/ghes: Extend acpi_ghes_memory_errors() to support multiple CPERs
Date: Fri, 31 Oct 2025 09:58:50 +0000	[thread overview]
Message-ID: <20251031095850.00002589@huawei.com> (raw)
In-Reply-To: <20251007060810.258536-2-gshan@redhat.com>

On Tue,  7 Oct 2025 16:08:08 +1000
Gavin Shan <gshan@redhat.com> wrote:

> In the situation where host and guest has 64KB and 4KB page sizes, one
> error on the host's page affects 16 guest's pages. we need to send 16
> consective errors in this specific case.

Hi Gavin,

Sorry this one has been on my to review list far too long.

> 
> Extend acpi_ghes_memory_errors() to support multiple CPERs after the
> hunk of code to generate the GHES error status is pulled out from
> ghes_gen_err_data_uncorrectable_recoverable().

I think this description needs to be more detailed wrt to how those
multiple CPERs are handled.  Specifically that they are in a single
error status block (so should only represent related errors.)

This is to make it clear this isn't queuing events, but instead just
presenting them as one block.

> 
> No functional changes intended.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  hw/acpi/ghes-stub.c    |  2 +-
>  hw/acpi/ghes.c         | 27 ++++++++++++++-------------
>  include/hw/acpi/ghes.h |  2 +-
>  target/arm/kvm.c       |  7 ++++++-
>  4 files changed, 22 insertions(+), 16 deletions(-)

> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 06555905ce..045b77715f 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -214,18 +214,13 @@ static void acpi_ghes_build_append_mem_cper(GArray *table,
>  
>  static void
>  ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
> -                                            const uint8_t *section_type,
> -                                            int data_length)
> +                                            const uint8_t *section_type)
>  {
>      /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
>       * Table 17-13 Generic Error Data Entry
>       */
>      QemuUUID fru_id = {};
>  
> -    /* Build the new generic error status block header */
> -    acpi_ghes_generic_error_status(block, ACPI_GEBS_UNCORRECTABLE,
> -        0, 0, data_length, ACPI_CPER_SEV_RECOVERABLE);
> -]

With this bit gone, is it worth having the helper?  Perhaps just move
the remains to where it is called.

>      /* Build this new generic error data entry header */
>      acpi_ghes_generic_error_data(block, section_type,
>          ACPI_CPER_SEV_RECOVERABLE, 0, 0,

> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 4f769d69b3..9a47ac9e3a 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -2434,6 +2434,7 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>      ram_addr_t ram_addr;
>      hwaddr paddr;
>      AcpiGhesState *ags;
> +    GArray *addresses;
>  
>      assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
>  
> @@ -2442,6 +2443,7 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>          ram_addr = qemu_ram_addr_from_host(addr);
>          if (ram_addr != RAM_ADDR_INVALID &&
>              kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
> +            addresses = g_array_new(false, false, sizeof(paddr));

Given you are going to free in all paths, maybe a g_autofree?

Also, we know this only grows to a fixed max size (16 after patch 3), so maybe just
provide a hwaddr paddrs[16]; and pass forwards the count?

>              kvm_hwpoison_page_add(ram_addr);
>              /*
>               * If this is a BUS_MCEERR_AR, we know we have been called
> @@ -2454,16 +2456,19 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>               * later from the main thread, so doing the injection of
>               * the error would be more complicated.
>               */
> +            g_array_append_vals(addresses, &paddr, 1);
>              if (code == BUS_MCEERR_AR) {
>                  kvm_cpu_synchronize_state(c);
>                  if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
> -                                             paddr)) {
> +                                             addresses)) {
>                      kvm_inject_arm_sea(c);
>                  } else {
>                      error_report("failed to record the error");
>                      abort();
>                  }
>              }
> +
> +            g_array_free(addresses, true);
>              return;
>          }
>          if (code == BUS_MCEERR_AO) {

WARNING: multiple messages have this Message-ID (diff)

From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Gavin Shan <gshan@redhat.com>
Cc: <qemu-arm@nongnu.org>, <qemu-devel@nongnu.org>, <mst@redhat.com>,
	<imammedo@redhat.com>, <anisinha@redhat.com>,
	<gengdongjiu1@gmail.com>, <peter.maydell@linaro.org>,
	<pbonzini@redhat.com>, <mchehab+huawei@kernel.org>,
	<shan.gavin@gmail.com>
Subject: Re: [PATCH RESEND v2 1/3] acpi/ghes: Extend acpi_ghes_memory_errors() to support multiple CPERs
Date: Fri, 31 Oct 2025 09:58:50 +0000	[thread overview]
Message-ID: <20251031095850.00002589@huawei.com> (raw)
In-Reply-To: <20251007060810.258536-2-gshan@redhat.com>

On Tue,  7 Oct 2025 16:08:08 +1000
Gavin Shan <gshan@redhat.com> wrote:

> In the situation where host and guest has 64KB and 4KB page sizes, one
> error on the host's page affects 16 guest's pages. we need to send 16
> consective errors in this specific case.

Hi Gavin,

Sorry this one has been on my to review list far too long.

> 
> Extend acpi_ghes_memory_errors() to support multiple CPERs after the
> hunk of code to generate the GHES error status is pulled out from
> ghes_gen_err_data_uncorrectable_recoverable().

I think this description needs to be more detailed wrt to how those
multiple CPERs are handled.  Specifically that they are in a single
error status block (so should only represent related errors.)

This is to make it clear this isn't queuing events, but instead just
presenting them as one block.

> 
> No functional changes intended.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  hw/acpi/ghes-stub.c    |  2 +-
>  hw/acpi/ghes.c         | 27 ++++++++++++++-------------
>  include/hw/acpi/ghes.h |  2 +-
>  target/arm/kvm.c       |  7 ++++++-
>  4 files changed, 22 insertions(+), 16 deletions(-)

> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 06555905ce..045b77715f 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -214,18 +214,13 @@ static void acpi_ghes_build_append_mem_cper(GArray *table,
>  
>  static void
>  ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
> -                                            const uint8_t *section_type,
> -                                            int data_length)
> +                                            const uint8_t *section_type)
>  {
>      /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
>       * Table 17-13 Generic Error Data Entry
>       */
>      QemuUUID fru_id = {};
>  
> -    /* Build the new generic error status block header */
> -    acpi_ghes_generic_error_status(block, ACPI_GEBS_UNCORRECTABLE,
> -        0, 0, data_length, ACPI_CPER_SEV_RECOVERABLE);
> -]

With this bit gone, is it worth having the helper?  Perhaps just move
the remains to where it is called.

>      /* Build this new generic error data entry header */
>      acpi_ghes_generic_error_data(block, section_type,
>          ACPI_CPER_SEV_RECOVERABLE, 0, 0,

> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 4f769d69b3..9a47ac9e3a 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -2434,6 +2434,7 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>      ram_addr_t ram_addr;
>      hwaddr paddr;
>      AcpiGhesState *ags;
> +    GArray *addresses;
>  
>      assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
>  
> @@ -2442,6 +2443,7 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>          ram_addr = qemu_ram_addr_from_host(addr);
>          if (ram_addr != RAM_ADDR_INVALID &&
>              kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
> +            addresses = g_array_new(false, false, sizeof(paddr));

Given you are going to free in all paths, maybe a g_autofree?

Also, we know this only grows to a fixed max size (16 after patch 3), so maybe just
provide a hwaddr paddrs[16]; and pass forwards the count?

>              kvm_hwpoison_page_add(ram_addr);
>              /*
>               * If this is a BUS_MCEERR_AR, we know we have been called
> @@ -2454,16 +2456,19 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>               * later from the main thread, so doing the injection of
>               * the error would be more complicated.
>               */
> +            g_array_append_vals(addresses, &paddr, 1);
>              if (code == BUS_MCEERR_AR) {
>                  kvm_cpu_synchronize_state(c);
>                  if (!acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
> -                                             paddr)) {
> +                                             addresses)) {
>                      kvm_inject_arm_sea(c);
>                  } else {
>                      error_report("failed to record the error");
>                      abort();
>                  }
>              }
> +
> +            g_array_free(addresses, true);
>              return;
>          }
>          if (code == BUS_MCEERR_AO) {

next prev parent reply	other threads:[~2025-10-31 10:04 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-07  6:08 [PATCH RESEND v2 0/3] target/arm/kvm: Improve memory error handling Gavin Shan
2025-10-07  6:08 ` [PATCH RESEND v2 1/3] acpi/ghes: Extend acpi_ghes_memory_errors() to support multiple CPERs Gavin Shan
2025-10-31  9:58   ` Jonathan Cameron via [this message]
2025-10-31  9:58     ` Jonathan Cameron via
2025-10-31 10:08     ` Jonathan Cameron via
2025-10-31 10:08       ` Jonathan Cameron via
2025-11-02 22:45       ` Gavin Shan
2025-10-31 13:17   ` Igor Mammedov
2025-11-02 22:51     ` Gavin Shan
2025-10-07  6:08 ` [PATCH RESEND v2 2/3] kvm/arm/kvm: Introduce helper push_ghes_memory_errors() Gavin Shan
2025-10-31 10:09   ` Jonathan Cameron via
2025-10-31 10:09     ` Jonathan Cameron via
2025-11-02 23:39     ` Gavin Shan
2025-11-03  9:45       ` Igor Mammedov
2025-10-31 13:25   ` Igor Mammedov
2025-11-02 23:35     ` Gavin Shan
2025-10-07  6:08 ` [PATCH RESEND v2 3/3] target/arm/kvm: Support multiple memory CPERs injection Gavin Shan
2025-10-07 10:57   ` Mauro Carvalho Chehab
2025-10-08  3:57     ` Gavin Shan
2025-10-17 14:27   ` Igor Mammedov
2025-10-19  0:36     ` Gavin Shan
2025-10-31 13:55       ` Igor Mammedov
2025-11-02 23:02         ` Gavin Shan
2025-11-03  9:52           ` Igor Mammedov
2025-11-03 23:51             ` Gavin Shan
2025-11-06  7:57               ` Igor Mammedov
2025-11-06 21:43                 ` Gavin Shan
2025-11-04 12:21             ` Jonathan Cameron via
2025-11-04 12:21               ` Jonathan Cameron via
2025-11-05  0:40               ` Gavin Shan
2025-11-05  9:02                 ` Jonathan Cameron via
2025-11-05  9:02                   ` Jonathan Cameron via
2025-11-07  5:11                   ` Gavin Shan
2025-10-31 10:10   ` Jonathan Cameron via
2025-10-31 10:10     ` Jonathan Cameron via
2025-11-02 23:03     ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251031095850.00002589@huawei.com \
    --to=qemu-arm@nongnu.org \
    --cc=anisinha@redhat.com \
    --cc=gengdongjiu1@gmail.com \
    --cc=gshan@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=mchehab+huawei@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.