Re: [PATCH v3 8/8] target/arm/kvm: Support multiple memory CPERs injection

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Gavin Shan <gshan@redhat.com>
Cc: <qemu-arm@nongnu.org>, <qemu-devel@nongnu.org>,
	<mchehab+huawei@kernel.org>, <gengdongjiu1@gmail.com>,
	<mst@redhat.com>, <imammedo@redhat.com>, <anisinha@redhat.com>,
	<peter.maydell@linaro.org>, <pbonzini@redhat.com>,
	<shan.gavin@gmail.com>
Subject: Re: [PATCH v3 8/8] target/arm/kvm: Support multiple memory CPERs injection
Date: Wed, 5 Nov 2025 14:37:10 +0000	[thread overview]
Message-ID: <20251105143710.000041f5@huawei.com> (raw)
In-Reply-To: <20251105114453.2164073-9-gshan@redhat.com>

On Wed,  5 Nov 2025 21:44:53 +1000
Gavin Shan <gshan@redhat.com> wrote:

> In the combination of 64KiB host and 4KiB guest, a problematic host
> page affects 16x guest pages that can be owned by different threads.
> It means 16x memory errors can be raised at once due to the parallel
> accesses to those 16x guest pages on the guest. Unfortunately, QEMU
> can't deliver them one by one because we just one GHES error block,

we have just one

> corresponding one read acknowledgement register. It can eventually
> cause QEMU crash dump due to the contention on that register, meaning
> the current memory error can't be delivered before the previous error
> isn't acknowledged.
> 
> Imporve push_ghes_memory_errors() to push 16x consecutive memory errors
Improve

> under this situation to avoid the contention on the read acknowledgement
> register.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
Hi Gavin

Silly question that never occurred to me before:
What happens if we just report a single larger error?

The CPER record has a Physical Address Mask that I think lets us say we
are only reporting at a 64KiB granularity.

In linux drivers/edac/ghes_edac.c seems to handle this via e->grain.
https://elixir.bootlin.com/linux/v6.18-rc4/source/drivers/edac/ghes_edac.c#L346

I haven't chased the whole path through to whether this does appropriate poisoning
on the guest though.

> ---
>  target/arm/kvm.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 50 insertions(+), 2 deletions(-)
> 
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 5b151eda3c..d7de8262da 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -11,6 +11,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/units.h"
>  #include <sys/ioctl.h>
>  
>  #include <linux/kvm.h>
> @@ -2432,12 +2433,59 @@ int kvm_arch_get_registers(CPUState *cs, Error **errp)
>  static void push_ghes_memory_errors(CPUState *c, AcpiGhesState *ags,
>                                      uint64_t paddr, Error **errp)
>  {
> +    uint64_t val, start, end, guest_pgsz, host_pgsz;
>      uint64_t addresses[16];
> +    uint32_t num_of_addresses;
> +    int ret;
> +
> +    /*
> +     * Sort out the guest page size from TCR_EL1, which can be modified
> +     * by the guest from time to time. So we have to sort it out dynamically.
> +     */
> +    ret = read_sys_reg64(c->kvm_fd, &val, ARM64_SYS_REG(3, 0, 2, 0, 2));
> +    if (ret) {
> +        error_setg(errp, "Error %" PRId32 " to read TCR_EL1 register", ret);
> +        return;
> +    }
> +
> +    switch (extract64(val, 14, 2)) {
> +    case 0:
> +        guest_pgsz = 4 * KiB;
> +        break;
> +    case 1:
> +        guest_pgsz = 64 * KiB;
> +        break;
> +    case 2:
> +        guest_pgsz = 16 * KiB;
> +        break;
> +    default:
> +        error_setg(errp, "Unknown page size from TCR_EL1 (0x%" PRIx64 ")", val);
> +        return;
> +    }
> +
> +    host_pgsz = qemu_real_host_page_size();
> +    start = paddr & ~(host_pgsz - 1);
> +    end = start + host_pgsz;
> +    num_of_addresses = 0;
>  
> -    addresses[0] = paddr;
> +    while (start < end) {
> +        /*
> +         * The precise physical address is provided for the affected
> +         * guest page that contains @paddr. Otherwise, the starting
> +         * address of the guest page is provided.
> +         */
> +        if (paddr >= start && paddr < (start + guest_pgsz)) {
> +            addresses[num_of_addresses++] = paddr;
> +        } else {
> +            addresses[num_of_addresses++] = start;
> +        }
> +
> +        start += guest_pgsz;
> +    }
>  
>      kvm_cpu_synchronize_state(c);
> -    acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC, addresses, 1, errp);
> +    acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
> +                            addresses, num_of_addresses, errp);
>      kvm_inject_arm_sea(c);
>  }
>

next prev parent reply	other threads:[~2025-11-05 14:38 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-05 11:44 [PATCH v3 0/8] target/arm/kvm: Improve memory error handling Gavin Shan
2025-11-05 11:44 ` [PATCH v3 1/8] tests/qtest/bios-tables-test: Prepare for changes in the HEST table Gavin Shan
2025-11-05 14:16   ` Jonathan Cameron via
2025-11-05 11:44 ` [PATCH v3 2/8] acpi/ghes: Increase GHES raw data maximal length to 4KiB Gavin Shan
2025-11-05 14:16   ` Jonathan Cameron via
2025-11-10 14:11   ` Igor Mammedov
2025-11-11  4:05     ` Gavin Shan
2025-11-12 12:32       ` Igor Mammedov
2025-11-12 17:41         ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 3/8] tests/qtest/bios-tables-test: Update HEST table Gavin Shan
2025-11-05 14:17   ` Jonathan Cameron via
2025-11-05 11:44 ` [PATCH v3 4/8] acpi/ghes: Extend acpi_ghes_memory_errors() to support multiple CPERs Gavin Shan
2025-11-05 14:14   ` Jonathan Cameron via
2025-11-06  3:15     ` Gavin Shan
2025-11-10 14:49       ` Igor Mammedov
2025-11-11  4:08         ` Gavin Shan
2025-11-11 10:07           ` Jonathan Cameron via
2025-11-11 10:55             ` Gavin Shan
2025-11-11 11:55               ` Jonathan Cameron via
2025-11-11 12:19                 ` Gavin Shan
2025-11-11 13:12                   ` Jonathan Cameron via
2025-11-10 14:38   ` Igor Mammedov
2025-11-11  4:40     ` Gavin Shan
2025-11-12 13:12       ` Igor Mammedov
2025-11-12 17:36         ` Gavin Shan
2025-11-10 14:43   ` Philippe Mathieu-Daudé
2025-11-10 23:38     ` Gavin Shan
2025-11-11  3:40       ` Gavin Shan
2025-11-10 14:48   ` Philippe Mathieu-Daudé
2025-11-11  3:44     ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 5/8] acpi/ghes: Bail early on error from get_ghes_source_offsets() Gavin Shan
2025-11-05 14:17   ` Jonathan Cameron via
2025-11-10 14:50   ` Philippe Mathieu-Daudé
2025-11-11  3:48     ` Gavin Shan
2025-11-10 14:51   ` Igor Mammedov
2025-11-05 11:44 ` [PATCH v3 6/8] acpi/ghes: Use error_abort in acpi_ghes_memory_errors() Gavin Shan
2025-11-05 14:18   ` Jonathan Cameron via
2025-11-10 14:53   ` Igor Mammedov
2025-11-10 14:54   ` Philippe Mathieu-Daudé
2025-11-11  3:58     ` Gavin Shan
2025-11-12 12:49       ` Igor Mammedov
2025-11-12 17:38         ` Gavin Shan
2025-11-11  5:08     ` Markus Armbruster
2025-11-11  5:25   ` Markus Armbruster
2025-11-11  6:02     ` Gavin Shan
2025-11-11  7:31       ` Markus Armbruster
2025-11-05 11:44 ` [PATCH v3 7/8] kvm/arm/kvm: Introduce helper push_ghes_memory_errors() Gavin Shan
2025-11-05 14:19   ` Jonathan Cameron via
2025-11-10 14:56   ` Igor Mammedov
2025-11-11  4:09     ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 8/8] target/arm/kvm: Support multiple memory CPERs injection Gavin Shan
2025-11-05 14:37   ` Jonathan Cameron via [this message]
2025-11-06  3:26     ` Gavin Shan
2025-11-11 10:12       ` Jonathan Cameron via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251105143710.000041f5@huawei.com \
    --to=qemu-devel@nongnu.org \
    --cc=anisinha@redhat.com \
    --cc=gengdongjiu1@gmail.com \
    --cc=gshan@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=mchehab+huawei@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).