Re: [PATCH v3 8/8] target/arm/kvm: Support multiple memory CPERs injection

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jonathan Cameron via <qemu-arm@nongnu.org>
To: Gavin Shan <gshan@redhat.com>
Cc: <qemu-arm@nongnu.org>, <qemu-devel@nongnu.org>,
	<mchehab+huawei@kernel.org>, <gengdongjiu1@gmail.com>,
	<mst@redhat.com>, <imammedo@redhat.com>, <anisinha@redhat.com>,
	<peter.maydell@linaro.org>, <pbonzini@redhat.com>,
	<shan.gavin@gmail.com>
Subject: Re: [PATCH v3 8/8] target/arm/kvm: Support multiple memory CPERs injection
Date: Wed, 5 Nov 2025 14:37:10 +0000	[thread overview]
Message-ID: <20251105143710.000041f5@huawei.com> (raw)
In-Reply-To: <20251105114453.2164073-9-gshan@redhat.com>

On Wed,  5 Nov 2025 21:44:53 +1000
Gavin Shan <gshan@redhat.com> wrote:

> In the combination of 64KiB host and 4KiB guest, a problematic host
> page affects 16x guest pages that can be owned by different threads.
> It means 16x memory errors can be raised at once due to the parallel
> accesses to those 16x guest pages on the guest. Unfortunately, QEMU
> can't deliver them one by one because we just one GHES error block,

we have just one

> corresponding one read acknowledgement register. It can eventually
> cause QEMU crash dump due to the contention on that register, meaning
> the current memory error can't be delivered before the previous error
> isn't acknowledged.
> 
> Imporve push_ghes_memory_errors() to push 16x consecutive memory errors
Improve

> under this situation to avoid the contention on the read acknowledgement
> register.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
Hi Gavin

Silly question that never occurred to me before:
What happens if we just report a single larger error?

The CPER record has a Physical Address Mask that I think lets us say we
are only reporting at a 64KiB granularity.

In linux drivers/edac/ghes_edac.c seems to handle this via e->grain.
https://elixir.bootlin.com/linux/v6.18-rc4/source/drivers/edac/ghes_edac.c#L346

I haven't chased the whole path through to whether this does appropriate poisoning
on the guest though.

> ---
>  target/arm/kvm.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 50 insertions(+), 2 deletions(-)
> 
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 5b151eda3c..d7de8262da 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -11,6 +11,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/units.h"
>  #include <sys/ioctl.h>
>  
>  #include <linux/kvm.h>
> @@ -2432,12 +2433,59 @@ int kvm_arch_get_registers(CPUState *cs, Error **errp)
>  static void push_ghes_memory_errors(CPUState *c, AcpiGhesState *ags,
>                                      uint64_t paddr, Error **errp)
>  {
> +    uint64_t val, start, end, guest_pgsz, host_pgsz;
>      uint64_t addresses[16];
> +    uint32_t num_of_addresses;
> +    int ret;
> +
> +    /*
> +     * Sort out the guest page size from TCR_EL1, which can be modified
> +     * by the guest from time to time. So we have to sort it out dynamically.
> +     */
> +    ret = read_sys_reg64(c->kvm_fd, &val, ARM64_SYS_REG(3, 0, 2, 0, 2));
> +    if (ret) {
> +        error_setg(errp, "Error %" PRId32 " to read TCR_EL1 register", ret);
> +        return;
> +    }
> +
> +    switch (extract64(val, 14, 2)) {
> +    case 0:
> +        guest_pgsz = 4 * KiB;
> +        break;
> +    case 1:
> +        guest_pgsz = 64 * KiB;
> +        break;
> +    case 2:
> +        guest_pgsz = 16 * KiB;
> +        break;
> +    default:
> +        error_setg(errp, "Unknown page size from TCR_EL1 (0x%" PRIx64 ")", val);
> +        return;
> +    }
> +
> +    host_pgsz = qemu_real_host_page_size();
> +    start = paddr & ~(host_pgsz - 1);
> +    end = start + host_pgsz;
> +    num_of_addresses = 0;
>  
> -    addresses[0] = paddr;
> +    while (start < end) {
> +        /*
> +         * The precise physical address is provided for the affected
> +         * guest page that contains @paddr. Otherwise, the starting
> +         * address of the guest page is provided.
> +         */
> +        if (paddr >= start && paddr < (start + guest_pgsz)) {
> +            addresses[num_of_addresses++] = paddr;
> +        } else {
> +            addresses[num_of_addresses++] = start;
> +        }
> +
> +        start += guest_pgsz;
> +    }
>  
>      kvm_cpu_synchronize_state(c);
> -    acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC, addresses, 1, errp);
> +    acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
> +                            addresses, num_of_addresses, errp);
>      kvm_inject_arm_sea(c);
>  }
>

WARNING: multiple messages have this Message-ID (diff)

From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Gavin Shan <gshan@redhat.com>
Cc: <qemu-arm@nongnu.org>, <qemu-devel@nongnu.org>,
	<mchehab+huawei@kernel.org>, <gengdongjiu1@gmail.com>,
	<mst@redhat.com>, <imammedo@redhat.com>, <anisinha@redhat.com>,
	<peter.maydell@linaro.org>, <pbonzini@redhat.com>,
	<shan.gavin@gmail.com>
Subject: Re: [PATCH v3 8/8] target/arm/kvm: Support multiple memory CPERs injection
Date: Wed, 5 Nov 2025 14:37:10 +0000	[thread overview]
Message-ID: <20251105143710.000041f5@huawei.com> (raw)
In-Reply-To: <20251105114453.2164073-9-gshan@redhat.com>

On Wed,  5 Nov 2025 21:44:53 +1000
Gavin Shan <gshan@redhat.com> wrote:

> In the combination of 64KiB host and 4KiB guest, a problematic host
> page affects 16x guest pages that can be owned by different threads.
> It means 16x memory errors can be raised at once due to the parallel
> accesses to those 16x guest pages on the guest. Unfortunately, QEMU
> can't deliver them one by one because we just one GHES error block,

we have just one

> corresponding one read acknowledgement register. It can eventually
> cause QEMU crash dump due to the contention on that register, meaning
> the current memory error can't be delivered before the previous error
> isn't acknowledged.
> 
> Imporve push_ghes_memory_errors() to push 16x consecutive memory errors
Improve

> under this situation to avoid the contention on the read acknowledgement
> register.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
Hi Gavin

Silly question that never occurred to me before:
What happens if we just report a single larger error?

The CPER record has a Physical Address Mask that I think lets us say we
are only reporting at a 64KiB granularity.

In linux drivers/edac/ghes_edac.c seems to handle this via e->grain.
https://elixir.bootlin.com/linux/v6.18-rc4/source/drivers/edac/ghes_edac.c#L346

I haven't chased the whole path through to whether this does appropriate poisoning
on the guest though.

> ---
>  target/arm/kvm.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 50 insertions(+), 2 deletions(-)
> 
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 5b151eda3c..d7de8262da 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -11,6 +11,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/units.h"
>  #include <sys/ioctl.h>
>  
>  #include <linux/kvm.h>
> @@ -2432,12 +2433,59 @@ int kvm_arch_get_registers(CPUState *cs, Error **errp)
>  static void push_ghes_memory_errors(CPUState *c, AcpiGhesState *ags,
>                                      uint64_t paddr, Error **errp)
>  {
> +    uint64_t val, start, end, guest_pgsz, host_pgsz;
>      uint64_t addresses[16];
> +    uint32_t num_of_addresses;
> +    int ret;
> +
> +    /*
> +     * Sort out the guest page size from TCR_EL1, which can be modified
> +     * by the guest from time to time. So we have to sort it out dynamically.
> +     */
> +    ret = read_sys_reg64(c->kvm_fd, &val, ARM64_SYS_REG(3, 0, 2, 0, 2));
> +    if (ret) {
> +        error_setg(errp, "Error %" PRId32 " to read TCR_EL1 register", ret);
> +        return;
> +    }
> +
> +    switch (extract64(val, 14, 2)) {
> +    case 0:
> +        guest_pgsz = 4 * KiB;
> +        break;
> +    case 1:
> +        guest_pgsz = 64 * KiB;
> +        break;
> +    case 2:
> +        guest_pgsz = 16 * KiB;
> +        break;
> +    default:
> +        error_setg(errp, "Unknown page size from TCR_EL1 (0x%" PRIx64 ")", val);
> +        return;
> +    }
> +
> +    host_pgsz = qemu_real_host_page_size();
> +    start = paddr & ~(host_pgsz - 1);
> +    end = start + host_pgsz;
> +    num_of_addresses = 0;
>  
> -    addresses[0] = paddr;
> +    while (start < end) {
> +        /*
> +         * The precise physical address is provided for the affected
> +         * guest page that contains @paddr. Otherwise, the starting
> +         * address of the guest page is provided.
> +         */
> +        if (paddr >= start && paddr < (start + guest_pgsz)) {
> +            addresses[num_of_addresses++] = paddr;
> +        } else {
> +            addresses[num_of_addresses++] = start;
> +        }
> +
> +        start += guest_pgsz;
> +    }
>  
>      kvm_cpu_synchronize_state(c);
> -    acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC, addresses, 1, errp);
> +    acpi_ghes_memory_errors(ags, ACPI_HEST_SRC_ID_SYNC,
> +                            addresses, num_of_addresses, errp);
>      kvm_inject_arm_sea(c);
>  }
>

next prev parent reply	other threads:[~2025-11-05 14:37 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-05 11:44 [PATCH v3 0/8] target/arm/kvm: Improve memory error handling Gavin Shan
2025-11-05 11:44 ` [PATCH v3 1/8] tests/qtest/bios-tables-test: Prepare for changes in the HEST table Gavin Shan
2025-11-05 14:16   ` Jonathan Cameron via
2025-11-05 14:16     ` Jonathan Cameron via
2025-11-05 11:44 ` [PATCH v3 2/8] acpi/ghes: Increase GHES raw data maximal length to 4KiB Gavin Shan
2025-11-05 14:16   ` Jonathan Cameron via
2025-11-05 14:16     ` Jonathan Cameron via
2025-11-10 14:11   ` Igor Mammedov
2025-11-11  4:05     ` Gavin Shan
2025-11-12 12:32       ` Igor Mammedov
2025-11-12 17:41         ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 3/8] tests/qtest/bios-tables-test: Update HEST table Gavin Shan
2025-11-05 14:17   ` Jonathan Cameron via
2025-11-05 14:17     ` Jonathan Cameron via
2025-11-05 11:44 ` [PATCH v3 4/8] acpi/ghes: Extend acpi_ghes_memory_errors() to support multiple CPERs Gavin Shan
2025-11-05 14:14   ` Jonathan Cameron via
2025-11-05 14:14     ` Jonathan Cameron via
2025-11-06  3:15     ` Gavin Shan
2025-11-10 14:49       ` Igor Mammedov
2025-11-11  4:08         ` Gavin Shan
2025-11-11 10:07           ` Jonathan Cameron via
2025-11-11 10:07             ` Jonathan Cameron via
2025-11-11 10:55             ` Gavin Shan
2025-11-11 11:55               ` Jonathan Cameron via
2025-11-11 11:55                 ` Jonathan Cameron via
2025-11-11 12:19                 ` Gavin Shan
2025-11-11 13:12                   ` Jonathan Cameron via
2025-11-11 13:12                     ` Jonathan Cameron via
2025-11-10 14:38   ` Igor Mammedov
2025-11-11  4:40     ` Gavin Shan
2025-11-12 13:12       ` Igor Mammedov
2025-11-12 17:36         ` Gavin Shan
2025-11-10 14:43   ` Philippe Mathieu-Daudé
2025-11-10 23:38     ` Gavin Shan
2025-11-11  3:40       ` Gavin Shan
2025-11-10 14:48   ` Philippe Mathieu-Daudé
2025-11-11  3:44     ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 5/8] acpi/ghes: Bail early on error from get_ghes_source_offsets() Gavin Shan
2025-11-05 14:17   ` Jonathan Cameron via
2025-11-05 14:17     ` Jonathan Cameron via
2025-11-10 14:50   ` Philippe Mathieu-Daudé
2025-11-11  3:48     ` Gavin Shan
2025-11-10 14:51   ` Igor Mammedov
2025-11-05 11:44 ` [PATCH v3 6/8] acpi/ghes: Use error_abort in acpi_ghes_memory_errors() Gavin Shan
2025-11-05 14:18   ` Jonathan Cameron via
2025-11-05 14:18     ` Jonathan Cameron via
2025-11-10 14:53   ` Igor Mammedov
2025-11-10 14:54   ` Philippe Mathieu-Daudé
2025-11-11  3:58     ` Gavin Shan
2025-11-12 12:49       ` Igor Mammedov
2025-11-12 17:38         ` Gavin Shan
2025-11-11  5:08     ` Markus Armbruster
2025-11-11  5:25   ` Markus Armbruster
2025-11-11  6:02     ` Gavin Shan
2025-11-11  7:31       ` Markus Armbruster
2025-11-05 11:44 ` [PATCH v3 7/8] kvm/arm/kvm: Introduce helper push_ghes_memory_errors() Gavin Shan
2025-11-05 14:19   ` Jonathan Cameron via
2025-11-05 14:19     ` Jonathan Cameron via
2025-11-10 14:56   ` Igor Mammedov
2025-11-11  4:09     ` Gavin Shan
2025-11-05 11:44 ` [PATCH v3 8/8] target/arm/kvm: Support multiple memory CPERs injection Gavin Shan
2025-11-05 14:37   ` Jonathan Cameron via [this message]
2025-11-05 14:37     ` Jonathan Cameron via
2025-11-06  3:26     ` Gavin Shan
2025-11-11 10:12       ` Jonathan Cameron
2025-11-11 10:12         ` Jonathan Cameron via
2025-11-11 10:12         ` Jonathan Cameron via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251105143710.000041f5@huawei.com \
    --to=qemu-arm@nongnu.org \
    --cc=anisinha@redhat.com \
    --cc=gengdongjiu1@gmail.com \
    --cc=gshan@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=mchehab+huawei@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.