qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Gavin Shan <gshan@redhat.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org, mst@redhat.com,
	imammedo@redhat.com, anisinha@redhat.com, gengdongjiu1@gmail.com,
	peter.maydell@linaro.org, pbonzini@redhat.com,
	shan.gavin@gmail.com,
	Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Subject: Re: [PATCH 0/4] target/arm: Improvement on memory error handling
Date: Mon, 17 Feb 2025 10:29:31 +1000	[thread overview]
Message-ID: <b6699187-a720-4fbd-a57c-a7bd86d7621b@redhat.com> (raw)
In-Reply-To: <20250214095353.00007afc@huawei.com>

On 2/14/25 7:53 PM, Jonathan Cameron wrote:
> On Fri, 14 Feb 2025 14:16:31 +1000
> Gavin Shan <gshan@redhat.com> wrote:
> 
>> Currently, there is only one CPER buffer (entry), meaning only one
>> memory error can be reported. In extreme case, multiple memory errors
>> can be raised on different vCPUs. For example, a singile memory error
>> on a 64KB page of the host can results in 16 memory errors to 4KB
>> pages of the guest. Unfortunately, the virtual machine is simply aborted
>> by multiple concurrent memory errors, as the following call trace shows.
>> A SEA exception is injected to the guest so that the CPER buffer can
>> be claimed if the error is successfully pushed by acpi_ghes_memory_errors(),
>> Otherwise, abort() is triggered to crash the virtual machine.
>>
>>    kvm_vcpu_thread_fn
>>      kvm_cpu_exec
>>        kvm_arch_on_sigbus_vcpu
>>          kvm_cpu_synchronize_state
>>          acpi_ghes_memory_errors         (a)
>>          kvm_inject_arm_sea | abort
>>
>> It's arguably to crash the virtual machine in this case. The better
>> behaviour would be to retry on pushing the memory errors, to keep the
>> virtual machine alive so that the administrator has chance to chime
>> in, for example to dump the important data with luck. This series
>> adds one more parameter to acpi_ghes_memory_errors() so that it will
>> be tried to push the memory error until it succeeds.
> 
> Hi Gavin,
> 
> +CC Mauro given:
> https://lore.kernel.org/all/cover.1738345063.git.mchehab+huawei@kernel.org/
> 
> is more or less reviewed subject to some requested patch reordering and
> whilst I haven't checked, seems unlikely that there won't be a
> clash with this series (might just be some fuzz)
> 

Jonathan, thanks for the pointer. I didn't notice there are pending acpi/hest
changes. The changes clash with those included in this series, I will take a
close look.

Thanks,
Gavin

> Jonathan
> 
> 
> 
>>
>> Gavin Shan (4):
>>    acpi/ghes: Make ghes_record_cper_errors() static
>>    acpi/ghes: Use error_report() in ghes_record_cper_errors()
>>    acpi/ghes: Allow retry to write CPER errors
>>    target/arm: Retry pushing CPER error if necessary
>>
>>   hw/acpi/ghes-stub.c    |  3 ++-
>>   hw/acpi/ghes.c         | 45 +++++++++++++++++++++---------------------
>>   include/hw/acpi/ghes.h |  5 ++---
>>   target/arm/kvm.c       | 31 +++++++++++++++++++++++------
>>   4 files changed, 51 insertions(+), 33 deletions(-)
>>
> 



  reply	other threads:[~2025-02-17  0:30 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-14  4:16 [PATCH 0/4] target/arm: Improvement on memory error handling Gavin Shan
2025-02-14  4:16 ` [PATCH 1/4] acpi/ghes: Make ghes_record_cper_errors() static Gavin Shan
2025-02-21 10:44   ` Philippe Mathieu-Daudé
2025-02-14  4:16 ` [PATCH 2/4] acpi/ghes: Use error_report() in ghes_record_cper_errors() Gavin Shan
2025-02-14  4:16 ` [PATCH 3/4] acpi/ghes: Allow retry to write CPER errors Gavin Shan
2025-02-14  4:16 ` [PATCH 4/4] target/arm: Retry pushing CPER error if necessary Gavin Shan
2025-02-19 17:55   ` Igor Mammedov
2025-02-21  5:27     ` Gavin Shan
2025-02-21 11:04       ` Jonathan Cameron via
2025-02-25 11:19         ` Igor Mammedov
2025-02-26  4:58           ` Gavin Shan
2025-02-28  1:55             ` Jonathan Cameron via
2025-02-26  6:56         ` Gavin Shan
2025-02-14  9:53 ` [PATCH 0/4] target/arm: Improvement on memory error handling Jonathan Cameron via
2025-02-17  0:29   ` Gavin Shan [this message]
2025-02-14 10:12 ` Jonathan Cameron via
2025-02-17  3:49   ` Gavin Shan
2025-02-14 12:59 ` Mauro Carvalho Chehab
2025-02-17  3:58   ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6699187-a720-4fbd-a57c-a7bd86d7621b@redhat.com \
    --to=gshan@redhat.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=anisinha@redhat.com \
    --cc=gengdongjiu1@gmail.com \
    --cc=imammedo@redhat.com \
    --cc=mchehab+huawei@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).