Re: [PATCH 4/4] target/arm: Retry pushing CPER error if necessary

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Gavin Shan <gshan@redhat.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Igor Mammedov <imammedo@redhat.com>,
	qemu-arm@nongnu.org, qemu-devel@nongnu.org, mst@redhat.com,
	anisinha@redhat.com, gengdongjiu1@gmail.com,
	peter.maydell@linaro.org, pbonzini@redhat.com,
	shan.gavin@gmail.com,
	Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Subject: Re: [PATCH 4/4] target/arm: Retry pushing CPER error if necessary
Date: Wed, 26 Feb 2025 16:56:26 +1000	[thread overview]
Message-ID: <f14e5c46-8205-41b4-aca7-cdcb3456c7ac@redhat.com> (raw)
In-Reply-To: <20250221110435.00004a3b@huawei.com>

On 2/21/25 9:04 PM, Jonathan Cameron wrote:
> On Fri, 21 Feb 2025 15:27:36 +1000
> Gavin Shan <gshan@redhat.com> wrote:

[...]
   
>>
>> I would say #1 is the ideal model because the read_ack_register is the bottleneck
>> and it should be scaled up to max_cpus. In that way, the bottleneck can be avoided
>> from the bottom. Another benefit with #1 is the error can be delivered immediately
>> to the vCPU where the error was raised. This matches with the syntax of SEA to me.
> 
> I don't think it helps for the bottleneck in linux at least.  A whole bunch of locks
> are taken on each SEA because of the novel use of the fixmap.  There is only one
> VA ever used to access the error status blocks we just change what PA it points to
> under a spin lock. Maybe that can be improved on if we can persuade people that error
> handling performance is a thing to care about!
> 

Right, it doesn't helps for the bottleneck in guest kernel due to @ghes_notify_lock_sea.
With the lock, all existing GHES devices and error statuses are serialized for access. I
was actually talking about the benefit to avoid the bottleneck regarding the read_ack_regsiter,
which is the synchronization mechanism between guest kernel and QEMU. For example, an error
has been raised on vCPU-0, but not acknowledged at (A). Another error raised on vCPU-1
can't be delivered because we have only one GHES device and error status block, which
has been reserved for the error raised on vCPU-0.  With solution #1, the bottleneck can
be avoided with multiple GHES devices and error status blocks.

   vCPU-0                                           vCPU-1
   ======                                           ======
   kvm_cpu_exec                                     kvm_cpu_exec
     kvm_vcpu_ioctl(RUN)                              kvm_vcpu_ioctl(RUN)
     kvm_arch_on_sigbus_vcpu                          kvm_arch_on_sigbus_vcpu
       acpi_ghes_memory_errors                          acpi_ghes_memory_errors   (B)
       kvm_inject_arm_sea
     kvm_vcpu_ioctl(RUN)
       :
     do_mem_abort
       do_sea
         apei_claim_sea
           ghes_notify_sea
             raw_spin_lock(&ghes_notify_lock_sea)
             ghes_in_nmi_spool_from_list
               ghes_in_nmi_queue_one_entry
                 ghes_clear_estatus                 (A)
             raw_spin_unlock(&ghes_notify_lock_sea)

Thanks,
Gavin

next prev parent reply	other threads:[~2025-02-26  6:57 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-14  4:16 [PATCH 0/4] target/arm: Improvement on memory error handling Gavin Shan
2025-02-14  4:16 ` [PATCH 1/4] acpi/ghes: Make ghes_record_cper_errors() static Gavin Shan
2025-02-21 10:44   ` Philippe Mathieu-Daudé
2025-02-14  4:16 ` [PATCH 2/4] acpi/ghes: Use error_report() in ghes_record_cper_errors() Gavin Shan
2025-02-14  4:16 ` [PATCH 3/4] acpi/ghes: Allow retry to write CPER errors Gavin Shan
2025-02-14  4:16 ` [PATCH 4/4] target/arm: Retry pushing CPER error if necessary Gavin Shan
2025-02-19 17:55   ` Igor Mammedov
2025-02-21  5:27     ` Gavin Shan
2025-02-21 11:04       ` Jonathan Cameron via
2025-02-25 11:19         ` Igor Mammedov
2025-02-26  4:58           ` Gavin Shan
2025-02-28  1:55             ` Jonathan Cameron via
2025-02-26  6:56         ` Gavin Shan [this message]
2025-02-14  9:53 ` [PATCH 0/4] target/arm: Improvement on memory error handling Jonathan Cameron via
2025-02-17  0:29   ` Gavin Shan
2025-02-14 10:12 ` Jonathan Cameron via
2025-02-17  3:49   ` Gavin Shan
2025-02-14 12:59 ` Mauro Carvalho Chehab
2025-02-17  3:58   ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f14e5c46-8205-41b4-aca7-cdcb3456c7ac@redhat.com \
    --to=gshan@redhat.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=anisinha@redhat.com \
    --cc=gengdongjiu1@gmail.com \
    --cc=imammedo@redhat.com \
    --cc=mchehab+huawei@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).