From: Peter Xu <peterx@redhat.com>
To: "“William Roche" <william.roche@oracle.com>
Cc: david@redhat.com, kvm@vger.kernel.org, qemu-devel@nongnu.org,
qemu-arm@nongnu.org, pbonzini@redhat.com,
richard.henderson@linaro.org, philmd@linaro.org,
peter.maydell@linaro.org, mtosatti@redhat.com,
imammedo@redhat.com, eduardo@habkost.net,
marcel.apfelbaum@gmail.com, wangyanan55@huawei.com,
zhao1.liu@intel.com, joao.m.martins@oracle.com
Subject: Re: [PATCH v7 3/6] accel/kvm: Report the loss of a large memory page
Date: Tue, 4 Feb 2025 12:01:48 -0500 [thread overview]
Message-ID: <Z6JH_OyppIA7WFjk@x1.local> (raw)
In-Reply-To: <20250201095726.3768796-4-william.roche@oracle.com>
On Sat, Feb 01, 2025 at 09:57:23AM +0000, “William Roche wrote:
> From: William Roche <william.roche@oracle.com>
>
> In case of a large page impacted by a memory error, provide an
> information about the impacted large page before the memory
> error injection message.
>
> This message would also appear on ras enabled ARM platforms, with
> the introduction of an x86 similar error injection message.
>
> In the case of a large page impacted, we now report:
> Memory Error on large page from <backend>:<address>+<fd_offset> +<page_size>
>
> The +<fd_offset> information is only provided with a file backend.
>
> Signed-off-by: William Roche <william.roche@oracle.com>
This is still pretty kvm / arch relevant patch that needs some reviews.
I wonder do we really need this - we could fetch ramblock mapping
(e.g. hwaddr -> HVA) via HMP "info ramblock", and also dmesg shows process
ID + VA. IIUC we have all below info already as long as we do some math
based on above. Would that work too?
> ---
> accel/kvm/kvm-all.c | 18 ++++++++++++++++++
> include/exec/cpu-common.h | 10 ++++++++++
> system/physmem.c | 22 ++++++++++++++++++++++
> target/arm/kvm.c | 3 +++
> 4 files changed, 53 insertions(+)
>
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index f89568bfa3..9a0d970ce1 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -1296,6 +1296,24 @@ static void kvm_unpoison_all(void *param)
> void kvm_hwpoison_page_add(ram_addr_t ram_addr)
> {
> HWPoisonPage *page;
> + struct RAMBlockInfo rb_info;
> +
> + if (qemu_ram_block_info_from_addr(ram_addr, &rb_info)) {
> + size_t ps = rb_info.page_size;
> +
> + if (ps > TARGET_PAGE_SIZE) {
> + uint64_t offset = QEMU_ALIGN_DOWN(ram_addr - rb_info.offset, ps);
> +
> + if (rb_info.fd >= 0) {
> + error_report("Memory Error on large page from %s:%" PRIx64
> + "+%" PRIx64 " +%zx", rb_info.idstr, offset,
> + rb_info.fd_offset, ps);
> + } else {
> + error_report("Memory Error on large page from %s:%" PRIx64
> + " +%zx", rb_info.idstr, offset, ps);
> + }
> + }
> + }
>
> QLIST_FOREACH(page, &hwpoison_page_list, list) {
> if (page->ram_addr == ram_addr) {
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index 3771b2130c..190bd4f34a 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -110,6 +110,16 @@ int qemu_ram_get_fd(RAMBlock *rb);
> size_t qemu_ram_pagesize(RAMBlock *block);
> size_t qemu_ram_pagesize_largest(void);
>
> +struct RAMBlockInfo {
> + char idstr[256];
> + ram_addr_t offset;
> + int fd;
> + uint64_t fd_offset;
> + size_t page_size;
> +};
> +bool qemu_ram_block_info_from_addr(ram_addr_t ram_addr,
> + struct RAMBlockInfo *block);
> +
> /**
> * cpu_address_space_init:
> * @cpu: CPU to add this address space to
> diff --git a/system/physmem.c b/system/physmem.c
> index e8ff930bc9..686f569270 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -1678,6 +1678,28 @@ size_t qemu_ram_pagesize_largest(void)
> return largest;
> }
>
> +/* Copy RAMBlock information associated to the given ram_addr location */
> +bool qemu_ram_block_info_from_addr(ram_addr_t ram_addr,
> + struct RAMBlockInfo *b_info)
> +{
> + RAMBlock *rb;
> +
> + assert(b_info);
> +
> + RCU_READ_LOCK_GUARD();
> + rb = qemu_get_ram_block(ram_addr);
> + if (!rb) {
> + return false;
> + }
> +
> + pstrcat(b_info->idstr, sizeof(b_info->idstr), rb->idstr);
> + b_info->offset = rb->offset;
> + b_info->fd = rb->fd;
> + b_info->fd_offset = rb->fd_offset;
> + b_info->page_size = rb->page_size;
> + return true;
> +}
> +
> static int memory_try_enable_merging(void *addr, size_t len)
> {
> if (!machine_mem_merge(current_machine)) {
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index da30bdbb23..d9dedc6d74 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -2389,6 +2389,9 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
> kvm_cpu_synchronize_state(c);
> if (!acpi_ghes_memory_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
> kvm_inject_arm_sea(c);
> + error_report("Guest Memory Error at QEMU addr %p and "
> + "GUEST addr 0x%" HWADDR_PRIx " of type %s injected",
> + addr, paddr, "BUS_MCEERR_AR");
> } else {
> error_report("failed to record the error");
> abort();
> --
> 2.43.5
>
--
Peter Xu
next prev parent reply other threads:[~2025-02-04 17:01 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-01 9:57 [PATCH v7 0/6] Poisoned memory recovery on reboot “William Roche
2025-02-01 9:57 ` [PATCH v7 1/6] system/physmem: handle hugetlb correctly in qemu_ram_remap() “William Roche
2025-02-04 17:09 ` Peter Xu
2025-02-01 9:57 ` [PATCH v7 2/6] system/physmem: poisoned memory discard on reboot “William Roche
2025-02-04 17:09 ` Peter Xu
2025-02-05 16:27 ` William Roche
2025-02-01 9:57 ` [PATCH v7 3/6] accel/kvm: Report the loss of a large memory page “William Roche
2025-02-04 17:01 ` Peter Xu [this message]
2025-02-05 16:27 ` William Roche
2025-02-05 17:07 ` Peter Xu
2025-02-07 18:02 ` William Roche
2025-02-10 16:48 ` Peter Xu
2025-02-11 21:22 ` William Roche
2025-02-11 21:45 ` Peter Xu
2025-02-01 9:57 ` [PATCH v7 4/6] numa: Introduce and use ram_block_notify_remap() “William Roche
2025-02-04 17:17 ` Peter Xu
2025-02-04 17:42 ` David Hildenbrand
2025-02-01 9:57 ` [PATCH v7 5/6] hostmem: Factor out applying settings “William Roche
2025-02-01 9:57 ` [PATCH v7 6/6] hostmem: Handle remapping of RAM “William Roche
2025-02-04 17:50 ` David Hildenbrand
2025-02-04 17:58 ` Peter Xu
2025-02-04 18:55 ` David Hildenbrand
2025-02-04 20:16 ` Peter Xu
2025-02-05 16:27 ` William Roche
2025-02-05 17:58 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6JH_OyppIA7WFjk@x1.local \
--to=peterx@redhat.com \
--cc=david@redhat.com \
--cc=eduardo@habkost.net \
--cc=imammedo@redhat.com \
--cc=joao.m.martins@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=marcel.apfelbaum@gmail.com \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=philmd@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=wangyanan55@huawei.com \
--cc=william.roche@oracle.com \
--cc=zhao1.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.