qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: "“William Roche" <william.roche@oracle.com>
Cc: qemu-devel@nongnu.org, lizhijian@fujitsu.com,
	pbonzini@redhat.com, quintela@redhat.com, leobras@redhat.com,
	joao.m.martins@oracle.com, lidongchen@tencent.com
Subject: Re: [PATCH v4 2/2] migration: prevent migration when a poisoned page is unknown from the VM
Date: Mon, 16 Oct 2023 12:48:09 -0400	[thread overview]
Message-ID: <ZS1pSeL3hj4/73lk@x1n> (raw)
In-Reply-To: <20231013150839.867164-3-william.roche@oracle.com>

On Fri, Oct 13, 2023 at 03:08:39PM +0000, “William Roche wrote:
> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> index 5e95c496bb..e8db6380c1 100644
> --- a/target/arm/kvm64.c
> +++ b/target/arm/kvm64.c
> @@ -1158,7 +1158,6 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>          ram_addr = qemu_ram_addr_from_host(addr);
>          if (ram_addr != RAM_ADDR_INVALID &&
>              kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
> -            kvm_hwpoison_page_add(ram_addr);
>              /*
>               * If this is a BUS_MCEERR_AR, we know we have been called
>               * synchronously from the vCPU thread, so we can easily
> @@ -1169,7 +1168,12 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>               * called synchronously from the vCPU thread, or a bit
>               * later from the main thread, so doing the injection of
>               * the error would be more complicated.
> +             * In this case, BUS_MCEERR_AO errors are unknown from the
> +             * guest, and we will prevent migration as long as this
> +             * poisoned page hasn't generated a BUS_MCEERR_AR error
> +             * that the guest takes into account.
>               */
> +            kvm_hwpoison_page_add(ram_addr, (code == BUS_MCEERR_AR));

I'm curious why ARM doesn't forward this event to guest even if it's AO.
X86 does it, and makes more sense to me.  Not familiar with arm, do you
know the reason?

I think this patch needs review from ARM and/or KVM side.  Do you want to
have the 1st patch merged, or rather wait for the whole set?

Another thing to mention: feel free to look at a recent addition of ioctl
from userfault, where it can inject poisoned ptes:

https://lore.kernel.org/r/20230707215540.2324998-1-axelrasmussen@google.com

I'm wondering if that'll be helpful to qemu too, where we can migrate
hwpoison_page_list and enforce the poisoning on dest.  Then even for AO
when accessed by guest it'll generated another MCE on dest.

>              if (code == BUS_MCEERR_AR) {
>                  kvm_cpu_synchronize_state(c);
>                  if (!acpi_ghes_record_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {

-- 
Peter Xu



  reply	other threads:[~2023-10-16 16:48 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-06 13:59 [PATCH 0/1] Qemu crashes on VM migration after an handled memory error “William Roche
2023-09-06 13:59 ` [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-09-06 14:19   ` Joao Martins
2023-09-06 15:16     ` Peter Xu
2023-09-06 21:29       ` William Roche
2023-09-09 14:57         ` Joao Martins
2023-09-11 19:48           ` Peter Xu
2023-09-12 18:44             ` Peter Xu
2023-09-14 20:20               ` [PATCH v2 0/1] Qemu crashes on VM migration after an handled memory error “William Roche
2023-09-14 20:20                 ` [PATCH v2 1/1] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-09-15  3:13                   ` Zhijian Li (Fujitsu)
2023-09-15 11:31                     ` William Roche
2023-09-18  3:47                       ` Zhijian Li (Fujitsu)
2023-09-20 10:04                       ` Zhijian Li (Fujitsu)
2023-09-20 12:11                         ` William Roche
2023-09-20 23:53                         ` [PATCH v3 0/1] Qemu crashes on VM migration after an handled memory error “William Roche
2023-09-20 23:53                           ` [PATCH v3 1/1] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-10-13 15:08                           ` [PATCH v4 0/2] Qemu crashes on VM migration after an handled memory error “William Roche
2023-10-13 15:08                             ` [PATCH v4 1/2] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-10-13 15:08                             ` [PATCH v4 2/2] migration: prevent migration when a poisoned page is unknown from the VM “William Roche
2023-10-16 16:48                               ` Peter Xu [this message]
2023-10-17  0:38                                 ` William Roche
2023-10-17 15:13                                   ` Peter Xu
2023-11-06 21:38                                     ` William Roche
2023-11-08 21:45                                       ` Peter Xu
2023-11-10 19:22                                         ` William Roche
2023-11-06 22:03                                     ` [PATCH v5 0/2] Qemu crashes on VM migration after an handled memory error “William Roche
2023-11-06 22:03                                       ` [PATCH v5 1/2] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-11-06 22:03                                       ` [PATCH v5 2/2] migration: prevent migration when a poisoned page is unknown from the VM “William Roche
2023-11-08 21:49                                       ` [PATCH v5 0/2] Qemu crashes on VM migration after an handled memory error Peter Xu
2024-01-30 19:06                                         ` [PATCH v1 0/1] " “William Roche
2024-01-30 19:06                                           ` [PATCH v1 1/1] migration: prevent migration when VM has poisoned memory “William Roche
2024-01-31  1:48                                             ` Peter Xu
2023-09-14 21:50                 ` [PATCH v2 0/1] Qemu crashes on VM migration after an handled memory error Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZS1pSeL3hj4/73lk@x1n \
    --to=peterx@redhat.com \
    --cc=joao.m.martins@oracle.com \
    --cc=leobras@redhat.com \
    --cc=lidongchen@tencent.com \
    --cc=lizhijian@fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=william.roche@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).