From: Peter Xu <peterx@redhat.com>
To: "“William Roche" <william.roche@oracle.com>
Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org,
lizhijian@fujitsu.com, pbonzini@redhat.com, quintela@redhat.com,
leobras@redhat.com, joao.m.martins@oracle.com,
lidongchen@tencent.com
Subject: Re: [PATCH v5 0/2] Qemu crashes on VM migration after an handled memory error
Date: Wed, 8 Nov 2023 16:49:25 -0500 [thread overview]
Message-ID: <ZUwCZdZj-vZD1NJC@x1n> (raw)
In-Reply-To: <20231106220319.456765-1-william.roche@oracle.com>
On Mon, Nov 06, 2023 at 10:03:17PM +0000, “William Roche wrote:
> From: William Roche <william.roche@oracle.com>
>
>
> Note about ARM specificities:
> This code has a small part impacting more specificaly ARM machines,
> that's the reason why I added qemu-arm@nongnu.org -- see description.
>
>
> A Qemu VM can survive a memory error, as qemu can relay the error to the
> VM kernel which could also deal with it -- poisoning/off-lining the impacted
> page.
> This situation creates a hole in the VM memory address space that the VM kernel
> knows about (an unreadable page or set of pages).
>
> But the migration of this VM (live migration through the network or
> pseudo-migration with the creation of a state file) will crash Qemu when
> it sequentially reads the memory address space and stumbles on the
> existing hole.
>
> In order to thoroughly correct this problem, the poison information should
> follow the migration which represents several difficulties:
> - poisoning a page on the destination machine to replicate the source
> poison requires CAP_SYS_ADMIN priviledges, and qemu process may not
> always run as a root process
> - the destination kernel needs to be configured with CONFIG_MEMORY_FAILURE
> - the poison information would require a memory transfer protocol
> enhancement to provide this information
> (The current patches don't provide any of that)
>
> But if we rely on the fact that the a running VM kernel is correctly
> dealing with memory poison it is informed about: marking the poison page
> as inaccessible, we could count on the VM kernel to make sure that
> poisoned pages are not used, even after a migration.
> In this case, I suggest to treat the poisoned pages as if they were
> zero-pages for the migration copy.
> This fix also works with underlying large pages, taking into account the
> RAMBlock segment "page-size".
>
> Now, it leaves a case that we have to deal with: if a memory error is
> reported to qemu but not injected into the running kernel...
> As the migration will go from a poisoned page to an all-zero page, if
> the VM kernel doesn't prevent the access to this page, a memory read
> that would generate a BUS_MCEERR_AR error on the source platform, could
> be reading zeros on the destination. This is a memory corruption.
>
> So we have to ensure that all poisoned pages we set to zero are known by
> the running kernel. But we have a problem with platforms where BUS_MCEERR_AO
> errors are ignored, which means that qemu knows about the poison but the VM
> doesn't. For the moment it's only the case for ARM, but could later be
> also needed for AMD VMs.
> See https://lore.kernel.org/all/20230912211824.90952-3-john.allen@amd.com/
>
> In order to avoid this possible silent data corruption situation, we should
> prevent the migration when we know that a poisoned page is ignored from the VM.
>
> Which is, according to me, the smallest fix we need to avoid qemu crashes
> on migration after an handled memory error, without introducing a possible
> corruption situation.
>
> This fix is scripts/checkpatch.pl clean.
> Unit test: Migration blocking succesfully tested on ARM -- injected AO error
> blocks it. On x86 the same type of error being relayed doesn't block.
>
> v2:
> - adding compressed transfer handling of poisoned pages
>
> v3:
> - Included the Reviewed-by and Tested-by information on first patch
> - added a TODO comment above control_save_page()
> mentioning Zhijian's feedback about RDMA migration failure.
>
> v4:
> - adding a patch to deal with unknown poison tracking (impacting ARM)
> (not using migrate_add_blocker as this is not devices related and
> we want to avoid the interaction with --only-migratable mechanism)
>
> v5:
> - Updating the code to the latest version
> - adding qemu-arm@nongnu.org for a complementary review
>
>
> William Roche (2):
> migration: skip poisoned memory pages on "ram saving" phase
> migration: prevent migration when a poisoned page is unknown from the
> VM
I hope someone from arch-specific can have a quick look at patch 2..
One thing to mention is unfortunately waiting on patch 2 means we'll miss
this release. Actually it is already missed.. softfreeze yesterday [1]. So
it may likely need to wait for 9.0.
[1] https://wiki.qemu.org/Planning/8.2
--
Peter Xu
next prev parent reply other threads:[~2023-11-08 21:50 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-06 13:59 [PATCH 0/1] Qemu crashes on VM migration after an handled memory error “William Roche
2023-09-06 13:59 ` [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-09-06 14:19 ` Joao Martins
2023-09-06 15:16 ` Peter Xu
2023-09-06 21:29 ` William Roche
2023-09-09 14:57 ` Joao Martins
2023-09-11 19:48 ` Peter Xu
2023-09-12 18:44 ` Peter Xu
2023-09-14 20:20 ` [PATCH v2 0/1] Qemu crashes on VM migration after an handled memory error “William Roche
2023-09-14 20:20 ` [PATCH v2 1/1] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-09-15 3:13 ` Zhijian Li (Fujitsu)
2023-09-15 11:31 ` William Roche
2023-09-18 3:47 ` Zhijian Li (Fujitsu)
2023-09-20 10:04 ` Zhijian Li (Fujitsu)
2023-09-20 12:11 ` William Roche
2023-09-20 23:53 ` [PATCH v3 0/1] Qemu crashes on VM migration after an handled memory error “William Roche
2023-09-20 23:53 ` [PATCH v3 1/1] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-10-13 15:08 ` [PATCH v4 0/2] Qemu crashes on VM migration after an handled memory error “William Roche
2023-10-13 15:08 ` [PATCH v4 1/2] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-10-13 15:08 ` [PATCH v4 2/2] migration: prevent migration when a poisoned page is unknown from the VM “William Roche
2023-10-16 16:48 ` Peter Xu
2023-10-17 0:38 ` William Roche
2023-10-17 15:13 ` Peter Xu
2023-11-06 21:38 ` William Roche
2023-11-08 21:45 ` Peter Xu
2023-11-10 19:22 ` William Roche
2023-11-06 22:03 ` [PATCH v5 0/2] Qemu crashes on VM migration after an handled memory error “William Roche
2023-11-06 22:03 ` [PATCH v5 1/2] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-11-06 22:03 ` [PATCH v5 2/2] migration: prevent migration when a poisoned page is unknown from the VM “William Roche
2023-11-08 21:49 ` Peter Xu [this message]
2024-01-30 19:06 ` [PATCH v1 0/1] Qemu crashes on VM migration after an handled memory error “William Roche
2024-01-30 19:06 ` [PATCH v1 1/1] migration: prevent migration when VM has poisoned memory “William Roche
2024-01-31 1:48 ` Peter Xu
2023-09-14 21:50 ` [PATCH v2 0/1] Qemu crashes on VM migration after an handled memory error Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZUwCZdZj-vZD1NJC@x1n \
--to=peterx@redhat.com \
--cc=joao.m.martins@oracle.com \
--cc=leobras@redhat.com \
--cc=lidongchen@tencent.com \
--cc=lizhijian@fujitsu.com \
--cc=pbonzini@redhat.com \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=william.roche@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).