From: William Roche <william.roche@oracle.com>
To: David Hildenbrand <david@redhat.com>,
kvm@vger.kernel.org, qemu-devel@nongnu.org, qemu-arm@nongnu.org
Cc: peterx@redhat.com, pbonzini@redhat.com,
richard.henderson@linaro.org, philmd@linaro.org,
peter.maydell@linaro.org, mtosatti@redhat.com,
joao.m.martins@oracle.com
Subject: Re: [PATCH v1 3/4] system/physmem: Largepage punch hole before reset of memory pages
Date: Sat, 26 Oct 2024 01:27:31 +0200 [thread overview]
Message-ID: <e9f8e404-50db-4e0f-a5e1-749acad49325@oracle.com> (raw)
In-Reply-To: <0cda6b34-d62c-49c7-b30c-33f171985817@redhat.com>
On 10/23/24 09:30, David Hildenbrand wrote:
> On 22.10.24 23:35, “William Roche wrote:
>> From: William Roche <william.roche@oracle.com>
>>
>> When the VM reboots, a memory reset is performed calling
>> qemu_ram_remap() on all hwpoisoned pages.
>> While we take into account the recorded page sizes to repair the
>> memory locations, a large page also needs to punch a hole in the
>> backend file to regenerate a usable memory, cleaning the HW
>> poisoned section. This is mandatory for hugetlbfs case for example.
>>
>> Signed-off-by: William Roche <william.roche@oracle.com>
>> ---
>> system/physmem.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/system/physmem.c b/system/physmem.c
>> index 3757428336..3f6024a92d 100644
>> --- a/system/physmem.c
>> +++ b/system/physmem.c
>> @@ -2211,6 +2211,14 @@ void qemu_ram_remap(ram_addr_t addr,
>> ram_addr_t length)
>> prot = PROT_READ;
>> prot |= block->flags & RAM_READONLY ? 0 : PROT_WRITE;
>> if (block->fd >= 0) {
>> + if (length > TARGET_PAGE_SIZE &&
>> fallocate(block->fd,
>> + FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE,
>> + offset + block->fd_offset, length) != 0) {
>> + error_report("Could not recreate the file
>> hole for "
>> + "addr: " RAM_ADDR_FMT "@"
>> RAM_ADDR_FMT "",
>> + length, addr);
>> + exit(1);
>> + }
>> area = mmap(vaddr, length, prot, flags, block->fd,
>> offset + block->fd_offset);
>> } else {
>
> Ah! Just what I commented to patch #3; we should be using
> ram_discard_range(). It might be better to avoid the mmap() completely
> if ram_discard_range() worked.
I think you are referring to ram_block_discard_range() here, as
ram_discard_range() seems to relate to VM migrations, maybe not a VM reset.
Remapping the page is needed to get rid of the poison. So if we want to
avoid the mmap(), we have to shrink the memory address space -- which
can be a real problem if we imagine a VM with 1G large pages for
example. qemu_ram_remap() is used to regenerate the lost memory and the
mmap() call looks mandatory on the reset phase.
>
> And as raised, there is the problem with memory preallocation (where
> we should fail if it doesn't work) and ram discards being disabled
> because something relies on long-term page pinning ...
Yes. Do you suggest that we add a call to qemu_prealloc_mem() for the
remapped area in case of a backend->prealloc being true ?
Or as we are running on posix machines for this piece of code (ifndef
_WIN32) maybe we could simply add a MAP_POPULATE flag to the mmap call
done in qemu_ram_remap() in the case where the backend requires a
'prealloc' ? Can you confirm if this flag could be used on all systems
running this code ?
Unfortunately, I don't know how to get the MEMORY_BACKEND corresponding
to a given memory block. I'm not sure that MEMORY_BACKEND(block->mr) is
a valid way to retrieve the Backend object and its 'prealloc' property
here. Could you please give me a direction here ?
I can send a new version using ram_block_discard_range() as you
suggested to replace the direct call to fallocate(), if you think it
would be better.
Please let me know what other enhancement(s) you'd like to see in this
code change.
Thanks in advance,
William.
next prev parent reply other threads:[~2024-10-25 23:28 UTC|newest]
Thread overview: 119+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-10 9:07 [RFC 0/6] hugetlbfs largepage RAS project “William Roche
2024-09-10 9:07 ` [RFC 1/6] accel/kvm: SIGBUS handler should also deal with si_addr_lsb “William Roche
2024-09-10 9:07 ` [RFC 2/6] accel/kvm: Keep track of the HWPoisonPage sizes “William Roche
2024-09-10 9:07 ` [RFC 3/6] system/physmem: Remap memory pages on reset based on the page size “William Roche
2024-09-10 9:07 ` [RFC 4/6] system: Introducing hugetlbfs largepage RAS feature “William Roche
2024-09-10 9:07 ` [RFC 5/6] system/hugetlb_ras: Handle madvise SIGBUS signal on listener “William Roche
2024-09-10 9:07 ` [RFC 6/6] system/hugetlb_ras: Replay lost BUS_MCEERR_AO signals on VM resume “William Roche
2024-09-10 10:02 ` [RFC RESEND 0/6] hugetlbfs largepage RAS project “William Roche
2024-09-10 10:02 ` [RFC RESEND 1/6] accel/kvm: SIGBUS handler should also deal with si_addr_lsb “William Roche
2024-09-10 10:02 ` [RFC RESEND 2/6] accel/kvm: Keep track of the HWPoisonPage sizes “William Roche
2024-09-10 10:02 ` [RFC RESEND 3/6] system/physmem: Remap memory pages on reset based on the page size “William Roche
2024-09-10 10:02 ` [RFC RESEND 4/6] system: Introducing hugetlbfs largepage RAS feature “William Roche
2024-09-10 10:02 ` [RFC RESEND 5/6] system/hugetlb_ras: Handle madvise SIGBUS signal on listener “William Roche
2024-09-10 10:02 ` [RFC RESEND 6/6] system/hugetlb_ras: Replay lost BUS_MCEERR_AO signals on VM resume “William Roche
2024-09-10 11:36 ` [RFC RESEND 0/6] hugetlbfs largepage RAS project David Hildenbrand
2024-09-10 16:24 ` William Roche
2024-09-11 22:07 ` David Hildenbrand
2024-09-12 17:07 ` William Roche
2024-09-19 16:52 ` William Roche
2024-10-09 15:45 ` Peter Xu
2024-10-10 20:35 ` William Roche
2024-10-22 21:34 ` [PATCH v1 0/4] hugetlbfs memory HW error fixes “William Roche
2024-10-22 21:35 ` [PATCH v1 1/4] accel/kvm: SIGBUS handler should also deal with si_addr_lsb “William Roche
2024-10-22 21:35 ` [PATCH v1 2/4] accel/kvm: Keep track of the HWPoisonPage page_size “William Roche
2024-10-23 7:28 ` David Hildenbrand
2024-10-25 23:27 ` William Roche
2024-10-28 16:42 ` David Hildenbrand
2024-10-30 1:56 ` William Roche
2024-11-04 14:10 ` David Hildenbrand
2024-10-25 23:30 ` William Roche
2024-10-22 21:35 ` [PATCH v1 3/4] system/physmem: Largepage punch hole before reset of memory pages “William Roche
2024-10-23 7:30 ` David Hildenbrand
2024-10-25 23:27 ` William Roche [this message]
2024-10-28 17:01 ` David Hildenbrand
2024-10-30 1:56 ` William Roche
2024-11-04 13:30 ` David Hildenbrand
2024-11-07 10:21 ` [PATCH v2 0/7] hugetlbfs memory HW error fixes “William Roche
2024-11-07 10:21 ` [PATCH v2 1/7] accel/kvm: Keep track of the HWPoisonPage page_size “William Roche
2024-11-12 10:30 ` David Hildenbrand
2024-11-12 18:17 ` William Roche
2024-11-12 21:35 ` David Hildenbrand
2024-11-07 10:21 ` [PATCH v2 2/7] system/physmem: poisoned memory discard on reboot “William Roche
2024-11-12 11:07 ` David Hildenbrand
2024-11-12 18:17 ` William Roche
2024-11-12 22:06 ` David Hildenbrand
2024-11-07 10:21 ` [PATCH v2 3/7] accel/kvm: Report the loss of a large memory page “William Roche
2024-11-12 11:13 ` David Hildenbrand
2024-11-12 18:17 ` William Roche
2024-11-12 22:22 ` David Hildenbrand
2024-11-15 21:03 ` William Roche
2024-11-18 9:45 ` David Hildenbrand
2024-11-07 10:21 ` [PATCH v2 4/7] numa: Introduce and use ram_block_notify_remap() “William Roche
2024-11-07 10:21 ` [PATCH v2 5/7] hostmem: Factor out applying settings “William Roche
2024-11-07 10:21 ` [PATCH v2 6/7] hostmem: Handle remapping of RAM “William Roche
2024-11-12 13:45 ` David Hildenbrand
2024-11-12 18:17 ` William Roche
2024-11-12 22:24 ` David Hildenbrand
2024-11-07 10:21 ` [PATCH v2 7/7] system/physmem: Memory settings applied on remap notification “William Roche
2024-10-22 21:35 ` [PATCH v1 4/4] accel/kvm: Report the loss of a large memory page “William Roche
2024-10-28 16:32 ` [RFC RESEND 0/6] hugetlbfs largepage RAS project David Hildenbrand
2024-11-25 14:27 ` [PATCH v3 0/7] hugetlbfs memory HW error fixes “William Roche
2024-11-25 14:27 ` [PATCH v3 1/7] hwpoison_page_list and qemu_ram_remap are based of pages “William Roche
2024-11-25 14:27 ` [PATCH v3 2/7] system/physmem: poisoned memory discard on reboot “William Roche
2024-11-25 14:27 ` [PATCH v3 3/7] accel/kvm: Report the loss of a large memory page “William Roche
2024-11-25 14:27 ` [PATCH v3 4/7] numa: Introduce and use ram_block_notify_remap() “William Roche
2024-11-25 14:27 ` [PATCH v3 5/7] hostmem: Factor out applying settings “William Roche
2024-11-25 14:27 ` [PATCH v3 6/7] hostmem: Handle remapping of RAM “William Roche
2024-11-25 14:27 ` [PATCH v3 7/7] system/physmem: Memory settings applied on remap notification “William Roche
2024-12-02 15:41 ` [PATCH v3 0/7] hugetlbfs memory HW error fixes William Roche
2024-12-02 16:00 ` David Hildenbrand
2024-12-03 0:15 ` William Roche
2024-12-03 14:08 ` David Hildenbrand
2024-12-03 14:39 ` William Roche
2024-12-03 15:00 ` David Hildenbrand
2024-12-06 18:26 ` William Roche
2024-12-09 21:25 ` David Hildenbrand
2024-12-14 13:45 ` [PATCH v4 0/7] Poisoned memory recovery on reboot “William Roche
2024-12-14 13:45 ` [PATCH v4 1/7] hwpoison_page_list and qemu_ram_remap are based on pages “William Roche
2025-01-08 21:34 ` David Hildenbrand
2025-01-10 20:56 ` William Roche
2025-01-14 13:56 ` David Hildenbrand
2024-12-14 13:45 ` [PATCH v4 2/7] system/physmem: poisoned memory discard on reboot “William Roche
2025-01-08 21:44 ` David Hildenbrand
2025-01-10 20:56 ` William Roche
2025-01-14 14:00 ` David Hildenbrand
2025-01-27 21:15 ` William Roche
2024-12-14 13:45 ` [PATCH v4 3/7] accel/kvm: Report the loss of a large memory page “William Roche
2024-12-14 13:45 ` [PATCH v4 4/7] numa: Introduce and use ram_block_notify_remap() “William Roche
2024-12-14 13:45 ` [PATCH v4 5/7] hostmem: Factor out applying settings “William Roche
2025-01-08 21:58 ` David Hildenbrand
2025-01-10 20:56 ` William Roche
2024-12-14 13:45 ` [PATCH v4 6/7] hostmem: Handle remapping of RAM “William Roche
2025-01-08 21:51 ` [PATCH v4 6/7] c David Hildenbrand
2025-01-10 20:57 ` [PATCH v4 6/7] hostmem: Handle remapping of RAM William Roche
2024-12-14 13:45 ` [PATCH v4 7/7] system/physmem: Memory settings applied on remap notification “William Roche
2025-01-08 21:53 ` David Hildenbrand
2025-01-10 20:57 ` William Roche
2025-01-14 14:01 ` David Hildenbrand
2025-01-08 21:22 ` [PATCH v4 0/7] Poisoned memory recovery on reboot David Hildenbrand
2025-01-10 20:55 ` William Roche
2025-01-10 21:13 ` [PATCH v5 0/6] " “William Roche
2025-01-10 21:14 ` [PATCH v5 1/6] system/physmem: handle hugetlb correctly in qemu_ram_remap() “William Roche
2025-01-14 14:02 ` David Hildenbrand
2025-01-27 21:16 ` William Roche
2025-01-28 18:41 ` David Hildenbrand
2025-01-10 21:14 ` [PATCH v5 2/6] system/physmem: poisoned memory discard on reboot “William Roche
2025-01-14 14:07 ` David Hildenbrand
2025-01-27 21:16 ` William Roche
2025-01-10 21:14 ` [PATCH v5 3/6] accel/kvm: Report the loss of a large memory page “William Roche
2025-01-14 14:09 ` David Hildenbrand
2025-01-27 21:16 ` William Roche
2025-01-28 18:45 ` David Hildenbrand
2025-01-10 21:14 ` [PATCH v5 4/6] numa: Introduce and use ram_block_notify_remap() “William Roche
2025-01-10 21:14 ` [PATCH v5 5/6] hostmem: Factor out applying settings “William Roche
2025-01-10 21:14 ` [PATCH v5 6/6] hostmem: Handle remapping of RAM “William Roche
2025-01-14 14:11 ` David Hildenbrand
2025-01-27 21:16 ` William Roche
2025-01-14 14:12 ` [PATCH v5 0/6] Poisoned memory recovery on reboot David Hildenbrand
2025-01-27 21:16 ` William Roche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e9f8e404-50db-4e0f-a5e1-749acad49325@oracle.com \
--to=william.roche@oracle.com \
--cc=david@redhat.com \
--cc=joao.m.martins@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).