public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: William Roche <william.roche@oracle.com>
To: David Hildenbrand <david@redhat.com>,
	kvm@vger.kernel.org, qemu-devel@nongnu.org, qemu-arm@nongnu.org
Cc: peterx@redhat.com, pbonzini@redhat.com, philmd@linaro.org,
	richard.henderson@linaro.org, peter.maydell@linaro.org,
	mtosatti@redhat.com, joao.m.martins@oracle.com
Subject: Re: [PATCH v1 2/4] accel/kvm: Keep track of the HWPoisonPage page_size
Date: Sat, 26 Oct 2024 01:30:55 +0200	[thread overview]
Message-ID: <3404dceb-2baa-4ac3-8168-c87f3ed50b20@oracle.com> (raw)
In-Reply-To: <a0fda9e7-d55b-455b-aeaa-27162b6cdc65@redhat.com>

On 10/23/24 09:28, David Hildenbrand wrote:
> On 22.10.24 23:35, “William Roche wrote:
>> From: William Roche <william.roche@oracle.com>
>>
>> Add the page size information to the hwpoison_page_list elements.
>> As the kernel doesn't always report the actual poisoned page size,
>> we adjust this size from the backend real page size.
>> We take into account the recorded page size to adjust the size
>> and location of the memory hole.
>>
>> Signed-off-by: William Roche <william.roche@oracle.com>
>> ---
>>   accel/kvm/kvm-all.c       | 14 ++++++++++----
>>   include/exec/cpu-common.h |  1 +
>>   include/sysemu/kvm.h      |  3 ++-
>>   include/sysemu/kvm_int.h  |  3 ++-
>>   system/physmem.c          | 20 ++++++++++++++++++++
>>   target/arm/kvm.c          |  8 ++++++--
>>   target/i386/kvm/kvm.c     |  8 ++++++--
>>   7 files changed, 47 insertions(+), 10 deletions(-)
>>
>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>> index 2adc4d9c24..40117eefa7 100644
>> --- a/accel/kvm/kvm-all.c
>> +++ b/accel/kvm/kvm-all.c
>> @@ -1266,6 +1266,7 @@ int kvm_vm_check_extension(KVMState *s, unsigned 
>> int extension)
>>    */
>>   typedef struct HWPoisonPage {
>>       ram_addr_t ram_addr;
>> +    size_t     page_size;
>>       QLIST_ENTRY(HWPoisonPage) list;
>>   } HWPoisonPage;
>> @@ -1278,15 +1279,18 @@ static void kvm_unpoison_all(void *param)
>>       QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
>>           QLIST_REMOVE(page, list);
>> -        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
>> +        qemu_ram_remap(page->ram_addr, page->page_size);
> 
> Can't we just use the page size from the RAMBlock in qemu_ram_remap? 
> There we lookup the RAMBlock, and all pages in a RAMBlock have the same 
> size.

Yes, we could use the page size from the RAMBlock in qemu_ram_remap() 
that is called when the VM is resetting. I think that knowing the 
information about the size of poisoned chunk of memory when the poison 
is created is useful to give a trace of what is going on, before seeing 
maybe other pages being reported as poisoned. That's the 4th patch goal 
to give an information as soon as we get it.
It also helps to filter the new errors reported and only create an entry 
in the hwpoison_page_list for new large pages.

Now we could delay the page size retrieval until we are resetting and 
present the information (post mortem). I do think that having the 
information earlier is better in this case.



> 
> I'll note that qemu_ram_remap() is rather stupid and optimized only for 
> private memory (not shmem etc).
> 
> mmap(MAP_FIXED|MAP_SHARED, fd) will give you the same poisoned page from 
> the pagecache; you'd have to punch a hole instead.
> 
> It might be better to use ram_block_discard_range() in the long run. 
> Memory preallocation + page pinning is tricky, but we could simply bail 
> out in these cases (preallocation failing, ram discard being disabled).

I see that ram_block_discard_range() adds more control before discarding 
the RAM region and can also call madvise() in addition to the fallocate 
punch hole for standard sized memory pages. Now as the range is supposed 
to be recreated, I'm not convinced that these madvise calls are necessary.

But we can also notice that this function will report the following 
warning in all cases of not shared file backends:
"ram_block_discard_range: Discarding RAM in private file mappings is 
possibly dangerous, because it will modify the underlying file and will 
affect other users of the file"
Which means that hugetlbfs configurations do see this new cryptic 
warning message on reboot if it is impacted by a memory poisoning.
So I would prefer to leave the fallocate call in the qemu_ram_remap() 
function. Or would you prefer to enhance ram_block_discard_range() code 
to avoid the message in a reset situation (when called from 
qemu_ram_remap) ?


> 
> qemu_ram_remap() might be problematic with page pinning (vfio) as is in 
> any way :(
> 

I agree. If qemu_ram_remap() fails, Qemu is ended either abort() or 
exit(1). Do you say that memory pinning could be detected by 
ram_block_discard_range() or maybe mmap call for the impacted region and 
make one of them fail ? This would be an additional reason to call 
ram_block_discard_range() from qemu_ram_remap().   Is it what you are 
suggesting ?

  reply	other threads:[~2024-10-25 23:32 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240910090747.2741475-1-william.roche@oracle.com>
2024-09-10 10:02 ` [RFC RESEND 0/6] hugetlbfs largepage RAS project “William Roche
2024-09-10 10:02   ` [RFC RESEND 1/6] accel/kvm: SIGBUS handler should also deal with si_addr_lsb “William Roche
2024-09-10 10:02   ` [RFC RESEND 2/6] accel/kvm: Keep track of the HWPoisonPage sizes “William Roche
2024-09-10 10:02   ` [RFC RESEND 3/6] system/physmem: Remap memory pages on reset based on the page size “William Roche
2024-09-10 10:02   ` [RFC RESEND 4/6] system: Introducing hugetlbfs largepage RAS feature “William Roche
2024-09-10 10:02   ` [RFC RESEND 5/6] system/hugetlb_ras: Handle madvise SIGBUS signal on listener “William Roche
2024-09-10 10:02   ` [RFC RESEND 6/6] system/hugetlb_ras: Replay lost BUS_MCEERR_AO signals on VM resume “William Roche
2024-09-10 11:36   ` [RFC RESEND 0/6] hugetlbfs largepage RAS project David Hildenbrand
2024-09-10 16:24     ` William Roche
2024-09-11 22:07       ` David Hildenbrand
2024-09-12 17:07         ` William Roche
2024-09-19 16:52           ` William Roche
2024-10-09 15:45             ` Peter Xu
2024-10-10 20:35               ` William Roche
2024-10-22 21:34               ` [PATCH v1 0/4] hugetlbfs memory HW error fixes “William Roche
2024-10-22 21:35                 ` [PATCH v1 1/4] accel/kvm: SIGBUS handler should also deal with si_addr_lsb “William Roche
2024-10-22 21:35                 ` [PATCH v1 2/4] accel/kvm: Keep track of the HWPoisonPage page_size “William Roche
2024-10-23  7:28                   ` David Hildenbrand
2024-10-25 23:30                     ` William Roche [this message]
     [not found]                     ` <9b17600d-4473-4bb6-841f-00f93d86f720@oracle.com>
2024-10-28 16:42                       ` David Hildenbrand
2024-10-30  1:56                         ` William Roche
2024-11-04 14:10                           ` David Hildenbrand
2024-10-22 21:35                 ` [PATCH v1 3/4] system/physmem: Largepage punch hole before reset of memory pages “William Roche
2024-10-23  7:30                   ` David Hildenbrand
2024-10-25 23:27                     ` William Roche
2024-10-28 17:01                       ` David Hildenbrand
2024-10-30  1:56                         ` William Roche
2024-11-04 13:30                           ` David Hildenbrand
2024-11-07 10:21                             ` [PATCH v2 0/7] hugetlbfs memory HW error fixes “William Roche
2024-11-07 10:21                               ` [PATCH v2 1/7] accel/kvm: Keep track of the HWPoisonPage page_size “William Roche
2024-11-12 10:30                                 ` David Hildenbrand
2024-11-12 18:17                                   ` William Roche
2024-11-12 21:35                                     ` David Hildenbrand
2024-11-07 10:21                               ` [PATCH v2 2/7] system/physmem: poisoned memory discard on reboot “William Roche
2024-11-12 11:07                                 ` David Hildenbrand
2024-11-12 18:17                                   ` William Roche
2024-11-12 22:06                                     ` David Hildenbrand
2024-11-07 10:21                               ` [PATCH v2 3/7] accel/kvm: Report the loss of a large memory page “William Roche
2024-11-12 11:13                                 ` David Hildenbrand
2024-11-12 18:17                                   ` William Roche
2024-11-12 22:22                                     ` David Hildenbrand
2024-11-15 21:03                                       ` William Roche
2024-11-18  9:45                                         ` David Hildenbrand
2024-11-07 10:21                               ` [PATCH v2 4/7] numa: Introduce and use ram_block_notify_remap() “William Roche
2024-11-07 10:21                               ` [PATCH v2 5/7] hostmem: Factor out applying settings “William Roche
2024-11-07 10:21                               ` [PATCH v2 6/7] hostmem: Handle remapping of RAM “William Roche
2024-11-12 13:45                                 ` David Hildenbrand
2024-11-12 18:17                                   ` William Roche
2024-11-12 22:24                                     ` David Hildenbrand
2024-11-07 10:21                               ` [PATCH v2 7/7] system/physmem: Memory settings applied on remap notification “William Roche
2024-10-22 21:35                 ` [PATCH v1 4/4] accel/kvm: Report the loss of a large memory page “William Roche
2024-10-28 16:32             ` [RFC RESEND 0/6] hugetlbfs largepage RAS project David Hildenbrand
2024-11-25 14:27         ` [PATCH v3 0/7] hugetlbfs memory HW error fixes “William Roche
2024-11-25 14:27           ` [PATCH v3 1/7] hwpoison_page_list and qemu_ram_remap are based of pages “William Roche
2024-11-25 14:27           ` [PATCH v3 2/7] system/physmem: poisoned memory discard on reboot “William Roche
2024-11-25 14:27           ` [PATCH v3 3/7] accel/kvm: Report the loss of a large memory page “William Roche
2024-11-25 14:27           ` [PATCH v3 4/7] numa: Introduce and use ram_block_notify_remap() “William Roche
2024-11-25 14:27           ` [PATCH v3 5/7] hostmem: Factor out applying settings “William Roche
2024-11-25 14:27           ` [PATCH v3 6/7] hostmem: Handle remapping of RAM “William Roche
2024-11-25 14:27           ` [PATCH v3 7/7] system/physmem: Memory settings applied on remap notification “William Roche
2024-12-02 15:41           ` [PATCH v3 0/7] hugetlbfs memory HW error fixes William Roche
2024-12-02 16:00             ` David Hildenbrand
2024-12-03  0:15               ` William Roche
2024-12-03 14:08                 ` David Hildenbrand
2024-12-03 14:39                   ` William Roche
2024-12-03 15:00                     ` David Hildenbrand
2024-12-06 18:26                       ` William Roche
2024-12-09 21:25                         ` David Hildenbrand
2024-12-14 13:45         ` [PATCH v4 0/7] Poisoned memory recovery on reboot “William Roche
2024-12-14 13:45           ` [PATCH v4 1/7] hwpoison_page_list and qemu_ram_remap are based on pages “William Roche
2025-01-08 21:34             ` David Hildenbrand
2025-01-10 20:56               ` William Roche
2025-01-14 13:56                 ` David Hildenbrand
2024-12-14 13:45           ` [PATCH v4 2/7] system/physmem: poisoned memory discard on reboot “William Roche
2025-01-08 21:44             ` David Hildenbrand
2025-01-10 20:56               ` William Roche
2025-01-14 14:00                 ` David Hildenbrand
2025-01-27 21:15                   ` William Roche
2024-12-14 13:45           ` [PATCH v4 3/7] accel/kvm: Report the loss of a large memory page “William Roche
2024-12-14 13:45           ` [PATCH v4 4/7] numa: Introduce and use ram_block_notify_remap() “William Roche
2024-12-14 13:45           ` [PATCH v4 5/7] hostmem: Factor out applying settings “William Roche
2025-01-08 21:58             ` David Hildenbrand
2025-01-10 20:56               ` William Roche
2024-12-14 13:45           ` [PATCH v4 6/7] hostmem: Handle remapping of RAM “William Roche
2025-01-08 21:51             ` [PATCH v4 6/7] c David Hildenbrand
2025-01-10 20:57               ` [PATCH v4 6/7] hostmem: Handle remapping of RAM William Roche
2024-12-14 13:45           ` [PATCH v4 7/7] system/physmem: Memory settings applied on remap notification “William Roche
2025-01-08 21:53             ` David Hildenbrand
2025-01-10 20:57               ` William Roche
2025-01-14 14:01                 ` David Hildenbrand
2025-01-08 21:22           ` [PATCH v4 0/7] Poisoned memory recovery on reboot David Hildenbrand
2025-01-10 20:55             ` William Roche
2025-01-10 21:13         ` [PATCH v5 0/6] " “William Roche
2025-01-10 21:14           ` [PATCH v5 1/6] system/physmem: handle hugetlb correctly in qemu_ram_remap() “William Roche
2025-01-14 14:02             ` David Hildenbrand
2025-01-27 21:16               ` William Roche
2025-01-28 18:41                 ` David Hildenbrand
2025-01-10 21:14           ` [PATCH v5 2/6] system/physmem: poisoned memory discard on reboot “William Roche
2025-01-14 14:07             ` David Hildenbrand
2025-01-27 21:16               ` William Roche
2025-01-10 21:14           ` [PATCH v5 3/6] accel/kvm: Report the loss of a large memory page “William Roche
2025-01-14 14:09             ` David Hildenbrand
2025-01-27 21:16               ` William Roche
2025-01-28 18:45                 ` David Hildenbrand
2025-01-10 21:14           ` [PATCH v5 4/6] numa: Introduce and use ram_block_notify_remap() “William Roche
2025-01-10 21:14           ` [PATCH v5 5/6] hostmem: Factor out applying settings “William Roche
2025-01-10 21:14           ` [PATCH v5 6/6] hostmem: Handle remapping of RAM “William Roche
2025-01-14 14:11             ` David Hildenbrand
2025-01-27 21:16               ` William Roche
2025-01-14 14:12           ` [PATCH v5 0/6] Poisoned memory recovery on reboot David Hildenbrand
2025-01-27 21:16             ` William Roche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3404dceb-2baa-4ac3-8168-c87f3ed50b20@oracle.com \
    --to=william.roche@oracle.com \
    --cc=david@redhat.com \
    --cc=joao.m.martins@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox