Re: [PATCH v7 3/6] accel/kvm: Report the loss of a large memory page

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: William Roche <william.roche@oracle.com>
Cc: david@redhat.com, kvm@vger.kernel.org, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, pbonzini@redhat.com,
	richard.henderson@linaro.org, philmd@linaro.org,
	peter.maydell@linaro.org, mtosatti@redhat.com,
	imammedo@redhat.com, eduardo@habkost.net,
	marcel.apfelbaum@gmail.com, wangyanan55@huawei.com,
	zhao1.liu@intel.com, joao.m.martins@oracle.com
Subject: Re: [PATCH v7 3/6] accel/kvm: Report the loss of a large memory page
Date: Tue, 11 Feb 2025 16:45:55 -0500	[thread overview]
Message-ID: <Z6vFEwS6EjDXHsFc@x1.local> (raw)
In-Reply-To: <6c891caf-fbc0-4f5e-8e21-e87c3348c9fa@oracle.com>

On Tue, Feb 11, 2025 at 10:22:38PM +0100, William Roche wrote:
> On 2/10/25 17:48, Peter Xu wrote:
> > On Fri, Feb 07, 2025 at 07:02:22PM +0100, William Roche wrote:
> > > [...]
> > > So the main reason is a KVM "weakness" with kvm_send_hwpoison_signal(), and
> > > the second reason is to have richer error messages.
> > 
> > This seems true, and I also remember something when I looked at this
> > previously but maybe nobody tried to fix it.  ARM seems to be correct on
> > that field, otoh.
> > 
> > Is it possible we fix KVM on x86?
> 
> Yes, very probably, and it would be a kernel fix.
> This kernel modification would be needed to run on the hypervisor first to
> influence a new code in qemu able to use the SIGBUS siginfo information and
> identify the size of the page impacted (instead of using an internal
> addition to kvm API).
> But this mechanism could help to generate a large page memory error specific
> message on SIGBUS receiving.

Yes, QEMU should probably better be able to work on both old/new kernels,
even if this will be fixed.

> 
> 
> > > > 
> > > > I feel like when hwpoison becomes a serious topic, we need some more
> > > > serious reporting facility than error reports.  So that we could have this
> > > > as separate topic to be revisited.  It might speed up your prior patches
> > > > from not being blocked on this.
> > > 
> > > I explained why I think that error messages are important, but I don't want
> > > to get blocked on fixing the hugepage memory recovery because of that.
> > 
> > What is the major benefit of reporting in QEMU's stderr in this case?
> 
> Such messages can be collected into VM specific log file, as any other
> error_report() message, like the existing x86 error injection messages
> reported by Qemu.
> This messages should help the administrator to better understand the
> behavior of the VM.

I'll still put "better understand the behavior of VM" into debugging
category. :)

But I agree such can be important information.  That's also why I was
curious whether it should be something like a QMP event instead.  That's a
much formal way of sending important messages.

> 
> 
> > For example, how should we consume the error reports that this patch
> > introduces?  Is it still for debugging purpose?
> 
> Its not only debugging, but it's a trace of a significant event that can
> have major consequences on the VM.
> 
> > 
> > I agree it's always better to dump something in QEMU when such happened,
> > but IIUC what I mentioned above (by monitoring QEMU ramblock setups, and
> > monitor host dmesg on any vaddr reported hwpoison) should also allow anyone
> > to deduce the page size of affected vaddr, especially if it's for debugging
> > purpose.  However I could possibly have missed the goal here..
> 
> You're right that knowing the address, the administrator can deduce what
> memory area was impacted and the associated page size. But the goal of these
> large page specific messages was to give details on the event type and
> immediately qualify the consequences.
> Using large pages can also have drawbacks, and a large page specific message
> on memory error makes that more obvious !  Not only a debug msg, but an
> indication that the VM lost an unusually large amount of its memory.
> 
> > > 
> > > If you think that not displaying a specific message for large page loss can
> > > help to get the recovery fixed, than I can change my proposal to do so.
> > > 
> > > Early next week, I'll send a simplified version of my first 3 patches
> > > without this specific messages and without the preallocation handling in all
> > > remap cases, so you can evaluate this possibility.
> > 
> > Yes IMHO it'll always be helpful to separate it if possible.
> 
> I'm sending now a v8 version, without the specific messages and the remap
> notification. It should fix the main recovery bug we currently have. More
> messages and a notification dealing with pre-allocation can be added in a
> second step.
> 
> Please let me know if this v8 version can be integrated without the prealloc
> and specific messages ?

IMHO fixing hugetlb page is still a progress on its own, even without any
added error message, or proactive allocation during reset.

One issue is the v8 still contains patch 3 which is for ARM kvm.. You may
need to post it separately for ARM maintainers to review & collect.  I'll
be able to queue patch 1-2.

Thanks,

-- 
Peter Xu

next prev parent reply	other threads:[~2025-02-11 21:46 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-01  9:57 [PATCH v7 0/6] Poisoned memory recovery on reboot “William Roche
2025-02-01  9:57 ` [PATCH v7 1/6] system/physmem: handle hugetlb correctly in qemu_ram_remap() “William Roche
2025-02-04 17:09   ` Peter Xu
2025-02-01  9:57 ` [PATCH v7 2/6] system/physmem: poisoned memory discard on reboot “William Roche
2025-02-04 17:09   ` Peter Xu
2025-02-05 16:27     ` William Roche
2025-02-01  9:57 ` [PATCH v7 3/6] accel/kvm: Report the loss of a large memory page “William Roche
2025-02-04 17:01   ` Peter Xu
2025-02-05 16:27     ` William Roche
2025-02-05 17:07       ` Peter Xu
2025-02-07 18:02         ` William Roche
2025-02-10 16:48           ` Peter Xu
2025-02-11 21:22             ` William Roche
2025-02-11 21:45               ` Peter Xu [this message]
2025-02-01  9:57 ` [PATCH v7 4/6] numa: Introduce and use ram_block_notify_remap() “William Roche
2025-02-04 17:17   ` Peter Xu
2025-02-04 17:42     ` David Hildenbrand
2025-02-01  9:57 ` [PATCH v7 5/6] hostmem: Factor out applying settings “William Roche
2025-02-01  9:57 ` [PATCH v7 6/6] hostmem: Handle remapping of RAM “William Roche
2025-02-04 17:50   ` David Hildenbrand
2025-02-04 17:58     ` Peter Xu
2025-02-04 18:55       ` David Hildenbrand
2025-02-04 20:16         ` Peter Xu
2025-02-05 16:27           ` William Roche
2025-02-05 17:58             ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z6vFEwS6EjDXHsFc@x1.local \
    --to=peterx@redhat.com \
    --cc=david@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=imammedo@redhat.com \
    --cc=joao.m.martins@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=wangyanan55@huawei.com \
    --cc=william.roche@oracle.com \
    --cc=zhao1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.