qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 0/3] recover hardware corrupted page by virtio balloon
@ 2022-05-25 20:16 Jue Wang
  2022-05-26 18:37 ` Peter Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Jue Wang @ 2022-05-25 20:16 UTC (permalink / raw)
  To: pizhenwei
  Cc: Andrew Morton, David Hildenbrand, jasowang, LKML, Linux MM, mst,
	HORIGUCHI NAOYA(堀口 直也), Paolo Bonzini,
	Peter Xu, qemu-devel, virtualization

Some points to consider:

The injected MCE has _done_ the damages to guest workload. Recovering
the guest poisoned memory doesn't help with the already happened guest
workload memory corruption / loss / interruption due to injected MCEs.

The hypervisor _must_ emulate poisons identified in guest physical
address space (could be transported from the source VM), this is to
prevent silent data corruption in the guest. With a paravirtual
approach like this patch series, the hypervisor can clear some of the
poisoned HVAs knowing for certain that the guest OS has isolated the
poisoned page. I wonder how much value it provides to the guest if the
guest and workload are _not_ in a pressing need for the extra KB/MB
worth of memory.

Thanks,
-Jue


^ permalink raw reply	[flat|nested] 14+ messages in thread
* [PATCH 0/3] recover hardware corrupted page by virtio balloon
@ 2022-05-20  7:06 zhenwei pi
  2022-05-24 18:59 ` David Hildenbrand
  2022-05-27  3:47 ` zhenwei pi
  0 siblings, 2 replies; 14+ messages in thread
From: zhenwei pi @ 2022-05-20  7:06 UTC (permalink / raw)
  To: akpm, naoya.horiguchi, mst, david
  Cc: linux-mm, linux-kernel, jasowang, virtualization, pbonzini,
	peterx, qemu-devel, zhenwei pi

Hi,

I'm trying to recover hardware corrupted page by virtio balloon, the
workflow of this feature like this:

Guest              5.MF -> 6.RVQ FE    10.Unpoison page
                    /           \            /
-------------------+-------------+----------+-----------
                   |             |          |
                4.MCE        7.RVQ BE   9.RVQ Event
 QEMU             /               \       /
             3.SIGBUS              8.Remap
                /
----------------+------------------------------------
                |
            +--2.MF
 Host       /
       1.HW error

1, HardWare page error occurs randomly.
2, host side handles corrupted page by Memory Failure mechanism, sends
   SIGBUS to the user process if early-kill is enabled.
3, QEMU handles SIGBUS, if the address belongs to guest RAM, then:
4, QEMU tries to inject MCE into guest.
5, guest handles memory failure again.

1-5 is already supported for a long time, the next steps are supported
in this patch(also related driver patch):

6, guest balloon driver gets noticed of the corrupted PFN, and sends
   request to host side by Recover VQ FrontEnd.
7, QEMU handles request from Recover VQ BackEnd, then:
8, QEMU remaps the corrupted HVA fo fix the memory failure, then:
9, QEMU acks the guest side the result by Recover VQ.
10, guest unpoisons the page if the corrupted page gets recoverd
    successfully.

Test:
This patch set can be tested with QEMU(also in developing):
https://github.com/pizhenwei/qemu/tree/balloon-recover

Emulate MCE by QEMU(guest RAM normal page only, hugepage is not supported):
virsh qemu-monitor-command vm --hmp mce 0 9 0xbd000000000000c0 0xd 0x61646678 0x8c

The guest works fine(on Intel Platinum 8260):
 mce: [Hardware Error]: Machine check events logged
 Memory failure: 0x61646: recovery action for dirty LRU page: Recovered
 virtio_balloon virtio5: recovered pfn 0x61646
 Unpoison: Unpoisoned page 0x61646 by virtio-balloon
 MCE: Killing stress:24502 due to hardware memory corruption fault at 7f5be2e5a010

And the 'HardwareCorrupted' in /proc/meminfo also shows 0 kB.

About the protocol of virtio balloon recover VQ, it's undefined and in
developing currently:
- 'struct virtio_balloon_recover' defines the structure which is used to
  exchange message between guest and host.
- '__le32 corrupted_pages' in struct virtio_balloon_config is used in the next
  step:
  1, a VM uses RAM of 2M huge page, once a MCE occurs, the 2M becomes
     unaccessible. Reporting 512 * 4K 'corrupted_pages' to the guest, the guest
     has a chance to isolate the 512 pages ahead of time.

  2, after migrating to another host, the corrupted pages are actually recovered,
     once the guest gets the 'corrupted_pages' with 0, then the guest could
     unpoison all the poisoned pages which are recorded in the balloon driver.

zhenwei pi (3):
  memory-failure: Introduce memory failure notifier
  mm/memory-failure.c: support reset PTE during unpoison
  virtio_balloon: Introduce memory recover

 drivers/virtio/virtio_balloon.c     | 243 ++++++++++++++++++++++++++++
 include/linux/mm.h                  |   4 +-
 include/uapi/linux/virtio_balloon.h |  16 ++
 mm/hwpoison-inject.c                |   2 +-
 mm/memory-failure.c                 |  59 ++++++-
 5 files changed, 315 insertions(+), 9 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-06-02  9:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-25 20:16 [PATCH 0/3] recover hardware corrupted page by virtio balloon Jue Wang
2022-05-26 18:37 ` Peter Xu
2022-05-27  6:32   ` zhenwei pi
2022-05-30  7:41     ` David Hildenbrand
2022-05-30 11:33       ` zhenwei pi
2022-05-30 15:49         ` Peter Xu
2022-05-31  4:08           ` Jue Wang
2022-06-01  2:17             ` zhenwei pi
2022-06-01  7:59               ` David Hildenbrand
2022-06-02  9:28                 ` zhenwei pi
2022-06-02  9:40                   ` David Hildenbrand
  -- strict thread matches above, loose matches on Subject: below --
2022-05-20  7:06 zhenwei pi
2022-05-24 18:59 ` David Hildenbrand
2022-05-27  3:47 ` zhenwei pi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).