Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: Joao Martins <joao.m.martins@oracle.com>,
	William Roche <william.roche@oracle.com>
Cc: William Roche <william.roche@oracle.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	Leonardo Bras <leobras@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase
Date: Wed, 6 Sep 2023 11:16:49 -0400	[thread overview]
Message-ID: <ZPiX4YLAT5HjxUAJ@x1n> (raw)
In-Reply-To: <e2adce18-7aef-5445-352a-01e789619fac@oracle.com>

On Wed, Sep 06, 2023 at 03:19:32PM +0100, Joao Martins wrote:
> On 06/09/2023 14:59, “William Roche wrote:
> > From: William Roche <william.roche@oracle.com>
> > 
> > A memory page poisoned from the hypervisor level is no longer readable.
> > Thus, it is now treated as a zero-page for the ram saving migration phase.
> > 
> > The migration of a VM will crash Qemu when it tries to read the
> > memory address space and stumbles on the poisoned page with a similar
> > stack trace:
> > 
> > Program terminated with signal SIGBUS, Bus error.
> > #0  _mm256_loadu_si256
> > #1  buffer_zero_avx2
> > #2  select_accel_fn
> > #3  buffer_is_zero
> > #4  save_zero_page_to_file
> > #5  save_zero_page
> > #6  ram_save_target_page_legacy
> > #7  ram_save_host_page
> > #8  ram_find_and_save_block
> > #9  ram_save_iterate
> > #10 qemu_savevm_state_iterate
> > #11 migration_iteration_run
> > #12 migration_thread
> > #13 qemu_thread_start
> > 
> > Fix it by considering poisoned pages as if they were zero-pages for
> > the migration copy. This fix also works with underlying large pages,
> > taking into account the RAMBlock segment "page-size".
> > 
> > Signed-off-by: William Roche <william.roche@oracle.com>
> 
> You forgot to CC the maintainers; Adding them now
> 
> ./scripts/get_maintainer.pl is your friend for the next version :)
> 
> > ---
> >  accel/kvm/kvm-all.c    | 14 ++++++++++++++
> >  accel/stubs/kvm-stub.c |  5 +++++
> >  include/sysemu/kvm.h   | 10 ++++++++++
> >  migration/ram.c        |  3 ++-
> >  4 files changed, 31 insertions(+), 1 deletion(-)
> > 
> > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> > index 2ba7521695..24a7709495 100644
> > --- a/accel/kvm/kvm-all.c
> > +++ b/accel/kvm/kvm-all.c
> > @@ -1152,6 +1152,20 @@ static void kvm_unpoison_all(void *param)
> >      }
> >  }
> >  
> > +bool kvm_hwpoisoned_page(RAMBlock *block, void *offset)
> > +{
> > +    HWPoisonPage *pg;
> > +    ram_addr_t ram_addr = (ram_addr_t) offset;
> > +
> > +    QLIST_FOREACH(pg, &hwpoison_page_list, list) {
> > +        if ((ram_addr >= pg->ram_addr) &&
> > +            (ram_addr - pg->ram_addr < block->page_size)) {

Just a note..

Probably fine for now to reuse block page size, but IIUC the right thing to
do is to fetch it from the signal info (in QEMU's sigbus_handler()) of
kernel_siginfo.si_addr_lsb.

At least for x86 I think that stores the "shift" of covered poisoned page
(one needs to track the Linux handling of VM_FAULT_HWPOISON_LARGE for a
huge page, though.. not aware of any man page for that).  It'll then work
naturally when Linux huge pages will start to support sub-huge-page-size
poisoning someday.  We can definitely leave that for later.

> > +            return true;
> > +        }
> > +    }
> > +    return false;
> > +}
> > +
> >  void kvm_hwpoison_page_add(ram_addr_t ram_addr)
> >  {
> >      HWPoisonPage *page;
> > diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
> > index 235dc661bc..c0a31611df 100644
> > --- a/accel/stubs/kvm-stub.c
> > +++ b/accel/stubs/kvm-stub.c
> > @@ -133,3 +133,8 @@ uint32_t kvm_dirty_ring_size(void)
> >  {
> >      return 0;
> >  }
> > +
> > +bool kvm_hwpoisoned_page(RAMBlock *block, void *ram_addr)
> > +{
> > +    return false;
> > +}
> > diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> > index ebdca41052..a2196e9e6b 100644
> > --- a/include/sysemu/kvm.h
> > +++ b/include/sysemu/kvm.h
> > @@ -580,4 +580,14 @@ bool kvm_arch_cpu_check_are_resettable(void);
> >  bool kvm_dirty_ring_enabled(void);
> >  
> >  uint32_t kvm_dirty_ring_size(void);
> > +
> > +/**
> > + * kvm_hwpoisoned_page - indicate if the given page is poisoned
> > + * @block: memory block of the given page
> > + * @ram_addr: offset of the page
> > + *
> > + * Returns: true: page is poisoned
> > + *          false: page not yet poisoned
> > + */
> > +bool kvm_hwpoisoned_page(RAMBlock *block, void *ram_addr);
> >  #endif
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 9040d66e61..48d875b12d 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -1145,7 +1145,8 @@ static int save_zero_page_to_file(PageSearchStatus *pss, QEMUFile *file,
> >      uint8_t *p = block->host + offset;
> >      int len = 0;
> >  
> > -    if (buffer_is_zero(p, TARGET_PAGE_SIZE)) {
> > +    if ((kvm_enabled() && kvm_hwpoisoned_page(block, (void *)offset)) ||

Can we move this out of zero page handling?  Zero detection is not
guaranteed to always be the 1st thing to do when processing a guest page.
Currently it'll already skip either rdma or when compression enabled, so
it'll keep crashing there.

Perhaps at the entry of ram_save_target_page_legacy()?

> > +        buffer_is_zero(p, TARGET_PAGE_SIZE)) {
> >          len += save_page_header(pss, file, block, offset | RAM_SAVE_FLAG_ZERO);
> >          qemu_put_byte(file, 0);
> >          len += 1;
> 

-- 
Peter Xu

next prev parent reply	other threads:[~2023-09-06 15:17 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-06 13:59 [PATCH 0/1] Qemu crashes on VM migration after an handled memory error “William Roche
2023-09-06 13:59 ` [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-09-06 14:19   ` Joao Martins
2023-09-06 15:16     ` Peter Xu [this message]
2023-09-06 21:29       ` William Roche
2023-09-09 14:57         ` Joao Martins
2023-09-11 19:48           ` Peter Xu
2023-09-12 18:44             ` Peter Xu
2023-09-14 20:20               ` [PATCH v2 0/1] Qemu crashes on VM migration after an handled memory error “William Roche
2023-09-14 20:20                 ` [PATCH v2 1/1] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-09-15  3:13                   ` Zhijian Li (Fujitsu)
2023-09-15 11:31                     ` William Roche
2023-09-18  3:47                       ` Zhijian Li (Fujitsu)
2023-09-20 10:04                       ` Zhijian Li (Fujitsu)
2023-09-20 12:11                         ` William Roche
2023-09-20 23:53                         ` [PATCH v3 0/1] Qemu crashes on VM migration after an handled memory error “William Roche
2023-09-20 23:53                           ` [PATCH v3 1/1] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-10-13 15:08                           ` [PATCH v4 0/2] Qemu crashes on VM migration after an handled memory error “William Roche
2023-10-13 15:08                             ` [PATCH v4 1/2] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-10-13 15:08                             ` [PATCH v4 2/2] migration: prevent migration when a poisoned page is unknown from the VM “William Roche
2023-10-16 16:48                               ` Peter Xu
2023-10-17  0:38                                 ` William Roche
2023-10-17 15:13                                   ` Peter Xu
2023-11-06 21:38                                     ` William Roche
2023-11-08 21:45                                       ` Peter Xu
2023-11-10 19:22                                         ` William Roche
2023-11-06 22:03                                     ` [PATCH v5 0/2] Qemu crashes on VM migration after an handled memory error “William Roche
2023-11-06 22:03                                       ` [PATCH v5 1/2] migration: skip poisoned memory pages on "ram saving" phase “William Roche
2023-11-06 22:03                                       ` [PATCH v5 2/2] migration: prevent migration when a poisoned page is unknown from the VM “William Roche
2023-11-08 21:49                                       ` [PATCH v5 0/2] Qemu crashes on VM migration after an handled memory error Peter Xu
2024-01-30 19:06                                         ` [PATCH v1 0/1] " “William Roche
2024-01-30 19:06                                           ` [PATCH v1 1/1] migration: prevent migration when VM has poisoned memory “William Roche
2024-01-31  1:48                                             ` Peter Xu
2023-09-14 21:50                 ` [PATCH v2 0/1] Qemu crashes on VM migration after an handled memory error Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZPiX4YLAT5HjxUAJ@x1n \
    --to=peterx@redhat.com \
    --cc=joao.m.martins@oracle.com \
    --cc=leobras@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=william.roche@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.