qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Xiaohui Li <xiaohli@redhat.com>,
	qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH v2 0/6] migration/postcopy: Sync faulted addresses after network recovered
Date: Fri, 25 Sep 2020 12:50:26 +0100	[thread overview]
Message-ID: <20200925115026.GA2874@work-vm> (raw)
In-Reply-To: <20200923174311.GA124840@work-vm>

* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > v2:
> 
> Queued

Hi Peter,
  I've had to unqueue most of this; it doesn't like building on 32bit.
I fixed up the trace_ stuff easily (_del can take a void*, add just
needs to use PRIX64) but there are other places where it doesn't like
the casting from pointers to uint64_t's etc.

  I've kept the first couple of commits.

Dave

> > - add r-bs for Dave
> > - add patch "migration: Properly destroy variables on incoming side" as patch 1
> > - destroy page_request_mutex in migration_incoming_state_destroy() too [Dave]
> > - use WITH_QEMU_LOCK_GUARD in two places where we can [Dave]
> > 
> > We've seen conditional guest hangs on destination VM after postcopy recovered.
> > However the hang will resolve itself after a few minutes.
> > 
> > The problem is: after a postcopy recovery, the prioritized postcopy queue on
> > the source VM is actually missing.  So all the faulted threads before the
> > postcopy recovery happened will keep halted until (accidentally) the page got
> > copied by the background precopy migration stream.
> > 
> > The solution is to also refresh this information after postcopy recovery.  To
> > achieve this, we need to maintain a list of faulted addresses on the
> > destination node, so that we can resend the list when necessary.  This work is
> > done via patch 2-5.
> > 
> > With that, the last thing we need to do is to send this extra information to
> > source VM after recovered.  Very luckily, this synchronization can be
> > "emulated" by sending a bunch of page requests (although these pages have been
> > sent previously!) to source VM just like when we've got a page fault.  Even in
> > the 1st version of the postcopy code we'll handle duplicated pages well.  So
> > this fix does not even need a new capability bit and it'll work smoothly on old
> > QEMUs when we migrate from them to the new QEMUs.
> > 
> > Please review, thanks.
> > 
> > Peter Xu (6):
> >   migration: Properly destroy variables on incoming side
> >   migration: Rework migrate_send_rp_req_pages() function
> >   migration: Pass incoming state into qemu_ufd_copy_ioctl()
> >   migration: Introduce migrate_send_rp_message_req_pages()
> >   migration: Maintain postcopy faulted addresses
> >   migration: Sync requested pages after postcopy recovery
> > 
> >  migration/migration.c    | 79 +++++++++++++++++++++++++++++++++++-----
> >  migration/migration.h    | 23 +++++++++++-
> >  migration/postcopy-ram.c | 46 ++++++++++-------------
> >  migration/savevm.c       | 57 +++++++++++++++++++++++++++++
> >  migration/trace-events   |  3 ++
> >  5 files changed, 171 insertions(+), 37 deletions(-)
> > 
> > -- 
> > 2.26.2
> > 
> > 
> > 
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  reply	other threads:[~2020-09-25 11:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-08 20:30 [PATCH v2 0/6] migration/postcopy: Sync faulted addresses after network recovered Peter Xu
2020-09-08 20:30 ` [PATCH v2 1/6] migration: Properly destroy variables on incoming side Peter Xu
2020-09-09 10:21   ` Dr. David Alan Gilbert
2020-09-08 20:30 ` [PATCH v2 2/6] migration: Rework migrate_send_rp_req_pages() function Peter Xu
2020-09-08 20:30 ` [PATCH v2 3/6] migration: Pass incoming state into qemu_ufd_copy_ioctl() Peter Xu
2020-09-08 20:30 ` [PATCH v2 4/6] migration: Introduce migrate_send_rp_message_req_pages() Peter Xu
2020-09-08 20:30 ` [PATCH v2 5/6] migration: Maintain postcopy faulted addresses Peter Xu
2020-09-10  9:44   ` Dr. David Alan Gilbert
2020-09-08 20:30 ` [PATCH v2 6/6] migration: Sync requested pages after postcopy recovery Peter Xu
2020-09-23 17:43 ` [PATCH v2 0/6] migration/postcopy: Sync faulted addresses after network recovered Dr. David Alan Gilbert
2020-09-25 11:50   ` Dr. David Alan Gilbert [this message]
2020-09-25 13:46     ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200925115026.GA2874@work-vm \
    --to=dgilbert@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=xiaohli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).