qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] migration/postcopy: Sync faulted addresses after network recovered
@ 2020-09-03 15:26 Peter Xu
  2020-09-03 15:26 ` [PATCH 1/5] migration: Rework migrate_send_rp_req_pages() function Peter Xu
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Peter Xu @ 2020-09-03 15:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: Xiaohui Li, Dr . David Alan Gilbert, peterx, Juan Quintela

We've seen conditional guest hangs on destination VM after postcopy recovered.
However the hang will resolve itself after a few minutes.

The problem is: after a postcopy recovery, the prioritized postcopy queue on
the source VM is actually missing.  So all the faulted threads before the
postcopy recovery happened will keep halted until (accidentally) the page got
copied by the background precopy migration stream.

The solution is to also refresh this information after postcopy recovery.  To
achieve this, we need to maintain a list of faulted addresses on the
destination node, so that we can resend the list when necessary.  This work is
done via patch 1-4.

With that, the last thing we need to do is to send this extra information to
source VM after recovered.  Very luckily, this synchronization can be
"emulated" by sending a bunch of page requests (although these pages have been
sent previously!) to source VM just like when we've got a page fault.  Even in
the 1st version of the postcopy code we'll handle duplicated pages well.  So
this fix does not even need a new capability bit and it'll work smoothly on old
QEMUs when we migrate from them to the new QEMUs.

Please review, thanks.

Peter Xu (5):
  migration: Rework migrate_send_rp_req_pages() function
  migration: Introduce migrate_send_rp_message_req_pages()
  migration: Pass incoming state into qemu_ufd_copy_ioctl()
  migration: Maintain postcopy faulted addresses
  migration: Sync requested pages after postcopy recovery

 migration/migration.c    | 71 +++++++++++++++++++++++++++++++++++-----
 migration/migration.h    | 23 +++++++++++--
 migration/postcopy-ram.c | 46 +++++++++++---------------
 migration/savevm.c       | 56 +++++++++++++++++++++++++++++++
 migration/trace-events   |  3 ++
 5 files changed, 163 insertions(+), 36 deletions(-)

-- 
2.26.2




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-09-08 20:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-03 15:26 [PATCH 0/5] migration/postcopy: Sync faulted addresses after network recovered Peter Xu
2020-09-03 15:26 ` [PATCH 1/5] migration: Rework migrate_send_rp_req_pages() function Peter Xu
2020-09-08  9:18   ` Dr. David Alan Gilbert
2020-09-03 15:26 ` [PATCH 2/5] migration: Introduce migrate_send_rp_message_req_pages() Peter Xu
2020-09-08  9:57   ` Dr. David Alan Gilbert
2020-09-08 20:20     ` Peter Xu
2020-09-03 15:26 ` [PATCH 3/5] migration: Pass incoming state into qemu_ufd_copy_ioctl() Peter Xu
2020-09-08  9:30   ` Dr. David Alan Gilbert
2020-09-03 15:26 ` [PATCH 4/5] migration: Maintain postcopy faulted addresses Peter Xu
2020-09-08 11:00   ` Dr. David Alan Gilbert
2020-09-08 19:42     ` Peter Xu
2020-09-03 15:26 ` [PATCH 5/5] migration: Sync requested pages after postcopy recovery Peter Xu
2020-09-08 11:03   ` Dr. David Alan Gilbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).