xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Joshua Otto <jtotto@uwaterloo.ca>
To: Wei Liu <wei.liu2@citrix.com>, xen-devel@lists.xenproject.org
Cc: andrew.cooper3@citrix.com, hjarmstr@uwaterloo.ca,
	ian.jackson@eu.citrix.com, czylin@uwaterloo.ca,
	imhy.yang@gmail.com
Subject: Re: [PATCH RFC 00/20] Add postcopy live migration support
Date: Thu, 30 Mar 2017 00:13:51 -0400	[thread overview]
Message-ID: <20170330041351.GD3038@eagle> (raw)
In-Reply-To: <20170328144102.mg424gy3kw2poilo@citrix.com>

On Tue, Mar 28, 2017 at 03:41:02PM +0100, Wei Liu wrote:
> Hi Harley, Chester and Joshua
> 
> This is really nice work. I took a brief look at all the patches, they
> look really high quality.

Thank you!

> 
> We're currently approaching freeze for a Xen release. We've got a lot on
> our plate. I think maintainers will get to this series at some point.

Understood.  We're currently approaching our final exams so that's probably for
the best :)

> 
> From the look of things some patches can go in because they're general
> useful.
> 
> On Mon, Mar 27, 2017 at 05:06:12AM -0400, Joshua Otto wrote:
> > Hi,
> > 
> > We're a team of three fourth-year undergraduate software engineering students at
> > the University of Waterloo in Canada.  In late 2015 we posted on the list [1] to
> > ask for a project to undertake for our program's capstone design project, and
> > Andrew Cooper pointed us in the direction of the live migration implementation
> > as an area that could use some attention.  We were particularly interested in
> > post-copy live migration (as evaluated by [2] and discussed on the list at [3]),
> > and have been working on an implementation of this on-and-off since then.
> > 
> > We now have a working implementation of this scheme, and are submitting it for
> > comment.  The changes are also available as the 'postcopy' branch of the GitHub
> > repository at [4]
> > 
> > As a brief overview of our approach:
> > - We introduce a mechanism by which libxl can indicate to the libxc stream
> >   helper process that the iterative migration precopy loop should be terminated
> >   and postcopy should begin.
> > - At this point, we suspend the domain, collect the final set of dirty pfns and
> >   write these pfns (and _not_ their contents) into the stream.
> > - At the destination, the xc restore logic registers itself as a pager for the
> >   migrating domain, 'evicts' all of the pfns indicated by the sender as
> >   outstanding, and then resumes the domain at the destination.
> > - As the domain executes, the migration sender continues to push the remaining
> >   oustanding pages to the receiver in the background.  The receiver
> >   monitors both the stream for incoming page data and the paging ring event
> >   channel for page faults triggered by the guest.  Page faults are forwarded on
> >   the back-channel migration stream to the migration sender, which prioritizes
> >   these pages for transmission.
> > 
> > By leveraging the existing paging API, we are able to implement the postcopy
> > scheme without any hypervisor modifications - all of our changes are confined to
> > the userspace toolstack.  However, we inherit from the paging API the
> > requirement that the domains be HVM and that the host have HAP/EPT support.
> > 
> 
> Please consider writing a design document for this feature and stick it
> at the beginning of your series in the future. You can find examples
> under docs/designs.

Absolutely, I'll submit one with v2.

> 
> The restriction is a bit unfortunate, but we shouldn't block useful work
> because it's incomplete. We just need to make sure should someone decide
> to implement similar functionality for PV guest, they should be able to
> do so.
> 
> You might want to check if shadow paging can be used with paging API,
> such that you can widen the requirement to HVM guest support.
> 
> > We haven't yet had the opportunity to perform a quantitative evaluation of the
> > performance trade-offs between the traditional pre-copy and our post-copy
> > strategies, but intend to.  Informally, we've been testing our implementation by
> > migrating a domain running the x86 memtest program (which is obviously a
> > tremendously write-heavy workload), and have observed a substantial reduction in
> > total time required for migration completion (at the expense of a visually
> > obvious 'slowdown' in the execution of the program).  We've also noticed that,
> > when performing a postcopy without any leading precopy iterations, the time
> > required at the destination to 'evict' all of the outstanding pages is
> > substantial - possibly because there is no batching mechanism by which pages can
> > be evicted - so this area in particular might require further attention.
> > 
> 
> Please do post numbers when you have them. For now, please be patient
> and wait for people to comment.

Will do.  As a general question for those following the thread, are there any
application workloads/benchmarks that people would find particularly
interesting?

The experiment that we've planned but haven't had the time to follow through
fully is to mount a ramdisk inside the guest and use Axboe's fio to test all of
the entries in the (read/write mix) x (working set size) x (access pattern)
matrix.

Thank you again for your feedback!

Josh

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-03-30  4:14 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-27  9:06 [PATCH RFC 00/20] Add postcopy live migration support Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 01/20] tools: rename COLO 'postcopy' to 'aftercopy' Joshua Otto
2017-03-28 16:34   ` Wei Liu
2017-04-11  6:19     ` Zhang Chen
2017-03-27  9:06 ` [PATCH RFC 02/20] libxc/xc_sr: parameterise write_record() on fd Joshua Otto
2017-03-28 18:53   ` Andrew Cooper
2017-03-31 14:19   ` Wei Liu
2017-03-27  9:06 ` [PATCH RFC 03/20] libxc/xc_sr_restore.c: use write_record() in send_checkpoint_dirty_pfn_list() Joshua Otto
2017-03-28 18:56   ` Andrew Cooper
2017-03-31 14:19   ` Wei Liu
2017-03-27  9:06 ` [PATCH RFC 04/20] libxc/xc_sr_save.c: add WRITE_TRIVIAL_RECORD_FN() Joshua Otto
2017-03-28 19:03   ` Andrew Cooper
2017-03-30  4:28     ` Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 05/20] libxc/xc_sr: factor out filter_pages() Joshua Otto
2017-03-28 19:27   ` Andrew Cooper
2017-03-30  4:42     ` Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 06/20] libxc/xc_sr: factor helpers out of handle_page_data() Joshua Otto
2017-03-28 19:52   ` Andrew Cooper
2017-03-30  4:49     ` Joshua Otto
2017-04-12 15:16       ` Wei Liu
2017-03-27  9:06 ` [PATCH RFC 07/20] migration: defer precopy policy to libxl Joshua Otto
2017-03-29 18:54   ` Jennifer Herbert
2017-03-30  5:28     ` Joshua Otto
2017-03-29 20:18   ` Andrew Cooper
2017-03-30  5:19     ` Joshua Otto
2017-04-12 15:16       ` Wei Liu
2017-04-18 17:56         ` Ian Jackson
2017-03-27  9:06 ` [PATCH RFC 08/20] libxl/migration: add precopy tuning parameters Joshua Otto
2017-03-29 21:08   ` Andrew Cooper
2017-03-30  6:03     ` Joshua Otto
2017-04-12 15:37       ` Wei Liu
2017-04-27 22:51         ` Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 09/20] libxc/xc_sr_save: introduce save batch types Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 10/20] libxc/xc_sr_save.c: initialise rec.data before free() Joshua Otto
2017-03-28 19:59   ` Andrew Cooper
2017-03-29 17:47     ` Wei Liu
2017-03-27  9:06 ` [PATCH RFC 11/20] libxc/migration: correct hvm record ordering specification Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 12/20] libxc/migration: specify postcopy live migration Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 13/20] libxc/migration: add try_read_record() Joshua Otto
2017-04-12 15:16   ` Wei Liu
2017-03-27  9:06 ` [PATCH RFC 14/20] libxc/migration: implement the sender side of postcopy live migration Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 15/20] libxc/migration: implement the receiver " Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 16/20] libxl/libxl_stream_write.c: track callback chains with an explicit phase Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 17/20] libxl/libxl_stream_read.c: " Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 18/20] libxl/migration: implement the sender side of postcopy live migration Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 19/20] libxl/migration: implement the receiver " Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 20/20] tools: expose postcopy live migration support in libxl and xl Joshua Otto
2017-03-28 14:41 ` [PATCH RFC 00/20] Add postcopy live migration support Wei Liu
2017-03-30  4:13   ` Joshua Otto [this message]
2017-03-31 14:19     ` Wei Liu
2017-03-29 22:50 ` Andrew Cooper
2017-03-31  4:51   ` Joshua Otto
2017-04-12 15:38     ` Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170330041351.GD3038@eagle \
    --to=jtotto@uwaterloo.ca \
    --cc=andrew.cooper3@citrix.com \
    --cc=czylin@uwaterloo.ca \
    --cc=hjarmstr@uwaterloo.ca \
    --cc=ian.jackson@eu.citrix.com \
    --cc=imhy.yang@gmail.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).