From: Paul Durrant <Paul.Durrant@citrix.com>
To: Ian Jackson <Ian.Jackson@citrix.com>,
Wei Liu <wei.liu2@citrix.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
"jtotto@uwaterloo.ca" <jtotto@uwaterloo.ca>
Cc: Jennifer Herbert <jennifer.herbert@citrix.com>
Subject: Re: [PATCH v2 2/3] Introduce migration precopy policy
Date: Wed, 20 Sep 2017 08:35:46 +0000 [thread overview]
Message-ID: <03ce7ef17d89418cb1af913309bb9233@AMSPEX02CL03.citrite.net> (raw)
In-Reply-To: <1505844387-2224-3-git-send-email-Jennifer.Herbert@citrix.com>
> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Jennifer Herbert
> Sent: 19 September 2017 19:06
> To: Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wei.liu2@citrix.com>;
> xen-devel@lists.xenproject.org; jtotto@uwaterloo.ca
> Cc: Jennifer Herbert <jennifer.herbert@citrix.com>
> Subject: [Xen-devel] [PATCH v2 2/3] Introduce migration precopy policy
>
> This Patch allows a migration precopy policy to be specified.
>
> The precopy phase of the xc_domain_save() live migration algorithm has
> historically been implemented to run until either a) (almost) no pages
> are dirty or b) some fixed, hard-coded maximum number of precopy
> iterations has been exceeded. This policy and its implementation are
> less than ideal for a few reasons:
> - the logic of the policy is intertwined with the control flow of the
> mechanism of the precopy stage
> - it can't take into account facts external to the immediate
> migration context, such external state transfer state, interactive
> user input, or the passage of wall-clock time.
> - it does not permit the user to change their mind, over time, about
> what to do at the end of the precopy (they get an unconditional
> transition into the stop-and-copy phase of the migration)
>
> To permit callers to implement arbitrary higher-level policies governing
> when the live migration precopy phase should end, and what should be
> done next:
> - add a precopy_policy() callback to the xc_domain_save() user-supplied
> callbacks
> - during the precopy phase of live migrations, consult this policy after
> each batch of pages transmitted and take the dictated action, which
> may be to a) abort the migration entirely, b) continue with the
> precopy, or c) proceed to the stop-and-copy phase.
> - provide an implementation of the old policy, used when
> precopy_policy callback is not provided.
>
> Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>
> Signed-off-by: Joshua Otto <jtotto@uwaterloo.ca>
>
> ---
>
> v2:
>
> Have made a few formatting corrections, added typedef as suggested.
>
> v1:
>
> This is updated/modified subset of patch 7/20, part of
> Joshua Otto's "Add postcopy live migration support." patch,
> dated 27th March 2017. As indicated on the original thread,
> I wish to make use of this this within the XenServer product.
> I hope this will aid Josh in pushing the remainder of his series.
>
> ---
> tools/libxc/include/xenguest.h | 31 ++++++++++++--
> tools/libxc/xc_sr_common.h | 6 +--
> tools/libxc/xc_sr_save.c | 97 +++++++++++++++++++++++++++++--------
> -----
> 3 files changed, 97 insertions(+), 37 deletions(-)
>
> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> index 6626f0c..a2a654c 100644
> --- a/tools/libxc/include/xenguest.h
> +++ b/tools/libxc/include/xenguest.h
> @@ -39,6 +39,16 @@
> */
> struct xenevtchn_handle;
>
> +/* For save's precopy_policy(). */
> +struct precopy_stats
> +{
> + unsigned int iteration;
> + unsigned int total_written;
> + long dirty_count; /* -1 if unknown */
> +};
> +
> +typedef int (*precopy_policy_t)(struct precopy_stats, void *);
> +
> /* callbacks provided by xc_domain_save */
> struct save_callbacks {
> /* Called after expiration of checkpoint interval,
> @@ -46,7 +56,20 @@ struct save_callbacks {
> */
> int (*suspend)(void* data);
>
> - /* Called after the guest's dirty pages have been
> + /*
> + * Called after every batch of page data sent during the precopy
> + * phase of a live migration to ask the caller what to do next
> + * based on the current state of the precopy migration.
> + */
> +#define XGS_POLICY_ABORT (-1) /* Abandon the migration entirely
> + * and tidy up. */
> +#define XGS_POLICY_CONTINUE_PRECOPY 0 /* Remain in the precopy
> phase. */
> +#define XGS_POLICY_STOP_AND_COPY 1 /* Immediately suspend and
> transmit the
> + * remaining dirty pages. */
> + precopy_policy_t precopy_policy;
> +
> + /*
> + * Called after the guest's dirty pages have been
> * copied into an output buffer.
> * Callback function resumes the guest & the device model,
> * returns to xc_domain_save.
> @@ -55,7 +78,8 @@ struct save_callbacks {
> */
> int (*postcopy)(void* data);
>
> - /* Called after the memory checkpoint has been flushed
> + /*
> + * Called after the memory checkpoint has been flushed
> * out into the network. Typical actions performed in this
> * callback include:
> * (a) send the saved device model state (for HVM guests),
> @@ -65,7 +89,8 @@ struct save_callbacks {
> *
> * returns:
> * 0: terminate checkpointing gracefully
> - * 1: take another checkpoint */
> + * 1: take another checkpoint
> + */
> int (*checkpoint)(void* data);
>
> /*
> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
> index a83f22a..3635704 100644
> --- a/tools/libxc/xc_sr_common.h
> +++ b/tools/libxc/xc_sr_common.h
> @@ -198,12 +198,10 @@ struct xc_sr_context
> /* Further debugging information in the stream. */
> bool debug;
>
> - /* Parameters for tweaking live migration. */
> - unsigned max_iterations;
> - unsigned dirty_threshold;
> -
> unsigned long p2m_size;
>
> + struct precopy_stats stats;
> +
> xen_pfn_t *batch_pfns;
> unsigned nr_batch_pfns;
> unsigned long *deferred_pages;
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 1e7502d..f58c008 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -452,8 +452,7 @@ static int update_progress_string(struct
> xc_sr_context *ctx,
> xc_interface *xch = ctx->xch;
> char *new_str = NULL;
>
> - if ( asprintf(&new_str, "Frames iteration %u of %u",
> - iter, ctx->save.max_iterations) == -1 )
> + if ( asprintf(&new_str, "Frames iteration %u", iter) == -1 )
> {
> PERROR("Unable to allocate new progress string");
> return -1;
> @@ -467,6 +466,24 @@ static int update_progress_string(struct
> xc_sr_context *ctx,
> }
>
> /*
> + * This is the live migration precopy policy - it's called periodically during
> + * the precopy phase of live migrations, and is responsible for deciding
> when
> + * the precopy phase should terminate and what should be done next.
> + *
> + * The policy implemented here behaves identically to the policy previously
> + * hard-coded into xc_domain_save() - it proceeds to the stop-and-copy
> phase of
> + * the live migration when there are either fewer than 50 dirty pages, or
> more
> + * than 5 precopy rounds have completed.
> + */
> +static int simple_precopy_policy(struct precopy_stats stats, void *user)
> +{
> + return ((stats.dirty_count >= 0 && stats.dirty_count < 50) ||
> + stats.iteration >= 5)
> + ? XGS_POLICY_STOP_AND_COPY
> + : XGS_POLICY_CONTINUE_PRECOPY;
> +}
> +
> +/*
> * Send memory while guest is running.
> */
> static int send_memory_live(struct xc_sr_context *ctx)
> @@ -474,21 +491,58 @@ static int send_memory_live(struct xc_sr_context
> *ctx)
> xc_interface *xch = ctx->xch;
> xc_shadow_op_stats_t stats = { 0, ctx->save.p2m_size };
> char *progress_str = NULL;
> - unsigned x;
> + unsigned int x = 0;
> int rc;
> + int policy_decision;
> +
> + DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> + &ctx->save.dirty_bitmap_hbuf);
> +
> + precopy_policy_t precopy_policy = ctx->save.callbacks->precopy_policy;
> + void *data = ctx->save.callbacks->data;
> +
> + struct precopy_stats *policy_stats;
>
> rc = update_progress_string(ctx, &progress_str, 0);
> if ( rc )
> goto out;
>
> - rc = send_all_pages(ctx);
> - if ( rc )
> - goto out;
> + ctx->save.stats = (struct precopy_stats)
> + { .dirty_count = ctx->save.p2m_size };
> + policy_stats = &ctx->save.stats;
> +
> + if ( precopy_policy == NULL )
> + precopy_policy = simple_precopy_policy;
> +
> + bitmap_set(dirty_bitmap, ctx->save.p2m_size);
> +
> + do {
> + policy_decision = precopy_policy(*policy_stats, data);
> + x++;
> +
> + if ( stats.dirty_count > 0 && policy_decision != XGS_POLICY_ABORT )
> + {
> + rc = update_progress_string(ctx, &progress_str, x);
> + if ( rc )
> + goto out;
> +
> + rc = send_dirty_pages(ctx, stats.dirty_count);
> + if ( rc )
> + goto out;
> + }
> +
> + if ( policy_decision != XGS_POLICY_CONTINUE_PRECOPY )
> + break;
> +
> + policy_stats->iteration = x;
> + policy_stats->total_written += policy_stats->dirty_count;
> + policy_stats->dirty_count = -1;
> +
> + policy_decision = precopy_policy(*policy_stats, data);
> +
> + if ( policy_decision != XGS_POLICY_CONTINUE_PRECOPY )
> + break;
>
> - for ( x = 1;
> - ((x < ctx->save.max_iterations) &&
> - (stats.dirty_count > ctx->save.dirty_threshold)); ++x )
> - {
> if ( xc_shadow_control(
> xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_CLEAN,
> &ctx->save.dirty_bitmap_hbuf, ctx->save.p2m_size,
> @@ -499,17 +553,9 @@ static int send_memory_live(struct xc_sr_context
> *ctx)
> goto out;
> }
>
> - if ( stats.dirty_count == 0 )
> - break;
> + policy_stats->dirty_count = stats.dirty_count;
>
> - rc = update_progress_string(ctx, &progress_str, x);
> - if ( rc )
> - goto out;
> -
> - rc = send_dirty_pages(ctx, stats.dirty_count);
> - if ( rc )
> - goto out;
> - }
> + } while ( true );
I'm sure any compiler worth its salt will optimise to an unconditional jump, but I tend to prefer using for (;;) for infinite loops. Doesn't really matter so...
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
>
> out:
> xc_set_progress_prefix(xch, NULL);
> @@ -601,7 +647,7 @@ static int suspend_and_send_dirty(struct
> xc_sr_context *ctx)
> if ( ctx->save.live )
> {
> rc = update_progress_string(ctx, &progress_str,
> - ctx->save.max_iterations);
> + ctx->save.stats.iteration);
> if ( rc )
> goto out;
> }
> @@ -937,15 +983,6 @@ int xc_domain_save(xc_interface *xch, int io_fd,
> uint32_t dom,
> stream_type == XC_MIG_STREAM_REMUS ||
> stream_type == XC_MIG_STREAM_COLO);
>
> - /*
> - * TODO: Find some time to better tweak the live migration algorithm.
> - *
> - * These parameters are better than the legacy algorithm especially for
> - * busy guests.
> - */
> - ctx.save.max_iterations = 5;
> - ctx.save.dirty_threshold = 50;
> -
> /* Sanity checks for callbacks. */
> if ( hvm )
> assert(callbacks->switch_qemu_logdirty);
> --
> 1.8.3.1
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-09-20 8:35 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-19 18:06 [PATCH v2 0/3] Introduce migration precopy policy Jennifer Herbert
2017-09-19 18:06 ` [PATCH v2 1/3] Tidy libxc xc_domain_save Jennifer Herbert
2017-09-19 18:06 ` [PATCH v2 2/3] Introduce migration precopy policy Jennifer Herbert
2017-09-20 8:35 ` Paul Durrant [this message]
2017-09-20 10:20 ` Roger Pau Monné
2017-09-20 16:18 ` Jennifer Herbert
2017-09-21 11:08 ` Roger Pau Monné
2017-09-21 11:13 ` Wei Liu
2017-09-21 14:44 ` Roger Pau Monné
2017-09-19 18:06 ` [PATCH v2 3/3] RFC: migration: defer precopy policy to libxl Jennifer Herbert
2017-09-20 10:37 ` Roger Pau Monné
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=03ce7ef17d89418cb1af913309bb9233@AMSPEX02CL03.citrite.net \
--to=paul.durrant@citrix.com \
--cc=Ian.Jackson@citrix.com \
--cc=jennifer.herbert@citrix.com \
--cc=jtotto@uwaterloo.ca \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).