xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Wen Congyang <wency@cn.fujitsu.com>
Cc: Lars Kurth <lars.kurth@citrix.com>,
	Changlong Xie <xiecl.fnst@cn.fujitsu.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Jiang Yunhong <yunhong.jiang@intel.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	xen devel <xen-devel@lists.xen.org>,
	Dong Eddie <eddie.dong@intel.com>,
	Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
	Shriram Rajagopalan <rshriram@cs.ubc.ca>,
	Yang Hongyang <hongyang.yang@easystack.cn>
Subject: Re: [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc
Date: Mon, 25 Jan 2016 14:41:47 -0500	[thread overview]
Message-ID: <20160125194147.GR14977@char.us.oracle.com> (raw)
In-Reply-To: <1451442548-26974-13-git-send-email-wency@cn.fujitsu.com>

On Wed, Dec 30, 2015 at 10:29:02AM +0800, Wen Congyang wrote:
> In COLO mode, both VMs are running, and are considered in sync if the
> visible network traffic is identical.  After some time, they fall out of
> sync.
> 
> At this point, the two VMs have definitely diverged.  Lets call the
> primary dirty bitmap set A, while the secondary dirty bitmap set B.
> 
> Sets A and B are different.
> 
> Under normal migration, the page data for set A will be sent form the

s/form/from/

> primary to the secondary.
> 
> However, the set difference B - A (lets call this C) is out-of-date on
> the secondary (with respect to the primary) and will not be sent by the
> primary, as it was not memory dirtied by the primary.  The secondary

s/primary/primary (to secondary)/

> needs the page data for C to reconstruct an exact copy of the primary at

s/the page data/C page data/

> the checkpoint.
> 
> The secondary cannot calculate C as it doesn't know A.  Instead, the
> secondary must send B to the primary, at which point the primary
> calculates the union of A and B (lets call this D) which is all the
> pages dirtied by both the primary and the secondary, and sends all page
> data covered by D.

You could invert this - the primary could send A to secondary? I presume
this non-optimal as the 'A' set is much much bigger than 'C' set?

It may be good to include this in the commit description.

> 
> In the general case, D is a superset of both A and B.  Without the
> backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
> copy of the primary.
> 
> We transfer the dirty bitmap on libxc side, so we need to introduce back
> channel to libxc.

> 
> Note: it is different from the paper. We change the original design to
> the current one, according to our following concerns:
> 1. The original design needs extra memory on Secondary host. When there's
>    multiple backups on one host, the memory cost is high.
> 2. The memory cache code will be another 1k+, it will make the review
>    more time consuming.

Well, that 2) is a very good reason :-)
> 
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> commit message:

? Huh?

> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>

.. snip..
> index 05159bb..d4dc501 100644
> --- a/tools/libxc/xc_sr_restore.c
> +++ b/tools/libxc/xc_sr_restore.c
> @@ -722,7 +722,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>                        unsigned long *console_gfn, domid_t console_domid,
>                        unsigned int hvm, unsigned int pae, int superpages,
>                        int checkpointed_stream,
> -                      struct restore_callbacks *callbacks)
> +                      struct restore_callbacks *callbacks, int back_fd)
>  {
>      struct xc_sr_context ctx =
>          {
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 8ffd71d..a49d083 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -824,7 +824,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>  int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
>                     uint32_t max_iters, uint32_t max_factor, uint32_t flags,
>                     struct save_callbacks* callbacks, int hvm,
> -                   int checkpointed_stream)
> +                   int checkpointed_stream, int back_fd)
>  {
>      struct xc_sr_context ctx =
>          {


But where is the code?

Or is that suppose to be done in another patch? If so you may want to
mention that in the commit description?

  reply	other threads:[~2016-01-25 19:41 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
2015-12-30  2:28 ` [PATCH v6 01/18] libxl/remus: init checkpoint_callback in Remus setup callback Wen Congyang
2016-01-25 17:29   ` Konrad Rzeszutek Wilk
2015-12-30  2:28 ` [PATCH v6 02/18] tools/libxl: move remus code into libxl_remus.c Wen Congyang
2015-12-30  2:28 ` [PATCH v6 03/18] tools/libxl: move save/restore code into libxl_dom_save.c Wen Congyang
2015-12-30  2:28 ` [PATCH v6 04/18] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
2016-01-25 17:29   ` Konrad Rzeszutek Wilk
2016-01-26  2:23     ` Wen Congyang
2016-01-26 14:32       ` Konrad Rzeszutek Wilk
2015-12-30  2:28 ` [PATCH v6 05/18] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
2016-01-25 18:21   ` Konrad Rzeszutek Wilk
2016-01-26  2:53     ` Wen Congyang
2015-12-30  2:28 ` [PATCH v6 06/18] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
2016-01-25 18:30   ` Konrad Rzeszutek Wilk
2015-12-30  2:28 ` [PATCH v6 07/18] migration/save: pass checkpointed_stream from libxl to libxc Wen Congyang
2015-12-30  2:28 ` [PATCH v6 08/18] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state Wen Congyang
2015-12-30  2:28 ` [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Wen Congyang
2016-01-25 18:59   ` Konrad Rzeszutek Wilk
2016-01-26  7:04     ` Wen Congyang
2016-01-26 14:27       ` Konrad Rzeszutek Wilk
2016-01-27  0:53         ` Wen Congyang
2016-01-27  0:55           ` Wen Congyang
2016-01-27  2:06         ` Wen Congyang
2015-12-30  2:29 ` [PATCH v6 10/18] tools/libxl: export logdirty_init Wen Congyang
2016-01-25 19:01   ` Konrad Rzeszutek Wilk
2015-12-30  2:29 ` [PATCH v6 11/18] tools/libxl: Add back channel to allow migration target send data back Wen Congyang
2016-01-25 19:17   ` Konrad Rzeszutek Wilk
2016-01-26  7:48     ` Wen Congyang
2015-12-30  2:29 ` [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc Wen Congyang
2016-01-25 19:41   ` Konrad Rzeszutek Wilk [this message]
2016-01-26  8:03     ` Wen Congyang
2016-01-26 14:29       ` Konrad Rzeszutek Wilk
2016-01-27  0:52         ` Wen Congyang
2015-12-30  2:29 ` [PATCH v6 13/18] tools/libxl: rename remus device to checkpoint device Wen Congyang
2016-01-25 19:42   ` Konrad Rzeszutek Wilk
2015-12-30  2:29 ` [PATCH v6 14/18] tools/libxl: fix backword compatibility after the automatic renaming Wen Congyang
2015-12-30  2:29 ` [PATCH v6 15/18] tools/libxl: adjust the indentation Wen Congyang
2016-01-25 19:44   ` Konrad Rzeszutek Wilk
2015-12-30  2:29 ` [PATCH v6 16/18] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
2016-01-25 19:55   ` Konrad Rzeszutek Wilk
2016-01-26  8:07     ` Wen Congyang
2015-12-30  2:29 ` [PATCH v6 17/18] tools/libxl: move remus state into a seperate structure Wen Congyang
2016-01-25 19:59   ` Konrad Rzeszutek Wilk
2015-12-30  2:29 ` [PATCH v6 18/18] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
2016-01-25 20:01   ` Konrad Rzeszutek Wilk
2016-01-25 17:12 ` [PATCH v6 00/18] Prerequisite patches for COLO Konrad Rzeszutek Wilk
2016-01-25 20:06   ` Konrad Rzeszutek Wilk
2016-01-26  3:18     ` Wen Congyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160125194147.GR14977@char.us.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=eddie.dong@intel.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=hongyang.yang@easystack.cn \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=lars.kurth@citrix.com \
    --cc=rshriram@cs.ubc.ca \
    --cc=wei.liu2@citrix.com \
    --cc=wency@cn.fujitsu.com \
    --cc=xen-devel@lists.xen.org \
    --cc=xiecl.fnst@cn.fujitsu.com \
    --cc=yunhong.jiang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).