xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Ian Campbell <Ian.Campbell@citrix.com>
To: "rshriram@cs.ubc.ca" <rshriram@cs.ubc.ca>
Cc: "brendan@cs.ubc.ca" <brendan@cs.ubc.ca>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Subject: Re: [PATCH 1 of 2 V3] libxl: Remus - suspend/postflush/commit callbacks
Date: Thu, 9 Feb 2012 12:38:45 +0000	[thread overview]
Message-ID: <1328791125.6133.155.camel@zakaz.uk.xensource.com> (raw)
In-Reply-To: <90e59c643c00c079996e.1328252414@athos.nss.cs.ubc.ca>

On Fri, 2012-02-03 at 07:00 +0000, rshriram@cs.ubc.ca wrote:
> # HG changeset patch
> # User Shriram Rajagopalan <rshriram@cs.ubc.ca>
> # Date 1328251593 28800
> # Node ID 90e59c643c00c079996e13b75f89d1f0cd931a02
> # Parent  c7abecc14cceb18140335ebe20faad826282cd1f
> libxl: Remus - suspend/postflush/commit callbacks
> 
>  * Add libxl callback functions for Remus checkpoint suspend, postflush
>    (aka resume) and checkpoint commit callbacks.
>  * suspend callback is a stub that just bounces off
>    libxl__domain_suspend_common_callback - which suspends the domain and
>    saves the devices model state to a file.
>  * resume callback currently just resumes the domain (and the device model).
>  * commit callback just writes out the saved device model state to the
>    network and sleeps for the checkpoint interval.
>  * Introduce a new public API, libxl_domain_remus_start (currently a stub)
>    that sets up the network and disk buffer and initiates continuous
>    checkpointing.
> 
>  * Future patches will augument these callbacks/functions with more functionalities

                        "augment"

>    like issuing network buffer plug/unplug commands, disk checkpoint commands, etc.
> 
> Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
> 
> diff -r c7abecc14cce -r 90e59c643c00 tools/libxl/libxl.c
> --- a/tools/libxl/libxl.c	Thu Feb 02 22:46:33 2012 -0800
> +++ b/tools/libxl/libxl.c	Thu Feb 02 22:46:33 2012 -0800
> @@ -471,6 +471,41 @@ libxl_vminfo * libxl_list_vm(libxl_ctx *
>      return ptr;
>  }
>  
> +/* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
> +int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
> +                             uint32_t domid, int send_fd, int recv_fd)
> +{
> +    GC_INIT(ctx);
> +    libxl_domain_type type = libxl__domain_type(gc, domid);
> +    int rc = 0;
> +
> +    if (info == NULL) {
> +        LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
> +                   "No remus_info structure supplied for domain %d", domid);
> +        rc = ERROR_INVAL;
> +        goto remus_fail;
> +    }
> +
> +    /* TBD: Remus setup - i.e. attach qdisc, enable disk buffering, etc */

Is it worth checking that the domain has no disks or network (IOW is
this dangerous if they do?)

[...]
> @@ -791,7 +837,27 @@ int libxl__domain_suspend_common(libxl__
>      }
>  
>      memset(&callbacks, 0, sizeof(callbacks));
> -    callbacks.suspend = libxl__domain_suspend_common_callback;
> +    if (r_info != NULL) {
> +        /* save_callbacks:
> +         * suspend - called after expiration of checkpoint interval,
> +         *           to *suspend* the domain.
> +         *
> +         * postcopy - called after the domain's dirty pages have been
> +         *            copied into an output buffer. We *resume* the domain
> +         *            & the device model, return to the caller. Caller then
> +         *            flushes the output buffer, while the domain continues to run.
> +         *
> +         * checkpoint - called after the memory checkpoint has been flushed out
> +         *              into the network. Send the saved device state, *wait*
> +         *              for checkpoint ack and *release* the network buffer (TBD).
> +         *              Then *sleep* for the checkpoint interval.
> +         */

I think this comment would be more useful in xenguest.h next to the
callback struct.

Otherwise the patch looks good.

Ian.

  reply	other threads:[~2012-02-09 12:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-03  7:00 [PATCH 0 of 2 V3] libxl - Remus support rshriram
2012-02-03  7:00 ` [PATCH 1 of 2 V3] libxl: Remus - suspend/postflush/commit callbacks rshriram
2012-02-09 12:38   ` Ian Campbell [this message]
2012-02-09 18:04     ` Shriram Rajagopalan
2012-02-03  7:00 ` [PATCH 2 of 2 V3] libxl: Remus - xl remus command rshriram
2012-02-09 12:42   ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1328791125.6133.155.camel@zakaz.uk.xensource.com \
    --to=ian.campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=Stefano.Stabellini@eu.citrix.com \
    --cc=brendan@cs.ubc.ca \
    --cc=rshriram@cs.ubc.ca \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).