From: Hongyang Yang <yanghy@cn.fujitsu.com>
To: rshriram@cs.ubc.ca
Cc: ian.campbell@citrix.com, wency@cn.fujitsu.com,
stefano.stabellini@eu.citrix.com, ian.jackson@eu.citrix.com,
Jiang Yunhong <yunhong.jiang@intel.com>,
eddie.dong@intel.com, xen-devel@lists.xen.org,
andrew.cooper3@citrix.com, laijs@cn.fujitsu.com,
Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [PATCH v13 3/7] remus: introduce remus device
Date: Fri, 27 Jun 2014 11:40:31 +0800 [thread overview]
Message-ID: <53ACE7AF.1080408@cn.fujitsu.com> (raw)
In-Reply-To: <CAP8mzPPUHkLOks+ptRd_pZ6WcHwvjg6Uv1PXjnzOdFPE4kcJsw@mail.gmail.com>
On 06/27/2014 11:29 AM, Shriram Rajagopalan wrote:
>
> On Jun 27, 2014 7:29 AM, "Yang Hongyang" <yanghy@cn.fujitsu.com
> <mailto:yanghy@cn.fujitsu.com>> wrote:
> >
> > introduce remus device, an abstract layer of remus devices(nic, disk,
> > etc).It provides the following APIs for libxl:
> > >libxl__remus_device_setup
> > setup remus devices, like attach qdisc, enable disk buffering, etc
> > >libxl__remus_device_teardown
> > teardown devices
> > >libxl__remus_device_postsuspend
> > >libxl__remus_device_preresume
> > >libxl__remus_device_commit
> > above three are for checkpoint.
> > through remus device layer, the remus execution flow will be like
> > this:
> > xl remus -> remus device setup
> > |-> remus checkpoint(postsuspend, preresume, commit)
> > ...
> > |-> remus device teardown, failover or abort
> > the remus device layer provides an interface
> > libxl__remus_device_ops
> > which a remus device must implement. the whole remus structure:
> > |remus|
> > |
> > |remus device|
> > |
> > |nic| |drbd disks| |qemu disks| ...
> > a device(nic, drbd disks, qemu disks, etc) must implement
> > libxl__remus_device_ops to support remus.
> >
> > Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com
> <mailto:yanghy@cn.fujitsu.com>>
> > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com <mailto:wency@cn.fujitsu.com>>
> > Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com <mailto:laijs@cn.fujitsu.com>>
> > ---
> > tools/libxl/Makefile | 2 +
> > tools/libxl/libxl.c | 34 +++-
> > tools/libxl/libxl_dom.c | 132 +++++++++++++--
> > tools/libxl/libxl_internal.h | 182 +++++++++++++++++++++
> > tools/libxl/libxl_remus_device.c | 340 +++++++++++++++++++++++++++++++++++++++
> > tools/libxl/libxl_types.idl | 1 +
> > 6 files changed, 675 insertions(+), 16 deletions(-)
> > create mode 100644 tools/libxl/libxl_remus_device.c
> >
> > diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> > index fdffff3..cb2efdf 100644
> > --- a/tools/libxl/Makefile
> > +++ b/tools/libxl/Makefile
> > @@ -56,6 +56,8 @@ else
> > LIBXL_OBJS-y += libxl_nonetbuffer.o
> > endif
> >
> > +LIBXL_OBJS-y += libxl_remus_device.o
> > +
> > LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o
> > LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o
> >
> > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> > index 62e251a..f99477d 100644
> > --- a/tools/libxl/libxl.c
> > +++ b/tools/libxl/libxl.c
> > @@ -733,6 +733,31 @@ out:
> > static void remus_failover_cb(libxl__egc *egc,
> > libxl__domain_suspend_state *dss, int rc);
> >
> > +static void libxl__remus_setup_failed(libxl__egc *egc,
> > + libxl__remus_state *rs, int rc)
> > +{
> > + STATE_AO_GC(rs->ao);
> > + libxl__ao_complete(egc, ao, rc);
> > +}
> > +
> > +static void libxl__remus_setup_done(libxl__egc *egc,
> > + libxl__remus_state *rs, int rc)
> > +{
> > + libxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
> > + STATE_AO_GC(rs->ao);
> > +
> > + if (!rc) {
> > + libxl__domain_suspend(egc, dss);
> > + return;
> > + }
> > +
> > + LOG(ERROR, "Remus: failed to setup device for guest with domid %u",
> > + dss->domid);
> > + rs->saved_rc = rc;
> > + rs->callback = libxl__remus_setup_failed;
> > + libxl__remus_device_teardown(egc, rs);
> > +}
> > +
> > /* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
> > int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
> > uint32_t domid, int send_fd, int recv_fd,
> > @@ -761,10 +786,15 @@ int libxl_domain_remus_start(libxl_ctx *ctx,
> libxl_domain_remus_info *info,
> >
> > assert(info);
> >
> > - /* TBD: Remus setup - i.e. attach qdisc, enable disk buffering, etc */
> > + /* Convenience aliases */
> > + libxl__remus_state *const rs = &dss->rs;
> > + rs->ao = ao;
> > + rs->domid = domid;
> > + rs->saved_rc = 0;
> > + rs->callback = libxl__remus_setup_done;
> >
> > /* Point of no return */
> > - libxl__domain_suspend(egc, dss);
> > + libxl__remus_device_setup(egc, rs);
> > return AO_INPROGRESS;
> >
> > out:
> > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> > index c11993d..dde8bf6 100644
> > --- a/tools/libxl/libxl_dom.c
> > +++ b/tools/libxl/libxl_dom.c
> > @@ -1426,6 +1426,17 @@ static void libxl__domain_suspend_callback(void *data)
> > domain_suspend_callback_common(egc, dss);
> > }
> >
> > +static void remus_device_postsuspend_cb(libxl__egc *egc,
> > + libxl__remus_state *rs, int rc)
> > +{
> > + int ok = 0;
> > + libxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
> > +
> > + if (!rc)
> > + ok = 1;
> > + libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
> > +}
> > +
> > static void domain_suspend_callback_common_done(libxl__egc *egc,
> > libxl__domain_suspend_state *dss, int ok)
> > {
> > @@ -1447,32 +1458,51 @@ static void libxl__remus_domain_suspend_callback(void
> *data)
> > static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
> > libxl__domain_suspend_state *dss, int ok)
> > {
> > - /* REMUS TODO: Issue disk and network checkpoint reqs. */
> > + if (!ok)
> > + goto out;
> > +
> > + libxl__remus_state *const rs = &dss->rs;
> > + rs->callback = remus_device_postsuspend_cb;
> > + libxl__remus_device_postsuspend(egc, rs);
> > + return;
> > +
> > +out:
> > libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
> > }
> >
> > -static void libxl__remus_domain_resume_callback(void *data)
> > +static void remus_device_preresume_cb(libxl__egc *egc,
> > + libxl__remus_state *rs, int rc)
> > {
> > int ok = 0;
> > + libxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
> > + STATE_AO_GC(dss->ao);
> > +
> > + if (!rc) {
> > + /* Resumes the domain and the device model */
> > + if (!libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1))
> > + ok = 1;
> > + }
> > + libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
> > +}
> > +
> > +static void libxl__remus_domain_resume_callback(void *data)
> > +{
> > libxl__save_helper_state *shs = data;
> > libxl__egc *egc = shs->egc;
> > libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
> > STATE_AO_GC(dss->ao);
> >
> > - /* Resumes the domain and the device model */
> > - if (libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1))
> > - goto out;
> > -
> > - /* REMUS TODO: Deal with disk. Start a new network output buffer */
> > - ok = 1;
> > -out:
> > - libxl__xc_domain_saverestore_async_callback_done(egc, shs, ok);
> > + libxl__remus_state *const rs = &dss->rs;
> > + rs->callback = remus_device_preresume_cb;
> > + libxl__remus_device_preresume(egc, rs);
> > }
> >
> > /*----- remus asynchronous checkpoint callback -----*/
> >
> > static void remus_checkpoint_dm_saved(libxl__egc *egc,
> > libxl__domain_suspend_state *dss, int rc);
> > +static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
> > + const struct timeval *requested_abs);
> >
> > static void libxl__remus_domain_checkpoint_callback(void *data)
> > {
> > @@ -1489,13 +1519,67 @@ static void
> libxl__remus_domain_checkpoint_callback(void *data)
> > }
> > }
> >
> > +static void remus_device_commit_cb(libxl__egc *egc,
> > + libxl__remus_state *rs, int rc)
> > +{
> > + libxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
> > +
> > + STATE_AO_GC(dss->ao);
> > +
> > + if (rc) {
> > + LOG(ERROR, "Failed to do device commit op."
> > + " Terminating Remus..");
> > + goto out;
> > + } else {
> > + /* Set checkpoint interval timeout */
> > + rc = libxl__ev_time_register_rel(gc, &rs->timeout,
> > + remus_next_checkpoint,
> > + dss->interval);
> > + if (rc) {
> > + LOG(ERROR, "unable to register timeout for next epoch."
> > + " Terminating Remus..");
> > + goto out;
> > + }
> > + }
> > + return;
> > +
> > +out:
> > + libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 0);
> > +}
> > +
> > static void remus_checkpoint_dm_saved(libxl__egc *egc,
> > libxl__domain_suspend_state *dss, int rc)
> > {
> > - /* REMUS TODO: Wait for disk and memory ack, release network buffer */
> > - /* REMUS TODO: make this asynchronous */
> > - assert(!rc); /* REMUS TODO handle this error properly */
> > - usleep(dss->interval * 1000);
> > + /* Convenience aliases */
> > + libxl__remus_state *const rs = &dss->rs;
> > +
> > + STATE_AO_GC(dss->ao);
> > +
> > + if (rc) {
> > + LOG(ERROR, "Failed to save device model. Terminating Remus..");
> > + goto out;
> > + }
> > +
> > + rs->callback = remus_device_commit_cb;
> > + libxl__remus_device_commit(egc, rs);
> > +
> > + return;
> > +
> > +out:
> > + libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 0);
> > +}
> > +
> > +static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
> > + const struct timeval *requested_abs)
> > +{
> > + libxl__remus_state *rs = CONTAINER_OF(ev, *rs, timeout);
> > +
> > + /* Convenience aliases */
> > + libxl__domain_suspend_state *const dss = CONTAINER_OF(rs, *dss, rs);
> > +
> > + STATE_AO_GC(dss->ao);
> > +
> > + libxl__ev_time_deregister(gc, &rs->timeout);
> > libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 1);
> > }
> >
> > @@ -1720,6 +1804,13 @@ static void
> save_device_model_datacopier_done(libxl__egc *egc,
> > dss->save_dm_callback(egc, dss, our_rc);
> > }
> >
> > +static void libxl__remus_teardown_done(libxl__egc *egc,
> > + libxl__remus_state *rs, int rc)
> > +{
> > + libxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
> > + dss->callback(egc, dss, rc);
> > +}
> > +
> > static void domain_suspend_done(libxl__egc *egc,
> > libxl__domain_suspend_state *dss, int rc)
> > {
> > @@ -1734,6 +1825,19 @@ static void domain_suspend_done(libxl__egc *egc,
> > xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
> > dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
> >
> > + if (dss->remus) {
> > + /*
> > + * With Remus, if we reach this point, it means either
> > + * backup died or some network error occurred preventing us
> > + * from sending checkpoints. Teardown the network buffers and
> > + * release netlink resources. This is an async op.
> > + */
> > + dss->rs.saved_rc = rc;
> > + dss->rs.callback = libxl__remus_teardown_done;
> > + libxl__remus_device_teardown(egc, &dss->rs);
> > + return;
> > + }
> > +
> > dss->callback(egc, dss, rc);
> > }
> >
> > diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> > index 3fc90e2..5521a42 100644
> > --- a/tools/libxl/libxl_internal.h
> > +++ b/tools/libxl/libxl_internal.h
> > @@ -2470,6 +2470,187 @@ typedef struct libxl__save_helper_state {
> > * marshalling and xc callback functions */
> > } libxl__save_helper_state;
> >
> > +/*----- remus device related state structure -----*/
> > +/* remus device is an abstract layer of remus devices(nic, disk,
> > + * etc).It provides the following APIs for libxl:
> > + * >libxl__remus_device_setup
> > + * setup remus devices, like attach qdisc, enable disk buffering, etc
> > + * >libxl__remus_device_teardown
> > + * teardown devices
> > + * >libxl__remus_device_postsuspend
> > + * >libxl__remus_device_preresume
> > + * >libxl__remus_device_commit
> > + * above three are for checkpoint.
> > + * through remus device layer, the remus execution flow will be like
> > + * this:
> > + * xl remus -> remus device setup
> > + * |-> remus checkpoint(postsuspend, preresume, commit)
> > + * ...
> > + * |-> remus device teardown, failover or abort
> > + * the remus device layer provides an interface
> > + * libxl__remus_device_ops
> > + * which a remus device must implement. the whole remus structure:
> > + * |remus|
> > + * |
> > + * |remus device|
> > + * |
> > + * |nic| |drbd disks| |qemu disks| ...
> > + * a device(nic, drbd disks, qemu disks, etc) must implement
> > + * libxl__remus_device_ops to support remus.
> > + */
> > +
> > +typedef enum libxl__remus_device_kind {
> > + LIBXL__REMUS_DEVICE_NIC,
> > + LIBXL__REMUS_DEVICE_DISK,
> > +} libxl__remus_device_kind;
> > +
> > +typedef struct libxl__remus_state libxl__remus_state;
> > +typedef struct libxl__remus_device libxl__remus_device;
> > +typedef struct libxl__remus_device_state libxl__remus_device_state;
> > +typedef struct libxl__remus_device_ops libxl__remus_device_ops;
> > +
> > +struct libxl__remus_device_ops {
> > + /*
> > + * init() and destroy() APIs are produced by a device type and
> > + * consumed by the main remus code, a device type must implement
> > + * these two APIs.
> > + */
> > + /* init device ops private data, etc. must implement */
> > + int (*init)(libxl__remus_device_ops *self,
> > + libxl__remus_state *rs);
> > + /* free device ops private data, etc. must implement */
> > + void (*destroy)(libxl__remus_device_ops *self);
> > + /*
> > + * This is device ops's private data, for different device types,
> > + * the data structs are different
> > + */
> > + void *data;
> > +
> > + /*
> > + * checkpoint callbacks, these are async ops, call dev->callback
> > + * when done. These function pointers may be NULL, means the op is
> > + * not implemented, and it will do nothing when checkpoint.
> > + * The callers of these APIs must check the function pointer first.
> > + * These callbacks can be implemented synchronously, call
> > + * dev->callback at last directly.
> > + */
> > + void (*postsuspend)(libxl__remus_device *dev);
> > + void (*preresume)(libxl__remus_device *dev);
> > + void (*commit)(libxl__remus_device *dev);
> > +
> > + /*
> > + * This API determines whether the ops matchs the specific device. In the
> > + * implementation, we first init all device ops, for example, NIC ops,
> > + * DRBD ops ... Then we will find out the libxl devices, and match the
> > + * device with the ops, if the device is a drbd disk, then it will be
> > + * matched with DRBD ops, and the further ops(such as checkpoint ops etc.)
> > + * of this device will using DRBD ops. This API is mainly for disks,
> > + * because we must use an external script to determine whether a
> > + * libxl_disk is a DRBD disk. a device type must implement this API.
> > + * It's an async op and must be implemented asynchronously,
> > + * call dev->callback when done.
> > + */
> > + void (*match)(libxl__remus_device_ops *self,
> > + libxl__remus_device *dev);
> > +
> > + /*
> > + * setup() and teardown() are refer to the actual remus device,
> > + * a device type must implement these two APIs. They are async
> > + * ops, and call dev->callback when done.
> > + * These callbacks can be implemented synchronously, call
> > + * dev->callback at last directly.
> > + */
> > + /* setup the remus device */
> > + void (*setup)(libxl__remus_device *dev);
> > +
> > + /* teardown the remus device */
> > + void (*teardown)(libxl__remus_device *dev);
> > +};
> > +
> > +/*
> > + * This structure is for remus device layer, it records remus devices
> > + * that have been setuped.
> > + */
> > +struct libxl__remus_device_state {
> > + libxl__ao *ao;
> > + libxl__egc *egc;
> > +
> > + /* devices that have been setuped */
> > + libxl__remus_device **dev;
> > +
> > + int num_nics;
> > + int num_disks;
> > +
> > + /* for counting devices that have been handled */
> > + int num_devices;
> > + /* for counting devices that matched and setuped */
> > + int num_setuped;
> > +};
> > +
> > +typedef void libxl__remus_device_callback(libxl__egc *,
> > + libxl__remus_device *,
> > + int rc);
> > +/*
> > + * This structure is init and setup by remus device abstruct layer,
> > + * and pass to remus device ops
> > + */
> > +struct libxl__remus_device {
> > + /* set by remus device abstruct layer */
> > + int devid;
> > + /* libxl__device_* which this remus device related to */
> > + const void *backend_dev;
> > + libxl__remus_device_kind kind;
> > + /*
> > + * This is for matching, we must go through all device ops until we
> > + * find a matched op for the device. The ops_index record which ops
> > + * we are matching.
> > + */
> > + int ops_index;
> > + libxl__remus_device_ops *ops;
> > + libxl__remus_device_callback *callback;
> > + libxl__remus_device_state *rds;
> > +
> > + /* used by remus device implementation */
> > + /* *kind* of device's private data */
> > + void *data;
> > + /* for calling scripts, eg. setup|teardown|match scripts */
> > + libxl__async_exec_state aes;
> > + /*
> > + * for async func calls, in the implenmentation of device ops, we
> > + * may use fork to do async ops. this is owned by device-specific
> > + * ops methods
> > + */
> > + libxl__ev_child child;
> > +};
> > +
> > +typedef void libxl__remus_callback(libxl__egc *,
> > + libxl__remus_state *, int rc);
> > +
> > +struct libxl__remus_state {
> > + /* must set by caller of libxl__remus_device_(setup|teardown) */
> > + libxl__ao *ao;
> > + uint32_t domid;
> > + libxl__remus_callback *callback;
> > +
> > + /* private */
> > + int saved_rc;
> > + /* context containing device related stuff */
> > + libxl__remus_device_state dev_state;
> > +
> > + libxl__ev_time timeout; /* used for checkpoint */
> > +};
> > +
> > +/* the following 5 APIs are async ops, call rs->callback when done */
> > +_hidden void libxl__remus_device_setup(libxl__egc *egc,
> > + libxl__remus_state *rs);
> > +_hidden void libxl__remus_device_teardown(libxl__egc *egc,
> > + libxl__remus_state *rs);
> > +_hidden void libxl__remus_device_postsuspend(libxl__egc *egc,
> > + libxl__remus_state *rs);
> > +_hidden void libxl__remus_device_preresume(libxl__egc *egc,
> > + libxl__remus_state *rs);
> > +_hidden void libxl__remus_device_commit(libxl__egc *egc,
> > + libxl__remus_state *rs);
> > _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
> >
> > /*----- Domain suspend (save) state structure -----*/
> > @@ -2500,6 +2681,7 @@ struct libxl__domain_suspend_state {
> > int live;
> > int debug;
> > const libxl_domain_remus_info *remus;
> > + libxl__remus_state rs;
> > /* private */
> > libxl__ev_evtchn guest_evtchn;
> > int guest_evtchn_lockfd;
> > diff --git a/tools/libxl/libxl_remus_device.c b/tools/libxl/libxl_remus_device.c
> > new file mode 100644
> > index 0000000..07e298b
> > --- /dev/null
> > +++ b/tools/libxl/libxl_remus_device.c
> > @@ -0,0 +1,340 @@
> > +/*
> > + * Copyright (C) 2014
> > + * Author: Lai Jiangshan <laijs@cn.fujitsu.com <mailto:laijs@cn.fujitsu.com>>
> > + * Yang Hongyang <yanghy@cn.fujitsu.com <mailto:yanghy@cn.fujitsu.com>>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU Lesser General Public License as published
> > + * by the Free Software Foundation; version 2.1 only. with the special
> > + * exception on linking described in file LICENSE.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + * GNU Lesser General Public License for more details.
> > + */
> > +
> > +#include "libxl_osdeps.h" /* must come before any other headers */
> > +
> > +#include "libxl_internal.h"
> > +
> > +static libxl__remus_device_ops *dev_ops[] = {
> > +};
> > +
> > +static void device_common_cb(libxl__egc *egc,
> > + libxl__remus_device *dev,
> > + int rc)
> > +{
> > + /* Convenience aliases */
> > + libxl__remus_device_state *const rds = dev->rds;
> > + libxl__remus_state *const rs = CONTAINER_OF(rds, *rs, dev_state);
> > +
> > + STATE_AO_GC(rs->ao);
> > +
> > + rds->num_devices++;
> > +
> > + if (rc)
> > + rs->saved_rc = ERROR_FAIL;
> > +
> > + if (rds->num_devices == rds->num_setuped)
> > + rs->callback(egc, rs, rs->saved_rc);
> > +}
> > +
> > +void libxl__remus_device_postsuspend(libxl__egc *egc, libxl__remus_state *rs)
> > +{
> > + int i;
> > + libxl__remus_device *dev;
> > + STATE_AO_GC(rs->ao);
> > +
> > + /* Convenience aliases */
> > + libxl__remus_device_state *const rds = &rs->dev_state;
> > +
> > + rds->num_devices = 0;
> > + rs->saved_rc = 0;
> > +
> > + if(rds->num_setuped == 0)
> > + goto out;
> > +
> > + for (i = 0; i < rds->num_setuped; i++) {
> > + dev = rds->dev[i];
> > + dev->callback = device_common_cb;
> > + if (dev->ops->postsuspend) {
> > + dev->ops->postsuspend(dev);
> > + } else {
> > + rds->num_devices++;
> > + if (rds->num_devices == rds->num_setuped)
> > + rs->callback(egc, rs, rs->saved_rc);
> > + }
> > + }
> > +
> > + return;
> > +
> > +out:
> > + rs->callback(egc, rs, rs->saved_rc);
> > +}
> > +
> > +void libxl__remus_device_preresume(libxl__egc *egc, libxl__remus_state *rs)
> > +{
> > + int i;
> > + libxl__remus_device *dev;
> > + STATE_AO_GC(rs->ao);
> > +
> > + /* Convenience aliases */
> > + libxl__remus_device_state *const rds = &rs->dev_state;
> > +
> > + rds->num_devices = 0;
> > + rs->saved_rc = 0;
> > +
> > + if(rds->num_setuped == 0)
> > + goto out;
> > +
> > + for (i = 0; i < rds->num_setuped; i++) {
> > + dev = rds->dev[i];
> > + dev->callback = device_common_cb;
> > + if (dev->ops->preresume) {
> > + dev->ops->preresume(dev);
> > + } else {
> > + rds->num_devices++;
> > + if (rds->num_devices == rds->num_setuped)
> > + rs->callback(egc, rs, rs->saved_rc);
> > + }
> > + }
> > +
> > + return;
> > +
> > +out:
> > + rs->callback(egc, rs, rs->saved_rc);
> > +}
> > +
> > +void libxl__remus_device_commit(libxl__egc *egc, libxl__remus_state *rs)
> > +{
> > + int i;
> > + libxl__remus_device *dev;
> > + STATE_AO_GC(rs->ao);
> > +
> > + /*
> > + * REMUS TODO: Wait for disk and explicit memory ack (through restore
> > + * callback from remote) before releasing network buffer.
> > + */
> > + /* Convenience aliases */
> > + libxl__remus_device_state *const rds = &rs->dev_state;
> > +
> > + rds->num_devices = 0;
> > + rs->saved_rc = 0;
> > +
> > + if(rds->num_setuped == 0)
> > + goto out;
> > +
> > + for (i = 0; i < rds->num_setuped; i++) {
> > + dev = rds->dev[i];
> > + dev->callback = device_common_cb;
> > + if (dev->ops->commit) {
> > + dev->ops->commit(dev);
> > + } else {
> > + rds->num_devices++;
> > + if (rds->num_devices == rds->num_setuped)
> > + rs->callback(egc, rs, rs->saved_rc);
> > + }
> > + }
> > +
> > + return;
> > +
> > +out:
> > + rs->callback(egc, rs, rs->saved_rc);
> > +}
> > +
> > +static void device_setup_cb(libxl__egc *egc,
> > + libxl__remus_device *dev,
> > + int rc)
> > +{
> > + /* Convenience aliases */
> > + libxl__remus_device_state *const rds = dev->rds;
> > + libxl__remus_state *const rs = CONTAINER_OF(rds, *rs, dev_state);
> > +
> > + STATE_AO_GC(rs->ao);
> > +
> > + rds->num_devices++;
> > + /*
> > + * we add devices that have been setuped to the array no matter
> > + * the setup process succeed or failed because we need to ensure
> > + * the device been teardown while setup failed. If any of the
> > + * device setup failed, we will quit remus, but before we exit,
> > + * we will teardown the devices that have been added to **dev
> > + */
> > + rds->dev[rds->num_setuped++] = dev;
> > + if (rc) {
> > + /* setup failed */
> > + rs->saved_rc = ERROR_FAIL;
> > + }
> > +
> > + if (rds->num_devices == (rds->num_nics + rds->num_disks))
> > + rs->callback(egc, rs, rs->saved_rc);
> > +}
> > +
> > +static void device_match_cb(libxl__egc *egc,
> > + libxl__remus_device *dev,
> > + int rc)
> > +{
> > + libxl__remus_device_state *const rds = dev->rds;
> > + libxl__remus_state *rs = CONTAINER_OF(rds, *rs, dev_state);
> > +
> > + STATE_AO_GC(rs->ao);
> > +
> > + if (rc) {
> > + if (++dev->ops_index >= ARRAY_SIZE(dev_ops) ||
> > + rc != ERROR_NOT_MATCH) {
> > + /* the device can not be matched */
> > + rds->num_devices++;
> > + rs->saved_rc = ERROR_FAIL;
> > + if (rds->num_devices == (rds->num_nics + rds->num_disks))
> > + rs->callback(egc, rs, rs->saved_rc);
> > + return;
> > + }
> > + /* the ops does not match, try next ops */
> > + dev->ops = dev_ops[dev->ops_index];
> > + dev->ops->match(dev->ops, dev);
> > + } else {
> > + /* the ops matched, setup the device */
> > + dev->callback = device_setup_cb;
> > + dev->ops->setup(dev);
> > + }
> > +}
> > +
> > +static void device_teardown_cb(libxl__egc *egc,
> > + libxl__remus_device *dev,
> > + int rc)
> > +{
> > + int i;
> > + libxl__remus_device_ops *ops;
> > + libxl__remus_device_state *const rds = dev->rds;
> > + libxl__remus_state *rs = CONTAINER_OF(rds, *rs, dev_state);
> > +
> > + STATE_AO_GC(rs->ao);
> > +
> > + /* ignore teardown errors to teardown as many devs as possible*/
> > + rds->num_setuped--;
> > +
> > + if (rds->num_setuped == 0) {
> > + /* clean device ops */
> > + for (i = 0; i < ARRAY_SIZE(dev_ops); i++) {
> > + ops = dev_ops[i];
> > + ops->destroy(ops);
> > + }
> > + rs->callback(egc, rs, rs->saved_rc);
> > + }
> > +}
> > +
> > +static __attribute__((unused)) void libxl__remus_device_init(libxl__egc *egc,
> > + libxl__remus_device_state *rds,
> > + libxl__remus_device_kind kind,
> > + void *libxl_dev)
> > +{
> > + libxl__remus_device *dev = NULL;
> > + libxl_device_nic *nic = NULL;
> > + libxl_device_disk *disk = NULL;
> > +
> > + STATE_AO_GC(rds->ao);
> > + GCNEW(dev);
> > + dev->ops_index = 0; /* we will match the ops later */
> > + dev->backend_dev = libxl_dev;
> > + dev->kind = kind;
> > + dev->rds = rds;
> > +
> > + switch (kind) {
> > + case LIBXL__REMUS_DEVICE_NIC:
> > + nic = libxl_dev;
> > + dev->devid = nic->devid;
> > + break;
> > + case LIBXL__REMUS_DEVICE_DISK:
> > + disk = libxl_dev;
> > + /* there are no dev id for disk devices */
> > + dev->devid = -1;
> > + break;
> > + default:
> > + return;
> > + }
> > +
> > + libxl__async_exec_init(&dev->aes);
> > + libxl__ev_child_init(&dev->child);
> > +
> > + /* match the ops begin */
> > + dev->callback = device_match_cb;
> > + dev->ops = dev_ops[dev->ops_index];
> > + dev->ops->match(dev->ops, dev);
> > +}
> > +
> > +void libxl__remus_device_setup(libxl__egc *egc, libxl__remus_state *rs)
> > +{
> > + int i;
> > + libxl__remus_device_ops *ops;
> > +
> > + /* Convenience aliases */
> > + libxl__remus_device_state *const rds = &rs->dev_state;
> > +
> > + STATE_AO_GC(rs->ao);
> > +
> > + if (ARRAY_SIZE(dev_ops) == 0)
> > + goto out;
> > +
> > + for (i = 0; i < ARRAY_SIZE(dev_ops); i++) {
> > + ops = dev_ops[i];
> > + if (ops->init(ops, rs)) {
> > + rs->saved_rc = ERROR_FAIL;
> > + goto out;
> > + }
> > + }
> > +
> > + rds->ao = rs->ao;
> > + rds->egc = egc;
> > + rds->num_devices = 0;
> > + rds->num_nics = 0;
> > + rds->num_disks = 0;
> > +
> > + /* TBD: Remus setup - i.e. attach qdisc, enable disk buffering, etc */
> > +
> > + if (rds->num_nics == 0 && rds->num_disks == 0)
> > + goto out;
> > +
> > + GCNEW_ARRAY(rds->dev, rds->num_nics + rds->num_disks);
> > +
> > + /* TBD: CALL libxl__remus_device_init to init remus devices */
> > +
> > + return;
> > +
> > +out:
> > + rs->callback(egc, rs, rs->saved_rc);
> > + return;
> > +}
> > +
> > +void libxl__remus_device_teardown(libxl__egc *egc, libxl__remus_state *rs)
> > +{
> > + int i;
> > + libxl__remus_device *dev;
> > + libxl__remus_device_ops *ops;
> > +
> > + STATE_AO_GC(rs->ao);
> > +
> > + /* Convenience aliases */
> > + libxl__remus_device_state *const rds = &rs->dev_state;
> > +
> > + if (rds->num_setuped == 0) {
> > + /* clean device ops */
> > + for (i = 0; i < ARRAY_SIZE(dev_ops); i++) {
> > + ops = dev_ops[i];
> > + ops->destroy(ops);
> > + }
> > + goto out;
> > + }
> > +
> > + for (i = 0; i < rds->num_setuped; i++) {
> > + dev = rds->dev[i];
> > + dev->callback = device_teardown_cb;
> > + dev->ops->teardown(dev);
> > + }
> > +
> > + return;
> > +
> > +out:
> > + rs->callback(egc, rs, rs->saved_rc);
> > + return;
> > +}
> > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> > index 1018142..cc5d390 100644
> > --- a/tools/libxl/libxl_types.idl
> > +++ b/tools/libxl/libxl_types.idl
> > @@ -43,6 +43,7 @@ libxl_error = Enumeration("error", [
> > (-12, "OSEVENT_REG_FAIL"),
> > (-13, "BUFFERFULL"),
> > (-14, "UNKNOWN_CHILD"),
> > + (-15, "NOT_MATCH"),
> > ], value_namespace = "")
> >
> > libxl_domain_type = Enumeration("domain_type", [
> > --
> > 1.9.1
> >
>
> As far as the Remus logic is concerned, I am fine with this patch. You can add
> my acked-by if it matters here. I'll defer it to IanJ to make the final call on
> the coding style, etc.
>
Thanks!
--
Thanks,
Yang.
next prev parent reply other threads:[~2014-06-27 3:40 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-27 1:59 [PATCH v13 0/7] Remus/Libxl: Remus network buffering and drbd disk Yang Hongyang
2014-06-27 1:59 ` [PATCH v13 1/7] remus: make postcopy callback asynchronous Yang Hongyang
2014-06-27 1:59 ` [PATCH v13 2/7] remus: add libnl3 dependency for network buffering support Yang Hongyang
2014-06-27 1:59 ` [PATCH v13 3/7] remus: introduce remus device Yang Hongyang
2014-06-27 3:29 ` Shriram Rajagopalan
2014-06-27 3:40 ` Hongyang Yang [this message]
2014-06-27 1:59 ` [PATCH v13 4/7] remus netbuffer: implement remus network buffering for nic devices Yang Hongyang
2014-06-27 17:54 ` Ian Jackson
2014-06-30 5:17 ` Hongyang Yang
2014-07-02 1:55 ` Hongyang Yang
2014-06-27 1:59 ` [PATCH v13 5/7] remus drbd: Implement remus drbd replicated disk Yang Hongyang
2014-06-27 1:59 ` [PATCH v13 6/7] libxl: network buffering cmdline switch Yang Hongyang
2014-06-27 1:59 ` [PATCH v13 7/7] libxl: disk " Yang Hongyang
2014-06-27 3:33 ` Shriram Rajagopalan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53ACE7AF.1080408@cn.fujitsu.com \
--to=yanghy@cn.fujitsu.com \
--cc=andrew.cooper3@citrix.com \
--cc=eddie.dong@intel.com \
--cc=ian.campbell@citrix.com \
--cc=ian.jackson@eu.citrix.com \
--cc=laijs@cn.fujitsu.com \
--cc=roger.pau@citrix.com \
--cc=rshriram@cs.ubc.ca \
--cc=stefano.stabellini@eu.citrix.com \
--cc=wency@cn.fujitsu.com \
--cc=xen-devel@lists.xen.org \
--cc=yunhong.jiang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.