Re: [PATCH v18 04/11] libxl/remus: introduce an abstract Remus device layer

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Hongyang Yang <yanghy@cn.fujitsu.com>
To: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: laijs@cn.fujitsu.com, wency@cn.fujitsu.com,
	andrew.cooper3@citrix.com, yunhong.jiang@intel.com,
	eddie.dong@intel.com, xen-devel@lists.xen.org,
	rshriram@cs.ubc.ca, ian.campbell@citrix.com
Subject: Re: [PATCH v18 04/11] libxl/remus: introduce an abstract Remus device layer
Date: Wed, 27 Aug 2014 09:46:22 +0800	[thread overview]
Message-ID: <53FD386E.90800@cn.fujitsu.com> (raw)
In-Reply-To: <21475.50623.284472.82134@mariner.uk.xensource.com>

Hi Ian,

   Thanks for the review, I'm addressing these comments. What do you think of
the rest of the patches? Do you intend to review them all this time or just
stop here and review next version?

在 08/08/2014 02:30 AM, Ian Jackson 写道:
> Yang Hongyang writes ("[PATCH v18 04/11] libxl/remus: introduce an abstract Remus device layer"):
>> Introduce an abstract device layer that allows the Remus
>> logic in libxl to control a guest's devices in a device-agnostic
>> manner. The device layer also exposes a set of internal interfaces
>> that a device type must implement, if it wishes to support Remus.
>
> Thanks.  I think this is converging.  I have mostly nits as comments
> now.  I have only two nontrivial comments: one about your use of
> multidev which I think needs to be improved, and the other is about
> the libxl__remus_device_kind enum (which you are already aware of).
>
>
>
>> +static void remus_devices_preresume_cb(libxl__egc *egc,
>> +                                       libxl__remus_devices_state *rds,
>> +                                       int rc)
>> +{
>> +    int ok = 0;
>> +    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
>> +    STATE_AO_GC(dss->ao);
>> +
>> +    if (rc)
>>           goto out;
>>
>> -    /* REMUS TODO: Deal with disk. Start a new network output buffer */
>> -    ok = 1;
>> +    /* Resumes the domain and the device model */
>> +    if (!libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1))
>> +        ok = 1;
>
> Again, this should use the standard `goto out' error handling style.
> In this case that means:
>
>       rc = libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1);
>       if (rc) goto out;
>
>       ok = 1;
>     out:
>
>
>> +static void remus_devices_commit_cb(libxl__egc *egc,
>> +                                    libxl__remus_devices_state *rds,
>> +                                    int rc)
>> +{
> ...
>> +    /* Set checkpoint interval timeout */
>> +    rc = libxl__ev_time_register_rel(gc, &dss->checkpoint_timeout,
>> +                                     remus_next_checkpoint,
>> +                                     dss->interval);
>> +
>> +    if (rc) {
>> +        LOG(ERROR, "unable to register timeout for next epoch."
>> +            " Terminating Remus..");
>> +        goto out;
>> +    }
>
> There is no need to log failures of libxl__ev_time_register_rel et
> al.  See the comment in libxl_internal.h near line 691.  It is
> sufficient to do
>
>        if (rc) goto out;
>
>> +typedef enum libxl__remus_device_kind {
>> +    LIBXL__REMUS_DEVICE_NIC  = (1 << 0),
>> +    LIBXL__REMUS_DEVICE_DISK = (1 << 1),
>> +} libxl__remus_device_kind;
>
> We still need to talk about this, and the comments I had about the
> vtables.
>
>> +typedef struct libxl__remus_device libxl__remus_device;
>> +typedef struct libxl__remus_devices_state libxl__remus_devices_state;
>> +typedef struct libxl__remus_device_subkind_ops libxl__remus_device_subkind_ops;
>> +
>> +/*
>> + * Interfaces to be implemented by every device type that wishes to
>> + * support Remus. Functions must be implemented unless otherwise
>> + * stated. Many of these functions are asynchronous. They call
>> + * dev->aodev.callback when done.  The actual implementations may be
>> + * synchronous and call dev->aodev.callback directly (as the last
>> + * thing they do).
>> + */
>> +struct libxl__remus_device_subkind_ops {
>> +    /* the device kind this ops belongs to... */
>> +    libxl__remus_device_kind kind;
>> +
>> +    /*
>> +     * init() and cleanup() relate to the subkind-specific state in
>> +     * the libxl ctx, not to any specific device.
>> +     * Synchronous. cleanup() cannot fail.
>> +     */
>> +    int (*init)(libxl__remus_devices_state *rds);
>> +    void (*cleanup)(libxl__remus_devices_state *rds);
>
> But actually they take a libxl__remus_devices_state.
>
> Either the state is global for all simultaneous remus invocations in
> with this libxl_ctx, in which case init and cleanup should not take
> any libxl__remus_devices_state.
>
> Or the state is per remus invocation, in which case the comment is
> wrong.
>
> You also need to document the error behaviour.  From the call site I
> think something like:
>
>       Before the first call to init, the subkind-specific state will be
>       all-bits-zero.  cleanup will be called whether or not init
>       succeeded.
>
> This is a similar situation to the one where I asked you to document
> the same thing about `teardown'.
>
>
> And if this is global state in the libxl_ctx, you have to also say:
>
>       init must be idempotent; it will be called multiple times,
>       possibly even if after it has been called and failed.
>
> And if that is the semantics I think something like `ensure_inited' is
> probably correct for its name.
>
>
>> +    int num_devices;
>> +    /*
>> +     * this array is allocated before setup the remus devices by the
>> +     * remus abstract layer.
>> +     * the size of this array is 'num_devices', which is the total number
>> +     * of libxl nic devices and disk devices(num_nics + num_disks).
>> +     */
>> +    libxl__remus_device **dev;
>
> (As I said before) this comment leaves some questions unananswered:
>
> What proportion of the devs array is initialised at any one time ?
> May the devs array contain null pointers and what do they mean ?  etc.
>
> (And, sorry for not noticing this last time, but I think this variable
> needs to be called `devs' rather than `dev'.)
>
>> +/*
>> + * Information about a single device being handled by remus.
>> + * Allocated by the remus abstract layer.
>> + */
>> +struct libxl__remus_device {
>> +    /*----- shared between abstract and concrete layers -----*/
>> +    /*
>> +     * if this is true, that means the subkind ops matched the
>> +     * device and we have actually set up the device no matter
>> +     * setup succeed or not.
>> +     */
>> +    int set_up;
>
> I don't understand this.  The protocol documented in
> libxl__remus_device_subkind_ops seems to be how the subkind
> communicates to the abstract layer whether the device was successfully
> set up.  Is this variable in fact solely for the abstract layer ?
>
> Also, "we have actually set up the device" and "setup succeeded" seem
> to be the same thing.
>
> (Also, can it be a boolean?)
>
>> +    /* find the error that was not ERROR_REMUS_DEVOPS_DOES_NOT_MATCH */
>> +    for (i = 0; i < rds->num_devices; i++) {
>> +        dev = rds->dev[i];
>> +
>> +        if (!dev->aodev.rc || dev->aodev.rc == ERROR_REMUS_DEVOPS_DOES_NOT_MATCH)
>
> This is quite tortuous.  I think you probably want to do it
> differently by having two layers of callback function:
>
> You should probably make the multidev->callback only when you have
> found the right subkind (or failed).
>
> So the subkind should be told to use a different callback which is
> handled here in the abstract type code.  Then your abstract code can
> iterate separately through each subkind, rather than hunting through
> the innards of multidev.
>
> (I think that accessing aodev->rc here is a layering violation.)
>
>
> Thanks,
> Ian.
> .
>

-- 
Thanks,
Yang.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2014-08-27  1:46 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-28  9:23 [PATCH v18 00/11] Remus/Libxl: Remus network buffering and drbd disk Yang Hongyang
2014-07-28  9:23 ` [PATCH v18 01/11] libxl: introduce libxl__multidev_prepare_with_aodev Yang Hongyang
2014-07-29 17:21   ` Ian Jackson
2014-07-28  9:23 ` [PATCH v18 02/11] libxl: add support for async. function calls when using libxl__ao_device Yang Hongyang
2014-07-29 17:24   ` Ian Jackson
2014-07-30  8:45     ` Hongyang Yang
2014-07-28  9:23 ` [PATCH v18 03/11] autoconf: add libnl3 dependency for Remus network buffering support Yang Hongyang
2014-07-28  9:23 ` [PATCH v18 04/11] libxl/remus: introduce an abstract Remus device layer Yang Hongyang
2014-08-07 18:30   ` Ian Jackson
2014-08-27  1:46     ` Hongyang Yang [this message]
2014-08-27  2:21       ` Ian Jackson
2014-08-27  2:27         ` Hongyang Yang
2014-07-28  9:23 ` [PATCH v18 05/11] libxl/remus: setup and control network output buffering Yang Hongyang
2014-07-28  9:24 ` [PATCH v18 06/11] libxl/remus: setup and control disk replication for DRBD backends Yang Hongyang
2014-07-28  9:24 ` [PATCH v18 07/11] xl/remus: cmdline switch to explicitly enable unsafe configurations Yang Hongyang
2014-07-28  9:24 ` [PATCH v18 08/11] xl/remus: cmdline switches and config vars to control network buffering Yang Hongyang
2014-07-28  9:24 ` [PATCH v18 09/11] xl/remus: add a cmdline switch to disable disk replication Yang Hongyang
2014-07-28  9:24 ` [PATCH v18 10/11] libxl/remus: add LIBXL_HAVE_REMUS to indicate Remus support in libxl Yang Hongyang
2014-07-28  9:24 ` [PATCH v18 11/11] MAINTAINERS: update maintained files of Remus Yang Hongyang
2014-08-07  1:17 ` [PATCH v18 00/11] Remus/Libxl: Remus network buffering and drbd disk Hongyang Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53FD386E.90800@cn.fujitsu.com \
    --to=yanghy@cn.fujitsu.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=eddie.dong@intel.com \
    --cc=ian.campbell@citrix.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=rshriram@cs.ubc.ca \
    --cc=wency@cn.fujitsu.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yunhong.jiang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).