From: Hongyang Yang <yanghy@cn.fujitsu.com>
To: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: laijs@cn.fujitsu.com, wency@cn.fujitsu.com,
andrew.cooper3@citrix.com, yunhong.jiang@intel.com,
eddie.dong@intel.com, xen-devel@lists.xen.org,
rshriram@cs.ubc.ca, ian.campbell@citrix.com
Subject: Re: [PATCH v15 2/7] remus: introduce remus device
Date: Fri, 11 Jul 2014 16:18:12 +0800 [thread overview]
Message-ID: <53BF9DC4.7000406@cn.fujitsu.com> (raw)
In-Reply-To: <21437.31542.967350.687109@mariner.uk.xensource.com>
On 07/10/2014 01:26 AM, Ian Jackson wrote:
> Yang Hongyang writes ("[PATCH v15 2/7] remus: introduce remus device"):
>> introduce remus device, an abstract layer of remus devices(nic, disk,
>> etc).It provides the following APIs for libxl:
>> >libxl__remus_device_setup
>> setup remus devices, like attach qdisc, enable disk buffering, etc
>> >libxl__remus_device_teardown
>> teardown devices
>> >libxl__remus_device_postsuspend
>> >libxl__remus_device_preresume
>> >libxl__remus_device_commit
>> above three are for checkpoint.
>
> Sorry to mention this now, but I think it would be clearer if all
> these libxl__remus_device_* functions which manipulate _all_ the
> devices for a domain used the plural of "device", ie
> libxl__remus_devices_setup, etc.
>
>> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
>> index 39f1c28..bcbd02b 100644
>> --- a/tools/libxl/libxl.c
>> +++ b/tools/libxl/libxl.c
>> @@ -733,6 +733,31 @@ out:
>> static void remus_failover_cb(libxl__egc *egc,
>> libxl__domain_suspend_state *dss, int rc);
>>
>> +static void libxl__remus_setup_failed(libxl__egc *egc,
>> + libxl__remus_state *rs, int rc)
>> +{
>> + STATE_AO_GC(rs->ao);
>> + libxl__ao_complete(egc, ao, rc);
>> +}
>> +
>> +static void libxl__remus_setup_done(libxl__egc *egc,
>> + libxl__remus_state *rs, int rc)
>
> This callback appears after the code which sets it up, which naturally
> executes first. Things should be in chronological order, as I said.
>
>> +{
>> + libxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
>> + STATE_AO_GC(rs->ao);
>> +
>> + if (!rc) {
>> + libxl__domain_suspend(egc, dss);
>> + return;
>> + }
>> +
>> + LOG(ERROR, "Remus: failed to setup device for guest with domid %u",
>> + dss->domid);
>
> This log message should report the rc.
Ok, thanks.
>
>> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
>> index 83eb29a..8cc90dc 100644
>> --- a/tools/libxl/libxl_dom.c
>> +++ b/tools/libxl/libxl_dom.c
>> @@ -1454,6 +1454,17 @@ static void libxl__domain_suspend_callback(void *data)
>> domain_suspend_callback_common(egc, dss);
>> }
>>
>> +static void remus_device_postsuspend_cb(libxl__egc *egc,
>> + libxl__remus_state *rs, int rc)
>> +{
>
> I can't quite tell, but isn't this code not in chronological order ?
>
>> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> index f8441f6..8ef20a0 100644
>> --- a/tools/libxl/libxl_internal.h
>> +++ b/tools/libxl/libxl_internal.h
> ...
>> +/*
>> + * This structure is for remus device layer, it records remus devices
>> + * that have been setuped.
>> + */
>
> Still lots of uses of "setuped". Correct would be "devices that have
> been set up".
>
>> +/*----- checkpointing APIs -----*/
>> +
>> +/* callbacks */
>> +
>> +static void device_common_cb(libxl__egc *egc,
>> + libxl__remus_device *dev,
>> + int rc)
>> +{
>
> I asked for things to be in chronological order in the file. I think
> this function therefore ought to be below these checkpoint functions.
>
> Also its name isn't very suggestive. Perhaps it should have
> "checkpoint" in the name ?
Yes, I shall put "checkpoint" in the name.
>
>> +/* API implementations */
>> +
>> +void libxl__remus_device_postsuspend(libxl__egc *egc, libxl__remus_state *rs)
> ...
>> +void libxl__remus_device_preresume(libxl__egc *egc, libxl__remus_state *rs)
> ...
>> +void libxl__remus_device_commit(libxl__egc *egc, libxl__remus_state *rs)
>
> These three functions are nearly identical. They should be combined
> somehow. Perhaps there should be a single function which takes the
> offset of the per-device operation vtable member as a parameter. Or
> perhaps the three functions should be generated by a macro. Currently
> I'm leaning towards the former.
>
>> + if(rds->num_set_up == 0)
> ^
> Coding style.
>
>> + for (i = 0; i < rds->num_set_up; i++) {
>> + dev = rds->dev[i];
>> + dev->callback = device_common_cb;
>> + if (dev->ops->commit) {
>> + dev->ops->commit(dev);
>> + } else {
>> + rds->num_devices++;
>> + if (rds->num_devices == rds->num_set_up)
>> + rs->callback(egc, rs, rs->saved_rc);
>
> Why not have this call device_common_cb rather than open-coding it ?
>
>> +/*----- main flow of control -----*/
>
> This is for setup, right ? Or teardown ? Or both ?
setup and teardown, will change this comment.
>
>> +/* callbacks */
>> +
>> +static void device_setup_cb(libxl__egc *egc,
>> + libxl__remus_device *dev,
>> + int rc);
>> +static void device_match_cb(libxl__egc *egc,
>> + libxl__remus_device *dev,
>> + int rc)
>> +{
>
> This is still not in chronological order. You must put callbacks
> after the things that set them up, since obviously the callback
> happens after it is set up.
>
> Seeing things in the wrong order makes it hard to see what's going on.
> In the interests of trying to reduce the number of iterations of this
> series I'm going to do a kind of peephole review of what I see here;
> you should bear in mind that I'm not always trying to understand what
> the purpose of these functions is so I may have structural comments on
> your next version.
>
>> + libxl__remus_device_state *const rds = dev->rds;
>> + libxl__remus_state *rs = CONTAINER_OF(rds, *rs, dev_state);
>> +
>> + STATE_AO_GC(rs->ao);
>> +
>> + if (rc) {
>> + if (++dev->ops_index >= ARRAY_SIZE(dev_ops) ||
>
> This use of ++ looks confusing to me.
>
>> + rc != ERROR_REMUS_DEVOPS_NOT_MATCH) {
>> + /* the device can not be matched */
>> + rds->num_devices++;
>> + rs->saved_rc = ERROR_FAIL;
>
> Please can we not return more ERROR_FAIL. Instead, why not return
> ERROR_REMUS_DEVICE_NOT_SUPPORTED as I suggested before ?
>
>> + rds->num_devices++;
>> + /*
>> + * we add devices that have been setuped to the array no matter
>> + * the setup process succeed or failed because we need to ensure
>> + * the device been teardown while setup failed. If any of the
>> + * device setup failed, we will quit remus, but before we exit,
>> + * we will teardown the devices that have been added to **dev
>> + */
>> + rds->dev[rds->num_set_up++] = dev;
>> + if (rc) {
>> + /* setup failed */
>> + rs->saved_rc = ERROR_FAIL;
>> + }
>
> This doesn't look right. If the setup fails, presumably we shouldn't
> put the device in the array ? If we do it will presumably be torn
> down later.
>
>> +static void destroy_device_ops(libxl__remus_state *rs);
>> +static void device_teardown_cb(libxl__egc *egc,
>> + libxl__remus_device *dev,
>> + int rc)
>> +{
>> + libxl__remus_device_state *const rds = dev->rds;
>> + libxl__remus_state *rs = CONTAINER_OF(rds, *rs, dev_state);
>> +
>> + STATE_AO_GC(rs->ao);
>> +
>> + /* ignore teardown errors to teardown as many devs as possible*/
>> + rds->num_set_up--;
>
> We should at least preserve the rc so that we don't pretend that
> everything worked if it didn't.
>
> You may need to invent a new teardown rc variable, and/or add logging,
> so that you get the right error code out, and the right log messages,
> if a setup fails and causes a teardown which also fails.
>
>> +static int init_device_ops(libxl__remus_state *rs)
>> +{
>> + int i, rc;
>> + const libxl__remus_device_ops *ops;
>> +
>> + for (i = 0; i < ARRAY_SIZE(dev_ops); i++) {
>> + ops = dev_ops[i];
>> + if (ops->init(ops, rs)) {
>> + rc = ERROR_FAIL;
>
> In general, you should preserve the rc, not turn everything into
> ERROR_FAIL.
>
> I still don't understand why libxl__remus_device_state is not part of
> libxl__remus_state.
>
> I wonder if you need a different name for "the abstract thing which is
> represented by a libxl__remus_device_ops". Since each ops only deals
> with one kind (but perhaps a kind may have several ops), I suggest
> calling them "subkind"s or "devsubkinds" or something. At the moment
> you seem to be calling them "ops".
>
> As a result I think this function is poorly named. It doesn't
> initialise any ops. It initialises each subkind. Perhaps it should
> be called init_device_subkind.
>
> You might consider whether libxl__remus_device_ops should be
> libxl__remus_subkind_ops perhaps or libxl__remus_devsubkind_ops or
> libxl__remus_device_subkind_ops.
>
>> +void libxl__remus_device_setup(libxl__egc *egc, libxl__remus_state *rs)
>> +{
>> + STATE_AO_GC(rs->ao);
>> +
>> + /* Convenience aliases */
>> + libxl__remus_device_state *const rds = &rs->dev_state;
>> +
>> + if (ARRAY_SIZE(dev_ops) == 0)
>> + goto out;
>
> Why is this special case necessary ?
This means there're no ops present, the following process of setup are
unnecessary, so we just goto out and return.
>
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index de25f42..e56567f 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -44,6 +44,7 @@ libxl_error = Enumeration("error", [
>> (-12, "OSEVENT_REG_FAIL"),
>> (-13, "BUFFERFULL"),
>> (-14, "UNKNOWN_CHILD"),
>> + (-15, "REMUS_DEVOPS_NOT_MATCH"),
>
> I don't object to this name for this exact case and it's fine to have
> a specific error code.
>
> But if you are going to make the whole machinery return
> ERROR_REMUS_DEVICE_NOT_SUPPORTED if no subkind matches, then you can
> reuse that error code for the subkind methods.
>
> Thanks,
> Ian.
> .
>
--
Thanks,
Yang.
next prev parent reply other threads:[~2014-07-11 8:18 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-07 8:26 [PATCH v15 0/7] Remus/Libxl: Remus network buffering and drbd disk Yang Hongyang
2014-07-07 8:26 ` [PATCH v15 1/7] remus: add libnl3 dependency for network buffering support Yang Hongyang
2014-07-07 8:26 ` [PATCH v15 2/7] remus: introduce remus device Yang Hongyang
2014-07-09 17:26 ` Ian Jackson
2014-07-10 2:18 ` Hongyang Yang
2014-07-10 17:34 ` Ian Jackson
2014-07-11 2:12 ` Hongyang Yang
2014-07-11 6:16 ` Hongyang Yang
2014-07-11 8:18 ` Hongyang Yang [this message]
2014-07-07 8:26 ` [PATCH v15 3/7] remus netbuffer: implement remus network buffering for nic devices Yang Hongyang
2014-07-09 23:15 ` Ian Jackson
2014-07-10 3:03 ` Hongyang Yang
2014-07-10 17:43 ` Ian Jackson
2014-07-11 2:18 ` Hongyang Yang
2014-07-10 12:53 ` Ian Campbell
2014-07-07 8:26 ` [PATCH v15 4/7] remus drbd: Implement remus drbd replicated disk Yang Hongyang
2014-07-07 8:26 ` [PATCH v15 5/7] libxl: network buffering cmdline switch Yang Hongyang
2014-07-07 8:26 ` [PATCH v15 6/7] libxl: disk " Yang Hongyang
2014-07-07 8:26 ` [PATCH v15 7/7] MAINTAINERS: Update maintained files of REMUS Yang Hongyang
-- strict thread matches above, loose matches on Subject: below --
2014-07-10 17:35 [PATCH v15 2/7] remus: introduce remus device Ian Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53BF9DC4.7000406@cn.fujitsu.com \
--to=yanghy@cn.fujitsu.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=eddie.dong@intel.com \
--cc=ian.campbell@citrix.com \
--cc=laijs@cn.fujitsu.com \
--cc=rshriram@cs.ubc.ca \
--cc=wency@cn.fujitsu.com \
--cc=xen-devel@lists.xen.org \
--cc=yunhong.jiang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.