From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hongyang Yang Subject: Re: [PATCH v15 3/7] remus netbuffer: implement remus network buffering for nic devices Date: Thu, 10 Jul 2014 11:03:48 +0800 Message-ID: <53BE0294.7060508@cn.fujitsu.com> References: <1404721584-17862-1-git-send-email-yanghy@cn.fujitsu.com> <1404721584-17862-4-git-send-email-yanghy@cn.fujitsu.com> <21437.52511.69050.855247@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <21437.52511.69050.855247@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Jackson Cc: laijs@cn.fujitsu.com, wency@cn.fujitsu.com, andrew.cooper3@citrix.com, yunhong.jiang@intel.com, eddie.dong@intel.com, xen-devel@lists.xen.org, rshriram@cs.ubc.ca, ian.campbell@citrix.com List-Id: xen-devel@lists.xenproject.org On 07/10/2014 07:15 AM, Ian Jackson wrote: > Yang Hongyang writes ("[PATCH v15 3/7] remus netbuffer: implement remus network buffering for nic devices"): >> diff --git a/tools/hotplug/Linux/remus-netbuf-setup b/tools/hotplug/Linux/remus-netbuf-setup >> new file mode 100644 >> index 0000000..58c46f3 >> --- /dev/null >> +++ b/tools/hotplug/Linux/remus-netbuf-setup >> @@ -0,0 +1,203 @@ >> +#!/bin/bash >> +#============================================================================ >> +# ${XEN_SCRIPT_DIR}/remus-netbuf-setup > ... >> +# IFB ifb interface to be cleaned up (required). [for teardown op only] > > It occurs to me to ask: Is it wise to use such a bare name for the > variable IFB ? I would suggest that it might be better to use a > variable with XEN or REMUS or something in it. Otherwise perhaps > someone somewhere might think IFB stood for something else and would > set it to something irrelevant (or your setting of it might break > their usage...) Maybe REMUS_IFB? > > I had some other comments about this script in v10, which we were > having a conversation about. I'm afraid I seem to have dropped the > thread of that. I will reply separately in that thread. I have replied in that thread. > >> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c >> index bcbd02b..b7d62c1 100644 >> --- a/tools/libxl/libxl.c >> +++ b/tools/libxl/libxl.c >> @@ -788,6 +788,24 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info, > ... >> + /* Setup network buffering */ > > This comment is misleading. This code doesn't actually set up any > network buffering. It doesn't call the setup script. All it does is > some checks and computing the script path. > > It would probably be better to simply remove the comment. Or you > could replace it with something more accurate. Will remove the comment, thanks. > >> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h >> index 8ef20a0..2fe36a6 100644 >> --- a/tools/libxl/libxl_internal.h >> +++ b/tools/libxl/libxl_internal.h >> @@ -310,6 +310,10 @@ struct libxl__gc { >> libxl_ctx *owner; >> }; >> >> +/* remus device ops specific structures start */ >> +typedef struct libxl__remus_netbuf_state libxl__remus_netbuf_state; >> +/* remus device ops specific structures end */ > > I don't think these comments add anything. You should move these > struct typedefs alongside the other struct typedefs (libxl__ao et al). > There is then no need to comment what they are. > >> struct libxl__ctx { >> xentoollog_logger *lg; >> xc_interface *xch; >> @@ -374,6 +378,9 @@ struct libxl__ctx { >> LIBXL_LIST_ENTRY(libxl_ctx) sigchld_users_entry; >> >> libxl_version_info version_info; >> + >> + /* remus device ops specific structures */ >> + libxl__remus_netbuf_state *rns; > > Again, I think this comment adds nothing that isn't in the type name. > There is no need to say in a comment anything that is plain from the > code. > > But before you delete it (and other comments that I'm going to quibble > about), please wait for a comment from the other tools maintainer, Ian > Campbell, who may disagree. > >> @@ -2631,6 +2639,8 @@ struct libxl__remus_state { >> libxl__ao *ao; >> uint32_t domid; >> libxl__remus_callback *callback; >> + /* Script to setup/teardown network buffers */ >> + const char *netbufscript; > > And again. > >> diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c >> index 52d593c..025ee89 100644 >> --- a/tools/libxl/libxl_netbuffer.c >> +++ b/tools/libxl/libxl_netbuffer.c >> @@ -17,11 +17,486 @@ >> >> #include "libxl_internal.h" >> >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +struct libxl__remus_netbuf_state { >> + libxl__ao *ao; >> + uint32_t domid; >> + const char *netbufscript; > > Why is there a copy of netbufscript here ? We use netbuf state in device concrete layer, so we made a copy of this. > >> +/*----- init() and destroy() -----*/ >> + >> +static int nic_init(const libxl__remus_device_ops *self, >> + libxl__remus_state *rs) >> +{ > > This function seems to leak the things it allocates, on error. > The `out' section should probably call destroy. If this failed, we will call teardown, and in teardown, the nic_destroy will be called. > >> + ns->nlsock = nl_socket_alloc(); >> + if (!ns->nlsock) { >> + LOG(ERROR, "cannot allocate nl socket"); >> + goto out; >> + } > > Last time I commented on the error handling in this patch. I see that > you have added an nl_geterror in some places. > > When I make a comment like that, you need to apply it whereever > relevant in the code, and to the whole patch series. Yes, actually I've checked the whole patch about this and corrected them. for this special case, nl_socket_alloc() doesn't return an error number, so we can't use nl_geterror, it returns NULL if alloc failed, so I think this error message should be ok. > >> +static void nic_destroy(const libxl__remus_device_ops *self, >> + libxl__remus_state *rs) > ... >> + /* free qdisc cache */ >> + if (ns->qdisc_cache) { >> + nl_cache_clear(ns->qdisc_cache); >> + nl_cache_free(ns->qdisc_cache); > > Can these functions fail ? If so, what does it mean and what should > the program do ? No, they both return void. > >> +/*----- checkpointing APIs -----*/ >> + >> +/* The buffer_op's value, not the value passed to kernel */ > > I would say "The value of buffer_op, not the value passed to kernel". Ok, thanks. > >> +static int remus_netbuf_op(libxl__remus_device_nic *remus_nic, >> + libxl__remus_netbuf_state *netbuf_state, >> + int buffer_op) >> +{ >> + int rc; >> + >> + STATE_AO_GC(netbuf_state->ao); >> + >> + if (buffer_op == tc_buffer_start) >> + rc = rtnl_qdisc_plug_buffer(remus_nic->qdisc); >> + else >> + rc = rtnl_qdisc_plug_release_one(remus_nic->qdisc); > > Error handling again. I can't quite follow you, what's wrong with the error handling in this function except the name of the "rc"? > > Also do not put anything other than a libxl error code in a variable > called "rc". Ok. > >> +/*----- main flow of control -----*/ >> + >> +/* helper functions */ >> + >> +/* >> + * If the device has a vifname, then use that instead of >> + * the vifX.Y format. >> + * it must ONLY be used for remus because if driver domains >> + * were in use it would constitute a security vulnerability. >> + */ >> +static const char *get_vifname(libxl__remus_device *dev, >> + const libxl_device_nic *nic) >> +{ > ... >> + path = libxl__sprintf(gc, "%s/backend/vif/%d/%d/vifname", >> + libxl__xs_get_dompath(gc, 0), domid, nic->devid); > > Please use GCSPRINTF where applicable. Ok, thanks. > >> + rc = libxl__xs_read_checked(gc, XBT_NULL, path, &vifname); >> + if (!rc && !vifname) { >> + /* use the default name */ > > This comment adds nothing to the comment above. > >> + vifname = libxl__device_nic_devname(gc, domid, >> + nic->devid, >> + nic->nictype); > >> + rc = ERROR_FAIL; >> + ifindex = rtnl_link_get_ifindex(ifb); >> + if (!ifindex) { >> + LOG(ERROR, "interface %s has no index", remus_nic->ifb); >> + goto out; > > How does rtnl_link_get_ifindex report its error code ? Or can it fail > only one way ? It Returns Interface index or 0 if not set. No error code will be returned. > >> + qdisc = rtnl_qdisc_get_by_parent(netbuf_state->qdisc_cache, ifindex, >> + TC_H_ROOT); >> + >> + if (qdisc) { >> + const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc)); >> + /* Sanity check: Ensure that the root qdisc is a plug qdisc. */ >> + if (!tc_kind || strcmp(tc_kind, "plug")) { >> + nl_object_put((struct nl_object *)qdisc); > > This freeing of qdisc should be in the `out' section. > > You can do this either by arranging that the `out' section is not used > at all on successful completion, or by saying something like > if (rc) { > if (qdisc) nl_object_put((struct nl_object *)qdisc); > } Ok thanks. > >> + LOG(ERROR, "plug qdisc is not installed on %s", remus_nic->ifb); >> + goto out; >> + } >> + remus_nic->qdisc = qdisc; >> + rc = 0; >> + } else { >> + LOG(ERROR, "Cannot get qdisc handle from ifb %s", remus_nic->ifb); > > This path out of the function seems not to set rc. > > ... In fact, I see that you initialise rc to ERROR_FAIL above. IMO > this is not a good programming paradigm. rc should be set, obviously, > on every exit path, at the point the error is generated. Ok thanks. > >> +/* >> + * In return, the script writes the name of IFB device (during setup) to be >> + * used for output buffering into XENBUS_PATH/ifb >> + */ >> +static void netbuf_setup_script_cb(libxl__egc *egc, >> + libxl__async_exec_state *aes, >> + int status) > > This callback is not in chronological order. > > I'm afraid I've run out of time now so I'm going to stop here. I hope > these comments have been helpful. Thank you for patiently reviewing this. > > Thanks, > Ian. > . > -- Thanks, Yang.