From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hongyang Yang Subject: Re: [PATCH v13 4/7] remus netbuffer: implement remus network buffering for nic devices Date: Wed, 2 Jul 2014 09:55:14 +0800 Message-ID: <53B36682.3020004@cn.fujitsu.com> References: <1403834381-1473-1-git-send-email-yanghy@cn.fujitsu.com> <1403834381-1473-5-git-send-email-yanghy@cn.fujitsu.com> <21421.44993.176133.317319@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <21421.44993.176133.317319@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Jackson Cc: ian.campbell@citrix.com, wency@cn.fujitsu.com, stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com, yunhong.jiang@intel.com, eddie.dong@intel.com, xen-devel@lists.xen.org, rshriram@cs.ubc.ca, roger.pau@citrix.com, laijs@cn.fujitsu.com List-Id: xen-devel@lists.xenproject.org On 06/28/2014 01:54 AM, Ian Jackson wrote: > Yang Hongyang writes ("[PATCH v13 4/7] remus netbuffer: implement remus network buffering for nic devices"): >> 1.Add two members in libxl_domain_remus_info: > > Thanks. > > I'm going to make mostly general comments on this. > >> /* >> + * LIBXL_HAVE_REMUS_NETBUF 1 >> + * >> + * If this is defined, then the libxl_domain_remus_info structure will >> + * have a boolean field (netbuf) and a string field (netbufscript). >> + * >> + * netbuf, if true, indicates that network buffering should be enabled. >> + * >> + * netbufscript, if set, indicates the path to the hotplug script to >> + * setup or teardown network buffers. >> + */ >> +#define LIBXL_HAVE_REMUS_NETBUF 1 > > This seems to imply that the default at the API level is for network > buffering to be turned off. I don't think that's right, is it ? > >> +/* If the device has a vifname, then use that instead of >> + * the vifX.Y format. > > I don't think it is right in general to look in the backend directory > for the vifname. If we are using driver domains, the backend xenstore > directory is writeable by the driver domain. > > But I think Remus is not at all compatible with driver domains right > now - am I right ? > > If so then at the very least this function should have something in > the comment saying that it must ONLY be used for remus because if > driver domains were in use it would constitute a security > vulnerability. > >> + ret = nl_cache_refill(netbuf_state->nlsock, netbuf_state->qdisc_cache); >> + if (ret < 0) { >> + LOG(ERROR, "cannot refill qdisc cache"); > > Do these functions return error codes or set errno or something ? I > think you should print the actual error that was encountered. That > might be as simple as using LOGE instead but it depends on what they > do. It returns a negative error code on error. > >> + rc = libxl__xs_read_checked(gc, XBT_NULL, >> + GCSPRINTF("%s/hotplug-error", out_path_base), >> + &hotplug_error); >> + if (rc) { >> + rc = ERROR_FAIL; > > I think you can just "goto out" without setting rc again. That's true, will fix that. > >> +static void netbuf_teardown_script_cb(libxl__egc *egc, >> + libxl__async_exec_state *aes, >> + int status) > > As I said in my other mail, can you reorganise this so that the > functions are the file in the same order as the control flow, with > appropriate comments dividing the different flows up ? > >> +static void async_call_done(libxl__egc *egc, >> + libxl__ev_child *child, >> + pid_t pid, int status) >> +{ >> + libxl__remus_device *dev = CONTAINER_OF(child, *dev, child); >> + libxl__remus_device_state *rds = dev->rds; >> + STATE_AO_GC(rds->ao); >> + >> + if (WIFEXITED(status)) { >> + dev->callback(egc, dev, -WEXITSTATUS(status)); >> + } else { >> + dev->callback(egc, dev, ERROR_FAIL); >> + } > > This is really quite odd. These nonzero statuses are surely always > errors ? Why leave it to the next layer to interpret ? Returning > either a positive ERROR_* or a -WEXITSTATUS is particularly odd. > > And if you are going to do this you need to make a crystal-clear doc > comment about it. > >> +static void nic_match_async(const libxl__remus_device_ops *self, >> + libxl__remus_device *dev) >> +{ >> + if (dev->kind == LIBXL__REMUS_DEVICE_NIC) >> + _exit(0); >> + >> + _exit(-ERROR_NOT_MATCH); >> +} >> + >> +static void nic_match(libxl__remus_device_ops *self, >> + libxl__remus_device *dev) >> +{ >> + int pid = -1; >> + STATE_AO_GC(dev->rds->ao); >> + >> + /* Fork and call */ >> + pid = libxl__ev_child_fork(gc, &dev->child, async_call_done); >> + if (pid == -1) { >> + LOG(ERROR, "unable to fork"); >> + goto out; >> + } >> + >> + if (!pid) { >> + /* child */ >> + nic_match_async(self, dev); >> + /* notreached */ >> + abort(); > > This is very odd. Why are you spawning a subprocess to > test dev->kind ? > > >> +static void nic_setup(libxl__remus_device *dev) >> +{ >> + libxl__remus_device_nic *remus_nic; >> + libxl__remus_netbuf_state *ns = dev->ops->data; >> + const libxl_device_nic *nic = dev->backend_dev; >> + >> + STATE_AO_GC(ns->ao); >> + >> + GCNEW(remus_nic); >> + dev->data = remus_nic; >> + remus_nic->vif = get_vifname(dev, nic); >> + if (!remus_nic->vif) >> + goto out; > > I think you should set rc here rather than (implicitly) in out. > >> + setup_async_exec(&dev->aes, "setup", dev); >> + if (libxl__async_exec_start(gc, &dev->aes)) { > > And you should propagate the rc from libxl__async_exec_start. Yes, will fix that. > >> +/* >> + * Note: This function will be called in the same gc context as >> + * libxl__remus_netbuf_setup, created during the libxl_domain_remus_start >> + * API call. >> + */ >> +static void nic_teardown(libxl__remus_device *dev) > > This comment is interesting but I don't understand the implications. > Also isn't this a feature of the abstract/concrete API which should be > documented there ? > >> +static void nic_match_async(const libxl__remus_device_ops *self, >> + libxl__remus_device *dev) >> +{ >> + if (dev->kind == LIBXL__REMUS_DEVICE_NIC) >> + _exit(-ERROR_FAIL); >> + >> + _exit(-ERROR_NOT_MATCH); > > Perhaps each libxl__remus_device_ops could mention the kind it > supports, so that the abstract code could check that ? That would > avoid duplicating this check in every concrete kind's code. > >> +extern libxl__remus_device_ops remus_device_nic; > > This needs to be const. > > > I'm afraid I'm going to leave the drbd patch for another day... > > Thanks, > Ian. > . > -- Thanks, Yang.