From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hongyang Yang Subject: Re: [PATCH v10 4/5] remus: implement remus network buffering for nic devices Date: Tue, 10 Jun 2014 15:33:19 +0800 Message-ID: <5396B4BF.2060209@cn.fujitsu.com> References: <1401932069-16460-1-git-send-email-yanghy@cn.fujitsu.com> <1401932069-16460-5-git-send-email-yanghy@cn.fujitsu.com> <21392.42936.240674.451323@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <21392.42936.240674.451323@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Jackson Cc: ian.campbell@citrix.com, wency@cn.fujitsu.com, stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com, yunhong.jiang@intel.com, eddie.dong@intel.com, xen-devel@lists.xen.org, rshriram@cs.ubc.ca, roger.pau@citrix.com, laijs@cn.fujitsu.com List-Id: xen-devel@lists.xenproject.org Hi Ian, Sorry for the late reply, just notice this comment... On 06/06/2014 01:24 AM, Ian Jackson wrote: > Yang Hongyang writes ("[PATCH v10 4/5] remus: implement remus network buffering for nic devices"): >> 1.Add two members in libxl_domain_remus_info: > > Thanks for this patch. > > I'm deferring reviewing the parts of this inside libxl which use the > new libxl device interface, until we have the API documentation > comments which I discussed in my last email. I hope that's OK. > > But there i > >> netbuf: whether netbuf is enabled >> netbufscript: the path of the script which will be run to setup >> and tear down the guest's interface. >> 2.introduces remus-netbuf-setup hotplug script responsible for >> setting up and tearing down the necessary infrastructure required for >> network output buffering in Remus. This script is intended to be invoked >> by libxl for each guest interface, when starting or stopping Remus. >> >> Apart from returning success/failure indication via the usual hotplug >> entries in xenstore, this script also writes to xenstore, the name of >> the IFB device to be used to control the vif's network output. >> >> The script relies on libnl3 command line utilities to perform various >> setup/teardown functions. The script is confined to Linux platforms only >> since NetBSD does not seem to have libnl3. >> >> The following steps are taken during init: >> a) establish a dedicated remus context containing libnl related >> state (netlink sockets, qdisc caches, etc.,) >> >> The following steps are taken for each vif during setup: >> a) call the hotplug script to setup its network buffer >> >> b) Obtain handles to plug qdiscs installed on the IFB devices >> chosen by the hotplug scripts. >> >> And during teardown, the netlink resources are released, followed by >> invocation of hotplug scripts to remove the ifb devices. >> 3.implement the remus device interface. setup, teardown, etc. >> >> Signed-off-by: Shriram Rajagopalan >> Signed-off-by: Yang Hongyang >> Signed-off-by: Lai Jiangshan >> Reviewed-by: Wen Congyang >> --- >> docs/misc/xenstore-paths.markdown | 4 + >> tools/hotplug/Linux/Makefile | 1 + >> tools/hotplug/Linux/remus-netbuf-setup | 183 ++++++++++++ >> tools/libxl/libxl.c | 18 ++ >> tools/libxl/libxl.h | 13 + >> tools/libxl/libxl_internal.h | 3 + >> tools/libxl/libxl_netbuffer.c | 519 +++++++++++++++++++++++++++++++++ >> tools/libxl/libxl_nonetbuffer.c | 67 +++++ >> tools/libxl/libxl_remus_device.c | 22 +- >> tools/libxl/libxl_types.idl | 2 + >> 10 files changed, 831 insertions(+), 1 deletion(-) >> create mode 100644 tools/hotplug/Linux/remus-netbuf-setup >> > > > >> diff --git a/docs/misc/xenstore-paths.markdown b/docs/misc/xenstore-paths.markdown >> index 70ab7f4..039eaea 100644 >> --- a/docs/misc/xenstore-paths.markdown >> +++ b/docs/misc/xenstore-paths.markdown >> @@ -385,6 +385,10 @@ The guest's virtual time offset from UTC in seconds. >> >> The device model version for a domain. >> >> +#### /libxl/$DOMID/remus/netbuf/$DEVID/ifb = STRING [n,INTERNAL] >> + >> +ifb device used by Remus to buffer network output from the associated vif. >> + > > Thanks for updating the doc. Your changes to the hotplug Makefile > look good too. > >> diff --git a/tools/hotplug/Linux/remus-netbuf-setup b/tools/hotplug/Linux/remus-netbuf-setup >> new file mode 100644 >> index 0000000..aed2583 >> --- /dev/null >> +++ b/tools/hotplug/Linux/remus-netbuf-setup >> @@ -0,0 +1,183 @@ >> +#!/bin/bash >> +#============================================================================ >> +# ${XEN_SCRIPT_DIR}/remus-netbuf-setup >> +# >> +# Script for attaching a network buffer to the specified vif (in any mode). >> +# The hotplugging system will call this script when starting remus via libxl >> +# API, libxl_domain_remus_start. > > Right. Thanks for the comprehensive head comment. > >> +#============================================================================ > ... >> +setup_ifb() { >> + >> + for ifb in `ifconfig -a -s|egrep ^ifb|cut -d ' ' -f1` >> + do >> + local installed=`nl-qdisc-list -d $ifb` >> + [ -n "$installed" ] && continue >> + IFB="$ifb" >> + break >> + done > > As far as I can see this attempts to search for an ifb which is not in > use. > > I see you claim a lock to ensure that you don't fail due to races with > other copies of this script. > > But are there potentially other things (not Xen related, parhaps) in > the system which might try to allocate an ifb using a similar > approach ? How do we deal with the potential race with them ? > > Also: I think you should: > - write the IFB name to xenstore _before_ starting to configure it > - in the loop I quote above, check in xenstore that the ifb is not > in use by another domain > > Otherwise there seems to be the following risk: > 1. You pick ifbX using the loop above > 2. You start to configure ifbX, eventually resulting in a > configuration which makes it not show up as free > 3. Something bad happens and you fail, before writing the > ifb name to xenstore > > In this case, the ifb would be leaked. (I see you do try to avoid > this with xs_write_failed, but scripts can fail for other reasons.) If the setup failed for any reason, we will call teardown in the remus device framework, so the ifb won't be leaked. > >> + do_or_die tc qdisc add dev "$vif" ingress > > I'm not qualified to review these tc manipulations. I guess I'm going > to trust that they're correct. > >> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h >> index 80947c3..db30a97 100644 >> --- a/tools/libxl/libxl.h >> +++ b/tools/libxl/libxl.h >> @@ -437,6 +437,19 @@ >> #define LIBXL_HAVE_DRIVER_DOMAIN_CREATION 1 >> >> /* >> + * LIBXL_HAVE_REMUS_NETBUF 1 >> + * >> + * If this is defined, then the libxl_domain_remus_info structure will >> + * have a boolean field (netbuf) and a string field (netbufscript). >> + * >> + * netbuf, if true, indicates that network buffering should be enabled. >> + * >> + * netbufscript, if set, indicates the path to the hotplug script to >> + * setup or teardown network buffers. >> + */ >> +#define LIBXL_HAVE_REMUS_NETBUF 1 > > Good. > >> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl >> index 4278a6b..50bf1ef 100644 >> --- a/tools/libxl/libxl_types.idl >> +++ b/tools/libxl/libxl_types.idl >> @@ -566,6 +566,8 @@ libxl_domain_remus_info = Struct("domain_remus_info",[ >> ("interval", integer), >> ("blackhole", bool), >> ("compression", bool), >> + ("netbuf", bool), >> + ("netbufscript", string), > > I think netbuf should be a defbool, not a bool. Indeed, perhaps this > is true of the other options too. Is there some reason it shouldn't > default to enabled ? The netbuffering is enabled by default, this option is used to disable the netbuffering support. > > You should mention in your commit message that this is going to be > plumbed into xl and the documentation in the next patch. > > Regarding the other remus options here (and perhaps changing their > types), I think it would be OK to break API compatibility, since the > previous versions of remus exposed via xl have not been suitable for > deployment. Do you agree ? Agree. Since I've done v11 patch series, I will look into it in the next series. For rapid iteration, I will send v11 patch series for review in a short period. > > Thanks, > Ian. > . > -- Thanks, Yang.