Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] net: wan: sbni: remove assembly crc32 code
From: David Miller @ 2013-10-22 18:38 UTC (permalink / raw)
  To: sebastian; +Cc: andi, akpm, linux-kernel, ak, netdev
In-Reply-To: <20131022183625.GA4382@breakpoint.cc>

From: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Date: Tue, 22 Oct 2013 20:36:25 +0200

> 
> There is also a C function doing the same thing. Unless the asm code is
> 110% faster we could stick to the C function.
> 
> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
> 
> On Tue, Oct 22, 2013 at 01:59:28PM -0400, David Miller wrote:
>> 
>> Is it really impossible to use the generic crc32 support instead of this
>> unsightly custom inline assembler?
> 
> Since you asked for this here step 1.

If it's that simple, this is definitely what we should do.

On the off chance, is there anyone actually able to test this
change? :-)))

^ permalink raw reply

* Re: [PATCH] Revert "bridge: only expire the mdb entry when query is received"
From: David Miller @ 2013-10-22 18:41 UTC (permalink / raw)
  To: linus.luessing; +Cc: stephen, netdev, bridge, linux-kernel, amwang
In-Reply-To: <1382223537-10844-1-git-send-email-linus.luessing@web.de>

From: Linus Lüssing <linus.luessing@web.de>
Date: Sun, 20 Oct 2013 00:58:57 +0200

> While this commit was a good attempt to fix issues occuring when no
> multicast querier is present, this commit still has two more issues:
> 
> 1) There are cases where mdb entries do not expire even if there is a
> querier present. The bridge will unnecessarily continue flooding
> multicast packets on the according ports.
> 
> 2) Never removing an mdb entry could be exploited for a Denial of
> Service by an attacker on the local link, slowly, but steadily eating up
> all memory.
> 
> Actually, this commit became obsolete with
> "bridge: disable snooping if there is no querier" (b00589af3b)
> which included fixes for a few more cases.
> 
> Therefore reverting the following commits (the commit stated in the
> commit message plus three of its follow up fixes):
> 
> ---
> Revert "bridge: update mdb expiration timer upon reports."
> This reverts commit f144febd93d5ee534fdf23505ab091b2b9088edc.
> Revert "bridge: do not call setup_timer() multiple times"
> This reverts commit 1faabf2aab1fdaa1ace4e8c829d1b9cf7bfec2f1.
> Revert "bridge: fix some kernel warning in multicast timer"
> This reverts commit c7e8e8a8f7a70b343ca1e0f90a31e35ab2d16de1.
> Revert "bridge: only expire the mdb entry when query is received"
> This reverts commit 9f00b2e7cf241fa389733d41b615efdaa2cb0f5b.
> ---
> 
> CC: Cong Wang <amwang@redhat.com>
> Signed-off-by: Linus Lüssing <linus.luessing@web.de>

Applied, thanks a lot.

^ permalink raw reply

* Re: Big performance loss from 3.4.63 to 3.10.13 when routing ipv4
From: Wolfgang Walter @ 2013-10-22 19:07 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev
In-Reply-To: <20131001222002.GL10771@order.stressinduktion.org>

Am Mittwoch, 2. Oktober 2013, 00:20:02 schrieb Hannes Frederic Sowa:
> On Tue, Oct 01, 2013 at 06:39:32PM +0200, Wolfgang Walter wrote:
> > All network traffic over the router become slow and sluggish. If one pings
> > the router there is a packet loss. After about 2 minutes the traffic
> > completely stalls for about 1 minute. Then it works again as in the
> > beginning to then stall again. And so on.
> 
> Maybe dropwatch can give a first hint?
> 

I finally found the problem:

In 3.10.x and 3.11.x the value of /proc/sys/net/ipv4/xfrm4_gc_thresh is 1024.

It is much higher in 3.4.x. If I increase this value in 3.10.x to the one I 
see on 3.4.x all works fine with 3.10.x

Regards,
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts

^ permalink raw reply

* Re: [PATCH 1/4 net-next] net: phy: add Generic Netlink Ethernet switch configuration API
From: Dan Williams @ 2013-10-22 19:22 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev, davem, s.hauer, nbd, blogic, jogo, gary
In-Reply-To: <1382466229-15123-2-git-send-email-f.fainelli@gmail.com>

On Tue, 2013-10-22 at 11:23 -0700, Florian Fainelli wrote:
> This patch adds an Ethernet Switch generic netlink configuration API
> which allows for doing the required configuration of managed Ethernet
> switches commonly found in Wireless/Cable/DSL routers in the market.

"swconfig" probably means "switch config", but is there any way to
rename this away from the "sw" prefix, since "sw" typically means
"software" and not "switch"?

Dan

> Since this API is based on the Generic Netlink infrastructure it is very
> easy to extend a particular switch driver to support additional features
> and to adapt it to specific switches.
> 
> So far the API includes support for:
> 
> - getting/setting a port VLAN id
> - getting/setting VLAN port membership
> - getting a port link status
> - getting a port statistics counters
> - resetting a switch device
> - applying a configuration to a switch device
> 
> Unlike the Distributed Switch Architecture code, this API is much
> smaller and does not interfere with the networking stack packet flow, but
> rather focuses on the control path of managed switches.
> 
> A concrete example of a switch driver is included in subsequent patches
> to illustrate how it can be used as well as the required user-space
> controlling tools.
> 
> Signed-off-by: Felix Fietkau <nbd@openwrt.org>
> Signed-off-by: John Crispin <blogic@openwrt.org>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>  Documentation/networking/swconfig.txt |  162 +++++
>  MAINTAINERS                           |   10 +
>  drivers/net/phy/Kconfig               |    6 +
>  drivers/net/phy/Makefile              |    1 +
>  drivers/net/phy/swconfig.c            | 1078 +++++++++++++++++++++++++++++++++
>  include/linux/swconfig.h              |  180 ++++++
>  include/uapi/linux/Kbuild             |    1 +
>  include/uapi/linux/swconfig.h         |  103 ++++
>  8 files changed, 1541 insertions(+)
>  create mode 100644 Documentation/networking/swconfig.txt
>  create mode 100644 drivers/net/phy/swconfig.c
>  create mode 100644 include/linux/swconfig.h
>  create mode 100644 include/uapi/linux/swconfig.h
> 
> diff --git a/Documentation/networking/swconfig.txt b/Documentation/networking/swconfig.txt
> new file mode 100644
> index 0000000..f560066
> --- /dev/null
> +++ b/Documentation/networking/swconfig.txt
> @@ -0,0 +1,162 @@
> +Generic Netlink Switch configuration API
> +
> +Introduction
> +============
> +
> +The following documentation covers the Linux Ethernet switch configuration API
> +which is based on the Generic Netlink infrastructure.
> +
> +Scope and rationale
> +===================
> +
> +Most Ethernet switches found in small routers are managed switches which allow
> +the following operations:
> +
> +- configure a port to belong to a particular set of VLANs either as tagged or
> +  untagged
> +- configure a particular port to advertise specific link/speed/duplex settings
> +- collect statistics about the number of packets/bytes transferred/received
> +- any other vendor specific feature: rate limiting, single/double tagging...
> +
> +Such switches can be connected to the controlling CPU using different hardware
> +busses, but most commonly:
> +
> +- SPI/I2C/GPIO bitbanging
> +- MDIO
> +- Memory mapped into the CPU register address space
> +
> +As of today the usual way to configure such a switch was either to write a
> +specific driver or to write an user-space application which would have to know
> +about the hardware differences and figure out a way to access the switch
> +registers (spidev, SIOCIGGMIIREG, mmap...) from user-space.
> +
> +This has multiple issues:
> +
> +- proliferation of ad-hoc solutions to configure a switch both open source and
> +  proprietary
> +
> +- absence of common software reference for switches commonly found on the market
> +  (Broadcom, Lantiq/Infineon/ADMTek, Marvell, Qualcomm/Atheros...) which implies
> +  a duplication effort for each implementer
> +
> +- inability to leverage existing hardware representation mechanisms such as
> +  Device Tree (spidev, i2c-dev.. do not belong in Device Tree and rely on
> +  Linux-specific "forwarder" drivers) to describe a switch device
> +
> +The goal of the switch configuration API is to provide a common basis to build
> +re-usable and extensible switch drivers with the following ideas in mind:
> +
> +- having a central point of configuration on top of which a reference user-space
> +  implementation can be provided but also allow for other user-space
> +  implementations to exist
> +
> +- ensure the Linux kernel is in control of the actual hardware access
> +
> +- be extensible enough to support per-switch features without making the generic
> +  implementation too heavy weighted and without making user-space changes each
> +  and every time a new feature is added
> +
> +Based on these design goals the Generic Netlink kernel/user-space communication
> +mechanism was chosen because it allows for all design goals to be met.
> +
> +Distributed Switch Architecture vs. swconfig
> +============================================
> +
> +The Marvell Distributed Switch Architecture drivers is an existing solution
> +which is a heavy switch driver infrastructure, is Marvell centric, only
> +supports MDIO connected switches, mangles an Ethernet driver transmit/receive
> +paths and does not offer a central control path for the user.
> +
> +swconfig is vendor agnostic, does not mangle the transmit/receive path
> +of an Ethernet driver and is focused on the control path of the switch rather
> +that the data path. It is based on Generic Netlink to allow for each switch
> +driver to easily extend the swconfig API without causing major core parts rework
> +each and every time someone has a specific feature to implement and offers a
> +central configuration point with a well-defined API.
> +
> +Switch configuration API
> +========================
> +
> +The main data structure of the switch configuration API is a "struct switch_dev"
> +which contains the following members:
> +
> +- a set of common operations to all switches (struct switch_dev_ops)
> +- a network device pointer it is physically attached to
> +- a number of physical switch ports (including CPU port)
> +- a set of configured vlans
> +- a CPU specific port index
> +
> +A particular switch device is registered/unregistered using the following pair
> +of functions:
> +
> +register_switch(struct switch_dev *sw_dev, struct net_device *dev);
> +unregister_switch(struct switch_dev);
> +
> +A given switch driver can be backed by any kind of underlying bus driver (i2c
> +client, GPIO driver, MMIO driver, directly into the Ethernet MAC driver...).
> +
> +The set of common operations to all switches is represented by the "struct
> +switch_dev_ops" function pointers, these common operations are defined as such:
> +
> +- get the port list of a VLAN identifier
> +- set the port list of a VLAN identifier
> +- get the primary VLAN identifier of a port
> +- set the primary VLAN identifier of a port
> +- apply the changed configuration to the switch
> +- reset the switch
> +- get a port link status
> +- get a port statistics counters
> +
> +The switch_dev_ops structure also contains an extensible way of representing and
> +querying switch specific features, 3 different types of attributes are
> +available:
> +
> +- global attributes: attributes global to a switch (name, identifier, number of
> +  ports)
> +- port attributes: per-port specific attributes (MIB counters, enabling port
> +  mirroring...)
> +- vlan attributes: per-VLAN specific attributes (VLAN id, specific VLAN
> +  information)
> +
> +Each of these 3 categories must be represented using an array of "struct
> +switch_attr" attributes. This structure must be filed with:
> +
> +- an unique name for the operation
> +- a description for the operation
> +- a setter operation
> +- a getter operation
> +- a data type (string, integer, port)
> +- eventual min/max limits to validate user input data
> +
> +The "struct switch_attr" directly maps to a Generic Netlink type of command and
> +will be automatically discovered by the "swconfig" user-space utility without
> +requiring user-space changes.
> +
> +User-space reference tool
> +=========================
> +
> +A reference user-space implementation is provided in tools/swconfig in order to
> +directly configure and use a particular switch driver. This reference
> +implementation is linking against libnl-1 for the moment.
> +
> +To build it:
> +
> +make -C tools/swconfig
> +
> +To list the available switches:
> +
> +./tools/swconfig list
> +
> +And to show a particular switch configuration for instance:
> +
> +./tools/swconfig dev eth0 show
> +
> +Fake (simulation) switch driver
> +===============================
> +
> +A fake switch driver called swconfig-hwsim is provided in order to allow for
> +easy testing of API changes and to perform regression testing. This driver will
> +automatically map to the loopback device and will create a fake switch of up to
> +8 Gigabit ports. Each of these ports can be configured with separate
> +speed/duplex/link settings. This driver is gated with the CONFIG_SWCONFIG_HWSIM
> +configuration symbol.
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f169259..3a54262 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8117,6 +8117,16 @@ F:	lib/swiotlb.c
>  F:	arch/*/kernel/pci-swiotlb.c
>  F:	include/linux/swiotlb.h
>  
> +SWITCH CONFIGURATION API
> +M:	Florian Fainelli <f.fainelli@gmail.com>
> +L:	openwrt-devel@lists.openwrt.org
> +L:	netdev@vger.kernel.org
> +S:	Supported
> +F:	drivers/net/ethernet/phy/swconfig*.c
> +F:	include/uapi/linux/switch.h
> +F:	include/linux/switch.h
> +F:	Documentation/networking/swconfig.txt
> +
>  SYNOPSYS ARC ARCHITECTURE
>  M:	Vineet Gupta <vgupta@synopsys.com>
>  S:	Supported
> diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
> index 342561a..9b3e117 100644
> --- a/drivers/net/phy/Kconfig
> +++ b/drivers/net/phy/Kconfig
> @@ -12,6 +12,12 @@ menuconfig PHYLIB
>  
>  if PHYLIB
>  
> +config SWCONFIG
> +	tristate "Switch configuration API"
> +	---help---
> +	  Switch configuration API using netlink. This allows
> +	  you to configure the VLAN features of certain switches.
> +
>  comment "MII PHY device drivers"
>  
>  config AT803X_PHY
> diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
> index 23a2ab2..268c7de 100644
> --- a/drivers/net/phy/Makefile
> +++ b/drivers/net/phy/Makefile
> @@ -3,6 +3,7 @@
>  libphy-objs			:= phy.o phy_device.o mdio_bus.o
>  
>  obj-$(CONFIG_PHYLIB)		+= libphy.o
> +obj-$(CONFIG_SWCONFIG)		+= swconfig.o
>  obj-$(CONFIG_MARVELL_PHY)	+= marvell.o
>  obj-$(CONFIG_DAVICOM_PHY)	+= davicom.o
>  obj-$(CONFIG_CICADA_PHY)	+= cicada.o
> diff --git a/drivers/net/phy/swconfig.c b/drivers/net/phy/swconfig.c
> new file mode 100644
> index 0000000..9997c35
> --- /dev/null
> +++ b/drivers/net/phy/swconfig.c
> @@ -0,0 +1,1078 @@
> +/*
> + * Switch configuration API
> + *
> + * Copyright (C) 2008 Felix Fietkau <nbd@openwrt.org>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <linux/types.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/list.h>
> +#include <linux/if.h>
> +#include <linux/if_ether.h>
> +#include <linux/capability.h>
> +#include <linux/skbuff.h>
> +#include <linux/swconfig.h>
> +
> +#define SWCONFIG_DEVNAME	"switch%d"
> +
> +MODULE_AUTHOR("Felix Fietkau <nbd@openwrt.org>");
> +MODULE_LICENSE("GPL");
> +
> +static int swdev_id;
> +static struct list_head swdevs;
> +static DEFINE_SPINLOCK(swdevs_lock);
> +struct swconfig_callback;
> +
> +struct swconfig_callback {
> +	struct sk_buff *msg;
> +	struct genlmsghdr *hdr;
> +	struct genl_info *info;
> +	int cmd;
> +
> +	/* callback for filling in the message data */
> +	int (*fill)(struct swconfig_callback *cb, void *arg);
> +
> +	/* callback for closing the message before sending it */
> +	int (*close)(struct swconfig_callback *cb, void *arg);
> +
> +	struct nlattr *nest[4];
> +	int args[4];
> +};
> +
> +/* defaults */
> +
> +static int
> +swconfig_get_vlan_ports(struct switch_dev *dev,
> +			const struct switch_attr *attr, struct switch_val *val)
> +{
> +	int ret;
> +	if (val->port_vlan >= dev->vlans)
> +		return -EINVAL;
> +
> +	if (!dev->ops->get_vlan_ports)
> +		return -EOPNOTSUPP;
> +
> +	ret = dev->ops->get_vlan_ports(dev, val);
> +	return ret;
> +}
> +
> +static int
> +swconfig_set_vlan_ports(struct switch_dev *dev,
> +			const struct switch_attr *attr, struct switch_val *val)
> +{
> +	struct switch_port *ports = val->value.ports;
> +	const struct switch_dev_ops *ops = dev->ops;
> +	int i;
> +
> +	if (val->port_vlan >= dev->vlans)
> +		return -EINVAL;
> +
> +	/* validate ports */
> +	if (val->len > dev->ports)
> +		return -EINVAL;
> +
> +	if (!ops->set_vlan_ports)
> +		return -EOPNOTSUPP;
> +
> +	for (i = 0; i < val->len; i++) {
> +		if (ports[i].id >= dev->ports)
> +			return -EINVAL;
> +
> +		if (ops->set_port_pvid &&
> +		    !(ports[i].flags & (1 << SWITCH_PORT_FLAG_TAGGED)))
> +			ops->set_port_pvid(dev, ports[i].id, val->port_vlan);
> +	}
> +
> +	return ops->set_vlan_ports(dev, val);
> +}
> +
> +static int
> +swconfig_set_pvid(struct switch_dev *dev,
> +			const struct switch_attr *attr, struct switch_val *val)
> +{
> +	if (val->port_vlan >= dev->ports)
> +		return -EINVAL;
> +
> +	if (!dev->ops->set_port_pvid)
> +		return -EOPNOTSUPP;
> +
> +	return dev->ops->set_port_pvid(dev, val->port_vlan, val->value.i);
> +}
> +
> +static int
> +swconfig_get_pvid(struct switch_dev *dev,
> +			const struct switch_attr *attr, struct switch_val *val)
> +{
> +	if (val->port_vlan >= dev->ports)
> +		return -EINVAL;
> +
> +	if (!dev->ops->get_port_pvid)
> +		return -EOPNOTSUPP;
> +
> +	return dev->ops->get_port_pvid(dev, val->port_vlan, &val->value.i);
> +}
> +
> +static const char *
> +swconfig_speed_str(enum switch_port_speed speed)
> +{
> +	switch (speed) {
> +	case SWITCH_PORT_SPEED_10:
> +		return "10baseT";
> +	case SWITCH_PORT_SPEED_100:
> +		return "100baseT";
> +	case SWITCH_PORT_SPEED_1000:
> +		return "1000baseT";
> +	default:
> +		break;
> +	}
> +
> +	return "unknown";
> +}
> +
> +static int
> +swconfig_get_link(struct switch_dev *dev,
> +			const struct switch_attr *attr, struct switch_val *val)
> +{
> +	struct switch_port_link link;
> +	int len;
> +	int ret;
> +
> +	if (val->port_vlan >= dev->ports)
> +		return -EINVAL;
> +
> +	if (!dev->ops->get_port_link)
> +		return -EOPNOTSUPP;
> +
> +	memset(&link, 0, sizeof(link));
> +	ret = dev->ops->get_port_link(dev, val->port_vlan, &link);
> +	if (ret)
> +		return ret;
> +
> +	memset(dev->buf, 0, sizeof(dev->buf));
> +
> +	if (link.link)
> +		len = snprintf(dev->buf, sizeof(dev->buf),
> +			       "port:%d link:up speed:%s %s-duplex %s%s%s",
> +			       val->port_vlan,
> +			       swconfig_speed_str(link.speed),
> +			       link.duplex ? "full" : "half",
> +			       link.tx_flow ? "txflow " : "",
> +			       link.rx_flow ?	"rxflow " : "",
> +			       link.aneg ? "auto" : "");
> +	else
> +		len = snprintf(dev->buf, sizeof(dev->buf), "port:%d link:down",
> +			       val->port_vlan);
> +
> +	val->value.s = dev->buf;
> +	val->len = len;
> +
> +	return 0;
> +}
> +
> +static int
> +swconfig_apply_config(struct switch_dev *dev,
> +			const struct switch_attr *attr, struct switch_val *val)
> +{
> +	/* don't complain if not supported by the switch driver */
> +	if (!dev->ops->apply_config)
> +		return 0;
> +
> +	return dev->ops->apply_config(dev);
> +}
> +
> +static int
> +swconfig_reset_switch(struct switch_dev *dev,
> +			const struct switch_attr *attr, struct switch_val *val)
> +{
> +	/* don't complain if not supported by the switch driver */
> +	if (!dev->ops->reset_switch)
> +		return 0;
> +
> +	return dev->ops->reset_switch(dev);
> +}
> +
> +enum global_defaults {
> +	GLOBAL_APPLY,
> +	GLOBAL_RESET,
> +};
> +
> +enum vlan_defaults {
> +	VLAN_PORTS,
> +};
> +
> +enum port_defaults {
> +	PORT_PVID,
> +	PORT_LINK,
> +};
> +
> +static struct switch_attr default_global[] = {
> +	[GLOBAL_APPLY] = {
> +		.type = SWITCH_TYPE_NOVAL,
> +		.name = "apply",
> +		.description = "Activate changes in the hardware",
> +		.set = swconfig_apply_config,
> +	},
> +	[GLOBAL_RESET] = {
> +		.type = SWITCH_TYPE_NOVAL,
> +		.name = "reset",
> +		.description = "Reset the switch",
> +		.set = swconfig_reset_switch,
> +	}
> +};
> +
> +static struct switch_attr default_port[] = {
> +	[PORT_PVID] = {
> +		.type = SWITCH_TYPE_INT,
> +		.name = "pvid",
> +		.description = "Primary VLAN ID",
> +		.set = swconfig_set_pvid,
> +		.get = swconfig_get_pvid,
> +	},
> +	[PORT_LINK] = {
> +		.type = SWITCH_TYPE_STRING,
> +		.name = "link",
> +		.description = "Get port link information",
> +		.set = NULL,
> +		.get = swconfig_get_link,
> +	}
> +};
> +
> +static struct switch_attr default_vlan[] = {
> +	[VLAN_PORTS] = {
> +		.type = SWITCH_TYPE_PORTS,
> +		.name = "ports",
> +		.description = "VLAN port mapping",
> +		.set = swconfig_set_vlan_ports,
> +		.get = swconfig_get_vlan_ports,
> +	},
> +};
> +
> +static const struct switch_attr *
> +swconfig_find_attr_by_name(const struct switch_attrlist *alist,
> +				const char *name)
> +{
> +	int i;
> +
> +	for (i = 0; i < alist->n_attr; i++)
> +		if (strcmp(name, alist->attr[i].name) == 0)
> +			return &alist->attr[i];
> +
> +	return NULL;
> +}
> +
> +static void swconfig_defaults_init(struct switch_dev *dev)
> +{
> +	const struct switch_dev_ops *ops = dev->ops;
> +
> +	dev->def_global = 0;
> +	dev->def_vlan = 0;
> +	dev->def_port = 0;
> +
> +	if (ops->get_vlan_ports || ops->set_vlan_ports)
> +		set_bit(VLAN_PORTS, &dev->def_vlan);
> +
> +	if (ops->get_port_pvid || ops->set_port_pvid)
> +		set_bit(PORT_PVID, &dev->def_port);
> +
> +	if (ops->get_port_link &&
> +	    !swconfig_find_attr_by_name(&ops->attr_port, "link"))
> +		set_bit(PORT_LINK, &dev->def_port);
> +
> +	/* always present, can be no-op */
> +	set_bit(GLOBAL_APPLY, &dev->def_global);
> +	set_bit(GLOBAL_RESET, &dev->def_global);
> +}
> +
> +
> +static struct genl_family switch_fam = {
> +	.id = GENL_ID_GENERATE,
> +	.name = "switch",
> +	.hdrsize = 0,
> +	.version = 1,
> +	.maxattr = SWITCH_ATTR_MAX,
> +};
> +
> +static const struct nla_policy switch_policy[SWITCH_ATTR_MAX+1] = {
> +	[SWITCH_ATTR_ID] = { .type = NLA_U32 },
> +	[SWITCH_ATTR_OP_ID] = { .type = NLA_U32 },
> +	[SWITCH_ATTR_OP_PORT] = { .type = NLA_U32 },
> +	[SWITCH_ATTR_OP_VLAN] = { .type = NLA_U32 },
> +	[SWITCH_ATTR_OP_VALUE_INT] = { .type = NLA_U32 },
> +	[SWITCH_ATTR_OP_VALUE_STR] = { .type = NLA_NUL_STRING },
> +	[SWITCH_ATTR_OP_VALUE_PORTS] = { .type = NLA_NESTED },
> +	[SWITCH_ATTR_TYPE] = { .type = NLA_U32 },
> +};
> +
> +static const struct nla_policy port_policy[SWITCH_PORT_ATTR_MAX+1] = {
> +	[SWITCH_PORT_ID] = { .type = NLA_U32 },
> +	[SWITCH_PORT_FLAG_TAGGED] = { .type = NLA_FLAG },
> +};
> +
> +static inline void
> +swconfig_lock(void)
> +{
> +	spin_lock(&swdevs_lock);
> +}
> +
> +static inline void
> +swconfig_unlock(void)
> +{
> +	spin_unlock(&swdevs_lock);
> +}
> +
> +static struct switch_dev *
> +swconfig_get_dev(struct genl_info *info)
> +{
> +	struct switch_dev *dev = NULL;
> +	struct switch_dev *p;
> +	int id;
> +
> +	if (!info->attrs[SWITCH_ATTR_ID])
> +		goto done;
> +
> +	id = nla_get_u32(info->attrs[SWITCH_ATTR_ID]);
> +	swconfig_lock();
> +	list_for_each_entry(p, &swdevs, dev_list) {
> +		if (id != p->id)
> +			continue;
> +
> +		dev = p;
> +		break;
> +	}
> +	if (dev)
> +		mutex_lock(&dev->sw_mutex);
> +	else
> +		pr_debug("device %d not found\n", id);
> +	swconfig_unlock();
> +done:
> +	return dev;
> +}
> +
> +static inline void
> +swconfig_put_dev(struct switch_dev *dev)
> +{
> +	mutex_unlock(&dev->sw_mutex);
> +}
> +
> +static int
> +swconfig_dump_attr(struct swconfig_callback *cb, void *arg)
> +{
> +	struct switch_attr *op = arg;
> +	struct genl_info *info = cb->info;
> +	struct sk_buff *msg = cb->msg;
> +	int id = cb->args[0];
> +	void *hdr;
> +
> +	hdr = genlmsg_put(msg, info->snd_portid, info->snd_seq, &switch_fam,
> +			NLM_F_MULTI, SWITCH_CMD_NEW_ATTR);
> +	if (IS_ERR(hdr))
> +		return -1;
> +
> +	if (nla_put_u32(msg, SWITCH_ATTR_OP_ID, id))
> +		goto nla_put_failure;
> +	if (nla_put_u32(msg, SWITCH_ATTR_OP_TYPE, op->type))
> +		goto nla_put_failure;
> +	if (nla_put_string(msg, SWITCH_ATTR_OP_NAME, op->name))
> +		goto nla_put_failure;
> +	if (op->description)
> +		if (nla_put_string(msg, SWITCH_ATTR_OP_DESCRIPTION,
> +			op->description))
> +			goto nla_put_failure;
> +
> +	return genlmsg_end(msg, hdr);
> +nla_put_failure:
> +	genlmsg_cancel(msg, hdr);
> +	return -EMSGSIZE;
> +}
> +
> +/* spread multipart messages across multiple message buffers */
> +static int
> +swconfig_send_multipart(struct swconfig_callback *cb, void *arg)
> +{
> +	struct genl_info *info = cb->info;
> +	int restart = 0;
> +	int err;
> +
> +	do {
> +		if (!cb->msg) {
> +			cb->msg = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
> +			if (cb->msg == NULL)
> +				goto error;
> +		}
> +
> +		if (!(cb->fill(cb, arg) < 0))
> +			break;
> +
> +		/* fill failed, check if this was already the second attempt */
> +		if (restart)
> +			goto error;
> +
> +		/* try again in a new message, send the current one */
> +		restart = 1;
> +		if (cb->close) {
> +			if (cb->close(cb, arg) < 0)
> +				goto error;
> +		}
> +		err = genlmsg_reply(cb->msg, info);
> +		cb->msg = NULL;
> +		if (err < 0)
> +			goto error;
> +
> +	} while (restart);
> +
> +	return 0;
> +
> +error:
> +	if (cb->msg)
> +		nlmsg_free(cb->msg);
> +	return -1;
> +}
> +
> +static int
> +swconfig_list_attrs(struct sk_buff *skb, struct genl_info *info)
> +{
> +	struct genlmsghdr *hdr = nlmsg_data(info->nlhdr);
> +	const struct switch_attrlist *alist;
> +	struct switch_dev *dev;
> +	struct swconfig_callback cb;
> +	int err = -EINVAL;
> +	int i;
> +
> +	/* defaults */
> +	struct switch_attr *def_list;
> +	unsigned long *def_active;
> +	int n_def;
> +
> +	dev = swconfig_get_dev(info);
> +	if (!dev)
> +		return -EINVAL;
> +
> +	switch (hdr->cmd) {
> +	case SWITCH_CMD_LIST_GLOBAL:
> +		alist = &dev->ops->attr_global;
> +		def_list = default_global;
> +		def_active = &dev->def_global;
> +		n_def = ARRAY_SIZE(default_global);
> +		break;
> +	case SWITCH_CMD_LIST_VLAN:
> +		alist = &dev->ops->attr_vlan;
> +		def_list = default_vlan;
> +		def_active = &dev->def_vlan;
> +		n_def = ARRAY_SIZE(default_vlan);
> +		break;
> +	case SWITCH_CMD_LIST_PORT:
> +		alist = &dev->ops->attr_port;
> +		def_list = default_port;
> +		def_active = &dev->def_port;
> +		n_def = ARRAY_SIZE(default_port);
> +		break;
> +	default:
> +		WARN_ON(1);
> +		goto out;
> +	}
> +
> +	memset(&cb, 0, sizeof(cb));
> +	cb.info = info;
> +	cb.fill = swconfig_dump_attr;
> +	for (i = 0; i < alist->n_attr; i++) {
> +		if (alist->attr[i].disabled)
> +			continue;
> +		cb.args[0] = i;
> +		err = swconfig_send_multipart(&cb, (void *) &alist->attr[i]);
> +		if (err < 0)
> +			goto error;
> +	}
> +
> +	/* defaults */
> +	for (i = 0; i < n_def; i++) {
> +		if (!test_bit(i, def_active))
> +			continue;
> +		cb.args[0] = SWITCH_ATTR_DEFAULTS_OFFSET + i;
> +		err = swconfig_send_multipart(&cb, (void *) &def_list[i]);
> +		if (err < 0)
> +			goto error;
> +	}
> +	swconfig_put_dev(dev);
> +
> +	if (!cb.msg)
> +		return 0;
> +
> +	return genlmsg_reply(cb.msg, info);
> +
> +error:
> +	if (cb.msg)
> +		nlmsg_free(cb.msg);
> +out:
> +	swconfig_put_dev(dev);
> +	return err;
> +}
> +
> +static const struct switch_attr *
> +swconfig_lookup_attr(struct switch_dev *dev, struct genl_info *info,
> +		struct switch_val *val)
> +{
> +	struct genlmsghdr *hdr = nlmsg_data(info->nlhdr);
> +	const struct switch_attrlist *alist;
> +	const struct switch_attr *attr = NULL;
> +	int attr_id;
> +
> +	/* defaults */
> +	struct switch_attr *def_list;
> +	unsigned long *def_active;
> +	int n_def;
> +
> +	if (!info->attrs[SWITCH_ATTR_OP_ID])
> +		goto done;
> +
> +	switch (hdr->cmd) {
> +	case SWITCH_CMD_SET_GLOBAL:
> +	case SWITCH_CMD_GET_GLOBAL:
> +		alist = &dev->ops->attr_global;
> +		def_list = default_global;
> +		def_active = &dev->def_global;
> +		n_def = ARRAY_SIZE(default_global);
> +		break;
> +	case SWITCH_CMD_SET_VLAN:
> +	case SWITCH_CMD_GET_VLAN:
> +		alist = &dev->ops->attr_vlan;
> +		def_list = default_vlan;
> +		def_active = &dev->def_vlan;
> +		n_def = ARRAY_SIZE(default_vlan);
> +		if (!info->attrs[SWITCH_ATTR_OP_VLAN])
> +			goto done;
> +		val->port_vlan = nla_get_u32(info->attrs[SWITCH_ATTR_OP_VLAN]);
> +		if (val->port_vlan >= dev->vlans)
> +			goto done;
> +		break;
> +	case SWITCH_CMD_SET_PORT:
> +	case SWITCH_CMD_GET_PORT:
> +		alist = &dev->ops->attr_port;
> +		def_list = default_port;
> +		def_active = &dev->def_port;
> +		n_def = ARRAY_SIZE(default_port);
> +		if (!info->attrs[SWITCH_ATTR_OP_PORT])
> +			goto done;
> +		val->port_vlan = nla_get_u32(info->attrs[SWITCH_ATTR_OP_PORT]);
> +		if (val->port_vlan >= dev->ports)
> +			goto done;
> +		break;
> +	default:
> +		WARN_ON(1);
> +		goto done;
> +	}
> +
> +	if (!alist)
> +		goto done;
> +
> +	attr_id = nla_get_u32(info->attrs[SWITCH_ATTR_OP_ID]);
> +	if (attr_id >= SWITCH_ATTR_DEFAULTS_OFFSET) {
> +		attr_id -= SWITCH_ATTR_DEFAULTS_OFFSET;
> +		if (attr_id >= n_def)
> +			goto done;
> +		if (!test_bit(attr_id, def_active))
> +			goto done;
> +		attr = &def_list[attr_id];
> +	} else {
> +		if (attr_id >= alist->n_attr)
> +			goto done;
> +		attr = &alist->attr[attr_id];
> +	}
> +
> +	if (attr->disabled)
> +		attr = NULL;
> +
> +done:
> +	if (!attr)
> +		pr_debug("attribute lookup failed\n");
> +	val->attr = attr;
> +	return attr;
> +}
> +
> +static int
> +swconfig_parse_ports(struct sk_buff *msg, struct nlattr *head,
> +		struct switch_val *val, int max)
> +{
> +	struct nlattr *nla;
> +	int rem;
> +
> +	val->len = 0;
> +	nla_for_each_nested(nla, head, rem) {
> +		struct nlattr *tb[SWITCH_PORT_ATTR_MAX+1];
> +		struct switch_port *port = &val->value.ports[val->len];
> +
> +		if (val->len >= max)
> +			return -EINVAL;
> +
> +		if (nla_parse_nested(tb, SWITCH_PORT_ATTR_MAX, nla,
> +				port_policy))
> +			return -EINVAL;
> +
> +		if (!tb[SWITCH_PORT_ID])
> +			return -EINVAL;
> +
> +		port->id = nla_get_u32(tb[SWITCH_PORT_ID]);
> +		if (tb[SWITCH_PORT_FLAG_TAGGED])
> +			port->flags |= (1 << SWITCH_PORT_FLAG_TAGGED);
> +		val->len++;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +swconfig_set_attr(struct sk_buff *skb, struct genl_info *info)
> +{
> +	const struct switch_attr *attr;
> +	struct switch_dev *dev;
> +	struct switch_val val;
> +	int err = -EINVAL;
> +
> +	dev = swconfig_get_dev(info);
> +	if (!dev)
> +		return -EINVAL;
> +
> +	memset(&val, 0, sizeof(val));
> +	attr = swconfig_lookup_attr(dev, info, &val);
> +	if (!attr || !attr->set)
> +		goto error;
> +
> +	val.attr = attr;
> +	switch (attr->type) {
> +	case SWITCH_TYPE_NOVAL:
> +		break;
> +	case SWITCH_TYPE_INT:
> +		if (!info->attrs[SWITCH_ATTR_OP_VALUE_INT])
> +			goto error;
> +		val.value.i =
> +			nla_get_u32(info->attrs[SWITCH_ATTR_OP_VALUE_INT]);
> +		break;
> +	case SWITCH_TYPE_STRING:
> +		if (!info->attrs[SWITCH_ATTR_OP_VALUE_STR])
> +			goto error;
> +		val.value.s =
> +			nla_data(info->attrs[SWITCH_ATTR_OP_VALUE_STR]);
> +		break;
> +	case SWITCH_TYPE_PORTS:
> +		val.value.ports = dev->portbuf;
> +		memset(dev->portbuf, 0,
> +			sizeof(struct switch_port) * dev->ports);
> +
> +		/* TODO: implement multipart? */
> +		if (info->attrs[SWITCH_ATTR_OP_VALUE_PORTS]) {
> +			err = swconfig_parse_ports(skb,
> +				info->attrs[SWITCH_ATTR_OP_VALUE_PORTS],
> +				&val, dev->ports);
> +			if (err < 0)
> +				goto error;
> +		} else {
> +			val.len = 0;
> +			err = 0;
> +		}
> +		break;
> +	default:
> +		goto error;
> +	}
> +
> +	err = attr->set(dev, attr, &val);
> +error:
> +	swconfig_put_dev(dev);
> +	return err;
> +}
> +
> +static int
> +swconfig_close_portlist(struct swconfig_callback *cb, void *arg)
> +{
> +	if (cb->nest[0])
> +		nla_nest_end(cb->msg, cb->nest[0]);
> +	return 0;
> +}
> +
> +static int
> +swconfig_send_port(struct swconfig_callback *cb, void *arg)
> +{
> +	const struct switch_port *port = arg;
> +	struct nlattr *p = NULL;
> +
> +	if (!cb->nest[0]) {
> +		cb->nest[0] = nla_nest_start(cb->msg, cb->cmd);
> +		if (!cb->nest[0])
> +			return -1;
> +	}
> +
> +	p = nla_nest_start(cb->msg, SWITCH_ATTR_PORT);
> +	if (!p)
> +		goto error;
> +
> +	if (nla_put_u32(cb->msg, SWITCH_PORT_ID, port->id))
> +		goto nla_put_failure;
> +	if (port->flags & (1 << SWITCH_PORT_FLAG_TAGGED)) {
> +		if (nla_put_flag(cb->msg, SWITCH_PORT_FLAG_TAGGED))
> +			goto nla_put_failure;
> +	}
> +
> +	nla_nest_end(cb->msg, p);
> +	return 0;
> +
> +nla_put_failure:
> +		nla_nest_cancel(cb->msg, p);
> +error:
> +	nla_nest_cancel(cb->msg, cb->nest[0]);
> +	return -1;
> +}
> +
> +static int
> +swconfig_send_ports(struct sk_buff **msg, struct genl_info *info, int attr,
> +		const struct switch_val *val)
> +{
> +	struct swconfig_callback cb;
> +	int err = 0;
> +	int i;
> +
> +	if (!val->value.ports)
> +		return -EINVAL;
> +
> +	memset(&cb, 0, sizeof(cb));
> +	cb.cmd = attr;
> +	cb.msg = *msg;
> +	cb.info = info;
> +	cb.fill = swconfig_send_port;
> +	cb.close = swconfig_close_portlist;
> +
> +	cb.nest[0] = nla_nest_start(cb.msg, cb.cmd);
> +	for (i = 0; i < val->len; i++) {
> +		err = swconfig_send_multipart(&cb, &val->value.ports[i]);
> +		if (err)
> +			goto done;
> +	}
> +	err = val->len;
> +	swconfig_close_portlist(&cb, NULL);
> +	*msg = cb.msg;
> +
> +done:
> +	return err;
> +}
> +
> +static int
> +swconfig_get_attr(struct sk_buff *skb, struct genl_info *info)
> +{
> +	struct genlmsghdr *hdr = nlmsg_data(info->nlhdr);
> +	const struct switch_attr *attr;
> +	struct switch_dev *dev;
> +	struct sk_buff *msg = NULL;
> +	struct switch_val val;
> +	int err = -EINVAL;
> +	int cmd = hdr->cmd;
> +
> +	dev = swconfig_get_dev(info);
> +	if (!dev)
> +		return -EINVAL;
> +
> +	memset(&val, 0, sizeof(val));
> +	attr = swconfig_lookup_attr(dev, info, &val);
> +	if (!attr || !attr->get)
> +		goto error;
> +
> +	if (attr->type == SWITCH_TYPE_PORTS) {
> +		val.value.ports = dev->portbuf;
> +		memset(dev->portbuf, 0,
> +			sizeof(struct switch_port) * dev->ports);
> +	}
> +
> +	err = attr->get(dev, attr, &val);
> +	if (err)
> +		goto error;
> +
> +	msg = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
> +	if (!msg)
> +		goto error;
> +
> +	hdr = genlmsg_put(msg, info->snd_portid, info->snd_seq, &switch_fam,
> +			0, cmd);
> +	if (IS_ERR(hdr))
> +		goto nla_put_failure;
> +
> +	switch (attr->type) {
> +	case SWITCH_TYPE_INT:
> +		if (nla_put_u32(msg, SWITCH_ATTR_OP_VALUE_INT, val.value.i))
> +			goto nla_put_failure;
> +		break;
> +	case SWITCH_TYPE_STRING:
> +		if (nla_put_string(msg, SWITCH_ATTR_OP_VALUE_STR, val.value.s))
> +			goto nla_put_failure;
> +		break;
> +	case SWITCH_TYPE_PORTS:
> +		err = swconfig_send_ports(&msg, info,
> +				SWITCH_ATTR_OP_VALUE_PORTS, &val);
> +		if (err < 0)
> +			goto nla_put_failure;
> +		break;
> +	default:
> +		pr_debug("invalid type in attribute\n");
> +		err = -EINVAL;
> +		goto error;
> +	}
> +	err = genlmsg_end(msg, hdr);
> +	if (err < 0)
> +		goto nla_put_failure;
> +
> +	swconfig_put_dev(dev);
> +	return genlmsg_reply(msg, info);
> +
> +nla_put_failure:
> +	if (msg)
> +		nlmsg_free(msg);
> +error:
> +	swconfig_put_dev(dev);
> +	if (!err)
> +		err = -ENOMEM;
> +	return err;
> +}
> +
> +static int
> +swconfig_send_switch(struct sk_buff *msg, u32 pid, u32 seq, int flags,
> +		const struct switch_dev *dev)
> +{
> +	struct nlattr *p = NULL, *m = NULL;
> +	void *hdr;
> +	int i;
> +
> +	hdr = genlmsg_put(msg, pid, seq, &switch_fam, flags,
> +			SWITCH_CMD_NEW_ATTR);
> +	if (IS_ERR(hdr))
> +		return -1;
> +
> +	if (nla_put_u32(msg, SWITCH_ATTR_ID, dev->id))
> +		goto nla_put_failure;
> +	if (nla_put_string(msg, SWITCH_ATTR_DEV_NAME, dev->devname))
> +		goto nla_put_failure;
> +	if (nla_put_string(msg, SWITCH_ATTR_ALIAS, dev->alias))
> +		goto nla_put_failure;
> +	if (nla_put_string(msg, SWITCH_ATTR_NAME, dev->name))
> +		goto nla_put_failure;
> +	if (nla_put_u32(msg, SWITCH_ATTR_VLANS, dev->vlans))
> +		goto nla_put_failure;
> +	if (nla_put_u32(msg, SWITCH_ATTR_PORTS, dev->ports))
> +		goto nla_put_failure;
> +	if (nla_put_u32(msg, SWITCH_ATTR_CPU_PORT, dev->cpu_port))
> +		goto nla_put_failure;
> +
> +	m = nla_nest_start(msg, SWITCH_ATTR_PORTMAP);
> +	if (!m)
> +		goto nla_put_failure;
> +	for (i = 0; i < dev->ports; i++) {
> +		p = nla_nest_start(msg, SWITCH_ATTR_PORTS);
> +		if (!p)
> +			continue;
> +		if (dev->portmap[i].s) {
> +			if (nla_put_string(msg, SWITCH_PORTMAP_SEGMENT,
> +						dev->portmap[i].s))
> +				goto nla_put_failure;
> +			if (nla_put_u32(msg, SWITCH_PORTMAP_VIRT,
> +						dev->portmap[i].virt))
> +				goto nla_put_failure;
> +		}
> +		nla_nest_end(msg, p);
> +	}
> +	nla_nest_end(msg, m);
> +	return genlmsg_end(msg, hdr);
> +nla_put_failure:
> +	genlmsg_cancel(msg, hdr);
> +	return -EMSGSIZE;
> +}
> +
> +static int swconfig_dump_switches(struct sk_buff *skb,
> +		struct netlink_callback *cb)
> +{
> +	struct switch_dev *dev;
> +	int start = cb->args[0];
> +	int idx = 0;
> +
> +	swconfig_lock();
> +	list_for_each_entry(dev, &swdevs, dev_list) {
> +		if (++idx <= start)
> +			continue;
> +		if (swconfig_send_switch(skb, NETLINK_CB(cb->skb).portid,
> +				cb->nlh->nlmsg_seq, NLM_F_MULTI,
> +				dev) < 0)
> +			break;
> +	}
> +	swconfig_unlock();
> +	cb->args[0] = idx;
> +
> +	return skb->len;
> +}
> +
> +static int
> +swconfig_done(struct netlink_callback *cb)
> +{
> +	return 0;
> +}
> +
> +static struct genl_ops swconfig_ops[] = {
> +	{
> +		.cmd = SWITCH_CMD_LIST_GLOBAL,
> +		.doit = swconfig_list_attrs,
> +		.policy = switch_policy,
> +	},
> +	{
> +		.cmd = SWITCH_CMD_LIST_VLAN,
> +		.doit = swconfig_list_attrs,
> +		.policy = switch_policy,
> +	},
> +	{
> +		.cmd = SWITCH_CMD_LIST_PORT,
> +		.doit = swconfig_list_attrs,
> +		.policy = switch_policy,
> +	},
> +	{
> +		.cmd = SWITCH_CMD_GET_GLOBAL,
> +		.doit = swconfig_get_attr,
> +		.policy = switch_policy,
> +	},
> +	{
> +		.cmd = SWITCH_CMD_GET_VLAN,
> +		.doit = swconfig_get_attr,
> +		.policy = switch_policy,
> +	},
> +	{
> +		.cmd = SWITCH_CMD_GET_PORT,
> +		.doit = swconfig_get_attr,
> +		.policy = switch_policy,
> +	},
> +	{
> +		.cmd = SWITCH_CMD_SET_GLOBAL,
> +		.doit = swconfig_set_attr,
> +		.policy = switch_policy,
> +	},
> +	{
> +		.cmd = SWITCH_CMD_SET_VLAN,
> +		.doit = swconfig_set_attr,
> +		.policy = switch_policy,
> +	},
> +	{
> +		.cmd = SWITCH_CMD_SET_PORT,
> +		.doit = swconfig_set_attr,
> +		.policy = switch_policy,
> +	},
> +	{
> +		.cmd = SWITCH_CMD_GET_SWITCH,
> +		.dumpit = swconfig_dump_switches,
> +		.policy = switch_policy,
> +		.done = swconfig_done,
> +	}
> +};
> +
> +int
> +register_switch(struct switch_dev *dev, struct net_device *netdev)
> +{
> +	struct switch_dev *sdev;
> +	const int max_switches = 8 * sizeof(unsigned long);
> +	unsigned long in_use = 0;
> +	int i;
> +
> +	INIT_LIST_HEAD(&dev->dev_list);
> +	if (netdev) {
> +		dev->netdev = netdev;
> +		if (!dev->alias)
> +			dev->alias = netdev->name;
> +	}
> +	BUG_ON(!dev->alias);
> +
> +	if (dev->ports > 0) {
> +		dev->portbuf = kzalloc(sizeof(struct switch_port) *
> +				dev->ports, GFP_KERNEL);
> +		if (!dev->portbuf)
> +			return -ENOMEM;
> +		dev->portmap = kzalloc(sizeof(struct switch_portmap) *
> +				dev->ports, GFP_KERNEL);
> +		if (!dev->portmap) {
> +			kfree(dev->portbuf);
> +			return -ENOMEM;
> +		}
> +	}
> +	swconfig_defaults_init(dev);
> +	mutex_init(&dev->sw_mutex);
> +	swconfig_lock();
> +	dev->id = ++swdev_id;
> +
> +	list_for_each_entry(sdev, &swdevs, dev_list) {
> +		if (!sscanf(sdev->devname, SWCONFIG_DEVNAME, &i))
> +			continue;
> +		if (i < 0 || i > max_switches)
> +			continue;
> +
> +		set_bit(i, &in_use);
> +	}
> +	i = find_first_zero_bit(&in_use, max_switches);
> +
> +	if (i == max_switches) {
> +		swconfig_unlock();
> +		return -ENFILE;
> +	}
> +
> +	/* fill device name */
> +	snprintf(dev->devname, IFNAMSIZ, SWCONFIG_DEVNAME, i);
> +
> +	list_add(&dev->dev_list, &swdevs);
> +	swconfig_unlock();
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(register_switch);
> +
> +void
> +unregister_switch(struct switch_dev *dev)
> +{
> +	kfree(dev->portbuf);
> +	mutex_lock(&dev->sw_mutex);
> +	swconfig_lock();
> +	list_del(&dev->dev_list);
> +	swconfig_unlock();
> +	mutex_unlock(&dev->sw_mutex);
> +}
> +EXPORT_SYMBOL_GPL(unregister_switch);
> +
> +
> +static int __init
> +swconfig_init(void)
> +{
> +	int i, err;
> +
> +	INIT_LIST_HEAD(&swdevs);
> +	err = genl_register_family(&switch_fam);
> +	if (err)
> +		return err;
> +
> +	for (i = 0; i < ARRAY_SIZE(swconfig_ops); i++) {
> +		err = genl_register_ops(&switch_fam, &swconfig_ops[i]);
> +		if (err)
> +			goto unregister;
> +	}
> +
> +	return 0;
> +
> +unregister:
> +	genl_unregister_family(&switch_fam);
> +	return err;
> +}
> +
> +static void __exit
> +swconfig_exit(void)
> +{
> +	genl_unregister_family(&switch_fam);
> +}
> +
> +module_init(swconfig_init);
> +module_exit(swconfig_exit);
> +
> diff --git a/include/linux/swconfig.h b/include/linux/swconfig.h
> new file mode 100644
> index 0000000..fd96eec
> --- /dev/null
> +++ b/include/linux/swconfig.h
> @@ -0,0 +1,180 @@
> +/*
> + * Switch configuration API
> + *
> + * Copyright (C) 2008 Felix Fietkau <nbd@openwrt.org>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +#ifndef _LINUX_SWITCH_H
> +#define _LINUX_SWITCH_H
> +
> +#include <net/genetlink.h>
> +#include <uapi/linux/swconfig.h>
> +
> +struct switch_dev;
> +struct switch_op;
> +struct switch_val;
> +struct switch_attr;
> +struct switch_attrlist;
> +struct switch_led_trigger;
> +
> +int register_switch(struct switch_dev *dev, struct net_device *netdev);
> +void unregister_switch(struct switch_dev *dev);
> +
> +/**
> + * struct switch_attrlist - attribute list
> + *
> + * @n_attr: number of attributes
> + * @attr: pointer to the attributes array
> + */
> +struct switch_attrlist {
> +	int n_attr;
> +	const struct switch_attr *attr;
> +};
> +
> +enum switch_port_speed {
> +	SWITCH_PORT_SPEED_UNKNOWN = 0,
> +	SWITCH_PORT_SPEED_10 = 10,
> +	SWITCH_PORT_SPEED_100 = 100,
> +	SWITCH_PORT_SPEED_1000 = 1000,
> +};
> +
> +struct switch_port_link {
> +	bool link;
> +	bool duplex;
> +	bool aneg;
> +	bool tx_flow;
> +	bool rx_flow;
> +	enum switch_port_speed speed;
> +};
> +
> +struct switch_port_stats {
> +	unsigned long tx_bytes;
> +	unsigned long rx_bytes;
> +};
> +
> +/**
> + * struct switch_dev_ops - switch driver operations
> + *
> + * @attr_global: global switch attribute list
> + * @attr_port: port attribute list
> + * @attr_vlan: vlan attribute list
> + *
> + * Callbacks:
> + *
> + * @get_vlan_ports: read the port list of a VLAN
> + * @set_vlan_ports: set the port list of a VLAN
> + *
> + * @get_port_pvid: get the primary VLAN ID of a port
> + * @set_port_pvid: set the primary VLAN ID of a port
> + *
> + * @apply_config: apply all changed settings to the switch
> + * @reset_switch: resetting the switch
> + *
> + * @get_port_link: read the port link status
> + * @get_port_stats: read the port statistics counters
> + */
> +struct switch_dev_ops {
> +	struct switch_attrlist attr_global, attr_port, attr_vlan;
> +
> +	int (*get_vlan_ports)(struct switch_dev *dev, struct switch_val *val);
> +	int (*set_vlan_ports)(struct switch_dev *dev, struct switch_val *val);
> +
> +	int (*get_port_pvid)(struct switch_dev *dev, int port, int *val);
> +	int (*set_port_pvid)(struct switch_dev *dev, int port, int val);
> +
> +	int (*apply_config)(struct switch_dev *dev);
> +	int (*reset_switch)(struct switch_dev *dev);
> +
> +	int (*get_port_link)(struct switch_dev *dev, int port,
> +			     struct switch_port_link *link);
> +	int (*get_port_stats)(struct switch_dev *dev, int port,
> +			      struct switch_port_stats *stats);
> +};
> +
> +/**
> + * struct switch_dev - switch device
> + *
> + * @ops: switch driver operations pointer
> + * @devname: switch device name (automatically filled)
> + * @name: switch driver name returned to user-space
> + * @alias: alias name for the switch (instead of ethX) returned to user-space
> + * @netdev: network device pointer if alias is not used
> + *
> + * @ports: number of physical switch ports
> + * @vlans: number of supported VLANs
> + * @cpu_port: identifier for the CPU port
> + */
> +struct switch_dev {
> +	const struct switch_dev_ops *ops;
> +	/* will be automatically filled */
> +	char devname[IFNAMSIZ];
> +
> +	const char *name;
> +	/* NB: either alias or netdev must be set */
> +	const char *alias;
> +	struct net_device *netdev;
> +
> +	int ports;
> +	int vlans;
> +	int cpu_port;
> +
> +	/* the following fields are internal for swconfig */
> +	int id;
> +	struct list_head dev_list;
> +	unsigned long def_global, def_port, def_vlan;
> +
> +	struct mutex sw_mutex;
> +	struct switch_port *portbuf;
> +	struct switch_portmap *portmap;
> +
> +	char buf[128];
> +};
> +
> +struct switch_port {
> +	u32 id;
> +	u32 flags;
> +};
> +
> +struct switch_portmap {
> +	u32 virt;
> +	const char *s;
> +};
> +
> +struct switch_val {
> +	const struct switch_attr *attr;
> +	int port_vlan;
> +	int len;
> +	union {
> +		const char *s;
> +		u32 i;
> +		struct switch_port *ports;
> +	} value;
> +};
> +
> +struct switch_attr {
> +	int disabled;
> +	int type;
> +	const char *name;
> +	const char *description;
> +
> +	int (*set)(struct switch_dev *dev, const struct switch_attr *attr,
> +			struct switch_val *val);
> +	int (*get)(struct switch_dev *dev, const struct switch_attr *attr,
> +			struct switch_val *val);
> +
> +	/* for driver internal use */
> +	int id;
> +	int ofs;
> +	int max;
> +};
> +
> +#endif /* _LINUX_SWITCH_H */
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index 115add2..0a995be 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -363,6 +363,7 @@ header-y += stddef.h
>  header-y += string.h
>  header-y += suspend_ioctls.h
>  header-y += swab.h
> +header-y += swconfig.h
>  header-y += synclink.h
>  header-y += sysctl.h
>  header-y += sysinfo.h
> diff --git a/include/uapi/linux/swconfig.h b/include/uapi/linux/swconfig.h
> new file mode 100644
> index 0000000..17cf178
> --- /dev/null
> +++ b/include/uapi/linux/swconfig.h
> @@ -0,0 +1,103 @@
> +/*
> + * Switch configuration API
> + *
> + * Copyright (C) 2008 Felix Fietkau <nbd@openwrt.org>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#ifndef _UAPI_LINUX_SWITCH_H
> +#define _UAPI_LINUX_SWITCH_H
> +
> +#include <linux/types.h>
> +#include <linux/netdevice.h>
> +#include <linux/netlink.h>
> +#include <linux/genetlink.h>
> +#ifndef __KERNEL__
> +#include <netlink/netlink.h>
> +#include <netlink/genl/genl.h>
> +#include <netlink/genl/ctrl.h>
> +#endif
> +
> +/* main attributes */
> +enum {
> +	SWITCH_ATTR_UNSPEC,
> +	/* global */
> +	SWITCH_ATTR_TYPE,
> +	/* device */
> +	SWITCH_ATTR_ID,
> +	SWITCH_ATTR_DEV_NAME,
> +	SWITCH_ATTR_ALIAS,
> +	SWITCH_ATTR_NAME,
> +	SWITCH_ATTR_VLANS,
> +	SWITCH_ATTR_PORTS,
> +	SWITCH_ATTR_PORTMAP,
> +	SWITCH_ATTR_CPU_PORT,
> +	/* attributes */
> +	SWITCH_ATTR_OP_ID,
> +	SWITCH_ATTR_OP_TYPE,
> +	SWITCH_ATTR_OP_NAME,
> +	SWITCH_ATTR_OP_PORT,
> +	SWITCH_ATTR_OP_VLAN,
> +	SWITCH_ATTR_OP_VALUE_INT,
> +	SWITCH_ATTR_OP_VALUE_STR,
> +	SWITCH_ATTR_OP_VALUE_PORTS,
> +	SWITCH_ATTR_OP_DESCRIPTION,
> +	/* port lists */
> +	SWITCH_ATTR_PORT,
> +	SWITCH_ATTR_MAX
> +};
> +
> +enum {
> +	/* port map */
> +	SWITCH_PORTMAP_PORTS,
> +	SWITCH_PORTMAP_SEGMENT,
> +	SWITCH_PORTMAP_VIRT,
> +	SWITCH_PORTMAP_MAX
> +};
> +
> +/* commands */
> +enum {
> +	SWITCH_CMD_UNSPEC,
> +	SWITCH_CMD_GET_SWITCH,
> +	SWITCH_CMD_NEW_ATTR,
> +	SWITCH_CMD_LIST_GLOBAL,
> +	SWITCH_CMD_GET_GLOBAL,
> +	SWITCH_CMD_SET_GLOBAL,
> +	SWITCH_CMD_LIST_PORT,
> +	SWITCH_CMD_GET_PORT,
> +	SWITCH_CMD_SET_PORT,
> +	SWITCH_CMD_LIST_VLAN,
> +	SWITCH_CMD_GET_VLAN,
> +	SWITCH_CMD_SET_VLAN
> +};
> +
> +/* data types */
> +enum switch_val_type {
> +	SWITCH_TYPE_UNSPEC,
> +	SWITCH_TYPE_INT,
> +	SWITCH_TYPE_STRING,
> +	SWITCH_TYPE_PORTS,
> +	SWITCH_TYPE_NOVAL,
> +};
> +
> +/* port nested attributes */
> +enum {
> +	SWITCH_PORT_UNSPEC,
> +	SWITCH_PORT_ID,
> +	SWITCH_PORT_FLAG_TAGGED,
> +	SWITCH_PORT_ATTR_MAX
> +};
> +
> +#define SWITCH_ATTR_DEFAULTS_OFFSET	0x1000
> +
> +
> +#endif /* _UAPI_LINUX_SWITCH_H */

^ permalink raw reply

* Re: [PATCH 1/4 net-next] net: phy: add Generic Netlink Ethernet switch configuration API
From: Florian Fainelli @ 2013-10-22 19:32 UTC (permalink / raw)
  To: Dan Williams
  Cc: netdev, David Miller, Sascha Hauer, Felix Fietkau, John Crispin,
	Jonas Gorski, Gary Thomas
In-Reply-To: <1382469746.19269.55.camel@dcbw.foobar.com>

2013/10/22 Dan Williams <dcbw@redhat.com>:
> On Tue, 2013-10-22 at 11:23 -0700, Florian Fainelli wrote:
>> This patch adds an Ethernet Switch generic netlink configuration API
>> which allows for doing the required configuration of managed Ethernet
>> switches commonly found in Wireless/Cable/DSL routers in the market.
>
> "swconfig" probably means "switch config", but is there any way to
> rename this away from the "sw" prefix, since "sw" typically means
> "software" and not "switch"?

Sure, how about something like "enetsw"? I would like to avoid using
"switch" too much since this is a C reserved keyword.
-- 
Florian

^ permalink raw reply

* Re: [PATCH net] net: sctp: fix ASCONF to allow non SCTP_ADDR_SRC addresses in ipv6
From: Michio Honda @ 2013-10-22 19:33 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: davem, netdev, linux-sctp
In-Reply-To: <1382459696-1732-1-git-send-email-dborkman@redhat.com>


On Oct 22, 2013, at 6:34 PM, Daniel Borkmann wrote:

> Commit 8a07eb0a50 ("sctp: Add ASCONF operation on the single-homed host")
> implemented possible use of IPv4 addresses with non SCTP_ADDR_SRC state
> as source address when sending ASCONF (ADD) packets, but IPv6 part for
> that was not implemented in 8a07eb0a50. Therefore, as this is not restricted
> to IPv4-only, fix this up to allow the same for IPv6 addresses in SCTP.
> 
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Cc: Michio Honda <micchie@sfc.wide.ad.jp>
> ---
> net/sctp/ipv6.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
> index e7b2d4f..96a5591 100644
> --- a/net/sctp/ipv6.c
> +++ b/net/sctp/ipv6.c
> @@ -279,7 +279,9 @@ static void sctp_v6_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
> 		sctp_v6_to_addr(&dst_saddr, &fl6->saddr, htons(bp->port));
> 		rcu_read_lock();
> 		list_for_each_entry_rcu(laddr, &bp->address_list, list) {
> -			if (!laddr->valid || (laddr->state != SCTP_ADDR_SRC))
> +			if (!laddr->valid || laddr->state == SCTP_ADDR_DEL ||
> +			    (laddr->state != SCTP_ADDR_SRC &&
> +			     !asoc->src_out_of_asoc_ok))
> 				continue;
> 
> 			/* Do not compare against v4 addrs */
> -- 
> 1.8.3.1
Acked-by: Michio Honda <micchie@sfc.wide.ad.jp>

^ permalink raw reply

* Re: [PATCH net] netpoll: fix rx_hook() interface by passing the skb
From: David Miller @ 2013-10-22 19:40 UTC (permalink / raw)
  To: antonio; +Cc: David.Laight, netdev
In-Reply-To: <20131022171314.GL1544@neomailbox.net>

From: Antonio Quartulli <antonio@meshcoding.com>
Date: Tue, 22 Oct 2013 19:13:14 +0200

> If we go for the "no udp port" approach they will get an error any
> way because of the mismatching arguments.

Antonio, I think it is fair enough to keep passing the port argument
as well as the length.

These precomputed values might as well be provided the to receiver.

Thanks.

^ permalink raw reply

* Re: [PATCH 1/2] [PATCH] ax88179_178a: Correct the RX error definition in RX header
From: David Miller @ 2013-10-22 19:44 UTC (permalink / raw)
  To: freddy; +Cc: linux-usb, linux-kernel, netdev, allan, louis
In-Reply-To: <1382427131-2429-1-git-send-email-freddy@asix.com.tw>

From: freddy@asix.com.tw
Date: Tue, 22 Oct 2013 15:32:10 +0800

> From: Freddy Xin <freddy@asix.com.tw>
> 
> Correct the definition of AX_RXHDR_CRC_ERR and
> AX_RXHDR_DROP_ERR. They are BIT29 and BIT31 in pkt_hdr
> seperately.
> 
> Signed-off-by: Freddy Xin <freddy@asix.com.tw>

Applied.

^ permalink raw reply

* Re: [PATCH 2/2] [PATCH] ax88179_178a: Add VID:DID for Samsung USB Ethernet Adapter
From: David Miller @ 2013-10-22 19:44 UTC (permalink / raw)
  To: freddy; +Cc: linux-usb, linux-kernel, netdev, allan, louis
In-Reply-To: <1382427131-2429-2-git-send-email-freddy@asix.com.tw>

From: freddy@asix.com.tw
Date: Tue, 22 Oct 2013 15:32:11 +0800

> From: Freddy Xin <freddy@asix.com.tw>
> 
> Add VID:DID for Samsung USB Ethernet Adapter.
> 
> Signed-off-by: Freddy Xin <freddy@asix.com.tw>

Applied.

^ permalink raw reply

* Re: Big performance loss from 3.4.63 to 3.10.13 when routing ipv4
From: David Miller @ 2013-10-22 19:46 UTC (permalink / raw)
  To: linux; +Cc: hannes, netdev, klassert
In-Reply-To: <3928724.bDaZagpRh6@h2o.as.studentenwerk.mhn.de>

From: Wolfgang Walter <linux@stwm.de>
Date: Tue, 22 Oct 2013 21:07:41 +0200

> Am Mittwoch, 2. Oktober 2013, 00:20:02 schrieb Hannes Frederic Sowa:
>> On Tue, Oct 01, 2013 at 06:39:32PM +0200, Wolfgang Walter wrote:
>> > All network traffic over the router become slow and sluggish. If one pings
>> > the router there is a packet loss. After about 2 minutes the traffic
>> > completely stalls for about 1 minute. Then it works again as in the
>> > beginning to then stall again. And so on.
>> 
>> Maybe dropwatch can give a first hint?
>> 
> 
> I finally found the problem:
> 
> In 3.10.x and 3.11.x the value of /proc/sys/net/ipv4/xfrm4_gc_thresh is 1024.
> 
> It is much higher in 3.4.x. If I increase this value in 3.10.x to the one I 
> see on 3.4.x all works fine with 3.10.x

Steffen, here is yet another report about this issue.

I think we should resolve this soon, even bumping it to 2048 or 4096
and leaving it at that would be I think acceptable.

Thanks.

^ permalink raw reply

* Re: [PATCH 1/4 net-next] net: phy: add Generic Netlink Ethernet switch configuration API
From: David Miller @ 2013-10-22 19:46 UTC (permalink / raw)
  To: dcbw; +Cc: f.fainelli, netdev, s.hauer, nbd, blogic, jogo, gary
In-Reply-To: <1382469746.19269.55.camel@dcbw.foobar.com>

From: Dan Williams <dcbw@redhat.com>
Date: Tue, 22 Oct 2013 14:22:26 -0500

> On Tue, 2013-10-22 at 11:23 -0700, Florian Fainelli wrote:
>> This patch adds an Ethernet Switch generic netlink configuration API
>> which allows for doing the required configuration of managed Ethernet
>> switches commonly found in Wireless/Cable/DSL routers in the market.
> 
> "swconfig" probably means "switch config", but is there any way to
> rename this away from the "sw" prefix, since "sw" typically means
> "software" and not "switch"?

Agreed.

^ permalink raw reply

* Re: [PATCH 1/4 net-next] net: phy: add Generic Netlink Ethernet switch configuration API
From: David Miller @ 2013-10-22 19:47 UTC (permalink / raw)
  To: f.fainelli; +Cc: dcbw, netdev, s.hauer, nbd, blogic, jogo, gary
In-Reply-To: <CAGVrzcYCK3QBMgHnZ2T=wzeckX-enwJOu7DEcKHg3s=ABivLDQ@mail.gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Tue, 22 Oct 2013 12:32:29 -0700

> 2013/10/22 Dan Williams <dcbw@redhat.com>:
>> On Tue, 2013-10-22 at 11:23 -0700, Florian Fainelli wrote:
>>> This patch adds an Ethernet Switch generic netlink configuration API
>>> which allows for doing the required configuration of managed Ethernet
>>> switches commonly found in Wireless/Cable/DSL routers in the market.
>>
>> "swconfig" probably means "switch config", but is there any way to
>> rename this away from the "sw" prefix, since "sw" typically means
>> "software" and not "switch"?
> 
> Sure, how about something like "enetsw"? I would like to avoid using
> "switch" too much since this is a C reserved keyword.

"swtch"? :-)

^ permalink raw reply

* Re: [net-next v2 00/14][pull request] Intel Wired LAN Driver Updates
From: David Miller @ 2013-10-22 19:53 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, sassmann
In-Reply-To: <1382451757-9817-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 22 Oct 2013 07:22:23 -0700

> This series contains updates to i40e only.
> 
> Jesse provides 6 patches against i40e.  First is a patch to reduce
> CPU utilization by reducing read-flush to read in the hot path.  Next
> couple of patches resolve coverity issues reported by Hannes Frederic
> Sowa <hannes@stressinduktion.org>.  Then Jesse refactored i40e to cleanup
> functions which used cpu_to_xxx(foo) which caused a lot of line wrapping.
> 
> Mitch provides 2 i40e patches.  First fixes a panic when tx_rings[0]
> are not allocated, his second patch corrects a math error when
> assigning MSI-X vectors to VFs.  The vectors-per-vf value reported
> by the hardware already conveniently reports one less than the actual
> value.
> 
> Shannon provides 5 patches against i40e.  His first patch corrects a
> number of little bugs in the error handling of irq setup, most of
> which ended up panicing the kernel.  Next he fixes the overactive
> IRQ issue seen in testing and allows the use of the legacy interrupt.
> Shannon then provides a cleanup of the arguments declared at the
> beginning of each function.  Then he provides a patch to make sure
> that there are really rings and queues before trying to dump
> information in them.  Lastly he simplifies the code by using an
> already existing variable.
> 
> Catherine provides an i40e patch to bump the version.

Pulled, thanks Jeff.

Just a note for the future, and I decided not to push back this time when
I saw it in this series.  When you have a construct like:

	if (x)
		for( ... ) {
		}

Put the top-level condition in braces too as it's much easier to read
and audit:

	if (x) {
		for ( ... ) {
		}
	}

Thanks.

^ permalink raw reply

* Re: [PATCH 1/4 net-next] net: phy: add Generic Netlink Ethernet switch configuration API
From: John Fastabend @ 2013-10-22 19:53 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, davem, s.hauer, nbd, blogic, jogo, gary, Jamal Hadi Salim,
	Neil Horman
In-Reply-To: <1382466229-15123-2-git-send-email-f.fainelli@gmail.com>

On 10/22/2013 11:23 AM, Florian Fainelli wrote:
> This patch adds an Ethernet Switch generic netlink configuration API
> which allows for doing the required configuration of managed Ethernet
> switches commonly found in Wireless/Cable/DSL routers in the market.
>
> Since this API is based on the Generic Netlink infrastructure it is very
> easy to extend a particular switch driver to support additional features
> and to adapt it to specific switches.
>

> So far the API includes support for:
>
> - getting/setting a port VLAN id
> - getting/setting VLAN port membership
> - getting a port link status
> - getting a port statistics counters
> - resetting a switch device
> - applying a configuration to a switch device
>

Did you consider exposing each physical switch port as a netdevice on
the host? I would assume your switch driver could do this.

Then you can drop the port specific attributes (link status, stats, etc)
and use existing interfaces. The win being my tools work equally well on
your real switch as they do on my software switch. Also by exposing net
devices you provide a mechanism to send packets over the port and trap
control packets.

Next instead of creating a switch specific netlink API could you use
the existing FDB API? Again what I would like is for my existing
applications to run on the switch without having to rewrite them. For
example it would be great to have 'bridge fdb show dev myswitch' report
the correct tables for both the Sw bridge, a real switch bridge, and
for the embedded SR-IOV bridge case.

I added Jamal and Neil because I think I remember talking about similar
ideas with them before.

Thanks,
.John

^ permalink raw reply

* hello dearest
From: Diasy Bali @ 2013-10-22 19:30 UTC (permalink / raw)



Hello,How are you,I hope you're well,my name is Diasy Bali,I'm medium height and fair in complexion,i love,caring and I decided to contact you.I really want to have a good relationship with you.Next I have a special something I want to discuss with you,and tell you more about my self.Hope hear from you soon.age is not matter for our relationship,I need your love and affection.I will give my best. Yours Miss Diasy Bali.

^ permalink raw reply

* Re: [PATCH 1/4 net-next] net: phy: add Generic Netlink Ethernet switch configuration API
From: Florian Fainelli @ 2013-10-22 19:59 UTC (permalink / raw)
  To: John Fastabend
  Cc: netdev, David Miller, Sascha Hauer, Felix Fietkau, John Crispin,
	Jonas Gorski, Gary Thomas, Jamal Hadi Salim, Neil Horman
In-Reply-To: <5266D7D6.9000309@intel.com>

2013/10/22 John Fastabend <john.r.fastabend@intel.com>:
> On 10/22/2013 11:23 AM, Florian Fainelli wrote:
>>
>> This patch adds an Ethernet Switch generic netlink configuration API
>> which allows for doing the required configuration of managed Ethernet
>> switches commonly found in Wireless/Cable/DSL routers in the market.
>>
>> Since this API is based on the Generic Netlink infrastructure it is very
>> easy to extend a particular switch driver to support additional features
>> and to adapt it to specific switches.
>>
>
>> So far the API includes support for:
>>
>> - getting/setting a port VLAN id
>> - getting/setting VLAN port membership
>> - getting a port link status
>> - getting a port statistics counters
>> - resetting a switch device
>> - applying a configuration to a switch device
>>
>
> Did you consider exposing each physical switch port as a netdevice on
> the host? I would assume your switch driver could do this.
>
> Then you can drop the port specific attributes (link status, stats, etc)
> and use existing interfaces. The win being my tools work equally well on
> your real switch as they do on my software switch. Also by exposing net
> devices you provide a mechanism to send packets over the port and trap
> control packets.

Well this is exactly what DSA does and which I do not like because it
is completely overkill for most switches out there which are using
802.1q tags and do not prepend/append proprietary tags for internal
traffic classification.

>
> Next instead of creating a switch specific netlink API could you use
> the existing FDB API? Again what I would like is for my existing
> applications to run on the switch without having to rewrite them. For
> example it would be great to have 'bridge fdb show dev myswitch' report
> the correct tables for both the Sw bridge, a real switch bridge, and
> for the embedded SR-IOV bridge case.

Ok, I know nothing about the FDB API, but will take a look and see if
that sounds suitable for the embedded use cases.

>
> I added Jamal and Neil because I think I remember talking about similar
> ideas with them before.

Thanks!
-- 
Florian

^ permalink raw reply

* Re: [net-next v2 00/14][pull request] Intel Wired LAN Driver Updates
From: Jeff Kirsher @ 2013-10-22 20:00 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, gospo, sassmann
In-Reply-To: <20131022.155336.2066350912394604583.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 2010 bytes --]

On Tue, 2013-10-22 at 15:53 -0400, David Miller wrote:
> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Date: Tue, 22 Oct 2013 07:22:23 -0700
> 
> > This series contains updates to i40e only.
> > 
> > Jesse provides 6 patches against i40e.  First is a patch to reduce
> > CPU utilization by reducing read-flush to read in the hot path.  Next
> > couple of patches resolve coverity issues reported by Hannes Frederic
> > Sowa <hannes@stressinduktion.org>.  Then Jesse refactored i40e to cleanup
> > functions which used cpu_to_xxx(foo) which caused a lot of line wrapping.
> > 
> > Mitch provides 2 i40e patches.  First fixes a panic when tx_rings[0]
> > are not allocated, his second patch corrects a math error when
> > assigning MSI-X vectors to VFs.  The vectors-per-vf value reported
> > by the hardware already conveniently reports one less than the actual
> > value.
> > 
> > Shannon provides 5 patches against i40e.  His first patch corrects a
> > number of little bugs in the error handling of irq setup, most of
> > which ended up panicing the kernel.  Next he fixes the overactive
> > IRQ issue seen in testing and allows the use of the legacy interrupt.
> > Shannon then provides a cleanup of the arguments declared at the
> > beginning of each function.  Then he provides a patch to make sure
> > that there are really rings and queues before trying to dump
> > information in them.  Lastly he simplifies the code by using an
> > already existing variable.
> > 
> > Catherine provides an i40e patch to bump the version.
> 
> Pulled, thanks Jeff.
> 
> Just a note for the future, and I decided not to push back this time when
> I saw it in this series.  When you have a construct like:
> 
> 	if (x)
> 		for( ... ) {
> 		}
> 
> Put the top-level condition in braces too as it's much easier to read
> and audit:
> 
> 	if (x) {
> 		for ( ... ) {
> 		}
> 	}
> 
> Thanks.

I agree, I will add that to my list of checks for the future.  Thanks
Dave.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH 1/4 net-next] net: phy: add Generic Netlink Ethernet switch configuration API
From: Neil Horman @ 2013-10-22 20:25 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: John Fastabend, netdev, David Miller, Sascha Hauer, Felix Fietkau,
	John Crispin, Jonas Gorski, Gary Thomas, Jamal Hadi Salim
In-Reply-To: <CAGVrzcZS4SGiUFoCnhau1Fvv9F4ffS7tA+PT4r-qLCj8HV8BqQ@mail.gmail.com>

On Tue, Oct 22, 2013 at 12:59:12PM -0700, Florian Fainelli wrote:
> 2013/10/22 John Fastabend <john.r.fastabend@intel.com>:
> > On 10/22/2013 11:23 AM, Florian Fainelli wrote:
> >>
> >> This patch adds an Ethernet Switch generic netlink configuration API
> >> which allows for doing the required configuration of managed Ethernet
> >> switches commonly found in Wireless/Cable/DSL routers in the market.
> >>
> >> Since this API is based on the Generic Netlink infrastructure it is very
> >> easy to extend a particular switch driver to support additional features
> >> and to adapt it to specific switches.
> >>
> >
> >> So far the API includes support for:
> >>
> >> - getting/setting a port VLAN id
> >> - getting/setting VLAN port membership
> >> - getting a port link status
> >> - getting a port statistics counters
> >> - resetting a switch device
> >> - applying a configuration to a switch device
> >>
> >
> > Did you consider exposing each physical switch port as a netdevice on
> > the host? I would assume your switch driver could do this.
> >
> > Then you can drop the port specific attributes (link status, stats, etc)
> > and use existing interfaces. The win being my tools work equally well on
> > your real switch as they do on my software switch. Also by exposing net
> > devices you provide a mechanism to send packets over the port and trap
> > control packets.
> 
> Well this is exactly what DSA does and which I do not like because it
> is completely overkill for most switches out there which are using
> 802.1q tags and do not prepend/append proprietary tags for internal
> traffic classification.
> 
> >
> > Next instead of creating a switch specific netlink API could you use
> > the existing FDB API? Again what I would like is for my existing
> > applications to run on the switch without having to rewrite them. For
> > example it would be great to have 'bridge fdb show dev myswitch' report
> > the correct tables for both the Sw bridge, a real switch bridge, and
> > for the embedded SR-IOV bridge case.
> 
> Ok, I know nothing about the FDB API, but will take a look and see if
> that sounds suitable for the embedded use cases.
> 
Further to Johns comments, why are you creating a new netlink protocol for this?
It seems that 90% of what you want to accomplish above is handled by rtnetlink.
As long as you write your driver properly, most of that should "just work".

Neil

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2013-10-22 20:36 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


Sorry I let so much accumulate, I was in Buffalo and wanted a few
things to cook in my tree for a while before sending to you.  Anyways,
it's a lot of little things as usual at this stage in the game.

1) Make bonding MAINTAINERS entry reflect reality, from Andy Gospodarek.

2) Fix accidental sock_put() on timewait mini sockets, from Eric
   Dumazet.

3) Fix crashes in l2tp due to mis-handling of ipv4 mapped ipv6 addresses,
   from François CACHEREUL.

4) Fix heap overflow in __audit_sockaddr(), from the eagle eyed Dan
   Carpenter.

5) tcp_shifted_skb() doesn't take handle FINs properly, from Eric
   Dumazet.

6) SFC driver bug fixes from Ben Hutchings.

7) Fix TX packet scheduling wedge after channel change in ath9k driver,
   from Felix Fietkau.

8) Fix user after free in BPF JIT code, from Alexei Starovoitov.

9) Source address selection test is reversed in __ip_route_output_key(),
   fix from Jiri Benc.

10) VLAN and CAN layer mis-size netlink attributes, from Marc
    Kleine-Budde.

11) Fix permission checks in sysctls to use current_euid() instead of
    current_uid().  From Eric W. Biederman.

12) IPSEC policies can go away while a timer is still pending for them,
    add appropriate ref-counting to fix, from Steffen Klassert.

13) Fix mis-programming of FDR and RMCR registers on R8A7740 sh_eth
    chips, from Nguyen Hong Ky and Simon Horman.

14) MLX4 forgets to DMA unmap pages on RX, fix from Amir Vadai.

15) IPV6 GRE tunnel MTU upper limit is miscalculated, from Oussama
    Ghorbel.

16) Fix typo in fq_change(), we were assigning "initial quantum" to
    "quantum".  From Eric Dumazet.

17) Set a more appropriate sk_pacing_rate for non-TCP sockets, otherwise
    FQ packet scheduler does not pace those flows properly.  Also from
    Eric Dumazet.

18) rtlwifi miscalculates packet pointers, from Mark Cave-Ayland.

19) l2tp_xmit_skb() can be called from process context, not just
    softirq context, so we must always make sure to BH disable
    around it.  From Eric Dumazet.

20) On qdisc reset, we forget to purge the RB tree of SKBs in netem
    packet scheduler.  From Stephen Hemminger.

21) Fix info leak in farsync WAN driver ioctl() handler, from Dan
    Carpenter and Salva Peiró.

22) Fix PHY reset and other issues in dm9000 driver, from Nikita
    Kiryanov and Michael Abbott.

23) When hardware can do SCTP crc32 checksums, we accidently don't
    disable the csum offload when IPSEC transformations have been
    applied.  From Fan Du and Vlad Yasevich.

24) Tail loss probing in TCP leaves the socket in the wrong congestion
    avoidance state.  From Yuchung Cheng.

25) In CPSW driver, enable NAPI before interrupts are turned on, from
    Markus Pargmann.

26) Integer underflow and dual-assignment in YAM hamradio driver, from
    Dan Carpenter.

27) If we are going to mangle a packet in tcp_set_skb_tso_segs() we must
    unclone it.  This fixes various hard to track down crashes in drivers
    where the SKBs ->gso_segs was changing right from underneath the
    driver during TX queueing.  From Eric Dumazet.

28) Fix the handling of VLAN IDs, and in particular the special IDs 0 and
    4095, in the bridging layer.  From Toshiaki Makita.

29) Another info leak, this time in wanxl WAN driver, from Salva Peiró.

30) Fix race in socket credential passing, from Daniel Borkmann.

31) WHen NETLABEL is disabled, we don't validate CIPSO packets properly,
    from Seif Mazareeb.

32) Fix identification of fragmented frames in ipv4/ipv6 UDP
    Fragmentation Offload output paths, from Jiri Pirko.

33) Virtual Function fixes in bnx2x driver from Yuval Mintz and Ariel
    Elior.

34) When we removed the explicit neighbour pointer from ipv6 routes
    a slight regression was introduced for users such as IPVS, xt_TEE,
    and raw sockets.  We mix up the users requested destination address
    with the routes assigned nexthop/gateway.  From Julian Anastasov
    and Simon Horman.

35) Fix stack overruns in rt6_probe(), the issue is that can end up
    doing two full packet xmit paths at the same time when emitting
    neighbour discovery messages.  From Hannes Frederic Sowa.

36) davinci_emac driver doesn't handle IFF_ALLMULTI correctly, from
    Mariusz Ceier.

37) Make sure to set TCP sk_pacing_rate after the first legitimate
    RTT sample, from Neal Cardwell.

38) Wrong netlink attribute passed to xfrm_replay_verify_len(), from
    Steffen Klassert.

Please pull, thanks a lot!

The following changes since commit c31eeaced22ce8bd61268a3c595d542bb38c0a4f:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2013-10-01 12:58:48 -0700)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net master

for you to fetch changes up to f11a5bc148a313ad37c361c87c9aff2331a8b149:

  ax88179_178a: Add VID:DID for Samsung USB Ethernet Adapter (2013-10-22 15:43:43 -0400)

----------------------------------------------------------------
Alan Ott (5):
      6lowpan: Only make 6lowpan links to IEEE802154 devices
      6lowpan: Sync default hardware address of lowpan links to their wpan
      mrf24j40: Move INIT_COMPLETION() to before packet transmission
      mrf24j40: Use threaded IRQ handler
      mrf24j40: Use level-triggered interrupts

Alexander Bondar (1):
      iwlwifi: mvm: Disable uAPSD for D3 image

Alexandre Belloni (1):
      mac802154: correct a typo in ieee802154_alloc_device() prototype

Alexei Starovoitov (1):
      net: fix unsafe set_memory_rw from softirq

Amir Vadai (2):
      net/mlx4_en: Rename name of mlx4_en_rx_alloc members
      net/mlx4_en: Fix pages never dma unmapped on rx

Amitkumar Karwar (1):
      mwifiex: fix SDIO interrupt lost issue

Andi Kleen (2):
      igb: Avoid uninitialized advertised variable in eee_set_cur
      tcp: Always set options to 0 before calling tcp_established_options

Andy Gospodarek (1):
      bonding: update MAINTAINERS

Ariel Elior (3):
      bnx2x: Unlock VF-PF channel on MAC/VLAN config error
      bnx2x: Fix config when SR-IOV and iSCSI are enabled
      bnx2x: Lock DMAE when used by statistic flow

Avinash Patil (2):
      mwifiex: inform cfg80211 about disconnect if device is removed
      mwifiex: inform cfg80211 about disconnect for P2P client interface

Ben Hutchings (1):
      sfc: Only bind to EF10 functions with the LinkCtrl and Trusted flags

Bruno Randolf (1):
      cfg80211: fix warning when using WEXT for IBSS

Christophe Gouault (1):
      vti: get rid of nf mark rule in prerouting

Chun-Yeow Yeoh (1):
      mac80211: fix the setting of extended supported rate IE

Claudiu Manoil (3):
      gianfar: Enable eTSEC-A002 erratum w/a for all parts
      gianfar: Use mpc85xx support for errata detection
      gianfar: Enable eTSEC-20 erratum w/a for P2020 Rev1

Dan Carpenter (3):
      net: heap overflow in __audit_sockaddr()
      yam: integer underflow in yam_ioctl()
      yam: remove a no-op in yam_ioctl()

Daniel Borkmann (1):
      net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race

David S. Miller (20):
      Merge branch 'connector'
      Merge branch 'calxedaxgmac'
      Merge branch 'mv643xx'
      Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
      Merge branch 'for-davem' of git://git.kernel.org/.../linville/wireless
      Merge branch '6lowpan'
      Merge branch 'mrf24j40'
      l2tp: Fix build warning with ipv6 disabled.
      Merge branch 'mlx4'
      Merge branch 'sfc-3.12' of git://git.kernel.org/.../bwh/sfc
      Merge branch 'master' of git://git.kernel.org/.../klassert/ipsec
      Merge branch 'gianfar'
      Merge branch 'for-davem' of git://git.kernel.org/.../linville/wireless
      Merge branch 'dm9000'
      Merge branch 'sctp_csum'
      Merge branch 'for-davem' of git://git.kernel.org/.../linville/wireless
      Merge branch 'bridge_pvid'
      Merge branch 'ufo_fixes'
      Merge branch 'bnx2x'
      Merge branch 'rt6i_gateway'

David Vrabel (1):
      xen-netback: transition to CLOSED when removing a VIF

Dmitry Kravkov (2):
      bnx2x: Fix Coalescing configuration
      bnx2x: Don't pretend during register dump

Edward Cree (3):
      sfc: Fix internal indices of ethtool stats for EF10
      sfc: Refactor EF10 stat mask code to allow for more conditional stats
      sfc: Add PM and RXDP drop counters to ethtool stats

Emmanuel Grumbach (6):
      iwlwifi: pcie: don't reset the TX queue counter
      iwlwifi: don't WARN on host commands sent when firmware is dead
      iwlwifi: pcie: add SKUs for 6000, 6005 and 6235 series
      iwlwifi: mvm: call ieee80211_scan_completed when needed
      mac80211: correctly close cancelled scans
      cfg80211: don't add p2p device while in RFKILL

Enrico Mioso (1):
      net: qmi_wwan: Olivetti Olicard 200 support

Eric Dumazet (8):
      net: do not call sock_put() on TIMEWAIT sockets
      tcp: do not forget FIN in tcp_shifted_skb()
      pkt_sched: fq: fix typo for initial_quantum
      pkt_sched: fq: fix non TCP flows pacing
      l2tp: must disable bh before calling l2tp_xmit_skb()
      bnx2x: record rx queue for LRO packets
      tcp: must unclone packets before mangling them
      tcp: remove the sk_can_gso() check from tcp_set_skb_tso_segs()

Eric W. Biederman (1):
      net: Update the sysctl permissions handler to test effective uid/gid

Fabio Estevam (1):
      net: secure_seq: Fix warning when CONFIG_IPV6 and CONFIG_INET are not selected

Fan Du (2):
      xfrm: Guard IPsec anti replay window against replay bitmap
      sctp: Use software crc32 checksum when xfrm transform will happen.

Felix Fietkau (5):
      mac80211: drop spoofed packets in ad-hoc mode
      mac80211: use sta_info_get_bss() for nl80211 tx and client probing
      mac80211: update sta->last_rx on acked tx frames
      ath9k: fix powersave response handling for BA session packets
      ath9k: fix tx queue scheduling after channel changes

François Cachereul (1):
      l2tp: fix kernel panic when using IPv4-mapped IPv6 addresses

Freddy Xin (2):
      ax88179_178a: Correct the RX error definition in RX header
      ax88179_178a: Add VID:DID for Samsung USB Ethernet Adapter

Hannes Frederic Sowa (1):
      ipv6: probe routes asynchronous in rt6_probe

Himanshu Madhani (1):
      qlcnic: Validate Tx queue only for 82xx adapters.

Jason Wang (2):
      virtio-net: don't respond to cpu hotplug notifier if we're not ready
      virtio-net: refill only when device is up during setting queues

Jiri Benc (1):
      ipv4: fix ineffective source address selection

Jiri Pirko (3):
      udp6: respect IPV6_DONTFRAG sockopt in case there are pending frames
      ip6_output: do skb ufo init for peeked non ufo skb as well
      ip_output: do skb ufo init for peeked non ufo skb as well

Johannes Berg (4):
      cfg80211: fix sysfs registration race
      iwlwifi: pcie: fix merge damage
      wireless: radiotap: fix parsing buffer overrun
      mac80211: fix crash if bitrate calculation goes wrong

John W. Linville (7):
      Merge branch 'for-john' of git://git.kernel.org/.../jberg/mac80211
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless into for-davem
      Merge branch 'for-john' of git://git.kernel.org/.../iwlwifi/iwlwifi-fixes
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless into for-davem
      Merge branch 'for-john' of git://git.kernel.org/.../jberg/mac80211
      Merge branch 'for-john' of git://git.kernel.org/.../jberg/mac80211
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless into for-davem

Jon Cooper (1):
      sfc: Add rmb() between reading stats and generation count to ensure consistency

Jouni Malinen (1):
      mac80211: Run deferred scan if last roc_list item is not started

Julian Anastasov (3):
      ipv6: always prefer rt6i_gateway if present
      ipv6: fill rt6i_gateway with nexthop address
      netfilter: nf_conntrack: fix rt6i_gateway checks for H.323 helper

Linus Lüssing (1):
      Revert "bridge: only expire the mdb entry when query is received"

Luciano Coelho (1):
      cfg80211: use the correct macro to check for active monitor support

Marc Kleine-Budde (5):
      can: dev: fix nlmsg size calculation in can_get_size()
      net: vlan: fix nlmsg size calculation in vlan_get_size()
      can: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and abort pending TX
      can: flexcan: fix mx28 detection by rearanging OF match table
      can: at91-can: fix device to driver data mapping for platform devices

Mariusz Ceier (1):
      davinci_emac.c: Fix IFF_ALLMULTI setup

Mark Cave-Ayland (1):
      rtlwifi: rtl8192cu: Fix error in pointer arithmetic

Markus Pargmann (3):
      net: ethernet: cpsw: Search childs for slave nodes
      net/ethernet: cpsw: DT read bool dual_emac
      net/ethernet: cpsw: Bugfix interrupts before enabling napi

Mathias Krause (5):
      proc connector: fix info leaks
      connector: use nlmsg_len() to check message length
      connector: use 'size' everywhere in cn_netlink_send()
      connector - documentation: simplify netlink message length assignment
      unix_diag: fix info leak

Matthew Slattery (1):
      sfc: Add definitions for new stats counters and capability flag

Matthias Schiffer (1):
      batman-adv: set up network coding packet handlers during module init

Matti Gottlieb (1):
      iwlwifi: pcie: add new SKUs for 7000 & 3160 NIC series

Merav Sicron (1):
      bnx2x: Set NETIF_F_HIGHDMA unconditionally

Michael Abbott (1):
      dm9000: Implement full reset of DM9000 network device

Michael S. Tsirkin (2):
      netif_set_xps_queue: make cpu mask const
      tun: don't look at current when non-blocking

Mugunthan V N (1):
      drivers: net: cpsw: fix kernel warn during iperf test with interrupt pacing

Neal Cardwell (1):
      tcp: initialize passive-side sk_pacing_rate after 3WHS

Nguyen Hong Ky (1):
      net: sh_eth: Fix RX packets errors on R8A7740

Nikita Kiryanov (3):
      dm9000: during init reset phy only for dm9000b
      dm9000: take phy out of reset during init
      dm9000: report the correct LPA

Oussama Ghorbel (3):
      ipv6: Allow the MTU of ipip6 tunnel to be set below 1280
      ipv6: Fix the upper MTU limit in GRE tunnel
      ipv6: Initialize ip6_tnl.hlen in gre tunnel even if no route is found

Rob Herring (3):
      net: calxedaxgmac: fix clearing of old filter addresses
      net: calxedaxgmac: add uc and mc filter addresses in promiscuous mode
      net: calxedaxgmac: determine number of address filters at runtime

Salva Peiró (2):
      farsync: fix info leak in ioctl
      wanxl: fix info leak in ioctl

Sebastian Hesselbarth (3):
      net: mv643xx_eth: update statistics timer from timer context only
      net: mv643xx_eth: fix orphaned statistics timer crash
      net: mv643xx_eth: fix missing device_node for port devices

Seif Mazareeb (1):
      net: fix cipso packet validation when !NETLABEL

Simon Horman (1):
      net: sh_eth: Correct fix for RX packet errors on R8A7740

Solomon Peachy (1):
      wireless: cw1200: acquire hwbus lock around cw1200_irq_handler() call.

Stanislaw Gruszka (1):
      Revert "rt2x00pci: Use PCI MSIs whenever possible"

Steffen Klassert (5):
      xfrm: Fix replay size checking on async events
      xfrm: Decode sessions with output interface.
      ipsec: Don't update the pmtu on ICMPV6_DEST_UNREACH
      xfrm: Add refcount handling to queued policies
      xfrm: check for a vaild skb in xfrm_policy_queue_process

Thomas Egerer (1):
      xfrm: Fix aevent generation for each received packet

Toshiaki Makita (4):
      bridge: Don't use VID 0 and 4095 in vlan filtering
      bridge: Apply the PVID to priority-tagged frames
      bridge: Fix the way the PVID is referenced
      bridge: Fix updating FDB entries when the PVID is applied

Vasundhara Volam (1):
      be2net: pass if_id for v1 and V2 versions of TX_CREATE cmd

Vlad Yasevich (4):
      bridge: update mdb expiration timer upon reports.
      net: dst: provide accessor function to dst->xfrm
      sctp: Perform software checksum if packet has to be fragmented.
      bridge: Correctly clamp MAX forward_delay when enabling STP

Wei Yongjun (3):
      moxa: fix the error handling in moxart_mac_probe()
      qlcnic: add missing destroy_workqueue() on error path in qlcnic_probe()
      usbnet: fix error return code in usbnet_probe()

Will Deacon (1):
      net: smc91x: dont't use SMC_outw for fixing up halfword-aligned data

Yuchung Cheng (1):
      tcp: fix incorrect ca_state in tail loss probe

Yuval Mintz (3):
      bnx2x: Fix Maximum CoS estimation for VFs
      bnx2x: Prevent an illegal pointer dereference during panic
      bnx2x: Prevent null pointer dereference on error flow

stephen hemminger (3):
      tc: export tc_defact.h to userspace
      netem: update backlog after drop
      netem: free skb's in tree on reset

 Documentation/connector/ucon.c                      |   2 +-
 MAINTAINERS                                         |   1 +
 arch/arm/net/bpf_jit_32.c                           |   1 +
 arch/powerpc/net/bpf_jit_comp.c                     |   1 +
 arch/s390/net/bpf_jit_comp.c                        |   4 +-
 arch/sparc/net/bpf_jit_comp.c                       |   1 +
 arch/x86/net/bpf_jit_comp.c                         |  18 ++-
 drivers/connector/cn_proc.c                         |  18 +++
 drivers/connector/connector.c                       |   9 +-
 drivers/net/can/at91_can.c                          |   4 +-
 drivers/net/can/dev.c                               |  10 +-
 drivers/net/can/flexcan.c                           |  14 ++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h         |  15 ++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c     |   1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c |  40 +-----
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_init.h    |  38 ++++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c    | 388 ++++++++++++++++++++++++++++++++---------------------------
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c   |  29 +++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c   |   2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c    |   2 +-
 drivers/net/ethernet/calxeda/xgmac.c                |  23 ++--
 drivers/net/ethernet/davicom/dm9000.c               |  56 ++++++---
 drivers/net/ethernet/emulex/benet/be_cmds.c         |   3 +-
 drivers/net/ethernet/freescale/gianfar.c            |  38 ++++--
 drivers/net/ethernet/intel/igb/igb_ethtool.c        |   2 +
 drivers/net/ethernet/marvell/mv643xx_eth.c          |   7 +-
 drivers/net/ethernet/mellanox/mlx4/en_rx.c          |  41 ++++---
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h        |   4 +-
 drivers/net/ethernet/moxa/moxart_ether.c            |  22 +++-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c |   2 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c    |  13 +-
 drivers/net/ethernet/renesas/sh_eth.c               |   4 +
 drivers/net/ethernet/sfc/ef10.c                     |  87 ++++++++++----
 drivers/net/ethernet/sfc/mcdi.c                     |  18 ++-
 drivers/net/ethernet/sfc/mcdi_pcol.h                |  56 ++++++++-
 drivers/net/ethernet/sfc/nic.c                      |   9 +-
 drivers/net/ethernet/sfc/nic.h                      |  12 ++
 drivers/net/ethernet/smsc/smc91x.h                  |   6 +-
 drivers/net/ethernet/ti/cpsw.c                      |  19 ++-
 drivers/net/ethernet/ti/davinci_emac.c              |   3 +-
 drivers/net/hamradio/yam.c                          |   1 -
 drivers/net/ieee802154/mrf24j40.c                   |  31 ++---
 drivers/net/tun.c                                   |   8 +-
 drivers/net/usb/ax88179_178a.c                      |  23 +++-
 drivers/net/usb/qmi_wwan.c                          |   1 +
 drivers/net/usb/usbnet.c                            |   4 +-
 drivers/net/virtio_net.c                            |  14 ++-
 drivers/net/wan/farsync.c                           |   1 +
 drivers/net/wan/wanxl.c                             |   1 +
 drivers/net/wireless/ath/ath9k/main.c               |  23 ++--
 drivers/net/wireless/ath/ath9k/xmit.c               |   9 +-
 drivers/net/wireless/cw1200/cw1200_spi.c            |   2 +
 drivers/net/wireless/iwlwifi/iwl-6000.c             |   6 +
 drivers/net/wireless/iwlwifi/iwl-config.h           |   1 +
 drivers/net/wireless/iwlwifi/iwl-trans.h            |   6 +-
 drivers/net/wireless/iwlwifi/mvm/power.c            |   5 +-
 drivers/net/wireless/iwlwifi/mvm/scan.c             |  12 +-
 drivers/net/wireless/iwlwifi/pcie/drv.c             |  42 +++++++
 drivers/net/wireless/iwlwifi/pcie/trans.c           |   8 +-
 drivers/net/wireless/iwlwifi/pcie/tx.c              |   2 +
 drivers/net/wireless/mwifiex/join.c                 |  10 +-
 drivers/net/wireless/mwifiex/main.c                 |   6 +-
 drivers/net/wireless/mwifiex/sta_event.c            |   3 +-
 drivers/net/wireless/rt2x00/rt2x00pci.c             |   9 +-
 drivers/net/wireless/rtlwifi/rtl8192cu/trx.c        |   3 +-
 drivers/net/xen-netback/xenbus.c                    |   4 +
 include/linux/filter.h                              |  15 ++-
 include/linux/netdevice.h                           |   5 +-
 include/linux/yam.h                                 |   2 +-
 include/net/cipso_ipv4.h                            |   6 +-
 include/net/dst.h                                   |  12 ++
 include/net/ip6_route.h                             |   6 +-
 include/net/mac802154.h                             |   2 +-
 include/net/sock.h                                  |   6 +-
 include/uapi/linux/tc_act/Kbuild                    |   1 +
 include/{ => uapi}/linux/tc_act/tc_defact.h         |   2 +-
 net/8021q/vlan_netlink.c                            |   2 +-
 net/batman-adv/main.c                               |   5 +-
 net/batman-adv/network-coding.c                     |  28 +++--
 net/batman-adv/network-coding.h                     |  14 ++-
 net/bridge/br_fdb.c                                 |   4 +-
 net/bridge/br_mdb.c                                 |   2 +-
 net/bridge/br_multicast.c                           |  38 ++++--
 net/bridge/br_netlink.c                             |   2 +-
 net/bridge/br_private.h                             |   5 +-
 net/bridge/br_stp_if.c                              |   2 +-
 net/bridge/br_vlan.c                                | 125 ++++++++++---------
 net/compat.c                                        |   2 +
 net/core/dev.c                                      |   3 +-
 net/core/filter.c                                   |   8 +-
 net/core/secure_seq.c                               |   2 +
 net/core/sock.c                                     |   1 +
 net/ieee802154/6lowpan.c                            |   5 +
 net/ipv4/inet_hashtables.c                          |   2 +-
 net/ipv4/ip_output.c                                |  13 +-
 net/ipv4/ip_vti.c                                   |  14 ++-
 net/ipv4/route.c                                    |   2 +-
 net/ipv4/tcp_input.c                                |   9 +-
 net/ipv4/tcp_output.c                               |  14 ++-
 net/ipv4/xfrm4_policy.c                             |   1 +
 net/ipv6/ah6.c                                      |   3 +-
 net/ipv6/esp6.c                                     |   3 +-
 net/ipv6/inet6_hashtables.c                         |   2 +-
 net/ipv6/ip6_gre.c                                  |   6 +-
 net/ipv6/ip6_output.c                               |  29 +++--
 net/ipv6/ip6_tunnel.c                               |  12 +-
 net/ipv6/ipcomp6.c                                  |   3 +-
 net/ipv6/route.c                                    |  46 +++++--
 net/ipv6/udp.c                                      |   5 +-
 net/ipv6/xfrm6_policy.c                             |   1 +
 net/key/af_key.c                                    |   3 +-
 net/l2tp/l2tp_core.c                                |  36 ++++--
 net/l2tp/l2tp_core.h                                |   3 +
 net/l2tp/l2tp_ppp.c                                 |   4 +
 net/mac80211/cfg.c                                  |   2 +-
 net/mac80211/ieee80211_i.h                          |   3 +
 net/mac80211/offchannel.c                           |   2 +
 net/mac80211/rx.c                                   |   3 +
 net/mac80211/scan.c                                 |  19 +++
 net/mac80211/status.c                               |   3 +
 net/mac80211/tx.c                                   |   3 +-
 net/mac80211/util.c                                 |   9 +-
 net/netfilter/nf_conntrack_h323_main.c              |   4 +-
 net/sched/sch_fq.c                                  |  22 ++--
 net/sched/sch_netem.c                               |  17 +++
 net/sctp/output.c                                   |   3 +-
 net/socket.c                                        |  24 +++-
 net/sysctl_net.c                                    |   4 +-
 net/unix/af_unix.c                                  |  10 ++
 net/unix/diag.c                                     |   1 +
 net/wireless/core.c                                 |  23 ++--
 net/wireless/core.h                                 |   3 +
 net/wireless/ibss.c                                 |   3 +
 net/wireless/nl80211.c                              |   4 +-
 net/wireless/radiotap.c                             |   7 +-
 net/xfrm/xfrm_policy.c                              |  28 +++--
 net/xfrm/xfrm_replay.c                              |  54 +++++----
 net/xfrm/xfrm_user.c                                |   5 +-
 138 files changed, 1313 insertions(+), 717 deletions(-)
 rename include/{ => uapi}/linux/tc_act/tc_defact.h (75%)

^ permalink raw reply

* Re: [PATCH v2.44 1/5] odp: Allow VLAN actions after MPLS actions
From: Ben Pfaff @ 2013-10-22 20:55 UTC (permalink / raw)
  To: Joe Stringer
  Cc: Simon Horman, dev, netdev, Jesse Gross, Pravin B Shelar, Ravi K,
	Isaku Yamahata, Joe Stringer
In-Reply-To: <CAOftzPgSG7NO+hFEeHU0xGH5TjJRe-THGDVz7THrm26j5ua-bA@mail.gmail.com>

On Tue, Oct 22, 2013 at 11:30:26AM -0700, Joe Stringer wrote:
> You're quite right. I think for OF1.2, this is similar to existing
> behaviour, but for OF1.3 it's just incorrect. There is an additional issue
> with the LOAD action.
> 
> OF1.2:
> "push_vlan(A),push_mpls,push_vlan(B)"
> In OF1.2, this has the same result as:-
> "push_mpls,push_vlan(A),push_vlan(B)"
> 
> When translated, this boils down to "(pop_vlan,)push_mpls,push_vlan(B)".
> Correct me if I am wrong, but I think this is similar to the existing
> behaviour, as we currently only support one layer of VLAN. It doesn't
> matter if A == B.
> 
> OF1.3:
> "mod_vlan_vid:A,push_mpls:0x8847,mod_vlan_vid:A" should work, but the
> second mod_vlan is being dropped as it has the same VID as the first. This
> is incorrect, as you point out.
> 
> LOAD:
> "load:A->OXM_OF_VLAN_VID,push_mpls:0x8847,load:A->OXM_OF_VLAN_VID" doesn't
> result in the correct vlan_vid from the first load action. I'm not sure
> that vlan_tci_restore() is clear or correct---I believe its original
> purpose was to properly handle the case where MPLS changes are made from
> REG_LOAD and friends, in the way that the PUSH_MPLS case works. Instead, it
> is handling all cases where MPLS actions have been applied in the past,
> whether the current action modifies MPLS or not.
> 
> It's probably also worthwhile to make use of the ovs-ofctl monitor "-m"
> option in the tests to actually verify these, rather than the current tests
> where we just check the size of the resulting packet.

Thanks for the careful analysis.

^ permalink raw reply

* Re: [PATCH 1/4 net-next] net: phy: add Generic Netlink Ethernet switch configuration API
From: David Miller @ 2013-10-22 21:22 UTC (permalink / raw)
  To: dcbw; +Cc: f.fainelli, netdev, s.hauer, nbd, blogic, jogo, gary
In-Reply-To: <1382477150.19269.69.camel@dcbw.foobar.com>

From: Dan Williams <dcbw@redhat.com>
Date: Tue, 22 Oct 2013 16:25:50 -0500

> On Tue, 2013-10-22 at 15:47 -0400, David Miller wrote:
>> From: Florian Fainelli <f.fainelli@gmail.com>
>> Date: Tue, 22 Oct 2013 12:32:29 -0700
>> 
>> > 2013/10/22 Dan Williams <dcbw@redhat.com>:
>> >> On Tue, 2013-10-22 at 11:23 -0700, Florian Fainelli wrote:
>> >>> This patch adds an Ethernet Switch generic netlink configuration API
>> >>> which allows for doing the required configuration of managed Ethernet
>> >>> switches commonly found in Wireless/Cable/DSL routers in the market.
>> >>
>> >> "swconfig" probably means "switch config", but is there any way to
>> >> rename this away from the "sw" prefix, since "sw" typically means
>> >> "software" and not "switch"?
>> > 
>> > Sure, how about something like "enetsw"? I would like to avoid using
>> > "switch" too much since this is a C reserved keyword.
>> 
>> "swtch"? :-)
> 
> haha...  seriously though, "enetsw" or even "esw" or "ensw" would be
> better than plain "sw".  Your choice, I have no horse in the race other
> than the "not sw" horse :)

"enetsw" is fine by me :-)

^ permalink raw reply

* [PATCH net-next] netem: markov loss model transition fix
From: Hagen Paul Pfeifer @ 2013-10-22 21:27 UTC (permalink / raw)
  To: netdev
  Cc: Hagen Paul Pfeifer, Stephen Hemminger, Eric Dumazet,
	Stefano Salsano, Fabio Ludovici

The transition from markov state "3 => lost packets within a burst
period" to "1 => successfully transmitted packets within a gap period"
has no *additional* loss event. The loss already happen for transition
from 1 -> 3, this additional loss will make things go wild.

E.g. transition probabilities:

p13:   10%
p31:  100%

Expected:

Ploss = p13 / (p13 + p31)
Ploss = ~9.09%

... but it isn't. Even worse: we get a double loss - each time.
So simple don't return true to indicate loss, rather break and return
false.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Stefano Salsano <stefano.salsano@uniroma2.it>
Cc: Fabio Ludovici <fabio.ludovici@yahoo.it>
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
---
 net/sched/sch_netem.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index a6d788d..6bf9088 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -235,7 +235,6 @@ static bool loss_4state(struct netem_sched_data *q)
 			clg->state = 2;
 		else if (clg->a3 < rnd && rnd < clg->a2 + clg->a3) {
 			clg->state = 1;
-			return true;
 		} else if (clg->a2 + clg->a3 < rnd) {
 			clg->state = 3;
 			return true;
-- 
1.8.4.rc3

^ permalink raw reply related

* -27% netperf TCP_STREAM regression by "tcp_memcontrol: Kill struct tcp_memcontrol"
From: fengguang.wu @ 2013-10-22 21:41 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev, LKML

Hi Eric,

We noticed big netperf throughput regressions

    a4fe34bf902b8f709c63      2e685cad57906e19add7  
------------------------  ------------------------  
                  707.40       -40.7%       419.60  lkp-nex04/micro/netperf/120s-200%-TCP_STREAM
                 2775.60       -23.7%      2116.40  lkp-sb03/micro/netperf/120s-200%-TCP_STREAM
                 3483.00       -27.2%      2536.00  TOTAL netperf.Throughput_Mbps

and bisected it to

commit 2e685cad57906e19add7189b5ff49dfb6aaa21d3
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Sat Oct 19 16:26:19 2013 -0700

    tcp_memcontrol: Kill struct tcp_memcontrol
    
    Replace the pointers in struct cg_proto with actual data fields and kill
    struct tcp_memcontrol as it is not fully redundant.
    
    This removes a confusing, unnecessary layer of abstraction.
    
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 a1896af98145c8ae2765a787845c43c9700c7dc0 02c93b50f66f1d1b34983bf3cc7e9a0dcc7105dc M	include
:040000 040000 ebe5d0619b54ddf730224f6581f595491eb36989 cd560b4a6e56cecac931814ba16420e167eb68f6 M	mm
:040000 040000 5df01f70484e07fbf98a7d5b8e0a53270777ac3d 1f8d1b340d8810a79691777f4e3ee529027b3c9b M	net
bisect run success

# bad: [aec2994e1799312822a30fefc27205e7360fe5af] Merge 'pwm/for-next' into devel-hourly-2013102222
# good: [31d141e3a666269a3b6fcccddb0351caf7454240] Linux 3.12-rc6
git bisect start 'aec2994e1799312822a30fefc27205e7360fe5af' '31d141e3a666269a3b6fcccddb0351caf7454240' '--'
# good: [ef26157747d42254453f6b3ac2bd8bd3c53339c3] batman-adv: tvlv - basic infrastructure
git bisect good ef26157747d42254453f6b3ac2bd8bd3c53339c3
# bad: [cc6a88faebab06b0323818cd102a6aae443cf34a] Merge 'netdev-next/master' into devel-hourly-2013102222
git bisect bad cc6a88faebab06b0323818cd102a6aae443cf34a
# good: [5cda73b68ebf7e08586d61e6777e64e12df23f07] Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
git bisect good 5cda73b68ebf7e08586d61e6777e64e12df23f07
# good: [a8fab0744585c1ab61009bfc1a1958f28e1c864f] x86/jump_label: expect default_nop if static_key gets enabled on boot-up
git bisect good a8fab0744585c1ab61009bfc1a1958f28e1c864f
# good: [21d35d212469c3138f8916f7e47b779313d79751] net: sky2: remove unnecessary pci_set_drvdata()
git bisect good 21d35d212469c3138f8916f7e47b779313d79751
# bad: [61c1db7fae21ed33c614356a43bf6580c5e53118] ipv6: sit: add GSO/TSO support
git bisect bad 61c1db7fae21ed33c614356a43bf6580c5e53118
# good: [a4fe34bf902b8f709c635ab37f1f39de0b86cff2] tcp_memcontrol: Remove the per netns control.
git bisect good a4fe34bf902b8f709c635ab37f1f39de0b86cff2
# bad: [1b66917d6b76db0abe1a1bbf86b2517ba8b91d98] cgxb4: remove duplicate include in cxgb4.h
git bisect bad 1b66917d6b76db0abe1a1bbf86b2517ba8b91d98
# bad: [0a6fa23dcb10eeb21adfd9955f7030f952a8122d] ipv4: Use math to point per net sysctls into the appropriate struct net.
git bisect bad 0a6fa23dcb10eeb21adfd9955f7030f952a8122d
# bad: [2e685cad57906e19add7189b5ff49dfb6aaa21d3] tcp_memcontrol: Kill struct tcp_memcontrol
git bisect bad 2e685cad57906e19add7189b5ff49dfb6aaa21d3
# first bad commit: [2e685cad57906e19add7189b5ff49dfb6aaa21d3] tcp_memcontrol: Kill struct tcp_memcontrol


                              netperf.Throughput_Mbps

   750 ++-------------------------------------------------------------------+
       *...*...                             .*...                         ..*
   700 ++      *...*..*...*...*...*...*...*.     *...*...*...*...*..*...*.  |
       |                                                                    |
   650 ++                                                                   |
       |                                                                    |
   600 ++                                                                   |
       |                                                                    |
   550 ++                                                                   |
       |                                                                    |
   500 ++                                                                   |
       |                                                                    |
   450 ++                                                                   |
       O   O       O  O   O       O       O  O   O   O   O   O              |
   400 ++------O--------------O-------O-------------------------------------+


                                  vmstat.system.in

   17200 ++-----------------------------------------------------------------+
   17100 ++               ..*..*.      *...*..                              |
         |          *...*.       ..  ..       *                        .*...|
   17000 ++       ..                .          +                     *.     *
   16900 ++..*.. .                 *            +                  ..       |
   16800 *+     *                                +       *        .         |
   16700 ++                                       *..   + +      *          |
         |                                           . +   +   ..           |
   16600 ++                                           *     + .             |
   16500 ++                                                  *              |
   16400 ++                                                                 |
   16300 ++                                                                 |
         O          O   O      O       O              O                     |
   16200 ++  O  O           O      O       O  O   O      O   O              |
   16100 ++-----------------------------------------------------------------+


                                   vmstat.system.cs

   550000 ++----------------------------------------------------------------+
   500000 *+..*..*..        *...*..    *...  ..*..*..    .*..*...*..    *...*
          |         .     ..       . ..    *.        . ..           . ..    |
   450000 ++         *...*          *                 *              *      |
   400000 ++                                                                |
   350000 ++                                                                |
   300000 ++                                                                |
          |                                                                 |
   250000 ++                                                                |
   200000 ++                                                                |
   150000 ++                                                                |
   100000 ++                                                                |
          |                                                                 |
    50000 O+  O  O   O   O  O   O   O  O   O   O  O   O   O  O              |
        0 ++----------------------------------------------------------------+


                          lock_stat.slock-AF_INET.contentions

   150000 ++----------------------------------------------------------------+
          |                 *...         ..*...                             |
   140000 ++               +    *      *.      *                            |
   130000 ++              +      :    :         :                           |
          |            ..*        :   :         :                       *...*
   120000 ++ .*..  ..*.           :  :         O :                    ..    |
   110000 ++.    *.                : :           :                  .*      |
          *                         *             :       *       ..        |
   100000 ++                                      *.     + :     *          |
    90000 ++                                        ..  +   :  ..           |
          |                                            +    : .             |
    80000 ++                           O              *      *              |
    70000 O+         O   O      O                     O                     |
          |   O             O       O             O          O              |
    60000 ++-----O-------------------------O--------------O-----------------+


                 lock_stat.slock-AF_INET.contentions.lock_sock_nested

   130000 ++----------------------------------------------------------------+
          |                                                                 |
   120000 ++                *...         ..*...*                            |
   110000 ++               +    *      *.       :                           |
          |               +      +    :        O:                       *...*
   100000 ++ .*..   .*...*        +   :          :                    ..    |
          |..     ..               + :           :                  .*      |
    90000 *+     *                  *             :       *       ..        |
          |                                       *.     + :     *          |
    80000 ++                                        ..  +   :  ..           |
    70000 ++                                           +    : .             |
          |              O      O      O              *      *              |
    60000 O+         O                            O   O                     |
          |   O  O          O       O      O              O  O              |
    50000 ++----------------------------------------------------------------+


                    lock_stat.slock-AF_INET.contentions.tcp_v4_rcv

   150000 ++----------------------------------------------------------------+
          |                 *...                                            |
   140000 ++               +    *      *...*...*                            |
   130000 ++              +      :    :         :                           |
          |            ..*        :   :         :                       *...*
   120000 ++ .*..  ..*.           :  :         O :                    ..    |
   110000 ++.    *.                : :           :                  .*      |
          *                         *             :               ..        |
   100000 ++                                      *.      *      *          |
    90000 ++                                        ..  .. +   ..           |
          |                                            .    + .             |
    80000 ++                           O              *      *              |
    70000 ++         O   O      O                     O                     |
          O   O             O       O             O          O              |
    60000 ++-----O-------------------------O--------------O-----------------+


                        lock_stat.slock-AF_INET/1.contentions

   50000 ++-----------------------------------------------------------------+
         |                  *..                                           ..*
   45000 ++                +           *...*..                          *.  |
   40000 ++               +    *.     +       *..                     ..    |
         |  .*..         +       ..  +           .                 ..*      |
   35000 ++.    *...  ..*           +             *      *.      *.         |
         *          *.             *               :    :  ..  ..           |
   30000 ++                                         :   :     .             |
         |                                          :  :     *              |
   25000 ++                                          : :                    |
   20000 ++                                           *                     |
         |                                    O                             |
   15000 ++                            O              O                     |
         O   O  O   O   O   O  O   O       O      O      O   O              |
   10000 ++-----------------------------------------------------------------+


                  lock_stat.slock-AF_INET/1.contentions.tcp_v4_rcv

   50000 ++-----------------------------------------------------------------+
         |                  *..                                           ..*
   45000 ++                +           *...*..                          *.  |
   40000 ++               +    *.     +       *..                     ..    |
         |  .*..         +       ..  +           .                 ..*      |
   35000 ++.    *...  ..*           +             *      *.      *.         |
         *          *.             *               :    :  ..  ..           |
   30000 ++                                         :   :     .             |
         |                                          :  :     *              |
   25000 ++                                          : :                    |
   20000 ++                                           *                     |
         |                                    O                             |
   15000 ++                            O              O                     |
         O   O  O   O   O   O  O   O       O      O      O   O              |
   10000 ++-----------------------------------------------------------------+


                 lock_stat.slock-AF_INET/1.contentions.release_sock

   35000 ++-----------------------------------------------------------------*
         |                  *..                                         *.  |
   30000 ++                +           *...*..*.                      ..    |
         |                +    *..   ..         ..                 ..*      |
         |  .*..         +        . .                    *.      *.         |
   25000 ++.    *...*...*          *              *     :  ..  ..           |
         *                                         +    :     .             |
   20000 ++                                         +  :     *              |
         |                                           + :                    |
   15000 ++                                           *                     |
         |                                                                  |
         |                                    O                             |
   10000 O+         O   O      O       O          O   O                     |
         |   O  O           O      O       O             O   O              |
    5000 ++-----------------------------------------------------------------+


                           lock_stat.&rq->lock.contentions

   45000 ++-----------------------------------------------------------------+
         |                 .*..        *...                                 |
   40000 ++              ..    *..   ..    *..                              |
   35000 ++ .*..      ..*         . .         *.                       .*...*
         |..    *...*.             *            ..                  .*.     |
   30000 *+                                                       ..        |
   25000 ++                                       *..    *...  ..*          |
         |                                           . ..    *.             |
   20000 ++                                           *                     |
   15000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
    5000 O+  O  O   O   O   O  O   O   O   O  O   O   O      O              |
         |                                               O                  |
       0 ++-----------------------------------------------------------------+


                   lock_stat.&rq->lock.contentions.try_to_wake_up

   35000 ++-----------------------------------------------------------------+
         |                                                                  |
   30000 ++                .*..            *..                              |
         |   *           ..    *..        :                                 |
   25000 ++.. :         *         .       :   *.                        *   |
         |.   :        :           *.    :      ..                      ::  |
   20000 *+    :       :             .. :                       .*     :  : |
         |     :      :                 :         *...        ..  +    :  : |
   15000 ++     :    :                 *              *      *     +  :    :|
         |      *... :                                 +   ..       + :     *
   10000 ++         *                                   + .          *      |
         |                                               *                  |
    5000 ++                                   O                             |
         O   O  O   O   O   O  O       O   O          O  O   O              |
       0 ++------------------------O--------------O-------------------------+


                     lock_stat.&rq->lock.contentions.__schedule

   35000 ++-----------------------------------------------------------------+
         |                                                                  |
   30000 ++                 *..                                             |
         |                ..              .*..                              |
   25000 ++ .*           .     *...     ..                              *.  |
         |..  +        .*          *...*      *                        :  ..|
   20000 *+    +     ..                        +                       :    |
         |      *...*                           +              ..*... :     *
   15000 ++                                      +         ..*.      *      |
         |                                        *...*..*.                 |
   10000 ++                                                                 |
         |                                                                  |
    5000 ++                    O                                            |
         O   O  O   O   O   O      O   O   O  O   O   O  O   O              |
       0 ++-----------------------------------------------------------------+


             lock_stat.&(&base->lock)->rlock.contentions.lock_timer_base

   18000 ++----------------------------*------------------------------------+
         |                ..*..       +    *..                         .*...*
   16000 ++           ..*.     *..   +        *...                  .*.     |
   14000 ++ .*..*...*.            . +             *      *.       ..        |
         |..                       *               +    :  ..   .*          |
   12000 *+                                         +   :     ..            |
   10000 ++                                          + :     *              |
         |                                            *                     |
    8000 ++                                                                 |
    6000 ++                                                                 |
         |                                                                  |
    4000 ++                                                                 |
    2000 ++                    O                                            |
         O   O  O   O   O   O      O   O   O  O   O   O  O   O              |
       0 ++-----------------------------------------------------------------+


                     lock_stat.&(&n->list_lock)->rlock.contentions

   140000 ++----------------------------------------------------------------+
          |                                       O                         |
   120000 O+  O  O   O   O  O   O                     O                     |
          |                         O  O   O   O          O  O              |
   100000 ++                                                                |
          |                                                                 |
    80000 ++                                                                |
          |                                                                 |
    60000 ++                                                                |
          |                                                                 |
    40000 ++                                                                |
          |                                                                 |
    20000 ++                                                                |
          |                                                                 |
        0 *+--*--*---*---*--*---*---*--*---*---*--*---*---*--*---*---*--*---*


            lock_stat.&(&n->list_lock)->rlock.contentions.get_partial_node

   250000 ++----------------------------------------------------------------+
          |                                                                 |
          O                                       O                         |
   200000 ++  O  O   O   O  O   O   O  O   O   O      O      O              |
          |                                               O                 |
          |                                                                 |
   150000 ++                                                                |
          |                                                                 |
   100000 ++                                                                |
          |                                                                 |
          |                                                                 |
    50000 ++                                                                |
          |                                                                 |
          |                                                                 |
        0 *+--*--*---*---*--*---*---*--*---*---*--*---*---*--*---*---*--*---*


           lock_stat.&(&n->list_lock)->rlock.contentions.unfreeze_partials

   45000 ++-----------------------------------------------------------------+
         O                                        O                         |
   40000 ++         O   O   O                         O                     |
   35000 ++  O  O              O   O       O  O          O   O              |
         |                             O                                    |
   30000 ++                                                                 |
   25000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
   15000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
    5000 ++                                                                 |
         |                                                                  |
       0 *+--*--*---*---*---*--*---*---*---*--*---*---*--*---*---*---*--*---*


                                  iostat.cpu.user

   1.8 ++-------------------------------------------------------------------+
       *...*...*.         *...*.      *...*..                               |
   1.6 ++        ..     ..      ..  ..       *...*..    .*...*...*      *...*
       |               .           .                . ..          +   ..    |
       |           *..*           *                  *             + .      |
   1.4 ++                                                           *       |
       |                                                                    |
   1.2 ++                                                                   |
       |                                                                    |
     1 ++                                                                   |
       |                                                                    |
       |                                                                    |
   0.8 ++                         O       O      O   O                      |
       O   O   O   O  O   O   O       O      O           O   O              |
   0.6 ++-------------------------------------------------------------------+


                                  iostat.cpu.system

   96.6 ++------------------------------------------------------------------+
        |   O          O              O   O       O  O   O                  |
   96.4 O+      O  O       O   O  O                          O              |
        |                                                                   |
        |                                                                   |
   96.2 ++                                                                  |
        |                                                                   |
     96 ++                                                                  |
        |                                                                   |
   95.8 ++                                    O                             |
        |            ..*.         *.                 *..            *..     |
        |          *.    ..      +  ..             ..   .         ..   .  ..*
   95.6 ++       ..             +           ..*...*      *...  ..*      *.  |
        |     ..*          *...*      *...*.                 *.             |
   95.4 *+--*---------------------------------------------------------------+

^ permalink raw reply

* Re: -27% netperf TCP_STREAM regression by "tcp_memcontrol: Kill struct tcp_memcontrol"
From: David Miller @ 2013-10-22 22:00 UTC (permalink / raw)
  To: fengguang.wu; +Cc: ebiederm, netdev, linux-kernel
In-Reply-To: <20131022214129.GB2715@localhost>

From: fengguang.wu@intel.com
Date: Tue, 22 Oct 2013 22:41:29 +0100

> We noticed big netperf throughput regressions
> 
>     a4fe34bf902b8f709c63      2e685cad57906e19add7  
> ------------------------  ------------------------  
>                   707.40       -40.7%       419.60  lkp-nex04/micro/netperf/120s-200%-TCP_STREAM
>                  2775.60       -23.7%      2116.40  lkp-sb03/micro/netperf/120s-200%-TCP_STREAM
>                  3483.00       -27.2%      2536.00  TOTAL netperf.Throughput_Mbps
> 
> and bisected it to
> 
> commit 2e685cad57906e19add7189b5ff49dfb6aaa21d3
> Author: Eric W. Biederman <ebiederm@xmission.com>
> Date:   Sat Oct 19 16:26:19 2013 -0700
> 
>     tcp_memcontrol: Kill struct tcp_memcontrol

Eric please look into this, I'd rather have a fix to apply than revert your
work.

Thanks.

^ permalink raw reply

* Re: [PATCH 1/4 net-next] net: phy: add Generic Netlink Ethernet switch configuration API
From: Florian Fainelli @ 2013-10-22 22:09 UTC (permalink / raw)
  To: Neil Horman
  Cc: John Fastabend, netdev, David Miller, Sascha Hauer, Felix Fietkau,
	John Crispin, Jonas Gorski, Gary Thomas, Jamal Hadi Salim
In-Reply-To: <20131022202537.GA16336@hmsreliant.think-freely.org>

2013/10/22 Neil Horman <nhorman@tuxdriver.com>:
> On Tue, Oct 22, 2013 at 12:59:12PM -0700, Florian Fainelli wrote:
>> 2013/10/22 John Fastabend <john.r.fastabend@intel.com>:
>> > On 10/22/2013 11:23 AM, Florian Fainelli wrote:
>> >>
>> >> This patch adds an Ethernet Switch generic netlink configuration API
>> >> which allows for doing the required configuration of managed Ethernet
>> >> switches commonly found in Wireless/Cable/DSL routers in the market.
>> >>
>> >> Since this API is based on the Generic Netlink infrastructure it is very
>> >> easy to extend a particular switch driver to support additional features
>> >> and to adapt it to specific switches.
>> >>
>> >
>> >> So far the API includes support for:
>> >>
>> >> - getting/setting a port VLAN id
>> >> - getting/setting VLAN port membership
>> >> - getting a port link status
>> >> - getting a port statistics counters
>> >> - resetting a switch device
>> >> - applying a configuration to a switch device
>> >>
>> >
>> > Did you consider exposing each physical switch port as a netdevice on
>> > the host? I would assume your switch driver could do this.
>> >
>> > Then you can drop the port specific attributes (link status, stats, etc)
>> > and use existing interfaces. The win being my tools work equally well on
>> > your real switch as they do on my software switch. Also by exposing net
>> > devices you provide a mechanism to send packets over the port and trap
>> > control packets.
>>
>> Well this is exactly what DSA does and which I do not like because it
>> is completely overkill for most switches out there which are using
>> 802.1q tags and do not prepend/append proprietary tags for internal
>> traffic classification.
>>
>> >
>> > Next instead of creating a switch specific netlink API could you use
>> > the existing FDB API? Again what I would like is for my existing
>> > applications to run on the switch without having to rewrite them. For
>> > example it would be great to have 'bridge fdb show dev myswitch' report
>> > the correct tables for both the Sw bridge, a real switch bridge, and
>> > for the embedded SR-IOV bridge case.
>>
>> Ok, I know nothing about the FDB API, but will take a look and see if
>> that sounds suitable for the embedded use cases.
>>
> Further to Johns comments, why are you creating a new netlink protocol for this?
> It seems that 90% of what you want to accomplish above is handled by rtnetlink.
> As long as you write your driver properly, most of that should "just work".

This is not a new netlink protocol, but a generic netlink family. Why
would I extend rtnetlink to cover the remaining 10% which are not
going to be used on desktop and servers when a new generic netlink
family is cheap and can be selectively disabled in the kernel?
-- 
Florian

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox