Netdev List
 help / color / mirror / Atom feed
* Re: Off-by-one error in net/8021q/vlan.c
From: Brent Cook @ 2011-02-21 19:26 UTC (permalink / raw)
  To: Michał Mirosław
  Cc: Eric Dumazet, Phil Karn, richard -rw- weinberger, kaber, netdev
In-Reply-To: <AANLkTimrQy6gU8d86m2jM4arMS0rOvEQmn2m-KTE4sx9@mail.gmail.com>

On Wednesday 16 February 2011 12:41:34 Michał Mirosław wrote:
> 2011/2/16 Eric Dumazet <eric.dumazet@gmail.com>:
> > Le mercredi 16 février 2011 à 08:28 -0800, Phil Karn a écrit :
> >> On 2/16/11 8:10 AM, richard -rw- weinberger wrote:
> >> > On Wed, Feb 16, 2011 at 4:58 PM, Phil Karn <karn@ka9q.net> wrote:
> >> >> On 2/16/11 4:51 AM, richard -rw- weinberger wrote:
> >> >>> On Wed, Feb 16, 2011 at 11:58 AM, Phil Karn <karn@ka9q.net> wrote:
> >> >>>> The range check on vlan_id in register_vlan_device is off by one, and it
> >> >>>> prevents the creation of a vlan interface for vlan ID 4095. (OSX allows
> >> >>>> this, I checked.)
> >> >>>
> >> >>> Then OSX should fix their code. 4095 is reserved.
> >> >> If it's reserved, then it's up to the user to reserve it.
> >> > No.
> >> > See:
> >> > http://standards.ieee.org/getieee802/download/802.1Q-2005.pdf
> >> Well, then I guess we all know better than the user. That's the Windows
> >> Way...no, wait, I thought this is Linux.
> >>
> >> The fact is that I did encounter a misconfigured switch using vlan 4095,
> >> and because of this off-by-one error I was unable to talk to it and fix it.
> >>
> >> I was hoping I wouldn't have to patch every new kernel I install.
> > You can use an OSX gateway ;)
> >
> > If we allow ID 4095, then some users will complain we violate rules.
> >
> > Really you cannot push this patch in official kernel only to ease your
> > life ;)
> 
> The idea is that you don't have to use ID 4095 and if you don't -
> nothing's broken by just allowing it. The same goes with ID 0 - it's
> defined to be 802.1p packet, but people do use it as normal VLAN
> (especially with hardware that can cope with only small number of
> VLANs at once).
> 
> Allowing it but with a big fat warning in logs is even better: "You
> want your network broken? Sure, can do, but you have been warned."
> 

On the other end of the spectrum, vconfig warns for vlan 1:

bcook@bcook-box:~$ sudo vconfig add eth0 1
Added VLAN with VID == 1 to IF -:eth0:-
WARNING:  VLAN 1 does not work with many switches,
consider another number if you have problems.
bcook@bcook-box:~$ sudo vconfig add eth0 4095
ERROR: trying to add VLAN #4095 to IF -:eth0:-  error: Numerical result out of range

^ permalink raw reply

* Re: [PATCH] tcp: undo_retrans counter fixes
From: David Miller @ 2011-02-21 19:31 UTC (permalink / raw)
  To: ycheng; +Cc: netdev, ilpo.jarvinen
In-Reply-To: <1297119424-19956-1-git-send-email-ycheng@google.com>

From: Yuchung Cheng <ycheng@google.com>
Date: Mon,  7 Feb 2011 14:57:04 -0800

> Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
> not set or undo_retrans is already 0. This happens when sender receives
> more DSACK ACKs than packets retransmitted during the current
> undo phase. This may also happen when sender receives DSACK after
> the undo operation is completed or cancelled.
> 
> Fix another bug that undo_retrans is incorrectly incremented when
> sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
> is rare but not impossible.
> 
> Signed-off-by: Yuchung Cheng <ycheng@google.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* [PATCH v2 4/4] m68k/atari: ARAnyM - Add support for network access
From: Geert Uytterhoeven @ 2011-02-21 19:30 UTC (permalink / raw)
  To: linux-m68k
  Cc: linux-kernel, aranym, Milan Jurik, Petr Stehlik, Michael Schmitz,
	Geert Uytterhoeven, netdev
In-Reply-To: <1298316600-23094-1-git-send-email-geert@linux-m68k.org>

From: Milan Jurik <milan.jurik@xylab.cz>

[petr: Second author]
[michael, geert: Cleanups and updates]

Signed-off-by: Milan Jurik <milan.jurik@xylab.cz>
Signed-off-by: Petr Stehlik <pstehlik@sophics.cz>
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: netdev@vger.kernel.org

---
v2:
  - Review comments from David Miller:
      - Remove "dev->trans_start = jiffies;",
      - Set pr_fmt(),
      - Use netdev_*() for logging.
  - Review comments from Petr Stehlik:
     - Update ARAnyM URL.
v1:
  - Convert to net_device_ops,
  - nfeth doesn't need obsolete <net/ieee80211.h>,
  - Convert print_mac to %pM,
  - Break too long lines,
  - Make needlessly global functions static,
  - Make version[] const,
  - Use pr_*(),
  - Use net_device_stats from struct net_device instead of our own,
  - Propagate error code from request_irq(),
  - Remove unused variable "handled".

 arch/m68k/Kconfig      |    8 ++
 arch/m68k/emu/Makefile |    1 +
 arch/m68k/emu/nfeth.c  |  270 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 279 insertions(+), 0 deletions(-)
 create mode 100644 arch/m68k/emu/nfeth.c

diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 6719c56..80df6ee 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -263,6 +263,14 @@ config NFCON
 	  which allows the console output to be redirected to the stderr
 	  output of ARAnyM.
 
+config NFETH
+	tristate "NatFeat Ethernet support"
+	depends on NET_ETHERNET && NATFEAT
+	help
+	  Say Y to include support for the ARAnyM NatFeat network device
+	  which will emulate a regular ethernet device while presenting an
+	  ethertap device to the host system.
+
 comment "Processor type"
 
 config M68020
diff --git a/arch/m68k/emu/Makefile b/arch/m68k/emu/Makefile
index a83ef1e..7dc2010 100644
--- a/arch/m68k/emu/Makefile
+++ b/arch/m68k/emu/Makefile
@@ -6,3 +6,4 @@ obj-y			+= natfeat.o
 
 obj-$(CONFIG_NFBLOCK)	+= nfblock.o
 obj-$(CONFIG_NFCON)	+= nfcon.o
+obj-$(CONFIG_NFETH)	+= nfeth.o
diff --git a/arch/m68k/emu/nfeth.c b/arch/m68k/emu/nfeth.c
new file mode 100644
index 0000000..8b6e201
--- /dev/null
+++ b/arch/m68k/emu/nfeth.c
@@ -0,0 +1,270 @@
+/*
+ * atari_nfeth.c - ARAnyM ethernet card driver for GNU/Linux
+ *
+ * Copyright (c) 2005 Milan Jurik, Petr Stehlik of ARAnyM dev team
+ *
+ * Based on ARAnyM driver for FreeMiNT written by Standa Opichal
+ *
+ * This software may be used and distributed according to the terms of
+ * the GNU General Public License (GPL), incorporated herein by reference.
+ */
+
+#define DRV_VERSION	"0.3"
+#define DRV_RELDATE	"10/12/2005"
+
+#define pr_fmt(fmt)	KBUILD_MODNAME ": " fmt
+
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/module.h>
+#include <asm/natfeat.h>
+#include <asm/virtconvert.h>
+
+enum {
+	GET_VERSION = 0,/* no parameters, return NFAPI_VERSION in d0 */
+	XIF_INTLEVEL,	/* no parameters, return Interrupt Level in d0 */
+	XIF_IRQ,	/* acknowledge interrupt from host */
+	XIF_START,	/* (ethX), called on 'ifup', start receiver thread */
+	XIF_STOP,	/* (ethX), called on 'ifdown', stop the thread */
+	XIF_READLENGTH,	/* (ethX), return size of network data block to read */
+	XIF_READBLOCK,	/* (ethX, buffer, size), read block of network data */
+	XIF_WRITEBLOCK,	/* (ethX, buffer, size), write block of network data */
+	XIF_GET_MAC,	/* (ethX, buffer, size), return MAC HW addr in buffer */
+	XIF_GET_IPHOST,	/* (ethX, buffer, size), return IP address of host */
+	XIF_GET_IPATARI,/* (ethX, buffer, size), return IP address of atari */
+	XIF_GET_NETMASK	/* (ethX, buffer, size), return IP netmask */
+};
+
+#define MAX_UNIT	8
+
+/* These identify the driver base version and may not be removed. */
+static const char version[] __devinitdata =
+	KERN_INFO KBUILD_MODNAME ".c:v" DRV_VERSION " " DRV_RELDATE
+	" S.Opichal, M.Jurik, P.Stehlik\n"
+	KERN_INFO " http://aranym.org/\n";
+
+MODULE_AUTHOR("Milan Jurik");
+MODULE_DESCRIPTION("Atari NFeth driver");
+MODULE_LICENSE("GPL");
+/*
+MODULE_PARM(nfeth_debug, "i");
+MODULE_PARM_DESC(nfeth_debug, "nfeth_debug level (1-2)");
+*/
+
+
+static long nfEtherID;
+static int nfEtherIRQ;
+
+struct nfeth_private {
+	int ethX;
+};
+
+static struct net_device *nfeth_dev[MAX_UNIT];
+
+static int nfeth_open(struct net_device *dev)
+{
+	struct nfeth_private *priv = netdev_priv(dev);
+	int res;
+
+	res = nf_call(nfEtherID + XIF_START, priv->ethX);
+	netdev_dbg(dev, "%s: %d\n", __func__, res);
+
+	/* Ready for data */
+	netif_start_queue(dev);
+
+	return 0;
+}
+
+static int nfeth_stop(struct net_device *dev)
+{
+	struct nfeth_private *priv = netdev_priv(dev);
+
+	/* No more data */
+	netif_stop_queue(dev);
+
+	nf_call(nfEtherID + XIF_STOP, priv->ethX);
+
+	return 0;
+}
+
+/*
+ * Read a packet out of the adapter and pass it to the upper layers
+ */
+static inline void recv_packet(struct net_device *dev)
+{
+	struct nfeth_private *priv = netdev_priv(dev);
+	unsigned short pktlen;
+	struct sk_buff *skb;
+
+	/* read packet length (excluding 32 bit crc) */
+	pktlen = nf_call(nfEtherID + XIF_READLENGTH, priv->ethX);
+
+	netdev_dbg(dev, "%s: %u\n", __func__, pktlen);
+
+	if (!pktlen) {
+		netdev_dbg(dev, "%s: pktlen == 0\n", __func__);
+		dev->stats.rx_errors++;
+		return;
+	}
+
+	skb = dev_alloc_skb(pktlen + 2);
+	if (!skb) {
+		netdev_dbg(dev, "%s: out of mem (buf_alloc failed)\n",
+			   __func__);
+		dev->stats.rx_dropped++;
+		return;
+	}
+
+	skb->dev = dev;
+	skb_reserve(skb, 2);		/* 16 Byte align  */
+	skb_put(skb, pktlen);		/* make room */
+	nf_call(nfEtherID + XIF_READBLOCK, priv->ethX, virt_to_phys(skb->data),
+		pktlen);
+
+	skb->protocol = eth_type_trans(skb, dev);
+	netif_rx(skb);
+	dev->last_rx = jiffies;
+	dev->stats.rx_packets++;
+	dev->stats.rx_bytes += pktlen;
+
+	/* and enqueue packet */
+	return;
+}
+
+static irqreturn_t nfeth_interrupt(int irq, void *dev_id)
+{
+	int i, m, mask;
+
+	mask = nf_call(nfEtherID + XIF_IRQ, 0);
+	for (i = 0, m = 1; i < MAX_UNIT; m <<= 1, i++) {
+		if (mask & m && nfeth_dev[i]) {
+			recv_packet(nfeth_dev[i]);
+			nf_call(nfEtherID + XIF_IRQ, m);
+		}
+	}
+	return IRQ_HANDLED;
+}
+
+static int nfeth_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	unsigned int len;
+	char *data, shortpkt[ETH_ZLEN];
+	struct nfeth_private *priv = netdev_priv(dev);
+
+	data = skb->data;
+	len = skb->len;
+	if (len < ETH_ZLEN) {
+		memset(shortpkt, 0, ETH_ZLEN);
+		memcpy(shortpkt, data, len);
+		data = shortpkt;
+		len = ETH_ZLEN;
+	}
+
+	netdev_dbg(dev, "%s: send %u bytes\n", __func__, len);
+	nf_call(nfEtherID + XIF_WRITEBLOCK, priv->ethX, virt_to_phys(data),
+		len);
+
+	dev->stats.tx_packets++;
+	dev->stats.tx_bytes += len;
+
+	dev_kfree_skb(skb);
+	return 0;
+}
+
+static void nfeth_tx_timeout(struct net_device *dev)
+{
+	dev->stats.tx_errors++;
+	netif_wake_queue(dev);
+}
+
+static const struct net_device_ops nfeth_netdev_ops = {
+	.ndo_open		= nfeth_open,
+	.ndo_stop		= nfeth_stop,
+	.ndo_start_xmit		= nfeth_xmit,
+	.ndo_tx_timeout		= nfeth_tx_timeout,
+	.ndo_validate_addr	= eth_validate_addr,
+	.ndo_change_mtu		= eth_change_mtu,
+	.ndo_set_mac_address	= eth_mac_addr,
+};
+
+static struct net_device * __init nfeth_probe(int unit)
+{
+	struct net_device *dev;
+	struct nfeth_private *priv;
+	char mac[ETH_ALEN], host_ip[32], local_ip[32];
+	int err;
+
+	if (!nf_call(nfEtherID + XIF_GET_MAC, unit, mac, ETH_ALEN))
+		return NULL;
+
+	dev = alloc_etherdev(sizeof(struct nfeth_private));
+	if (!dev)
+		return NULL;
+
+	dev->irq = nfEtherIRQ;
+	dev->netdev_ops = &nfeth_netdev_ops;
+
+	dev->flags |= NETIF_F_NO_CSUM;
+	memcpy(dev->dev_addr, mac, ETH_ALEN);
+
+	priv = netdev_priv(dev);
+	priv->ethX = unit;
+
+	err = register_netdev(dev);
+	if (err) {
+		free_netdev(dev);
+		return NULL;
+	}
+
+	nf_call(nfEtherID + XIF_GET_IPHOST, unit,
+		host_ip, sizeof(host_ip));
+	nf_call(nfEtherID + XIF_GET_IPATARI, unit,
+		local_ip, sizeof(local_ip));
+
+	netdev_info(dev, KBUILD_MODNAME " addr:%s (%s) HWaddr:%pM\n", host_ip,
+		    local_ip, mac);
+
+	return dev;
+}
+
+static int __init nfeth_init(void)
+{
+	long ver;
+	int error, i;
+
+	nfEtherID = nf_get_id("ETHERNET");
+	if (!nfEtherID)
+		return -ENODEV;
+
+	ver = nf_call(nfEtherID + GET_VERSION);
+	pr_info("API %lu\n", ver);
+
+	nfEtherIRQ = nf_call(nfEtherID + XIF_INTLEVEL);
+	error = request_irq(nfEtherIRQ, nfeth_interrupt, IRQF_SHARED,
+			    "eth emu", nfeth_interrupt);
+	if (error) {
+		pr_err("request for irq %d failed %d", nfEtherIRQ, error);
+		return error;
+	}
+
+	for (i = 0; i < MAX_UNIT; i++)
+		nfeth_dev[i] = nfeth_probe(i);
+
+	return 0;
+}
+
+static void __exit nfeth_cleanup(void)
+{
+	int i;
+
+	for (i = 0; i < MAX_UNIT; i++) {
+		if (nfeth_dev[i]) {
+			unregister_netdev(nfeth_dev[0]);
+			free_netdev(nfeth_dev[0]);
+		}
+	}
+	free_irq(nfEtherIRQ, nfeth_interrupt);
+}
+
+module_init(nfeth_init);
+module_exit(nfeth_cleanup);
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH ethtool 5/5] ethtool: Add --version option
From: Ben Hutchings @ 2011-02-21 19:19 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1298315809.2608.68.camel@bwh-desktop>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool.8.in |    5 +++++
 ethtool.c    |    6 ++++++
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/ethtool.8.in b/ethtool.8.in
index 133825b..8b04335 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -100,6 +100,8 @@ ethtool \- query or control network driver and hardware settings
 
 .B ethtool \-h|\-\-help
 
+.B ethtool \-\-version
+
 .B ethtool \-a|\-\-show\-pause
 .I ethX
 
@@ -310,6 +312,9 @@ settings of the specified device.
 .B \-h \-\-help
 Shows a short help message.
 .TP
+.B \-\-version
+Shows the ethtool version number.
+.TP
 .B \-a \-\-show\-pause
 Queries the specified Ethernet device for pause parameter information.
 .TP
diff --git a/ethtool.c b/ethtool.c
index 32a97f6..e9cb2c9 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -115,6 +115,7 @@ static int do_permaddr(int fd, struct ifreq *ifr);
 static int send_ioctl(int fd, struct ifreq *ifr);
 
 static enum {
+	MODE_VERSION = -2,
 	MODE_HELP = -1,
 	MODE_GSET=0,
 	MODE_SSET,
@@ -264,6 +265,7 @@ static struct option {
     { "-P", "--show-permaddr", MODE_PERMADDR,
 		"Show permanent hardware address" },
     { "-h", "--help", MODE_HELP, "Show this help" },
+    { NULL, "--version", MODE_VERSION, "Show version number" },
     {}
 };
 
@@ -816,6 +818,10 @@ static void parse_cmdline(int argc, char **argp)
 			if (mode == MODE_HELP) {
 				show_usage();
 				exit(0);
+			} else if (mode == MODE_VERSION) {
+				fprintf(stdout,
+					PACKAGE " version " VERSION "\n");
+				exit(0);
 			} else if (!args[k].lng && argp[i][0] == '-') {
 				exit_bad_args();
 			} else {
-- 
1.7.3.4


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH ethtool 4/5] ethtool: Indent the no-options usage line consistently with the others
From: Ben Hutchings @ 2011-02-21 19:18 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1298315809.2608.68.camel@bwh-desktop>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index 8246bda..32a97f6 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -286,7 +286,8 @@ static void show_usage(void)
 	fprintf(stdout, PACKAGE " version " VERSION "\n");
 	fprintf(stdout,
 		"Usage:\n"
-		"ethtool DEVNAME\tDisplay standard information about device\n");
+		"        ethtool DEVNAME\t"
+		"Display standard information about device\n");
 	for (i = 0; args[i].lng; i++) {
 		fputs("        ethtool ", stdout);
 		if (args[i].srt)
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH ethtool 3/5] ethtool: Allow for long options with no short option and without a device name
From: Ben Hutchings @ 2011-02-21 19:18 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1298315809.2608.68.camel@bwh-desktop>

Change loop conditions to check for a long option string.

Generalise check for whether option requires a device name.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool.c |   23 ++++++++++++++---------
 1 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index d28f1b2..8246bda 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -287,12 +287,16 @@ static void show_usage(void)
 	fprintf(stdout,
 		"Usage:\n"
 		"ethtool DEVNAME\tDisplay standard information about device\n");
-	for (i = 0; args[i].srt; i++) {
-		fprintf(stdout, "        ethtool %s|%s %s\t%s\n%s",
-			args[i].srt, args[i].lng,
-			strstr(args[i].srt, "-h") ? "\t" : "DEVNAME",
-			args[i].help,
-			args[i].opthelp ? args[i].opthelp : "");
+	for (i = 0; args[i].lng; i++) {
+		fputs("        ethtool ", stdout);
+		if (args[i].srt)
+			fprintf(stdout, "%s|", args[i].srt);
+		fprintf(stdout, "%s %s\t%s\n",
+			args[i].lng,
+			args[i].Mode < 0 ? "\t" : "DEVNAME",
+			args[i].help);
+		if (args[i].opthelp)
+			fputs(args[i].opthelp, stdout);
 	}
 }
 
@@ -801,8 +805,9 @@ static void parse_cmdline(int argc, char **argp)
 	for (i = 1; i < argc; i++) {
 		switch (i) {
 		case 1:
-			for (k = 0; args[k].srt; k++)
-				if (!strcmp(argp[i], args[k].srt) ||
+			for (k = 0; args[k].lng; k++)
+				if ((args[k].srt &&
+				     !strcmp(argp[i], args[k].srt)) ||
 				    !strcmp(argp[i], args[k].lng)) {
 					mode = args[k].Mode;
 					break;
@@ -810,7 +815,7 @@ static void parse_cmdline(int argc, char **argp)
 			if (mode == MODE_HELP) {
 				show_usage();
 				exit(0);
-			} else if (!args[k].srt && argp[i][0] == '-') {
+			} else if (!args[k].lng && argp[i][0] == '-') {
 				exit_bad_args();
 			} else {
 				devname = argp[i];
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH ethtool 2/5] ethtool: Report an error if given an unrecognised option
From: Ben Hutchings @ 2011-02-21 19:18 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1298315809.2608.68.camel@bwh-desktop>

Previously we would print full usage information and return 0.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index b9422d3..d28f1b2 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -807,10 +807,11 @@ static void parse_cmdline(int argc, char **argp)
 					mode = args[k].Mode;
 					break;
 				}
-			if (mode == MODE_HELP ||
-			    (!args[k].srt && argp[i][0] == '-')) {
+			if (mode == MODE_HELP) {
 				show_usage();
 				exit(0);
+			} else if (!args[k].srt && argp[i][0] == '-') {
+				exit_bad_args();
 			} else {
 				devname = argp[i];
 			}
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH ethtool 1/5] ethtool: Split show_usage() into two functions
From: Ben Hutchings @ 2011-02-21 19:17 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1298315809.2608.68.camel@bwh-desktop>

show_usage(0) and show_usage(1) now do unrelated things; split it
into show_usage() and exit_bad_args().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool.c |  144 +++++++++++++++++++++++++++++++------------------------------
 1 files changed, 73 insertions(+), 71 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index f680b6d..b9422d3 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -268,32 +268,32 @@ static struct option {
 };
 

-static void show_usage(int badarg) __attribute__((noreturn));
+static void exit_bad_args(void) __attribute__((noreturn));
 
-static void show_usage(int badarg)
+static void exit_bad_args(void)
+{
+	fprintf(stderr,
+		"ethtool: bad command line argument(s)\n"
+		"For more information run ethtool -h\n");
+	exit(1);
+}
+
+static void show_usage(void)
 {
 	int i;
-	if (badarg != 0) {
-		fprintf(stderr,
-			"ethtool: bad command line argument(s)\n"
-			"For more information run ethtool -h\n"
-		);
-	}
-	else {
-		/* ethtool -h */
-		fprintf(stdout, PACKAGE " version " VERSION "\n");
-		fprintf(stdout,
+
+	/* ethtool -h */
+	fprintf(stdout, PACKAGE " version " VERSION "\n");
+	fprintf(stdout,
 		"Usage:\n"
 		"ethtool DEVNAME\tDisplay standard information about device\n");
-		for (i = 0; args[i].srt; i++) {
-			fprintf(stdout, "        ethtool %s|%s %s\t%s\n%s",
-				args[i].srt, args[i].lng,
-				strstr(args[i].srt, "-h") ? "\t" : "DEVNAME",
-				args[i].help,
-				args[i].opthelp ? args[i].opthelp : "");
-		}
+	for (i = 0; args[i].srt; i++) {
+		fprintf(stdout, "        ethtool %s|%s %s\t%s\n%s",
+			args[i].srt, args[i].lng,
+			strstr(args[i].srt, "-h") ? "\t" : "DEVNAME",
+			args[i].help,
+			args[i].opthelp ? args[i].opthelp : "");
 	}
-	exit(badarg);
 }
 
 static char *devname = NULL;
@@ -612,11 +612,11 @@ get_int_range(char *str, int base, long long min, long long max)
 	char *endp;
 
 	if (!str)
-		show_usage(1);
+		exit_bad_args();
 	errno = 0;
 	v = strtoll(str, &endp, base);
 	if (errno || *endp || v < min || v > max)
-		show_usage(1);
+		exit_bad_args();
 	return v;
 }
 
@@ -627,11 +627,11 @@ get_uint_range(char *str, int base, unsigned long long max)
 	char *endp;
 
 	if (!str)
-		show_usage(1);
+		exit_bad_args();
 	errno = 0;
 	v = strtoull(str, &endp, base);
 	if ( errno || *endp || v > max)
-		show_usage(1);
+		exit_bad_args();
 	return v;
 }
 
@@ -664,7 +664,7 @@ static void parse_generic_cmdline(int argc, char **argp,
 					*(int *)info[idx].seen_val = 1;
 				i += 1;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				switch (info[idx].type) {
 				case CMDL_BOOL: {
 					int *p = info[idx].wanted_val;
@@ -673,7 +673,7 @@ static void parse_generic_cmdline(int argc, char **argp,
 					else if (!strcmp(argp[i], "off"))
 						*p = 0;
 					else
-						show_usage(1);
+						exit_bad_args();
 					break;
 				}
 				case CMDL_S32: {
@@ -712,7 +712,7 @@ static void parse_generic_cmdline(int argc, char **argp,
 					u32 *p = info[idx].wanted_val;
 					struct in_addr in;
 					if (!inet_aton(argp[i], &in))
-						show_usage(1);
+						exit_bad_args();
 					*p = in.s_addr;
 					break;
 				}
@@ -728,7 +728,7 @@ static void parse_generic_cmdline(int argc, char **argp,
 						p = info[idx].wanted_val;
 						*p |= info[idx].flag_val;
 					} else if (strcmp(argp[i], "off")) {
-						show_usage(1);
+						exit_bad_args();
 					}
 					break;
 				}
@@ -738,13 +738,13 @@ static void parse_generic_cmdline(int argc, char **argp,
 					break;
 				}
 				default:
-					show_usage(1);
+					exit_bad_args();
 				}
 				break;
 			}
 		}
 		if( !found)
-			show_usage(1);
+			exit_bad_args();
 	}
 }
 
@@ -808,10 +808,12 @@ static void parse_cmdline(int argc, char **argp)
 					break;
 				}
 			if (mode == MODE_HELP ||
-			    (!args[k].srt && argp[i][0] == '-'))
-				show_usage(0);
-			else
+			    (!args[k].srt && argp[i][0] == '-')) {
+				show_usage();
+				exit(0);
+			} else {
 				devname = argp[i];
+			}
 			break;
 		case 2:
 			if ((mode == MODE_SSET) ||
@@ -850,7 +852,7 @@ static void parse_cmdline(int argc, char **argp)
 				} else if (!strcmp(argp[i], "offline")) {
 					test_type = OFFLINE;
 				} else {
-					show_usage(1);
+					exit_bad_args();
 				}
 				break;
 			} else if (mode == MODE_PHYS_ID) {
@@ -923,14 +925,14 @@ static void parse_cmdline(int argc, char **argp)
 				if (!strcmp(argp[i], "flow-type")) {
 					i += 1;
 					if (i >= argc) {
-						show_usage(1);
+						exit_bad_args();
 						break;
 					}
 					parse_rxntupleopts(argc, argp, i);
 					i = argc;
 					break;
 				} else {
-					show_usage(1);
+					exit_bad_args();
 				}
 				break;
 			}
@@ -938,54 +940,54 @@ static void parse_cmdline(int argc, char **argp)
 				if (!strcmp(argp[i], "rx-flow-hash")) {
 					i += 1;
 					if (i >= argc) {
-						show_usage(1);
+						exit_bad_args();
 						break;
 					}
 					rx_fhash_get =
 						rxflow_str_to_type(argp[i]);
 					if (!rx_fhash_get)
-						show_usage(1);
+						exit_bad_args();
 				} else
-					show_usage(1);
+					exit_bad_args();
 				break;
 			}
 			if (mode == MODE_FLASHDEV) {
 				flash_region = strtol(argp[i], NULL, 0);
 				if ((flash_region < 0))
-					show_usage(1);
+					exit_bad_args();
 				break;
 			}
 			if (mode == MODE_SNFC) {
 				if (!strcmp(argp[i], "rx-flow-hash")) {
 					i += 1;
 					if (i >= argc) {
-						show_usage(1);
+						exit_bad_args();
 						break;
 					}
 					rx_fhash_set =
 						rxflow_str_to_type(argp[i]);
 					if (!rx_fhash_set) {
-						show_usage(1);
+						exit_bad_args();
 						break;
 					}
 					i += 1;
 					if (i >= argc) {
-						show_usage(1);
+						exit_bad_args();
 						break;
 					}
 					if (parse_rxfhashopts(argp[i],
 						&rx_fhash_val) < 0)
-						show_usage(1);
+						exit_bad_args();
 					else
 						rx_fhash_changed = 1;
 				} else
-					show_usage(1);
+					exit_bad_args();
 				break;
 			}
 			if (mode == MODE_SRXFHINDIR) {
 				if (!strcmp(argp[i], "equal")) {
 					if (argc != i + 2) {
-						show_usage(1);
+						exit_bad_args();
 						break;
 					}
 					i += 1;
@@ -996,42 +998,42 @@ static void parse_cmdline(int argc, char **argp)
 				} else if (!strcmp(argp[i], "weight")) {
 					i += 1;
 					if (i >= argc) {
-						show_usage(1);
+						exit_bad_args();
 						break;
 					}
 					rxfhindir_weight = argp + i;
 					i = argc;
 				} else {
-					show_usage(1);
+					exit_bad_args();
 				}
 				break;
 			}
 			if (mode != MODE_SSET)
-				show_usage(1);
+				exit_bad_args();
 			if (!strcmp(argp[i], "speed")) {
 				gset_changed = 1;
 				i += 1;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				speed_wanted = get_int(argp[i],10);
 				break;
 			} else if (!strcmp(argp[i], "duplex")) {
 				gset_changed = 1;
 				i += 1;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				if (!strcmp(argp[i], "half"))
 					duplex_wanted = DUPLEX_HALF;
 				else if (!strcmp(argp[i], "full"))
 					duplex_wanted = DUPLEX_FULL;
 				else
-					show_usage(1);
+					exit_bad_args();
 				break;
 			} else if (!strcmp(argp[i], "port")) {
 				gset_changed = 1;
 				i += 1;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				if (!strcmp(argp[i], "tp"))
 					port_wanted = PORT_TP;
 				else if (!strcmp(argp[i], "aui"))
@@ -1043,12 +1045,12 @@ static void parse_cmdline(int argc, char **argp)
 				else if (!strcmp(argp[i], "fibre"))
 					port_wanted = PORT_FIBRE;
 				else
-					show_usage(1);
+					exit_bad_args();
 				break;
 			} else if (!strcmp(argp[i], "autoneg")) {
 				i += 1;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				if (!strcmp(argp[i], "on")) {
 					gset_changed = 1;
 					autoneg_wanted = AUTONEG_ENABLE;
@@ -1056,56 +1058,56 @@ static void parse_cmdline(int argc, char **argp)
 					gset_changed = 1;
 					autoneg_wanted = AUTONEG_DISABLE;
 				} else {
-					show_usage(1);
+					exit_bad_args();
 				}
 				break;
 			} else if (!strcmp(argp[i], "advertise")) {
 				gset_changed = 1;
 				i += 1;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				advertising_wanted = get_int(argp[i], 16);
 				break;
 			} else if (!strcmp(argp[i], "phyad")) {
 				gset_changed = 1;
 				i += 1;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				phyad_wanted = get_int(argp[i], 0);
 				break;
 			} else if (!strcmp(argp[i], "xcvr")) {
 				gset_changed = 1;
 				i += 1;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				if (!strcmp(argp[i], "internal"))
 					xcvr_wanted = XCVR_INTERNAL;
 				else if (!strcmp(argp[i], "external"))
 					xcvr_wanted = XCVR_EXTERNAL;
 				else
-					show_usage(1);
+					exit_bad_args();
 				break;
 			} else if (!strcmp(argp[i], "wol")) {
 				gwol_changed = 1;
 				i++;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				if (parse_wolopts(argp[i], &wol_wanted) < 0)
-					show_usage(1);
+					exit_bad_args();
 				wol_change = 1;
 				break;
 			} else if (!strcmp(argp[i], "sopass")) {
 				gwol_changed = 1;
 				i++;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				get_mac_addr(argp[i], sopass_wanted);
 				sopass_change = 1;
 				break;
 			} else if (!strcmp(argp[i], "msglvl")) {
 				i++;
 				if (i >= argc)
-					show_usage(1);
+					exit_bad_args();
 				if (isdigit((unsigned char)argp[i][0])) {
 					msglvl_changed = 1;
 					msglvl_mask = ~0;
@@ -1122,7 +1124,7 @@ static void parse_cmdline(int argc, char **argp)
 				}
 				break;
 			}
-			show_usage(1);
+			exit_bad_args();
 		}
 	}
 
@@ -1159,9 +1161,9 @@ static void parse_cmdline(int argc, char **argp)
 	}
 
 	if (devname == NULL)
-		show_usage(1);
+		exit_bad_args();
 	if (strlen(devname) >= IFNAMSIZ)
-		show_usage(1);
+		exit_bad_args();
 }
 
 static void dump_supported(struct ethtool_cmd *ep)
@@ -1512,7 +1514,7 @@ static void get_mac_addr(char *src, unsigned char *dest)
 	count = sscanf(src, "%2x:%2x:%2x:%2x:%2x:%2x",
 		&buf[0], &buf[1], &buf[2], &buf[3], &buf[4], &buf[5]);
 	if (count != ETH_ALEN)
-		show_usage(1);
+		exit_bad_args();
 
 	for (i = 0; i < count; i++) {
 		dest[i] = buf[i];
@@ -3026,7 +3028,7 @@ static int do_srxfhindir(int fd, struct ifreq *ifr)
 	int err;
 
 	if (!rxfhindir_equal && !rxfhindir_weight)
-		show_usage(1);
+		exit_bad_args();
 
 	indir_head.cmd = ETHTOOL_GRXFHINDIR;
 	indir_head.size = 0;
@@ -3094,7 +3096,7 @@ static int do_flash(int fd, struct ifreq *ifr)
 
 	if (flash < 0) {
 		fprintf(stdout, "Missing filename argument\n");
-		show_usage(1);
+		exit_bad_args();
 		return 98;
 	}
 
@@ -3160,7 +3162,7 @@ static int do_srxntuple(int fd, struct ifreq *ifr)
 		if (err < 0)
 			perror("Cannot add new RX n-tuple filter");
 	} else {
-		show_usage(1);
+		exit_bad_args();
 	}
 
 	return 0;
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH ethtool 0/5] Add --version option and fix some bugs in show_usage()
From: Ben Hutchings @ 2011-02-21 19:16 UTC (permalink / raw)
  To: netdev

Every command-line tool needs a --version option, right?  And while
looking at the existing option parsing and usage messages I found and
fixed some bugs.

Ben.

Ben Hutchings (5):
  ethtool: Split show_usage() into two functions
  ethtool: Report an error if given an unrecognised option
  ethtool: Allow for long options with no short option and without a
    device name
  ethtool: Indent the no-options usage line consistently with the
    others
  ethtool: Add --version option

 ethtool.8.in |    5 ++
 ethtool.c    |  165 +++++++++++++++++++++++++++++++--------------------------
 2 files changed, 95 insertions(+), 75 deletions(-)

-- 
1.7.3.4


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH] xps-mq: Transmit Packet Steering for multiqueue
From: Ben Hutchings @ 2011-02-21 18:19 UTC (permalink / raw)
  To: David Miller; +Cc: therbert, eric.dumazet, shemminger, netdev
In-Reply-To: <20100901.183251.106803238.davem@davemloft.net>

On Wed, 2010-09-01 at 18:32 -0700, David Miller wrote:
> From: Tom Herbert <therbert@google.com>
> Date: Wed, 1 Sep 2010 09:24:18 -0700
> 
> > On Wed, Sep 1, 2010 at 8:54 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> 3) Eventually have a user selectable selection (socket option, or system
> >> wide, but one sysctl, not many bitmasks ;) ).
> >>
> > Right, but it would also be nice if a single sysctl could optimally
> > set up multiqueue, RSS, RPS, and all my interrupt affinities for me
> > ;-)
> 
> It's becomming increasingly obvious to me that we need (somewhere,
> not necessarily the kernel) a complete datastructure representing
> the NUMA, cache, cpu, device hierarchy.
> 
> And that can be used to tweak all of this stuff.
> 
> The policy should probably be in userspace, we just need to provide
> the knobs in the kernel to tweak it however userspace wants.
> 
> Userspace should be able to, for example, move a TX queue into a
> NUMA domain and have this invoke several side effects:

I think most of the pieces are now ready:

> 1) IRQs for that TX queue get rerouted to a cpu in the NUMA
>    domain.

There is a longstanding procfs interface for IRQ affinity, and userland
infrastructure built on it.  Adding a new interface would be contentious
and I have tried to build on it instead.

> 2) TX queue datastructures in the driver get reallocated using
>    memory in that NUMA domain.

I've previously sent patches to add an ethtool API for NUMA control,
which include the option to allocate on the same node where IRQs are
handled.  However, there is currently no function to allocate
DMA-coherent memory on a specified NUMA node (rather than the device's
node).  This is likely to be beneficial for event rings and might be
good for descriptor rings for some devices.  (The implementation I sent
for sfc mistakenly switched it to allocating non-coherent memory, for
which it *is* possible to specify the node.)

> 3) TX hashing is configured to use the set of cpus in the NUMA
>    domain.

I posted patches for automatic XPS configuration at the end of last
week.  And RFS acceleration covers the other direction.

Ben.

> It's alot of tedious work and involves some delicate tasks figuring
> out where each of these things go, but really then we'd solve all
> of this crap one and for all.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH] bonding: bond_select_queue off by one
From: Andy Gospodarek @ 2011-02-21 18:06 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Andy Gospodarek, Jay Vosburgh, Phil Oester, netdev
In-Reply-To: <1298070372.2211.59.camel@localhost>

On Fri, Feb 18, 2011 at 11:06:12PM +0000, Ben Hutchings wrote:
> On Fri, 2011-02-18 at 17:49 -0500, Andy Gospodarek wrote:
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -2194,6 +2194,21 @@ static inline bool skb_rx_queue_recorded(const struct sk_buff *skb)
> >         return skb->queue_mapping != 0;
> >  }
> >  
> > +static inline void skb_record_tx_queue(struct sk_buff *skb, u16 tx_queue)
> > +{
> > +       skb->queue_mapping = tx_queue + 1;
> > +}
> > +
> > +static inline u16 skb_get_tx_queue(const struct sk_buff *skb)
> > +{
> > +       return skb->queue_mapping - 1;
> > +}
> > +
> > +static inline bool skb_tx_queue_recorded(const struct sk_buff *skb)
> > +{
> > +       return skb->queue_mapping != 0;
> > +}
> > +
> [...]
> 
> This is nonsense.  After the TX queue has been selected, it's recorded
> in queue_mapping *without* the offset (skb_set_queue_mapping()).
> 

I see that now.  Yay for symmetry! :)

I'm actually looking over this now and will post a tested patch to
address the original reporter's problem.



^ permalink raw reply

* Re: Huge ammount of invalid checksum packets on macvlan
From: Andrian Nord @ 2011-02-21 17:39 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Linux Netdev List, lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	Patrick McHardy
In-Reply-To: <4D628DC3.9000400-GANU6spQydw@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 2419 bytes --]

On Mon, Feb 21, 2011 at 05:07:31PM +0100, Daniel Lezcano wrote:
> IMO, the checksum is not needed for the virtual macvlan devices, hence 
Well, maybe then I've made a horrible mistake of asking the wrong
question. It's not a bad checksums that are wondering me, but very poor
network traffic performance: I'm getting about ~25kbps of
/dev/zero-to-/dev/null copy via netcat between hosts and mysql queries
from container other than mysql's are horribly laggy.

Strangely enough - while there is such bug on both systems, they are
behaving differently.

First system (router):
   eth1          eth0       dummy0
|----|------------|-----------|-----|   macvlans        |------------|
|  macvlan "lan"  |           |-------------------------| containers |
|   /  \          |           |     |                   | -----------|
|--|---|----------|-----------|-----|                    
   |   |          |           | macvlan                   
   |   |     |----------------|------|
  ---  |     |  eth0  eth1   eth2    | Router-container
  LAN  |     |         |             |
  ---  \---------------/             |
             |-----------------------|


And bug appears much less when copying from container to container, or
from container to HN (but still noticeable, especially in mysql queries),
but it's seen very well when copying from LAN to container. netcat copy
transfer rate shows enormous ~300 Mbps.
/proc/net/dev show many transfer errors but 0 receive errors.

Second system (server):
   eth0         eth1       dummy0
|----------------------------|---------------|
|             no carrier     |-macvlan "lxc" |
|----------------------------|---------------|
                             |
		|---------------------------|
		|        containers         |
		|---------------------------|

We can't link macvlans on eth1, as it has no carrier and macvlans are
not working in this case. Here bug is seen very well, even in
transferring packets from container to container (netcat copy transfer
rate is ~40kbps on this system).
/proc/net/dev show may errors in both directions.

First system is slightly more powerful, but difference in
between-container performance is just too big. Also, tcpdump on both
system reports kernel-dropped packets in great amount.

P.S. netcat copy is:
lxc1) nc6 -l -p 12345 > /dev/null
lxc2) dd if=/dev/zero | nc6 lxc1 12345

[-- Attachment #1.2: Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 387 bytes --]

------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb

[-- Attachment #3: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply

* OT: Re: [PATCH] CAN: add controller hardware name for Softing cards
From: Oliver Hartkopp @ 2011-02-21 17:26 UTC (permalink / raw)
  To: Marc Kleine-Budde, Wolfgang Grandegger; +Cc: Kurt Van Dijck, netdev
In-Reply-To: <4D622CC1.5050402@pengutronix.de>

On 21.02.2011 10:13, Marc Kleine-Budde wrote:
> On 02/21/2011 10:04 AM, Kurt Van Dijck wrote:
>> I just found that the controller hardware name is not set for the Softing
>> driver. After this patch, "$ ip -d link show" looks nicer.
>>
>> Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
> 

Hello Marc,

i really appreciate your work regarding the CAN driver maintenance together
with Wolfgang!

Would it probably make sense to add your name to the drivers/net/can section
in the MAINTAINERS file so that Dave and the other listeners on netdev-ML can
appraise your Acked-by more easily?

Regards,
Oliver

^ permalink raw reply

* [PATCH ethtool 3/3] ethtool: Report additional offload flag changes made automatically
From: Ben Hutchings @ 2011-02-21 16:59 UTC (permalink / raw)
  To: netdev, Michał Mirosław
In-Reply-To: <1298307282.2608.47.camel@bwh-desktop>

Turning off checksum offloads or SG will cause the kernel to turn off
other offloads that depend on them.  On some hardware, other offload
features may have to be turned on or off together.  Therefore, check
the offload flags before and after making changes and report any
additional changes beyond those requested.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool.c |  114 +++++++++++++++++++++++++++++++++++-------------------------
 1 files changed, 66 insertions(+), 48 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index 25601a5..4e6bd41 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -1890,13 +1890,16 @@ static const struct {
 	{ "receive-hashing",		  0,		   NETIF_F_RXHASH },
 };
 
-static int dump_offload(const struct ethtool_get_features_block *features)
+static int
+dump_offload(const struct ethtool_get_features_block *features, u32 mask)
 {
 	u32 value;
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(off_feature_def); i++) {
 		value = off_feature_def[i].value;
+		if (!(mask & value))
+			continue;
 		printf("%s: %s%s%s\n",
 		       off_feature_def[i].long_name,
 		       (features->active & value) ? "on" : "off",
@@ -2229,7 +2232,8 @@ static const u32 flags_dup_features =
 	(ETH_FLAG_LRO | ETH_FLAG_RXVLAN | ETH_FLAG_TXVLAN | ETH_FLAG_NTUPLE |
 	 ETH_FLAG_RXHASH);
 
-static int do_goffload(int fd, struct ifreq *ifr)
+static int get_offload(int fd, struct ifreq *ifr,
+		       struct ethtool_get_features_block *pfeatures)
 {
 	struct {
 		struct ethtool_gfeatures cmd;
@@ -2240,24 +2244,23 @@ static int do_goffload(int fd, struct ifreq *ifr)
 	u32 value;
 	int i;
 
-	fprintf(stdout, "Offload parameters for %s:\n", devname);
-
 	features.cmd.cmd = ETHTOOL_GFEATURES;
 	features.cmd.size = ARRAY_SIZE(features.data);
 	ifr->ifr_data = (caddr_t)&features;
 	err = ioctl(fd, SIOCETHTOOL, ifr);
 	if (err == 0) {
 		allfail = 0;
+		pfeatures[0] = features.data[0];
 	} else if (errno != EOPNOTSUPP && errno != EPERM) {
 		perror("Cannot get device offload settings");
 	} else {
-		memset(&features.data, 0, sizeof(features.data));
+		memset(pfeatures, 0, sizeof(*pfeatures));
 
 		for (i = 0; i < ARRAY_SIZE(off_feature_def); i++) {
 			value = off_feature_def[i].value;
 
 			/* Assume that anything we can get is changeable */
-			features.data[0].available |= value;
+			pfeatures[0].available |= value;
 
 			if (!off_feature_def[i].cmd)
 				continue;
@@ -2271,7 +2274,7 @@ static int do_goffload(int fd, struct ifreq *ifr)
 					off_feature_def[i].long_name);
 			} else {
 				if (eval.data)
-					features.data[0].active |= value;
+					pfeatures[0].active |= value;
 				allfail = 0;
 			}
 		}
@@ -2282,28 +2285,34 @@ static int do_goffload(int fd, struct ifreq *ifr)
 		if (err) {
 			perror("Cannot get device flags");
 		} else {
-			features.data[0].active |=
+			pfeatures[0].active |=
 				eval.data & flags_dup_features;
 			allfail = 0;
 		}
 
-		features.data[0].requested = features.data[0].active;
+		pfeatures[0].requested = pfeatures[0].active;
 	}
 
-	if (allfail) {
+	return allfail;
+}
+
+static int do_goffload(int fd, struct ifreq *ifr)
+{
+	struct ethtool_get_features_block features;
+
+	fprintf(stdout, "Offload parameters for %s:\n", devname);
+
+	if (get_offload(fd, ifr, &features)) {
 		fprintf(stdout, "no offload info available\n");
 		return 83;
 	}
 
-	return dump_offload(features.data);
+	return dump_offload(&features, ~(u32)0);
 }
 
 static int do_soffload(int fd, struct ifreq *ifr)
 {
-	struct {
-		struct ethtool_gfeatures cmd;
-		struct ethtool_get_features_block data[1];
-	} get_features;
+	struct ethtool_get_features_block old_features, new_features;
 	struct {
 		struct ethtool_sfeatures cmd;
 		struct ethtool_set_features_block data[1];
@@ -2313,42 +2322,37 @@ static int do_soffload(int fd, struct ifreq *ifr)
 	u32 value;
 	int i;
 
-	get_features.cmd.cmd = ETHTOOL_GFEATURES;
-	get_features.cmd.size = ARRAY_SIZE(get_features.data);
-	ifr->ifr_data = (caddr_t)&get_features;
-	err = ioctl(fd, SIOCETHTOOL, ifr);
-	if (err == 0) {
-		set_features.cmd.cmd = ETHTOOL_SFEATURES;
-		set_features.cmd.size = ARRAY_SIZE(set_features.data);
-		set_features.data[0] = off_features;
+	if (get_offload(fd, ifr, &old_features)) {
+		fprintf(stderr, "no offload info available\n");
+		return 1;
+	}
 
-		for (i = 0; i < ARRAY_SIZE(off_feature_def); i++) {
-			value = off_feature_def[i].value;
-			if (!(off_features.valid & value))
-				continue;
-			if (!(get_features.data[0].available & value)) {
-				/* None of these features can be changed */
-				fprintf(stderr,
-					"Cannot set device %s settings: "
-					"Operation not supported\n",
-					off_feature_def[i].long_name);
-			} else if (off_features.requested & value) {
-				/* Some of these features can be
-				 * enabled; mask out any that cannot
-				 */
-				set_features.data[0].requested &=
-					~(value &
-					  ~get_features.data[0].available);
-			}
-		}
+	set_features.cmd.cmd = ETHTOOL_SFEATURES;
+	set_features.cmd.size = ARRAY_SIZE(set_features.data);
+	set_features.data[0] = off_features;
 
-		ifr->ifr_data = (caddr_t)&set_features;
-		err = ioctl(fd, SIOCETHTOOL, ifr);
-		if (err < 0) {
-			perror("Cannot set device offload settings");
-			return 1;
+	for (i = 0; i < ARRAY_SIZE(off_feature_def); i++) {
+		value = off_feature_def[i].value;
+		if (!(off_features.valid & value))
+			continue;
+		if (!(old_features.available & value)) {
+			/* None of these features can be changed */
+			fprintf(stderr,
+				"Cannot set device %s settings: "
+				"Operation not supported\n",
+				off_feature_def[i].long_name);
+		} else if (off_features.requested & value) {
+			/* Some of these features can be
+			 * enabled; mask out any that cannot
+			 */
+			set_features.data[0].requested &=
+				~(value & ~old_features.available);
 		}
+	}
 
+	ifr->ifr_data = (caddr_t)&set_features;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err >= 0) {
 		changed = !!set_features.data[0].valid;
 
 		if (err & ETHTOOL_F_WISH)
@@ -2367,7 +2371,7 @@ static int do_soffload(int fd, struct ifreq *ifr)
 				"warning flags %#x",
 				err & ~ETHTOOL_F_WISH);
 	} else if (errno != EOPNOTSUPP && errno != EPERM) {
-		perror("Cannot get device offload settings");
+		perror("Cannot set device offload settings");
 		return 1;
 	} else {
 		for (i = 0; i < ARRAY_SIZE(off_feature_def); i++) {
@@ -2415,6 +2419,20 @@ static int do_soffload(int fd, struct ifreq *ifr)
 
 	if (!changed) {
 		fprintf(stdout, "no offload settings changed\n");
+		return 0;
+	}
+
+	/* Were any additional changes made automatically? */
+	if (get_offload(fd, ifr, &new_features)) {
+		fprintf(stderr, "no offload info available\n");
+		return 1;
+	}
+	value = ((old_features.active & ~off_features.valid) |
+		 off_features.requested) ^
+		new_features.active;
+	if (value) {
+		printf("Additional changes:\n");
+		dump_offload(&new_features, value);
 	}
 
 	return 0;
-- 
1.7.3.4


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH ethtool 2/3] ethtool: Regularise handling of offload flags
From: Ben Hutchings @ 2011-02-21 16:59 UTC (permalink / raw)
  To: netdev, Michał Mirosław
In-Reply-To: <1298307282.2608.47.camel@bwh-desktop>

Use the new ETHTOOL_{G,S}FEATURES operations where available, and
use the new structure and netif feature flags in any case.

Replace repetitive code for getting/setting offload flags with data-
driven loops.

This changes error messages to use the same long names for offload
flags as in dump_offload(), and changes various exit codes to 1.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
NEITF_F_* flags are copied into ethtool-util.h for now.  I think in
future they should be exposed from <linux/netdevice.h> (hence the
#ifndef).

Ben.

 ethtool-util.h |   30 ++++
 ethtool.c      |  441 +++++++++++++++++++++++++++-----------------------------
 2 files changed, 243 insertions(+), 228 deletions(-)

diff --git a/ethtool-util.h b/ethtool-util.h
index f053028..a118202 100644
--- a/ethtool-util.h
+++ b/ethtool-util.h
@@ -40,6 +40,36 @@ static inline u32 cpu_to_be32(u32 value)
 }
 #endif
 
+#ifndef NETIF_F_SG
+#define NETIF_F_SG		(1 << 0)
+#define NETIF_F_IP_CSUM		(1 << 1)
+#define NETIF_F_NO_CSUM		(1 << 2)
+#define NETIF_F_HW_CSUM		(1 << 3)
+#define NETIF_F_IPV6_CSUM	(1 << 4)
+#define NETIF_F_FRAGLIST	(1 << 6)
+#define NETIF_F_HW_VLAN_TX	(1 << 7)
+#define NETIF_F_HW_VLAN_RX	(1 << 8)
+#define NETIF_F_HW_VLAN_FILTER	(1 << 9)
+#define NETIF_F_GSO		(1 << 11)
+#define NETIF_F_GRO		(1 << 14)
+#define NETIF_F_LRO		(1 << 15)
+#define NETIF_F_TSO		(1 << 16)
+#define NETIF_F_UFO		(1 << 17)
+#define NETIF_F_GSO_ROBUST	(1 << 18)
+#define NETIF_F_TSO_ECN		(1 << 19)
+#define NETIF_F_TSO6		(1 << 20)
+#define NETIF_F_FSO		(1 << 21)
+#define NETIF_F_FCOE_CRC	(1 << 24)
+#define NETIF_F_SCTP_CSUM	(1 << 25)
+#define NETIF_F_FCOE_MTU	(1 << 26)
+#define NETIF_F_NTUPLE		(1 << 27)
+#define NETIF_F_RXHASH		(1 << 28)
+#define NETIF_F_RXCSUM		(1 << 29)
+#define NETIF_F_ALL_CSUM	(NETIF_F_NO_CSUM | NETIF_F_HW_CSUM | \
+				 NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM)
+#define NETIF_F_ALL_TSO 	(NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_TSO_ECN)
+#endif
+
 /* National Semiconductor DP83815, DP83816 */
 int natsemi_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
 int natsemi_dump_eeprom(struct ethtool_drvinfo *info,
diff --git a/ethtool.c b/ethtool.c
index f680b6d..25601a5 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -299,15 +299,7 @@ static void show_usage(int badarg)
 static char *devname = NULL;
 
 static int goffload_changed = 0;
-static int off_csum_rx_wanted = -1;
-static int off_csum_tx_wanted = -1;
-static int off_sg_wanted = -1;
-static int off_tso_wanted = -1;
-static int off_ufo_wanted = -1;
-static int off_gso_wanted = -1;
-static u32 off_flags_wanted = 0;
-static u32 off_flags_mask = 0;
-static int off_gro_wanted = -1;
+struct ethtool_set_features_block off_features;
 
 static struct ethtool_pauseparam epause;
 static int gpause_changed = 0;
@@ -463,23 +455,30 @@ static struct cmdline_info cmdline_seeprom[] = {
 };
 
 static struct cmdline_info cmdline_offload[] = {
-	{ "rx", CMDL_BOOL, &off_csum_rx_wanted, NULL },
-	{ "tx", CMDL_BOOL, &off_csum_tx_wanted, NULL },
-	{ "sg", CMDL_BOOL, &off_sg_wanted, NULL },
-	{ "tso", CMDL_BOOL, &off_tso_wanted, NULL },
-	{ "ufo", CMDL_BOOL, &off_ufo_wanted, NULL },
-	{ "gso", CMDL_BOOL, &off_gso_wanted, NULL },
-	{ "lro", CMDL_FLAG, &off_flags_wanted, NULL,
-	  ETH_FLAG_LRO, &off_flags_mask },
-	{ "gro", CMDL_BOOL, &off_gro_wanted, NULL },
-	{ "rxvlan", CMDL_FLAG, &off_flags_wanted, NULL,
-	  ETH_FLAG_RXVLAN, &off_flags_mask },
-	{ "txvlan", CMDL_FLAG, &off_flags_wanted, NULL,
-	  ETH_FLAG_TXVLAN, &off_flags_mask },
-	{ "ntuple", CMDL_FLAG, &off_flags_wanted, NULL,
-	  ETH_FLAG_NTUPLE, &off_flags_mask },
-	{ "rxhash", CMDL_FLAG, &off_flags_wanted, NULL,
-	  ETH_FLAG_RXHASH, &off_flags_mask },
+	{ "rx", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_RXCSUM, &off_features.valid },
+	{ "tx", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_ALL_CSUM, &off_features.valid },
+	{ "sg", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_SG, &off_features.valid },
+	{ "tso", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_ALL_TSO, &off_features.valid },
+	{ "ufo", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_UFO, &off_features.valid },
+	{ "gso", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_GSO, &off_features.valid },
+	{ "lro", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_LRO, &off_features.valid },
+	{ "gro", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_GRO, &off_features.valid },
+	{ "rxvlan", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_HW_VLAN_TX, &off_features.valid },
+	{ "txvlan", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_HW_VLAN_TX, &off_features.valid },
+	{ "ntuple", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_NTUPLE, &off_features.valid },
+	{ "rxhash", CMDL_FLAG, &off_features.requested, NULL,
+	  NETIF_F_RXHASH, &off_features.valid },
 };
 
 static struct cmdline_info cmdline_pause[] = {
@@ -1872,35 +1871,39 @@ static int dump_coalesce(void)
 	return 0;
 }
 
-static int dump_offload(int rx, int tx, int sg, int tso, int ufo, int gso,
-			int gro, int lro, int rxvlan, int txvlan, int ntuple,
-			int rxhash)
+static const struct {
+	const char *long_name;
+	u32 cmd;
+	u32 value;
+} off_feature_def[] = {
+	{ "rx-checksumming",		  ETHTOOL_GRXCSUM, NETIF_F_RXCSUM },
+	{ "tx-checksumming",		  ETHTOOL_GTXCSUM, NETIF_F_ALL_CSUM },
+	{ "scatter-gather",		  ETHTOOL_GSG,	   NETIF_F_SG },
+	{ "tcp-segmentation-offload",	  ETHTOOL_GTSO,	   NETIF_F_ALL_TSO },
+	{ "udp-fragmentation-offload",	  ETHTOOL_GUFO,	   NETIF_F_UFO },
+	{ "generic-segmentation-offload", ETHTOOL_GGSO,	   NETIF_F_GSO },
+	{ "generic-receive-offload",	  ETHTOOL_GGRO,	   NETIF_F_GRO },
+	{ "large-receive-offload",	  0,		   NETIF_F_LRO },
+	{ "rx-vlan-offload",		  0,		   NETIF_F_HW_VLAN_RX },
+	{ "tx-vlan-offload",		  0,		   NETIF_F_HW_VLAN_TX },
+	{ "ntuple-filters",		  0,		   NETIF_F_NTUPLE },
+	{ "receive-hashing",		  0,		   NETIF_F_RXHASH },
+};
+
+static int dump_offload(const struct ethtool_get_features_block *features)
 {
-	fprintf(stdout,
-		"rx-checksumming: %s\n"
-		"tx-checksumming: %s\n"
-		"scatter-gather: %s\n"
-		"tcp-segmentation-offload: %s\n"
-		"udp-fragmentation-offload: %s\n"
-		"generic-segmentation-offload: %s\n"
-		"generic-receive-offload: %s\n"
-		"large-receive-offload: %s\n"
-		"rx-vlan-offload: %s\n"
-		"tx-vlan-offload: %s\n"
-		"ntuple-filters: %s\n"
-		"receive-hashing: %s\n",
-		rx ? "on" : "off",
-		tx ? "on" : "off",
-		sg ? "on" : "off",
-		tso ? "on" : "off",
-		ufo ? "on" : "off",
-		gso ? "on" : "off",
-		gro ? "on" : "off",
-		lro ? "on" : "off",
-		rxvlan ? "on" : "off",
-		txvlan ? "on" : "off",
-		ntuple ? "on" : "off",
-		rxhash ? "on" : "off");
+	u32 value;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(off_feature_def); i++) {
+		value = off_feature_def[i].value;
+		printf("%s: %s%s%s\n",
+		       off_feature_def[i].long_name,
+		       (features->active & value) ? "on" : "off",
+		       (features->requested & ~features->active & value) ?
+		       " [requested on]" : "",
+		       (~features->available & value) ? " [unchangeable]" : "");
+	}
 
 	return 0;
 }
@@ -2219,97 +2222,72 @@ static int do_scoalesce(int fd, struct ifreq *ifr)
 	return 0;
 }
 
+/* the following list of flags are the same as their associated
+ * NETIF_F_xxx values in include/linux/netdevice.h
+ */
+static const u32 flags_dup_features =
+	(ETH_FLAG_LRO | ETH_FLAG_RXVLAN | ETH_FLAG_TXVLAN | ETH_FLAG_NTUPLE |
+	 ETH_FLAG_RXHASH);
+
 static int do_goffload(int fd, struct ifreq *ifr)
 {
+	struct {
+		struct ethtool_gfeatures cmd;
+		struct ethtool_get_features_block data[1];
+	} features;
 	struct ethtool_value eval;
-	int err, allfail = 1, rx = 0, tx = 0, sg = 0;
-	int tso = 0, ufo = 0, gso = 0, gro = 0, lro = 0, rxvlan = 0, txvlan = 0,
-	    ntuple = 0, rxhash = 0;
+	int err, allfail = 1;
+	u32 value;
+	int i;
 
 	fprintf(stdout, "Offload parameters for %s:\n", devname);
 
-	eval.cmd = ETHTOOL_GRXCSUM;
-	ifr->ifr_data = (caddr_t)&eval;
-	err = send_ioctl(fd, ifr);
-	if (err)
-		perror("Cannot get device rx csum settings");
-	else {
-		rx = eval.data;
+	features.cmd.cmd = ETHTOOL_GFEATURES;
+	features.cmd.size = ARRAY_SIZE(features.data);
+	ifr->ifr_data = (caddr_t)&features;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err == 0) {
 		allfail = 0;
-	}
+	} else if (errno != EOPNOTSUPP && errno != EPERM) {
+		perror("Cannot get device offload settings");
+	} else {
+		memset(&features.data, 0, sizeof(features.data));
 
-	eval.cmd = ETHTOOL_GTXCSUM;
-	ifr->ifr_data = (caddr_t)&eval;
-	err = send_ioctl(fd, ifr);
-	if (err)
-		perror("Cannot get device tx csum settings");
-	else {
-		tx = eval.data;
-		allfail = 0;
-	}
+		for (i = 0; i < ARRAY_SIZE(off_feature_def); i++) {
+			value = off_feature_def[i].value;
 
-	eval.cmd = ETHTOOL_GSG;
-	ifr->ifr_data = (caddr_t)&eval;
-	err = send_ioctl(fd, ifr);
-	if (err)
-		perror("Cannot get device scatter-gather settings");
-	else {
-		sg = eval.data;
-		allfail = 0;
-	}
+			/* Assume that anything we can get is changeable */
+			features.data[0].available |= value;
 
-	eval.cmd = ETHTOOL_GTSO;
-	ifr->ifr_data = (caddr_t)&eval;
-	err = send_ioctl(fd, ifr);
-	if (err)
-		perror("Cannot get device tcp segmentation offload settings");
-	else {
-		tso = eval.data;
-		allfail = 0;
-	}
+			if (!off_feature_def[i].cmd)
+				continue;
 
-	eval.cmd = ETHTOOL_GUFO;
-	ifr->ifr_data = (caddr_t)&eval;
-	err = ioctl(fd, SIOCETHTOOL, ifr);
-	if (err)
-		perror("Cannot get device udp large send offload settings");
-	else {
-		ufo = eval.data;
-		allfail = 0;
-	}
-
-	eval.cmd = ETHTOOL_GGSO;
-	ifr->ifr_data = (caddr_t)&eval;
-	err = ioctl(fd, SIOCETHTOOL, ifr);
-	if (err)
-		perror("Cannot get device generic segmentation offload settings");
-	else {
-		gso = eval.data;
-		allfail = 0;
-	}
+			eval.cmd = off_feature_def[i].cmd;
+			ifr->ifr_data = (caddr_t)&eval;
+			err = send_ioctl(fd, ifr);
+			if (err) {
+				fprintf(stderr,
+					"Cannot get device %s settings: %m\n",
+					off_feature_def[i].long_name);
+			} else {
+				if (eval.data)
+					features.data[0].active |= value;
+				allfail = 0;
+			}
+		}
 
-	eval.cmd = ETHTOOL_GFLAGS;
-	ifr->ifr_data = (caddr_t)&eval;
-	err = ioctl(fd, SIOCETHTOOL, ifr);
-	if (err) {
-		perror("Cannot get device flags");
-	} else {
-		lro = (eval.data & ETH_FLAG_LRO) != 0;
-		rxvlan = (eval.data & ETH_FLAG_RXVLAN) != 0;
-		txvlan = (eval.data & ETH_FLAG_TXVLAN) != 0;
-		ntuple = (eval.data & ETH_FLAG_NTUPLE) != 0;
-		rxhash = (eval.data & ETH_FLAG_RXHASH) != 0;
-		allfail = 0;
-	}
+		eval.cmd = ETHTOOL_GFLAGS;
+		ifr->ifr_data = (caddr_t)&eval;
+		err = ioctl(fd, SIOCETHTOOL, ifr);
+		if (err) {
+			perror("Cannot get device flags");
+		} else {
+			features.data[0].active |=
+				eval.data & flags_dup_features;
+			allfail = 0;
+		}
 
-	eval.cmd = ETHTOOL_GGRO;
-	ifr->ifr_data = (caddr_t)&eval;
-	err = ioctl(fd, SIOCETHTOOL, ifr);
-	if (err)
-		perror("Cannot get device GRO settings");
-	else {
-		gro = eval.data;
-		allfail = 0;
+		features.data[0].requested = features.data[0].active;
 	}
 
 	if (allfail) {
@@ -2317,114 +2295,121 @@ static int do_goffload(int fd, struct ifreq *ifr)
 		return 83;
 	}
 
-	return dump_offload(rx, tx, sg, tso, ufo, gso, gro, lro, rxvlan, txvlan,
-			    ntuple, rxhash);
+	return dump_offload(features.data);
 }
 
 static int do_soffload(int fd, struct ifreq *ifr)
 {
+	struct {
+		struct ethtool_gfeatures cmd;
+		struct ethtool_get_features_block data[1];
+	} get_features;
+	struct {
+		struct ethtool_sfeatures cmd;
+		struct ethtool_set_features_block data[1];
+	} set_features;
 	struct ethtool_value eval;
 	int err, changed = 0;
+	u32 value;
+	int i;
 
-	if (off_csum_rx_wanted >= 0) {
-		changed = 1;
-		eval.cmd = ETHTOOL_SRXCSUM;
-		eval.data = (off_csum_rx_wanted == 1);
-		ifr->ifr_data = (caddr_t)&eval;
-		err = send_ioctl(fd, ifr);
-		if (err) {
-			perror("Cannot set device rx csum settings");
-			return 84;
+	get_features.cmd.cmd = ETHTOOL_GFEATURES;
+	get_features.cmd.size = ARRAY_SIZE(get_features.data);
+	ifr->ifr_data = (caddr_t)&get_features;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err == 0) {
+		set_features.cmd.cmd = ETHTOOL_SFEATURES;
+		set_features.cmd.size = ARRAY_SIZE(set_features.data);
+		set_features.data[0] = off_features;
+
+		for (i = 0; i < ARRAY_SIZE(off_feature_def); i++) {
+			value = off_feature_def[i].value;
+			if (!(off_features.valid & value))
+				continue;
+			if (!(get_features.data[0].available & value)) {
+				/* None of these features can be changed */
+				fprintf(stderr,
+					"Cannot set device %s settings: "
+					"Operation not supported\n",
+					off_feature_def[i].long_name);
+			} else if (off_features.requested & value) {
+				/* Some of these features can be
+				 * enabled; mask out any that cannot
+				 */
+				set_features.data[0].requested &=
+					~(value &
+					  ~get_features.data[0].available);
+			}
 		}
-	}
 
-	if (off_csum_tx_wanted >= 0) {
-		changed = 1;
-		eval.cmd = ETHTOOL_STXCSUM;
-		eval.data = (off_csum_tx_wanted == 1);
-		ifr->ifr_data = (caddr_t)&eval;
-		err = send_ioctl(fd, ifr);
-		if (err) {
-			perror("Cannot set device tx csum settings");
-			return 85;
+		ifr->ifr_data = (caddr_t)&set_features;
+		err = ioctl(fd, SIOCETHTOOL, ifr);
+		if (err < 0) {
+			perror("Cannot set device offload settings");
+			return 1;
 		}
-	}
 
-	if (off_sg_wanted >= 0) {
-		changed = 1;
-		eval.cmd = ETHTOOL_SSG;
-		eval.data = (off_sg_wanted == 1);
-		ifr->ifr_data = (caddr_t)&eval;
-		err = send_ioctl(fd, ifr);
-		if (err) {
-			perror("Cannot set device scatter-gather settings");
-			return 86;
-		}
-	}
+		changed = !!set_features.data[0].valid;
 
-	if (off_tso_wanted >= 0) {
-		changed = 1;
-		eval.cmd = ETHTOOL_STSO;
-		eval.data = (off_tso_wanted == 1);
-		ifr->ifr_data = (caddr_t)&eval;
-		err = send_ioctl(fd, ifr);
-		if (err) {
-			perror("Cannot set device tcp segmentation offload settings");
-			return 88;
-		}
-	}
-	if (off_ufo_wanted >= 0) {
-		changed = 1;
-		eval.cmd = ETHTOOL_SUFO;
-		eval.data = (off_ufo_wanted == 1);
-		ifr->ifr_data = (caddr_t)&eval;
-		err = ioctl(fd, SIOCETHTOOL, ifr);
-		if (err) {
-			perror("Cannot set device udp large send offload settings");
-			return 89;
-		}
-	}
-	if (off_gso_wanted >= 0) {
-		changed = 1;
-		eval.cmd = ETHTOOL_SGSO;
-		eval.data = (off_gso_wanted == 1);
-		ifr->ifr_data = (caddr_t)&eval;
-		err = ioctl(fd, SIOCETHTOOL, ifr);
-		if (err) {
-			perror("Cannot set device generic segmentation offload settings");
-			return 90;
-		}
-	}
-	if (off_flags_mask) {
-		changed = 1;
-		eval.cmd = ETHTOOL_GFLAGS;
-		eval.data = 0;
-		ifr->ifr_data = (caddr_t)&eval;
-		err = ioctl(fd, SIOCETHTOOL, ifr);
-		if (err) {
-			perror("Cannot get device flag settings");
-			return 91;
+		if (err & ETHTOOL_F_WISH)
+			fprintf(stderr,
+				"Cannot set device offload settings: "
+				"Some requested features depend on others "
+				"that are not currently enabled\n");
+
+		/* ETHTOOL_F_UNSUPPORTED should never be set as we
+		 * checked for unsupported flags above.  Treat any
+		 * other warning flags as unknown.
+		 */
+		if (err & ~ETHTOOL_F_WISH)
+			fprintf(stderr,
+				"Cannot set device offload settings: "
+				"warning flags %#x",
+				err & ~ETHTOOL_F_WISH);
+	} else if (errno != EOPNOTSUPP && errno != EPERM) {
+		perror("Cannot get device offload settings");
+		return 1;
+	} else {
+		for (i = 0; i < ARRAY_SIZE(off_feature_def); i++) {
+			if (!off_feature_def[i].cmd)
+				continue;
+			if (off_features.valid & off_feature_def[i].value) {
+				changed = 1;
+				eval.cmd = off_feature_def[i].cmd + 1;
+				eval.data = !!(off_features.requested &
+					       off_feature_def[i].value);
+				ifr->ifr_data = (caddr_t)&eval;
+				err = send_ioctl(fd, ifr);
+				if (err) {
+					fprintf(stderr,
+						"Cannot set device %s settings: %m\n",
+						off_feature_def[i].long_name);
+					return 1;
+				}
+			}
 		}
 
-		eval.cmd = ETHTOOL_SFLAGS;
-		eval.data = ((eval.data & ~off_flags_mask) |
-			     off_flags_wanted);
+		if (off_features.valid & flags_dup_features) {
+			changed = 1;
+			eval.cmd = ETHTOOL_GFLAGS;
+			eval.data = 0;
+			ifr->ifr_data = (caddr_t)&eval;
+			err = ioctl(fd, SIOCETHTOOL, ifr);
+			if (err) {
+				perror("Cannot get device flag settings");
+				return 91;
+			}
 
-		err = ioctl(fd, SIOCETHTOOL, ifr);
-		if (err) {
-			perror("Cannot set device flag settings");
-			return 92;
-		}
-	}
-	if (off_gro_wanted >= 0) {
-		changed = 1;
-		eval.cmd = ETHTOOL_SGRO;
-		eval.data = (off_gro_wanted == 1);
-		ifr->ifr_data = (caddr_t)&eval;
-		err = ioctl(fd, SIOCETHTOOL, ifr);
-		if (err) {
-			perror("Cannot set device GRO settings");
-			return 93;
+			eval.cmd = ETHTOOL_SFLAGS;
+			eval.data &= ~(off_features.valid & flags_dup_features);
+			eval.data |= (off_features.requested &
+				      flags_dup_features);
+
+			err = ioctl(fd, SIOCETHTOOL, ifr);
+			if (err) {
+				perror("Cannot set device flag settings");
+				return 92;
+			}
 		}
 	}
 
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH ethtool 1/3] ethtool-copy.h: sync with net-next-2.6
From: Ben Hutchings @ 2011-02-21 16:57 UTC (permalink / raw)
  To: netdev, Michał Mirosław
In-Reply-To: <1298307282.2608.47.camel@bwh-desktop>

This covers kernel changes up to:

commit e83d360d9a7e5d71d55c13e96b19109a2ea23bf0
Author: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Date:   Tue Feb 15 16:59:18 2011 +0000

    net: introduce NETIF_F_RXCSUM

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool-copy.h |   89 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 88 insertions(+), 1 deletions(-)

diff --git a/ethtool-copy.h b/ethtool-copy.h
index 75c3ae7..6430dbd 100644
--- a/ethtool-copy.h
+++ b/ethtool-copy.h
@@ -251,6 +251,7 @@ enum ethtool_stringset {
 	ETH_SS_STATS,
 	ETH_SS_PRIV_FLAGS,
 	ETH_SS_NTUPLE_FILTERS,
+	ETH_SS_FEATURES,
 };
 
 /* for passing string sets for data tagging */
@@ -523,6 +524,87 @@ struct ethtool_flash {
 	char	data[ETHTOOL_FLASH_MAX_FILENAME];
 };
 
+/* for returning and changing feature sets */
+
+/**
+ * struct ethtool_get_features_block - block with state of 32 features
+ * @available: mask of changeable features
+ * @requested: mask of features requested to be enabled if possible
+ * @active: mask of currently enabled features
+ * @never_changed: mask of features not changeable for any device
+ */
+struct ethtool_get_features_block {
+	__u32	available;
+	__u32	requested;
+	__u32	active;
+	__u32	never_changed;
+};
+
+/**
+ * struct ethtool_gfeatures - command to get state of device's features
+ * @cmd: command number = %ETHTOOL_GFEATURES
+ * @size: in: number of elements in the features[] array;
+ *       out: number of elements in features[] needed to hold all features
+ * @features: state of features
+ */
+struct ethtool_gfeatures {
+	__u32	cmd;
+	__u32	size;
+	struct ethtool_get_features_block features[0];
+};
+
+/**
+ * struct ethtool_set_features_block - block with request for 32 features
+ * @valid: mask of features to be changed
+ * @requested: values of features to be changed
+ */
+struct ethtool_set_features_block {
+	__u32	valid;
+	__u32	requested;
+};
+
+/**
+ * struct ethtool_sfeatures - command to request change in device's features
+ * @cmd: command number = %ETHTOOL_SFEATURES
+ * @size: array size of the features[] array
+ * @features: feature change masks
+ */
+struct ethtool_sfeatures {
+	__u32	cmd;
+	__u32	size;
+	struct ethtool_set_features_block features[0];
+};
+
+/*
+ * %ETHTOOL_SFEATURES changes features present in features[].valid to the
+ * values of corresponding bits in features[].requested. Bits in .requested
+ * not set in .valid or not changeable are ignored.
+ *
+ * Returns %EINVAL when .valid contains undefined or never-changable bits
+ * or size is not equal to required number of features words (32-bit blocks).
+ * Returns >= 0 if request was completed; bits set in the value mean:
+ *   %ETHTOOL_F_UNSUPPORTED - there were bits set in .valid that are not
+ *	changeable (not present in %ETHTOOL_GFEATURES' features[].available)
+ *	those bits were ignored.
+ *   %ETHTOOL_F_WISH - some or all changes requested were recorded but the
+ *      resulting state of bits masked by .valid is not equal to .requested.
+ *      Probably there are other device-specific constraints on some features
+ *      in the set. When %ETHTOOL_F_UNSUPPORTED is set, .valid is considered
+ *      here as though ignored bits were cleared.
+ *
+ * Meaning of bits in the masks are obtained by %ETHTOOL_GSSET_INFO (number of
+ * bits in the arrays - always multiple of 32) and %ETHTOOL_GSTRINGS commands
+ * for ETH_SS_FEATURES string set. First entry in the table corresponds to least
+ * significant bit in features[0] fields. Empty strings mark undefined features.
+ */
+enum ethtool_sfeatures_retval_bits {
+	ETHTOOL_F_UNSUPPORTED__BIT,
+	ETHTOOL_F_WISH__BIT,
+};
+
+#define ETHTOOL_F_UNSUPPORTED   (1 << ETHTOOL_F_UNSUPPORTED__BIT)
+#define ETHTOOL_F_WISH          (1 << ETHTOOL_F_WISH__BIT)
+
 
 /* CMDs currently supported */
 #define ETHTOOL_GSET		0x00000001 /* Get settings. */
@@ -534,7 +616,9 @@ struct ethtool_flash {
 #define ETHTOOL_GMSGLVL		0x00000007 /* Get driver message level */
 #define ETHTOOL_SMSGLVL		0x00000008 /* Set driver msg level. */
 #define ETHTOOL_NWAY_RST	0x00000009 /* Restart autonegotiation. */
-#define ETHTOOL_GLINK		0x0000000a /* Get link status (ethtool_value) */
+/* Get link status for host, i.e. whether the interface *and* the
+ * physical port (if there is one) are up (ethtool_value). */
+#define ETHTOOL_GLINK		0x0000000a
 #define ETHTOOL_GEEPROM		0x0000000b /* Get EEPROM data */
 #define ETHTOOL_SEEPROM		0x0000000c /* Set EEPROM data. */
 #define ETHTOOL_GCOALESCE	0x0000000e /* Get coalesce config */
@@ -585,6 +669,9 @@ struct ethtool_flash {
 #define ETHTOOL_GRXFHINDIR	0x00000038 /* Get RX flow hash indir'n table */
 #define ETHTOOL_SRXFHINDIR	0x00000039 /* Set RX flow hash indir'n table */
 
+#define ETHTOOL_GFEATURES	0x0000003a /* Get device offload settings */
+#define ETHTOOL_SFEATURES	0x0000003b /* Change device offload settings */
+
 /* compatibility with older code */
 #define SPARC_ETH_GSET		ETHTOOL_GSET
 #define SPARC_ETH_SSET		ETHTOOL_SSET
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH ethtool 0/3] Regularise handling of offload flags
From: Ben Hutchings @ 2011-02-21 16:54 UTC (permalink / raw)
  To: netdev, Michał Mirosław

This patch series is partly inspired by the recent changes to regularise
the relationship between ethtool offload control and net_device feature
flags, though I started on it earlier.

I don't intend to apply these changes until after the kernel and ethtool
v2.6.38 releases.  In the mean time, I would appreciate testing and
feedback.

Ben.

Ben Hutchings (3):
  ethtool-copy.h: sync with net-next-2.6
  ethtool: Regularise handling of offload flags
  ethtool: Report additional offload flag changes made automatically

 ethtool-copy.h |   89 +++++++++++-
 ethtool-util.h |   30 ++++
 ethtool.c      |  465 ++++++++++++++++++++++++++++----------------------------
 3 files changed, 352 insertions(+), 232 deletions(-)

-- 
1.7.3.4


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* [PATCH net-next] bnx2x: use dcb_setapp to manage negotiated application tlvs
From: Shmulik Ravid @ 2011-02-21 18:38 UTC (permalink / raw)
  To: davem; +Cc: eilong, netdev

With this patch the bnx2x uses the generic dcbnl application tlv list
instead of implementing its own get-app handler. When the driver is
alerted to a change in the DCB negotiated parameters, it calls
dcb_setapp to update the dcbnl application tlvs list making it available
to user mode applications and registered notifiers.   

Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x/bnx2x.h      |    2 +-
 drivers/net/bnx2x/bnx2x_dcb.c  |  137 ++++++++++++++++++++++++----------------
 drivers/net/bnx2x/bnx2x_dcb.h  |    5 +-
 drivers/net/bnx2x/bnx2x_main.c |    7 ++-
 4 files changed, 92 insertions(+), 59 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
index c0dd30d..1914026 100644
--- a/drivers/net/bnx2x/bnx2x.h
+++ b/drivers/net/bnx2x/bnx2x.h
@@ -31,7 +31,7 @@
 #define BNX2X_NEW_NAPI
 
 #if defined(CONFIG_DCB)
-#define BCM_DCB
+#define BCM_DCBNL
 #endif
 #if defined(CONFIG_CNIC) || defined(CONFIG_CNIC_MODULE)
 #define BCM_CNIC 1
diff --git a/drivers/net/bnx2x/bnx2x_dcb.c b/drivers/net/bnx2x/bnx2x_dcb.c
index fb60021..9a24d79 100644
--- a/drivers/net/bnx2x/bnx2x_dcb.c
+++ b/drivers/net/bnx2x/bnx2x_dcb.c
@@ -19,6 +19,9 @@
 #include <linux/netdevice.h>
 #include <linux/types.h>
 #include <linux/errno.h>
+#ifdef BCM_DCBNL
+#include <linux/dcbnl.h>
+#endif
 
 #include "bnx2x.h"
 #include "bnx2x_cmn.h"
@@ -508,13 +511,75 @@ static int bnx2x_dcbx_read_shmem_neg_results(struct bnx2x *bp)
 	return 0;
 }
 
+
+#ifdef BCM_DCBNL
+static inline
+u8 bnx2x_dcbx_dcbnl_app_up(struct dcbx_app_priority_entry *ent)
+{
+	u8 pri;
+
+	/* Choose the highest priority */
+	for (pri = MAX_PFC_PRIORITIES - 1; pri > 0; pri--)
+		if (ent->pri_bitmap & (1 << pri))
+			break;
+	return pri;
+}
+
+static inline
+u8 bnx2x_dcbx_dcbnl_app_idtype(struct dcbx_app_priority_entry *ent)
+{
+	return ((ent->appBitfield & DCBX_APP_ENTRY_SF_MASK) ==
+		DCBX_APP_SF_PORT) ? DCB_APP_IDTYPE_PORTNUM :
+		DCB_APP_IDTYPE_ETHTYPE;
+}
+
+static inline
+void bnx2x_dcbx_invalidate_local_apps(struct bnx2x *bp)
+{
+	int i;
+	for (i = 0; i < DCBX_MAX_APP_PROTOCOL; i++)
+		bp->dcbx_local_feat.app.app_pri_tbl[i].appBitfield &=
+							~DCBX_APP_ENTRY_VALID;
+}
+
+int bnx2x_dcbnl_update_applist(struct bnx2x *bp, bool delall)
+{
+	int i, err = 0;
+
+	for (i = 0; i < DCBX_MAX_APP_PROTOCOL && err == 0; i++) {
+		struct dcbx_app_priority_entry *ent =
+			&bp->dcbx_local_feat.app.app_pri_tbl[i];
+
+		if (ent->appBitfield & DCBX_APP_ENTRY_VALID) {
+			u8 up = bnx2x_dcbx_dcbnl_app_up(ent);
+
+			/* avoid invalid user-priority */
+			if (up) {
+				struct dcb_app app;
+				app.selector = bnx2x_dcbx_dcbnl_app_idtype(ent);
+				app.protocol = ent->app_id;
+				app.priority = delall ? 0 : up;
+				err = dcb_setapp(bp->dev, &app);
+			}
+		}
+	}
+	return err;
+}
+#endif
+
 void bnx2x_dcbx_set_params(struct bnx2x *bp, u32 state)
 {
 	switch (state) {
 	case BNX2X_DCBX_STATE_NEG_RECEIVED:
 		{
 			DP(NETIF_MSG_LINK, "BNX2X_DCBX_STATE_NEG_RECEIVED\n");
-
+#ifdef BCM_DCBNL
+			/**
+			 * Delete app tlvs from dcbnl before reading new
+			 * negotiation results
+			 */
+			bnx2x_dcbnl_update_applist(bp, true);
+#endif
 			/* Read neg results if dcbx is in the FW */
 			if (bnx2x_dcbx_read_shmem_neg_results(bp))
 				return;
@@ -526,10 +591,24 @@ void bnx2x_dcbx_set_params(struct bnx2x *bp, u32 state)
 						 bp->dcbx_error);
 
 			if (bp->state != BNX2X_STATE_OPENING_WAIT4_LOAD) {
+#ifdef BCM_DCBNL
+				/**
+				 * Add new app tlvs to dcbnl
+				 */
+				bnx2x_dcbnl_update_applist(bp, false);
+#endif
 				bnx2x_dcbx_stop_hw_tx(bp);
 				return;
 			}
 			/* fall through */
+#ifdef BCM_DCBNL
+			/**
+			 * Invalidate the local app tlvs if they are not added
+			 * to the dcbnl app list to avoid deleting them from
+			 * the list later on
+			 */
+			bnx2x_dcbx_invalidate_local_apps(bp);
+#endif
 		}
 	case BNX2X_DCBX_STATE_TX_PAUSED:
 		DP(NETIF_MSG_LINK, "BNX2X_DCBX_STATE_TX_PAUSED\n");
@@ -1505,8 +1584,7 @@ static void bnx2x_pfc_fw_struct_e2(struct bnx2x *bp)
 	bnx2x_dcbx_print_cos_params(bp,	pfc_fw_cfg);
 }
 /* DCB netlink */
-#ifdef BCM_DCB
-#include <linux/dcbnl.h>
+#ifdef BCM_DCBNL
 
 #define BNX2X_DCBX_CAPS		(DCB_CAP_DCBX_LLD_MANAGED | \
 				DCB_CAP_DCBX_VER_CEE | DCB_CAP_DCBX_STATIC)
@@ -1816,32 +1894,6 @@ static void bnx2x_dcbnl_set_pfc_state(struct net_device *netdev, u8 state)
 	bp->dcbx_config_params.admin_pfc_enable = (state ? 1 : 0);
 }
 
-static bool bnx2x_app_is_equal(struct dcbx_app_priority_entry *app_ent,
-			       u8 idtype, u16 idval)
-{
-	if (!(app_ent->appBitfield & DCBX_APP_ENTRY_VALID))
-		return false;
-
-	switch (idtype) {
-	case DCB_APP_IDTYPE_ETHTYPE:
-		if ((app_ent->appBitfield & DCBX_APP_ENTRY_SF_MASK) !=
-			DCBX_APP_SF_ETH_TYPE)
-			return false;
-		break;
-	case DCB_APP_IDTYPE_PORTNUM:
-		if ((app_ent->appBitfield & DCBX_APP_ENTRY_SF_MASK) !=
-			DCBX_APP_SF_PORT)
-			return false;
-		break;
-	default:
-		return false;
-	}
-	if (app_ent->app_id != idval)
-		return false;
-
-	return true;
-}
-
 static void bnx2x_admin_app_set_ent(
 	struct bnx2x_admin_priority_app_table *app_ent,
 	u8 idtype, u16 idval, u8 up)
@@ -1943,30 +1995,6 @@ static u8 bnx2x_dcbnl_set_app_up(struct net_device *netdev, u8 idtype,
 	return bnx2x_set_admin_app_up(bp, idtype, idval, up);
 }
 
-static u8 bnx2x_dcbnl_get_app_up(struct net_device *netdev, u8 idtype,
-				 u16 idval)
-{
-	int i;
-	u8 up = 0;
-
-	struct bnx2x *bp = netdev_priv(netdev);
-	DP(NETIF_MSG_LINK, "app_type %d, app_id 0x%x\n", idtype, idval);
-
-	/* iterate over the app entries looking for idtype and idval */
-	for (i = 0; i < DCBX_MAX_APP_PROTOCOL; i++)
-		if (bnx2x_app_is_equal(&bp->dcbx_local_feat.app.app_pri_tbl[i],
-				       idtype, idval))
-			break;
-
-	if (i < DCBX_MAX_APP_PROTOCOL)
-		/* if found return up */
-		up = bp->dcbx_local_feat.app.app_pri_tbl[i].pri_bitmap;
-	else
-		DP(NETIF_MSG_LINK, "app not found\n");
-
-	return up;
-}
-
 static u8 bnx2x_dcbnl_get_dcbx(struct net_device *netdev)
 {
 	struct bnx2x *bp = netdev_priv(netdev);
@@ -2107,7 +2135,6 @@ const struct dcbnl_rtnl_ops bnx2x_dcbnl_ops = {
 	.setnumtcs      = bnx2x_dcbnl_set_numtcs,
 	.getpfcstate    = bnx2x_dcbnl_get_pfc_state,
 	.setpfcstate    = bnx2x_dcbnl_set_pfc_state,
-	.getapp         = bnx2x_dcbnl_get_app_up,
 	.setapp         = bnx2x_dcbnl_set_app_up,
 	.getdcbx        = bnx2x_dcbnl_get_dcbx,
 	.setdcbx        = bnx2x_dcbnl_set_dcbx,
@@ -2115,4 +2142,4 @@ const struct dcbnl_rtnl_ops bnx2x_dcbnl_ops = {
 	.setfeatcfg     = bnx2x_dcbnl_set_featcfg,
 };
 
-#endif /* BCM_DCB */
+#endif /* BCM_DCBNL */
diff --git a/drivers/net/bnx2x/bnx2x_dcb.h b/drivers/net/bnx2x/bnx2x_dcb.h
index f650f98..71b8eda 100644
--- a/drivers/net/bnx2x/bnx2x_dcb.h
+++ b/drivers/net/bnx2x/bnx2x_dcb.h
@@ -189,8 +189,9 @@ enum {
 void bnx2x_dcbx_set_params(struct bnx2x *bp, u32 state);
 
 /* DCB netlink */
-#ifdef BCM_DCB
+#ifdef BCM_DCBNL
 extern const struct dcbnl_rtnl_ops bnx2x_dcbnl_ops;
-#endif /* BCM_DCB */
+int bnx2x_dcbnl_update_applist(struct bnx2x *bp, bool delall);
+#endif /* BCM_DCBNL */
 
 #endif /* BNX2X_DCB_H */
diff --git a/drivers/net/bnx2x/bnx2x_main.c b/drivers/net/bnx2x/bnx2x_main.c
index 6c7745e..061733e 100644
--- a/drivers/net/bnx2x/bnx2x_main.c
+++ b/drivers/net/bnx2x/bnx2x_main.c
@@ -9441,7 +9441,7 @@ static int __devinit bnx2x_init_dev(struct pci_dev *pdev,
 	dev->vlan_features |= (NETIF_F_TSO | NETIF_F_TSO_ECN);
 	dev->vlan_features |= NETIF_F_TSO6;
 
-#ifdef BCM_DCB
+#ifdef BCM_DCBNL
 	dev->dcbnl_ops = &bnx2x_dcbnl_ops;
 #endif
 
@@ -9848,6 +9848,11 @@ static void __devexit bnx2x_remove_one(struct pci_dev *pdev)
 	}
 #endif
 
+#ifdef BCM_DCBNL
+	/* Delete app tlvs from dcbnl */
+	bnx2x_dcbnl_update_applist(bp, true);
+#endif
+
 	unregister_netdev(dev);
 
 	/* Delete all NAPI objects */
-- 
1.7.3.5





^ permalink raw reply related

* Re: Advice on network driver design
From: Felix Radensky @ 2011-02-21 16:25 UTC (permalink / raw)
  To: arnd; +Cc: netdev@vger.kernel.org
In-Reply-To: <201102202013.34994.arnd@arndb.de>

Hi Anrd,

On 02/20/2011 09:13 PM, arnd@arndb.de wrote:
> On Saturday 19 February 2011 14:37:43 Felix Radensky wrote:
>> Hi,
>>
>> I'm in the process of designing a network driver for a custom
>> hardware and would like to get some advice from linux network
>> gurus.
>>
>> The host platform is Freescale P2020. The custom hardware is
>> FPGA with several TX FIFOs, single RX FIFO and a set of registers.
>> FPGA is connected to CPU via PCI-E. Host CPU DMA controller is used
>> to get packets to/from FIFOs. Each FIFO has its set of events,
>> generating interrupts, which can be enabled and disabled. Status
>> register reflects the current state of events, the bit in status
>> register is cleared by FPGA when event is handled. Reads or writes to
>> status register have no impact on its contents.
>>
>> The device driver should support 80Mbit/sec of traffic in each direction.
>>
>> So far I have TX side working. I'm using Linux dmaengine APIs to
>> transfer packets to FIFOs. The DMA completion interrupt is handled
>> by per-fifo work queue.
>>
>> My question is about RX. Would such design benefit from NAPI ?
>> If my understanding of NAPI is correct, it runs in softirq context,
>> so I cannot do any DMA work in dev->poll(). If I were to use NAPI,
>> I should probably disable RX interrupts, do all DMA work in some
>> work queue, keep RX packets in a list and only then call dev->poll().
>> Is that correct ?
>>
>> Any other advice and how to write an efficient driver for this
>> hardware is most welcome. I can influence FPGA design to some degree,
>> so if you think FPGA should be changed to improve things, please let
>> me know.
> There are currently discussions ongoing about using virtio for this
> kind of connection. See http://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg49294.html
> for an archive.
>
> When you use virtio as the base, you can use the regular virtio-net
> driver or any other virtio high-level driver on top.
>
> 	Arnd
>
Thanks, I'll take a look at virtio. Some people seem to think it's an 
overkill for this task.

Felix.

^ permalink raw reply

* Re: [PATCH] sfc: lower stack usage in efx_ethtool_self_test
From: Ben Hutchings @ 2011-02-21 16:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1297864118.3201.354.camel@edumazet-laptop>

On Wed, 2011-02-16 at 14:48 +0100, Eric Dumazet wrote:
> drivers/net/sfc/ethtool.c: In function ‘efx_ethtool_self_test’:
> drivers/net/sfc/ethtool.c:613: warning: the frame size of 1200 bytes
> is larger than 1024 bytes

This also fixes a whopping information leak in case the device is in a
broken state.

> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [Lxc-users] Huge ammount of invalid checksum packets on macvlan
From: Daniel Lezcano @ 2011-02-21 16:07 UTC (permalink / raw)
  To: Andrian Nord; +Cc: lxc-users, Patrick McHardy, Linux Netdev List
In-Reply-To: <20110221153421.GA6602@nord.niifaq.ru>

On 02/21/2011 04:34 PM, Andrian Nord wrote:
> Greetings, Daniel.
>
> On Mon, Feb 21, 2011 at 04:20:59PM +0100, Daniel Lezcano wrote:
>
> 2.6.37 kernel with gentoo linux patches (doesn't affect any low-system
> stuff, AFAIK).
> lxc-0.7.2 is used.
>
> Reproducable on two different machines.
> I'm using tcpdump -vvv for bad checksum detection. This also affects
> traffic from container to hardware node (as it's using macvlan to
> communicate with containers by itself).
>
> Also, I've got same problem on UDP packets coming from lan on other
> server and it was worked around by disabeling tx and rx checksum offload
> via ethtool. But dummy devices doesn't allow this.

I am not sure it is a bug. If we go outside of the container context and 
we do the following:

ssh 127.0.0.1
tcpdump -vvv -i lo

We will get the same errors AFAICS.

There is also in the man page the following option:

-K Don't attempt to verify IP, TCP, or UDP checksums. This is useful for 
inter‐
faces that perform some or all of those checksum calculation in 
hardware; other‐
wise, all outgoing TCP checksums will be flagged as bad.

IMO, the checksum is not needed for the virtual macvlan devices, hence 
the checksum is not computed and the checksum tcp packet is not filled. 
As the skb's are flagged as 'checksum not necessary' the packets are not 
dropped by the kernel and are delivered to the network stack. tcpdump 
intercept the raw packet and analyse the header. It will see a bad value 
as this one is a default value.

I Cc'ed the netdev mailing list and Patrick in case my analysis is wrong 
or incomplete.

Thanks
-- Daniel


^ permalink raw reply

* Re: IGMPMSG_WRONGVIF only happens when true VIF is an OIF of correct IIF?
From: Dan Langlois @ 2011-02-21 16:03 UTC (permalink / raw)
  To: David Lamparter; +Cc: netdev
In-Reply-To: <20110218171225.GC1739639@jupiter.n2.diac24.net>

On Fri, 2011-02-18 at 18:12 +0100, David Lamparter wrote:
> On Fri, Feb 18, 2011 at 04:46:56PM +0100, Dan Langlois wrote:
> > Hi, I have a question about the IGMPMSG_WRONGVIF message.
> > I have MRT_ASSERT enabled but do not receive notifications.
> > However, if I enable MRT_PIM, I do get them.  I'm curious
> > why MRT_ASSERT alone isn't sufficient.
> > 
> > This if-statement appears in ip_mr_forward() in net/ipv4/ipmr.c
> > 
> >        if (true_vifi >= 0 && net->ipv4.mroute_do_assert &&
> >             /* pimsm uses asserts, when switching from RPT to SPT, 
> >                so that we cannot check that packet arrived on an oif. 
> >                It is bad, but otherwise we would need to move pretty
> >                large chunk of pimd to kernel. Ough... --ANK
> >              */      
> >             (net->ipv4.mroute_do_pim ||
> >              cache->mfc_un.res.ttls[true_vifi] < 255) && 
> >             time_after(jiffies,
> >                        cache->mfc_un.res.last_assert +
> > MFC_ASSERT_THRESH)) {
> >                 cache->mfc_un.res.last_assert = jiffies;
> >                 ipmr_cache_report(net, skb, true_vifi,
> > IGMPMSG_WRONGVIF);
> >         }       
> > 
> > 
> > By the time this statement is reached, it is known that the packet
> > did not arrive on the expected incoming interface (IIF) and thus a
> > "WRONGVIF" condition.  This if-statement decides whether a PIM-assert
> > notification needs to be sent to the multicast routing daemon.
> > 
> > The part of this if-statement I'm questioning is:
> >     (net->ipv4.mroute_do_pim ||
> >      cache->mfc_un.res.ttls[true_vifi] < 255) &&
> > 
> > I read this as:
> > "send WRONGVIF if PIM is enabled -OR-
> > the packet arrived on an outgoing interface OIF of the correct IIF"
> > 
> > So this statement is always true when PIM is enabled.
> > However, if only ASSERT is enabled, this statement is only true when
> > a packet is reflected back on an OIF.
> > 
> > Why not always send the assert for any WRONGVIF condition regardless
> > of the interface it came in on?  I mean, isn't that the point of
> > enabling MRT_ASSERT in the first place?  If the assertions weren't
> > wanted, just don't enable that feature, right?
> 
> The point of MRT_ASSERT is to aid in resolving duplication issues where
> you end up with multiple mrouters thinking they're responsible for the
> same subnet. This condition is indicated by more than 1 router on the
> subnet having the subnet as OIF on the (*,G) or (S,G). Therefore, your
> daemon receives the assertion if it is one of the forwarding mrouters,
> which it will only be if it's an OIF.
> 
> If your daemon doesn't have the target subnet listed as oif, the
> assumption is that a different mrouter has been elected to forward
> packets for this G/S,G to this subnet. The kernel knows that your daemon
> decided to not forward things to this subnet, so you are not involved in
> duplicates, so you don't get an assert.


Hmmm, well thanks for the explanation.  At least that explains why the
notification only applies to OIFs.  I suppose it would actually be wrong
to generate an ASSERT under any other condition?  


> The "PIM or" is a kludge to make PIM work, as indicated by the comment
> above the if. I'd have to refreshen my PIM knowledge to fully understand
> it or explain it ;)
> 
> > In our system, multicast streams may get rerouted through a different
> > VIF than what was first installed in the MFC cache.  I would very much
> > like to get these WRONGVIF assertions in this case, which is NOT the
> > case that a packet is seen on an OIF.
> > 
> > Of course I can simply enable MRT_PIM to get the messages, but in that
> > case I don't understand the reason for having two separate toggles
> > (MRT_ASSERT versus MRT_PIM).
> 
> I don't really understand you use case -- is this a case of another
> router showing up on the subnet and directing traffic to it? If so, why
> wasn't the first router directing traffic to it? Do you have a primary
> multicast forwarder election system in place?
> 

It is a bit of an odd setup:  We have multicast traffic aboard various
vehicles and we need to tunnel that back to a central server, to then be
forwarded on its various backend interfaces.  There is also traffic
coming in from those interfaces which needs to be directed back out to
one or more of these vehicles, depending on group membership.  We use
IGMP-based multicast forwarding daemons on the vehicles and on the
server (something like igmpproxyd, but bidirectional) and IP-IP tunnels
between the server and each vehicle.

These vehicles can be coupled together.  When this happens, we designate
one of the coupled vehicles as "master" and all traffic is sent through
the master tunnel, rather than each of the individual tunnels.  The
vehicles can of course be decoupled again, such that the multicast
streams need to be separated to individual tunnels again.  

Since the onboard equipment maintains its IP address despite coupling,
the (S,G) pairing remains the same, but now it is being forwarded
through the "master" rather than its own IP-IP tunnel.  This corresponds
to a different VIF on the server.  So this (de)coupling activity is why
traffic can shift between VIFs.  I noticed the "Wrong" statistic
incrementing, but didn't get an ASSERT.  Once I started digging I found
this if-statement and decided to ask about it.

Now I'm not too sure how to proceed.  In our system, the IIF for a (S,G)
can change due to these couplings.  Receiving the IGMPMSG_WRONGVIF
notification seems to be the most ideal fix, but maybe it isn't.  Can
you suggest something else?  Is it "dangerous" to enable PIM in an
IGMP-based multicast forwarding daemon? (e.g. a daemon that doesn't
implement PIM-SM or -DM, but is still protocol independent)  Perhaps
enabling MRT_PIM is actually the right solution?  Advice appreciated. 

cheers,
Dan


^ permalink raw reply

* Re: [PATCH v2] ethtool : Add option -L | --set-common to set common flags.
From: Ben Hutchings @ 2011-02-21 16:00 UTC (permalink / raw)
  To: Mahesh Bandewar; +Cc: David Miller, Tom Herbert, Laurent Chavey, netdev
In-Reply-To: <1295390237-27381-1-git-send-email-maheshb@google.com>

On Tue, 2011-01-18 at 14:37 -0800, Mahesh Bandewar wrote:
> This patch adds -L | --set-common option to add / remove common flags which
> includes loopback flag. The -l | --show-common displays the current values
> for these common flags.
[...]

I haven't forgotten this, but I won't apply it until the corresponding
kernel change is applied.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [ethtool PATCH 2/2] Add RX packet classification interface
From: Ben Hutchings @ 2011-02-21 15:40 UTC (permalink / raw)
  To: Alexander Duyck, Santwona Behera; +Cc: netdev
In-Reply-To: <20110211011838.23554.3735.stgit@gitlad.jf.intel.com>

On Thu, 2011-02-10 at 17:18 -0800, Alexander Duyck wrote:
> From: Santwona Behera <santwona.behera@sun.com>
> 
> This patch was originally introduced as:
>   [PATCH 1/3] [ethtool] Add rx pkt classification interface
>   Signed-off-by: Santwona Behera <santwona.behera@sun.com>
>   http://patchwork.ozlabs.org/patch/23223/
> 
> I have updated it to address a number of issues.  As a result I removed the
> local caching of rules due to the fact that there were memory leaks in this
> code and the rule manager would consume over 1Mb of space for an 8K table
> when all that was needed was 1K in order to store which rules were active
> and which were not.
> 
> In addition I dropped the use of regions as there were multiple issue found
> including the fact that the regions were not properly expanding beyond 2
> and the fact that the regions required reading all of the rules in order to
> correctly expand beyond 2.  By dropping the regions from the rule manager
> it is possible to write a much cleaner interface leaving region management
> to be done by either the driver or by external management scripts.
> 
> I also added an ethtool bitops interface to allow for simple bit set and
> test activities since the rule manager can most efficiently store the list
> of active rules via a bitmap.
[...]
> diff --git a/ethtool-bitops.h b/ethtool-bitops.h
> new file mode 100644
> index 0000000..7101056
> --- /dev/null
> +++ b/ethtool-bitops.h
> @@ -0,0 +1,25 @@
> +#ifndef ETHTOOL_BITOPS_H__
> +#define ETHTOOL_BITOPS_H__
> +
> +#define BITS_PER_LONG		__WORDSIZE

This seems to be a glibc-internal macro and I don't think we should use
it.

> +#define BITS_PER_BYTE		8
> +#define DIV_ROUND_UP(n, d)	(((n) + (d) - 1) / (d))
> +#define BITS_TO_LONGS(nr)	DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))

In fact I notice you don't use it here...

> +static inline void set_bit(int nr, unsigned long *addr)
> +{
> +	addr[nr / BITS_PER_LONG] |= 1UL << (nr % BITS_PER_LONG);
> +}
> +
> +static inline void clear_bit(int nr, unsigned long *addr)
> +{
> +	addr[nr / BITS_PER_LONG] &= ~(1UL << (nr % BITS_PER_LONG));
> +}
> +
> +static __always_inline int test_bit(unsigned int nr, const unsigned long *addr)
> +{
> +	return ((1UL << (nr % BITS_PER_LONG)) &
> +		(((unsigned long *)addr)[nr / BITS_PER_LONG])) != 0UL;
> +}
> +
> +#endif
> diff --git a/ethtool-util.h b/ethtool-util.h
> index f053028..e9300e2 100644
> --- a/ethtool-util.h
> +++ b/ethtool-util.h
> @@ -103,4 +103,17 @@ int sfc_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
>  int st_mac100_dump_regs(struct ethtool_drvinfo *info,
>  			struct ethtool_regs *regs);
>  int st_gmac_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
> +
> +/* Rx flow classification */
> +#include <sys/ioctl.h>
> +#include <net/if.h>

Put #includes at the top of the file, please.

> +int rxclass_parse_ruleopts(char **optstr, int opt_cnt,
> +			   struct ethtool_rx_flow_spec *fsp, __u8 *loc_valid);
> +int rxclass_rule_getall(int fd, struct ifreq *ifr);
> +int rxclass_rule_get(int fd, struct ifreq *ifr, __u32 loc);
> +int rxclass_rule_ins(int fd, struct ifreq *ifr,
> +		     struct ethtool_rx_flow_spec *fsp, __u8 loc_valid);
> +int rxclass_rule_del(int fd, struct ifreq *ifr, __u32 loc);
> +
>  #endif
> diff --git a/ethtool.8.in b/ethtool.8.in
> index 133825b..c183a3d 100644
> --- a/ethtool.8.in
> +++ b/ethtool.8.in
> @@ -40,21 +40,36 @@
>  [\\fB\\$1\\fP\ \\fIN\\fP]
>  ..
>  .\"
> +.\"	.BM - same as above but has a mask field for format "[value N [m N]]"
> +.\"
> +.de BM
> +[\\fB\\$1\\fP\ \\fIN\\fP\ [\\fBm\\fP\ \\fIN\\fP]]
> +..
> +.\"
>  .\"	\(*MA - mac address
>  .\"
>  .ds MA \fIxx\fP\fB:\fP\fIyy\fP\fB:\fP\fIzz\fP\fB:\fP\fIaa\fP\fB:\fP\fIbb\fP\fB:\fP\fIcc\fP
>  .\"
> +.\"	\(*PA - IP address
> +.\"
> +.ds PA \fIx\fP\fB.\fP\fIx\fP\fB.\fP\fIx\fP\fB.\fP\fIx\fP
> +.\"
>  .\"	\(*WO - wol flags
>  .\"
>  .ds WO \fBp\fP|\fBu\fP|\fBm\fP|\fBb\fP|\fBa\fP|\fBg\fP|\fBs\fP|\fBd\fP...
>  .\"
>  .\"	\(*FL - flow type values
>  .\"
> -.ds FL \fBtcp4\fP|\fBudp4\fP|\fBah4\fP|\fBsctp4\fP|\fBtcp6\fP|\fBudp6\fP|\fBah6\fP|\fBsctp6\fP
> +.ds FL \fBtcp4\fP|\fBudp4\fP|\fBah4\fP||\fBesp4\fP|\fBsctp4\fP|\fBtcp6\fP|\fBudp6\fP|\fBah6\fP|\fBesp6\fP|\fBsctp6\fP

This adds two '|' characters between 'ah4' and 'esp4'.

>  .\"
>  .\"	\(*HO - hash options
>  .\"
>  .ds HO \fBm\fP|\fBv\fP|\fBt\fP|\fBs\fP|\fBd\fP|\fBf\fP|\fBn\fP|\fBr\fP...
> +.\"
> +.\"	\(*L4 - L4 proto options
> +.\"
> +.ds L4 \fBtcp\fP|\fBudp\fP|\fBsctp\fP|\fBah\fP|\fBesp\fP|\fIN\fP
> +.\"
>  .\" Start URL.
>  .de UR
>  .  ds m1 \\$1\"
> @@ -224,11 +239,27 @@ ethtool \- query or control network driver and hardware settings
>  .B ethtool \-n
>  .I ethX
>  .RB [ rx-flow-hash \ \*(FL]
> +.RB [ rx-rings ]
> +.RB [ rx-class-rule-all ]
> +.RB [ rx-class-rule
> +.IR N ]

Should use '.BN'.
 
[...]
> diff --git a/ethtool.c b/ethtool.c
> index 1afdfe4..b624980 100644
> --- a/ethtool.c
> +++ b/ethtool.c
> @@ -6,6 +6,7 @@
>   * Kernel 2.4 update Copyright 2001 Jeff Garzik <jgarzik@mandrakesoft.com>
>   * Wake-on-LAN,natsemi,misc support by Tim Hockin <thockin@sun.com>
>   * Portions Copyright 2002 Intel
> + * Portions Copyright (C) Sun Microsystems 2008
>   * do_test support by Eli Kupermann <eli.kupermann@intel.com>
>   * ETHTOOL_PHYS_ID support by Chris Leech <christopher.leech@intel.com>
>   * e1000 support by Scott Feldman <scott.feldman@intel.com>
> @@ -14,6 +15,7 @@
>   * amd8111e support by Reeja John <reeja.john@amd.com>
>   * long arguments by Andi Kleen.
>   * SMSC LAN911x support by Steve Glendinning <steve.glendinning@smsc.com>
> + * Rx Network Flow Control configuration support <santwona.behera@sun.com>
>   * Various features by Ben Hutchings <bhutchings@solarflare.com>;
>   *	Copyright 2009, 2010 Solarflare Communications
>   *
> @@ -32,7 +34,7 @@
>  #include <sys/ioctl.h>
>  #include <sys/stat.h>
>  #include <stdio.h>
> -#include <string.h>
> +#include <strings.h>

No, <string.h> is standard.

[...]
> @@ -408,6 +425,14 @@ static int msglvl_changed;
>  static u32 msglvl_wanted = 0;
>  static u32 msglvl_mask = 0;
>  
> +static int rx_rings_get = 0;
> +static int rx_class_rule_get = -1;
> +static int rx_class_rule_getall = 0;
> +static int rx_class_rule_del = -1;
> +static int rx_class_rule_added = 0;
> +static struct ethtool_rx_flow_spec rx_rule_fs;
> +static u8 rxclass_loc_valid = 0;
> +
>  static enum {
>  	ONLINE=0,
>  	OFFLINE,
[...]
> @@ -945,6 +974,23 @@ static void parse_cmdline(int argc, char **argp)
>  						rxflow_str_to_type(argp[i]);
>  					if (!rx_fhash_get)
>  						show_usage(1);
> +				} else if (!strcmp(argp[i], "rx-rings")) {
> +					i += 1;
> +					rx_rings_get = 1;

I'm not convinced of the value of a separate rx-rings option/keyword.
However it's probably worth displaying the number of rings/queues when
showing other flow hashing and steering/filtering information (the -x
option does this).

> +				} else if (!strcmp(argp[i],
> +						   "rx-class-rule-all")) {
> +					i += 1;
> +					rx_class_rule_getall = 1;
> +				} else if (!strcmp(argp[i], "rx-class-rule")) {
> +					i += 1;
> +					if (i >= argc) {
> +						show_usage(1);
> +						break;
> +					}
> +					rx_class_rule_get =
> +						strtol(argp[i], NULL, 0);
> +					if (rx_class_rule_get < 0)
> +						show_usage(1);

Use get_uint_range(argp[i], 0, INT_MAX).

>  				} else
>  					show_usage(1);
>  				break;

I don't think the same options (-n, -N) should be used both for flow
hashing and n-tuple flow steering/filtering.  This command-line
interface and the structure used in the ethtool API just seem to reflect
the implementation in the niu driver.

(In fact I would much prefer it if the -u and -U options could be used
for both the rxnfc and rxntuple interfaces.  But I haven't thought about
how the differences in functionality would be exposed to or hidden from
the user.)

> @@ -978,8 +1024,37 @@ static void parse_cmdline(int argc, char **argp)
>  						show_usage(1);
>  					else
>  						rx_fhash_changed = 1;
> -				} else
> +				} else if (!strcmp(argp[i],
> +						   "rx-class-rule-del")) {
> +					i += 1;
> +					if (i >= argc) {
> +						show_usage(1);
> +						break;
> +					}
> +					rx_class_rule_del =
> +						strtol(argp[i], NULL, 0);
> +					if (rx_class_rule_del < 0)
> +						show_usage(1);

Use get_uint_range(argp[i], 0, INT_MAX).

> +				} else if (!strcmp(argp[i],
> +						   "rx-class-rule-add")) {
> +					i += 1;
> +					if (i >= argc) {
> +						show_usage(1);
> +						break;
> +					}
> +					if (rxclass_parse_ruleopts(&argp[i],
> +								   argc - i,
> +								   &rx_rule_fs,
> +								   &rxclass_loc_valid) < 0) {
> +						show_usage(1);
> +					} else {
> +						i = argc;
> +						rx_class_rule_added = 1;
> +					}
> +				} else {
>  					show_usage(1);
> +				}
> +
>  				break;
>  			}
>  			if (mode == MODE_SRXFHINDIR) {
> @@ -1917,9 +1992,12 @@ static int dump_rxfhash(int fhash, u64 val)
>  	case SCTP_V4_FLOW:
>  		fprintf(stdout, "SCTP over IPV4 flows");
>  		break;
> -	case AH_ESP_V4_FLOW:

I believe this is still a valid type for flow hashing.

> +	case AH_V4_FLOW:
>  		fprintf(stdout, "IPSEC AH over IPV4 flows");
>  		break;
> +	case ESP_V4_FLOW:
> +		fprintf(stdout, "IPSEC ESP over IPV4 flows");
> +		break;
>  	case TCP_V6_FLOW:
>  		fprintf(stdout, "TCP over IPV6 flows");
>  		break;
> @@ -1929,9 +2007,12 @@ static int dump_rxfhash(int fhash, u64 val)
>  	case SCTP_V6_FLOW:
>  		fprintf(stdout, "SCTP over IPV6 flows");
>  		break;
> -	case AH_ESP_V6_FLOW:

Same as for AH_ESP_V4_FLOW.

> +	case AH_V6_FLOW:
>  		fprintf(stdout, "IPSEC AH over IPV6 flows");
>  		break;
> +	case ESP_V6_FLOW:
> +		fprintf(stdout, "IPSEC ESP over IPV6 flows");
> +		break;
>  	default:
>  		break;
>  	}
> @@ -2911,14 +2992,12 @@ static int do_gstats(int fd, struct ifreq *ifr)
>  	return 0;
>  }
>  
> -
>  static int do_srxclass(int fd, struct ifreq *ifr)
>  {
>  	int err;
> +	struct ethtool_rxnfc nfccmd;
>  
>  	if (rx_fhash_changed) {
> -		struct ethtool_rxnfc nfccmd;
> -
>  		nfccmd.cmd = ETHTOOL_SRXFH;
>  		nfccmd.flow_type = rx_fhash_set;
>  		nfccmd.data = rx_fhash_val;
> @@ -2930,6 +3009,20 @@ static int do_srxclass(int fd, struct ifreq *ifr)
>  
>  	}
>  
> +	if (rx_class_rule_added) {
> +		err = rxclass_rule_ins(fd, ifr, &rx_rule_fs,
> +				       rxclass_loc_valid);
> +		if (err < 0)
> +			fprintf(stderr, "Cannot insert RX classification rule\n");
> +	}
> +
> +	if (rx_class_rule_del >= 0) {
> +		err = rxclass_rule_del(fd, ifr, rx_class_rule_del);
> +
> +		if (err < 0)
> +			fprintf(stderr, "Cannot delete RX classification rule\n");
> +	}
> +
>  	return 0;
>  }

This needs to return 1 on error (I know that's an existing bug, but
don't compound it).

> @@ -2950,6 +3043,31 @@ static int do_grxclass(int fd, struct ifreq *ifr)
>  			dump_rxfhash(rx_fhash_get, nfccmd.data);
>  	}
>  
> +	if (rx_rings_get) {
> +		struct ethtool_rxnfc nfccmd;
> +
> +		nfccmd.cmd = ETHTOOL_GRXRINGS;
> +		ifr->ifr_data = (caddr_t)&nfccmd;
> +		err = ioctl(fd, SIOCETHTOOL, ifr);
> +		if (err < 0)
> +			perror("Cannot get RX rings");
> +		else
> +			fprintf(stdout, "%d RX rings available\n",
> +				(int)nfccmd.data);
> +	}
> +
> +	if (rx_class_rule_get >= 0) {
> +		err = rxclass_rule_get(fd, ifr, rx_class_rule_get);
> +		if (err < 0)
> +			fprintf(stderr, "Cannot get RX classification rule\n");
> +	}
> +
> +	if (rx_class_rule_getall) {
> +		err = rxclass_rule_getall(fd, ifr);
> +		if (err < 0)
> +			fprintf(stderr, "RX classification rule retrieval failed\n");
> +	}
> +
>  	return 0;
>  }

Ditto for this.

> diff --git a/rxclass.c b/rxclass.c
> new file mode 100644
> index 0000000..fd01a32
> --- /dev/null
> +++ b/rxclass.c
> @@ -0,0 +1,809 @@
> +/*
> + * Copyright (C) 2008 Sun Microsystems, Inc. All rights reserved.
> + */
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <stddef.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <strings.h>
> +
> +#include <linux/sockios.h>
> +#include <arpa/inet.h>
> +#include "ethtool-util.h"
> +#include "ethtool-bitops.h"
> +
> +/*
> + * This is a rule manager implementation for ordering rx flow
> + * classification rules in a longest prefix first match order.
> + * The assumption is that this rule manager is the only one adding rules to
> + * the device's hardware classifier.
> + */
> +
> +struct rmgr_ctrl {
> +	/* slot contains a bitmap indicating which filters are valid */
> +	unsigned long		*slot;
> +	__u32			n_rules;
> +	__u32			size;
> +};
> +
> +static struct rmgr_ctrl rmgr;
> +static int rmgr_init_done = 0;
> +
> +#ifndef SIOCETHTOOL
> +#define SIOCETHTOOL     0x8946
> +#endif

This definition ought to be moved to ethtool-util.h rather than
duplicated.

> +static void rmgr_print_ipv4_rule(struct ethtool_rx_flow_spec *fsp)
> +{
> +	char		chan[16];
> +	char		l4_proto[16];
> +	__u32		sip, dip, sipm, dipm;
> +
> +	sip = ntohl(fsp->h_u.tcp_ip4_spec.ip4src);
> +	dip = ntohl(fsp->h_u.tcp_ip4_spec.ip4dst);
> +	sipm = ntohl(fsp->m_u.tcp_ip4_spec.ip4src);
> +	dipm = ntohl(fsp->m_u.tcp_ip4_spec.ip4dst);
> +
> +	if (fsp->ring_cookie != RX_CLS_FLOW_DISC)
> +		sprintf(chan, "Rx Ring [%d]", (int)fsp->ring_cookie);
> +	else
> +		sprintf(chan, "Discard");
> +
> +	switch (fsp->flow_type) {
> +	case TCP_V4_FLOW:
> +	case UDP_V4_FLOW:
> +	case SCTP_V4_FLOW:
> +	case AH_V4_FLOW:
> +	case ESP_V4_FLOW:
> +	case IP_USER_FLOW:
> +		fprintf(stdout,
> +			"      IPv4 Rule:  ID[%d] Target[%s]\n"
> +			"      IP src addr[%d.%d.%d.%d] mask[%d.%d.%d.%d]\n"
> +			"      IP dst addr[%d.%d.%d.%d] mask[%d.%d.%d.%d]\n"
> +			"      IP TOS[0x%x] mask[0x%x]\n",

To be consistent with other ethtool output, this should use colons
rather than square brackets to separate field names and values.

[...]
> +int rxclass_parse_ruleopts(char **optstr, int opt_cnt,
> +			   struct ethtool_rx_flow_spec *fsp,
> +			   u_int8_t *loc_valid)
> +{
> +	int i = 0;
> +
> +	u_int8_t discard, ring_set;
> +	u_int32_t ipsa, ipsm, ipda, ipdm, spi, spim;
> +	u_int16_t sp, spm, dp, dpm;
> +	u_int8_t ip_ver, proto, tos, tm;
> +	struct in_addr in_addr;
> +
> +	if (*optstr == NULL || **optstr == '\0' || opt_cnt < 2) {
> +		fprintf(stdout, "Add rule, invalid syntax\n");
> +		return -1;
> +	}
> +
> +	bzero(fsp, sizeof(struct ethtool_rx_flow_spec));
> +	ipsa = ipda = ipsm = ipdm = spi = spim = 0x0;
> +	sp = dp = spm = dpm = 0x0;
> +	ip_ver = proto = tos = tm = 0x0;
> +	discard = ring_set = 0;
> +
> +	if (!strcmp(optstr[i], "ip4")) {
> +		ip_ver = ETH_RX_NFC_IP4;
> +	} else if (!strcmp(optstr[i], "ip6")) {
> +		fprintf(stdout, "IPv6 not yet implemented\n");
> +		return -1;
> +	} else {
> +		fprintf(stdout, "Add rule, invalid syntax for IP version\n");
> +		return -1;
> +	}
> +
> +	i++;
> +
> +	switch (ip_ver) {
> +	case ETH_RX_NFC_IP4:
> +		if (!strcmp(optstr[i], "tcp"))
> +			fsp->flow_type = TCP_V4_FLOW;
> +		else if (!strcmp(optstr[i], "udp"))
> +			fsp->flow_type = UDP_V4_FLOW;
> +		else if (!strcmp(optstr[i], "sctp"))
> +			fsp->flow_type = SCTP_V4_FLOW;
> +		else if (!strcmp(optstr[i], "ah"))
> +			fsp->flow_type = AH_V4_FLOW;
> +		else if (!strcmp(optstr[i], "esp"))
> +			fsp->flow_type = ESP_V4_FLOW;
> +		break;
> +	default:
> +		fprintf(stdout, "Add rule, Invalid IP version %d\n", ip_ver);
> +			return -1;
> +	}
> +
> +	if (fsp->flow_type == 0) {
> +		proto = (u_int8_t)strtoul(optstr[i], (char **)NULL, 0);
> +		if (proto != 0) {
> +			fprintf(stdout, "Add rule, user defined proto %d\n",
> +				proto);
> +			fsp->flow_type = IP_USER_FLOW;
> +			fsp->h_u.usr_ip4_spec.proto = proto;
> +			fsp->h_u.usr_ip4_spec.ip_ver = ip_ver;
> +		} else {
> +			fprintf(stdout, "Add rule, invalid IP proto %s\n",
> +				optstr[i]);
> +			return -1;
> +		}
> +	}
> +
> +	for (i = 2; i < opt_cnt;) {
> +		if (!strcmp(optstr[i], "tos")) {
> +			tos = (u_int8_t)strtoul(optstr[i+1], (char **)NULL,
> +						 0);
> +			tm = 0xff;
> +			fsp->h_u.tcp_ip4_spec.tos = tos;
> +
> +			i += 2;
> +			if (opt_cnt > (i+1)) {
> +				if (!strcmp(optstr[i], "m")) {
> +					tm = (u_int8_t)strtoul(optstr[i+1],
> +							       (char **)NULL,
> +							       16);
> +					i += 2;
> +				}
> +			}
> +			fsp->m_u.tcp_ip4_spec.tos = tm;
> +		} else if (!strcmp(optstr[i], "sip")) {
[...]

These keyword names must be made consistent with those used for the -U
(--config-ntuple) option.

Also, they can be parsed much more concisely using the new option types
I defined a while back for struct cmdline_info.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: Ethernet over GRE and vlans
From: Jonathan Thibault @ 2011-02-21 15:01 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev
In-Reply-To: <20110221053854.GA26462@gondor.apana.org.au>

On 21/02/11 12:38 AM, Herbert Xu wrote:
> On Sat, Jan 29, 2011 at 12:16:06AM -0500, Jonathan Thibault wrote:
>> Is it wrong on my part to expect such behaviour from gretap devices
>> or is this simply not possible/implemented yet?
> I don't see why this shouldn't work, so it might be a bug or
> misconfiguration.  How did you setup gre1.1 and gre2.2?
>
> Cheers,
I simply ran:

vconfig add rcg0 1
ifconfig rcg0.1 up

(the gretap interface is called rcg0, obviously)

Now that I think of it I did not try to add the vlan using the 'ip link
add' command though I'm not entirely sure it would make much of a
difference.  We sort of bypassed the problem using more hardware so the
test rig is dismantled now but if you feel there really is something to
it, I can set it back up.

Thanks again,

Jonathan

P.S.:  Gretap flies...  The little atom 1.6 boards I tested this on gave
me 877Mbit/sec without breaking a sweat.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox