Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: TX VLAN acceleration on bridges broken in 2.6.37?
From: Jan Niehusmann @ 2011-02-26  0:19 UTC (permalink / raw)
  To: Jesse Gross; +Cc: linux-kernel, netdev
In-Reply-To: <AANLkTinoJqWA6ffnUx2KW_83srNbo+k6r24hnNpPTGvW@mail.gmail.com>

On Fri, Feb 25, 2011 at 02:53:21PM -0800, Jesse Gross wrote:
> is specific to the e1000e driver.  I know that some other Intel NICs
> require vlan stripping on receive to be enabled for vlan insertion on
> transmit to work.  Since this driver has not been converted over to
> use the new vlan model yet, it only enables these things if a vlan is
> directly configured on it.  To confirm this can you try a few things:

My observations confirm your theory:

> * Directly configure the vlan on the device instead of going through the bridge.

- does work, but only if eth0 is not part of bridge (expected behaviour,
  afaik)

> * Use the bridge but also configure an unused vlan device on the
> physical interface.

- does work

> * Double check that tcpdump with the settings that you are using shows
> vlan tags in other situations.  In some cases you need to use the 'e'
> flag with tcpdump in order for it show vlan tags.  If it is the
> driver/NIC that is dropping the tags, tcpdump should still show them.

- indeed, -e is necessary to show the vlan tags. So my prior observation
  regarding tag visibility in tcpdump was wrong. The packets are still
  have a vlan tag in the non-working case. 

  (What actually is affected by the txvlan flag is the ability to filter
  for vlan tags with tcpdump.  so 'tcpdump -e -i eth0' shows the packets,
  'tcpdump -e -i eth0 vlan' only shows them with txvlan off. However,
  filtering for the vlan tag also doesn't work with the vlan interface
  on eth0.1, while the tagging actually works, as verified above.)

Jan

^ permalink raw reply

* Re: 2.6.37 regression: adding main interface to a bridge breaks vlan interface RX
From: chriss @ 2011-02-26  0:16 UTC (permalink / raw)
  To: netdev
In-Reply-To: <AANLkTimFC5drwt44C7Hpn4YJU-Njksr8Zkiv_RXK_oSL@mail.gmail.com>

Jesse Gross <jesse <at> nicira.com> writes:

> 
> What driver is in use with the NIC you are seeing this on?
>

He there

the device in question is (as lspci told)
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8110SC/8169SC Gigabit
Ethernet (rev 10)

handled by kernel module r8169.

I also tried to set up the vlan on the bridge. like br0.3. but that did not work.

regards//chriss

^ permalink raw reply

* [ethtool PATCH 3/4] v2 Add RX packet classification interface
From: Alexander Duyck @ 2011-02-25 23:48 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225233902.8409.74474.stgit@gitlad.jf.intel.com>

From: Santwona Behera <santwona.behera@sun.com>

This patch was originally introduced as:
  [PATCH 1/3] [ethtool] Add rx pkt classification interface
  Signed-off-by: Santwona Behera <santwona.behera@sun.com>
  http://patchwork.ozlabs.org/patch/23223/

I have updated it to address a number of issues.  As a result I removed the
local caching of rules due to the fact that there were memory leaks in this
code and the rule manager would consume over 1Mb of space for an 8K table
when all that was needed was 1K in order to store which rules were active
and which were not.

In addition I dropped the use of regions as there were multiple issue found
including the fact that the regions were not properly expanding beyond 2
and the fact that the regions required reading all of the rules in order to
correctly expand beyond 2.  By dropping the regions from the rule manager
it is possible to write a much cleaner interface leaving region management
to be done by either the driver or by external management scripts.

I also added an ethtool bitops interface to allow for simple bit set and
test activities since the rule manager can most efficiently store the list
of active rules via a bitmap.

This patch now also merges the functionality of the packet classification
into the ntuple interface.  This is done by using the key word flow-type to
indicate the addition of an ntuple, class-rule-add to indicate the addition
of a network flow classifier, and class-rule-del to indicate the deletion
of a network flow classifier. Since ntuple display functionality was
already removed I have made the defalt for the -u option to display the
number of rings and all network flow classification filters.  If a single
rule is requested via class-rule %d then only that rule will be displayed.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 Makefile.am      |    3 
 ethtool-bitops.h |   25 +
 ethtool-util.h   |   42 ++
 ethtool.8.in     |  198 +++++----
 ethtool.c        |  284 ++++++-------
 rxclass.c        | 1169 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 1459 insertions(+), 262 deletions(-)
 create mode 100644 ethtool-bitops.h
 create mode 100644 rxclass.c

diff --git a/Makefile.am b/Makefile.am
index a0d2116..0262c31 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -8,7 +8,8 @@ ethtool_SOURCES = ethtool.c ethtool-copy.h ethtool-util.h	\
 		  amd8111e.c de2104x.c e100.c e1000.c igb.c	\
 		  fec_8xx.c ibm_emac.c ixgb.c ixgbe.c natsemi.c	\
 		  pcnet32.c realtek.c tg3.c marvell.c vioc.c	\
-		  smsc911x.c at76c50x-usb.c sfc.c stmmac.c
+		  smsc911x.c at76c50x-usb.c sfc.c stmmac.c	\
+		  rxclass.c
 
 dist-hook:
 	cp $(top_srcdir)/ethtool.spec $(distdir)
diff --git a/ethtool-bitops.h b/ethtool-bitops.h
new file mode 100644
index 0000000..93d32a4
--- /dev/null
+++ b/ethtool-bitops.h
@@ -0,0 +1,25 @@
+#ifndef ETHTOOL_BITOPS_H__
+#define ETHTOOL_BITOPS_H__
+
+#define BITS_PER_BYTE		8
+#define BITS_PER_LONG		BITS_PER_BYTE * sizeof(long)
+#define DIV_ROUND_UP(n, d)	(((n) + (d) - 1) / (d))
+#define BITS_TO_LONGS(nr)	DIV_ROUND_UP(nr, BITS_PER_LONG)
+
+static inline void set_bit(int nr, unsigned long *addr)
+{
+	addr[nr / BITS_PER_LONG] |= 1UL << (nr % BITS_PER_LONG);
+}
+
+static inline void clear_bit(int nr, unsigned long *addr)
+{
+	addr[nr / BITS_PER_LONG] &= ~(1UL << (nr % BITS_PER_LONG));
+}
+
+static __always_inline int test_bit(unsigned int nr, const unsigned long *addr)
+{
+	return ((1UL << (nr % BITS_PER_LONG)) &
+		(((unsigned long *)addr)[nr / BITS_PER_LONG])) != 0UL;
+}
+
+#endif
diff --git a/ethtool-util.h b/ethtool-util.h
index f053028..182658b 100644
--- a/ethtool-util.h
+++ b/ethtool-util.h
@@ -5,15 +5,18 @@
 
 #include <sys/types.h>
 #include <endian.h>
+#include <sys/ioctl.h>
+#include <net/if.h>
+#include "ethtool-config.h"
+#include "ethtool-copy.h"
 
 /* ethtool.h expects these to be defined by <linux/types.h> */
 #ifndef HAVE_BE_TYPES
 typedef __uint16_t __be16;
 typedef __uint32_t __be32;
+typedef unsigned long long __be64;
 #endif
 
-#include "ethtool-copy.h"
-
 typedef unsigned long long u64;
 typedef __uint32_t u32;
 typedef __uint16_t u16;
@@ -23,11 +26,15 @@ typedef __int32_t s32;
 #if __BYTE_ORDER == __BIG_ENDIAN
 static inline u16 cpu_to_be16(u16 value)
 {
-    return value;
+	return value;
 }
 static inline u32 cpu_to_be32(u32 value)
 {
-    return value;
+	return value;
+}
+static inline u64 cpu_to_be64(u64 value)
+{
+	return value;
 }
 #else
 static inline u16 cpu_to_be16(u16 value)
@@ -38,6 +45,21 @@ static inline u32 cpu_to_be32(u32 value)
 {
 	return cpu_to_be16(value >> 16) | (cpu_to_be16(value) << 16);
 }
+static inline u64 cpu_to_be64(u64 value)
+{
+	return cpu_to_be32(value >> 32) | ((u64)cpu_to_be32(value) << 32);
+}
+#endif
+
+#define ntohll cpu_to_be64
+#define htonll cpu_to_be64
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+#endif
+
+#ifndef SIOCETHTOOL
+#define SIOCETHTOOL     0x8946
 #endif
 
 /* National Semiconductor DP83815, DP83816 */
@@ -103,4 +125,14 @@ int sfc_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
 int st_mac100_dump_regs(struct ethtool_drvinfo *info,
 			struct ethtool_regs *regs);
 int st_gmac_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
-#endif
+
+/* Rx flow classification */
+int rxclass_parse_ruleopts(char **optstr, int opt_cnt,
+			   void *fsp, __u8 *loc_valid);
+int rxclass_rule_getall(int fd, struct ifreq *ifr);
+int rxclass_rule_get(int fd, struct ifreq *ifr, __u32 loc);
+int rxclass_rule_ins(int fd, struct ifreq *ifr,
+		     struct ethtool_rx_flow_spec *fsp, __u8 loc_valid);
+int rxclass_rule_del(int fd, struct ifreq *ifr, __u32 loc);
+
+#endif /* ETHTOOL_UTIL_H__ */
diff --git a/ethtool.8.in b/ethtool.8.in
index 7dec259..b68b010 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -40,10 +40,20 @@
 [\\fB\\$1\\fP\ \\fIN\\fP]
 ..
 .\"
+.\"	.BM - same as above but has a mask field for format "[value N [value-mask N]]"
+.\"
+.de BM
+[\\fB\\$1\\fP\ \\fIN\\fP\ [\\fB\\$1\-mask\\fP\ \\fIN\\fP]]
+..
+.\"
 .\"	\(*MA - mac address
 .\"
 .ds MA \fIxx\fP\fB:\fP\fIyy\fP\fB:\fP\fIzz\fP\fB:\fP\fIaa\fP\fB:\fP\fIbb\fP\fB:\fP\fIcc\fP
 .\"
+.\"	\(*PA - IP address
+.\"
+.ds PA \fIx\fP\fB.\fP\fIx\fP\fB.\fP\fIx\fP\fB.\fP\fIx\fP
+.\"
 .\"	\(*WO - wol flags
 .\"
 .ds WO \fBp\fP|\fBu\fP|\fBm\fP|\fBb\fP|\fBa\fP|\fBg\fP|\fBs\fP|\fBd\fP...
@@ -55,6 +65,12 @@
 .\"	\(*HO - hash options
 .\"
 .ds HO \fBm\fP|\fBv\fP|\fBt\fP|\fBs\fP|\fBd\fP|\fBf\fP|\fBn\fP|\fBr\fP...
+.\"
+.\"	\(*NC - Network Classifier type values
+.\"
+.ds NC \fBether\fP|\fBip4\fP|\fBtcp4\fP|\fBudp4\fP|\fBsctp4\fP|\fBah4\fP|\fBesp4\fP
+
+.\"
 .\" Start URL.
 .de UR
 .  ds m1 \\$1\"
@@ -227,8 +243,7 @@ ethtool \- query or control network driver and hardware settings
 
 .B ethtool \-N
 .I ethX
-.RB [ rx-flow-hash \ \*(FL
-.RB \ \*(HO]
+.RB [ rx-flow-hash \ \*(FL \  \*(HO]
 
 .B ethtool \-x|\-\-show\-rxfh\-indir
 .I ethX
@@ -248,51 +263,38 @@ ethtool \- query or control network driver and hardware settings
 
 .B ethtool \-u|\-\-show\-ntuple
 .I ethX
+.BN class-rule
 
-.TP
 .BI ethtool\ \-U|\-\-config\-ntuple \ ethX
-.RB {
-.A3 flow-type tcp4 udp4 sctp4
-.RB [ src-ip
-.IR addr
-.RB [ src-ip-mask
-.IR mask ]]
-.RB [ dst-ip
-.IR addr
-.RB [ dst-ip-mask
-.IR mask ]]
-.RB [ src-port
-.IR port
-.RB [ src-port-mask
-.IR mask ]]
-.RB [ dst-port
-.IR port
-.RB [ dst-port-mask
-.IR mask ]]
-.br
-.RB | \ flow-type\ ether
-.RB [ src
-.IR mac-addr
-.RB [ src-mask
-.IR mask ]]
-.RB [ dst
-.IR mac-addr
-.RB [ dst-mask
-.IR mask ]]
-.RB [ proto
-.IR N
-.RB [ proto-mask
-.IR mask ]]\ }
-.br
-.RB [ vlan
-.IR VLAN-tag
-.RB [ vlan-mask
-.IR mask ]]
-.RB [ user-def
-.IR data
-.RB [ user-def-mask
-.IR mask ]]
-.RI action \ N
+.BN class-rule-del
+.RB [\  class-rule-add \ \*(NC
+.RB [ src \ \*(MA\ [ src-mask \ \*(MA]]
+.RB [ dst \ \*(MA\ [ dst-mask \ \*(MA]]
+.BM proto
+.RB [ src-ip \ \*(PA\ [ src-ip-mask \ \*(PA]]
+.RB [ dst-ip \ \*(PA\ [ dst-ip-mask \ \*(PA]]
+.BM tos
+.BM l4proto
+.BM src-port
+.BM dst-port
+.BM spi
+.BN action
+.BN loc
+.RB \ |\  flow-type \ \*(NC
+.RB [ src \ \*(MA\ [ src-mask \ \*(MA]]
+.RB [ dst \ \*(MA\ [ dst-mask \ \*(MA]]
+.BM proto
+.RB [ src-ip \ \*(PA\ [ src-ip-mask \ \*(PA]]
+.RB [ dst-ip \ \*(PA\ [ dst-ip-mask \ \*(PA]]
+.BM tos
+.BM l4proto
+.BM src-port
+.BM dst-port
+.BM spi
+.BM vlan
+.BM user-def
+.BN action
+.RB ]
 
 .SH DESCRIPTION
 .BI ethtool
@@ -654,7 +656,8 @@ Hash on bytes 0 and 1 of the Layer 4 header of the rx packet.
 Hash on bytes 2 and 3 of the Layer 4 header of the rx packet.
 .TP 3
 .B r
-Discard all packets of this flow type. When this option is set, all other options are ignored.
+Discard all packets of this flow type. When this option is set, all
+other options are ignored.
 .PD
 .RE
 .TP
@@ -685,14 +688,23 @@ Default region is 0 which denotes all regions in the flash.
 .TP
 .B \-u \-\-show-ntuple
 Get Rx ntuple filters and actions, then display them to the user.
+.TP
+.BI class-rule \ N
+Retrieves the RX classification rule with the given ID.
 .PD
 .RE
 .TP
 .B \-U \-\-config-ntuple
 Configure Rx ntuple filters and actions
 .TP
-.B flow-type tcp4|udp4|sctp4|ether
+.BI class-rule-del \ N
+Deletes the RX classification rule with the given ID.
+.PP
+.BR class-rule-add \ \*(NC
+.br
+.BR flow-type \ \*(NC
 .RS
+Adds an RX packet classification rule.
 .PD 0
 .TP 3
 .BR "tcp4" "    TCP over IPv4"
@@ -701,78 +713,80 @@ Configure Rx ntuple filters and actions
 .TP 3
 .BR "sctp4" "   SCTP over IPv4"
 .TP 3
+.BR "ah4" "     IPSEC AH over IPv4"
+.TP 3
+.BR "esp4" "    IPSEC ESP over IPv4"
+.TP 3
+.BR "ip4" "     Raw IPv4"
+.TP 3
 .BR "ether" "   Ethernet"
 .PD
 .RE
 .TP
-.BI src-ip \ addr
-Includes the source IP address, specified using dotted-quad notation
-or as a single 32-bit number.
-.TP
-.BI src-ip-mask \ mask
-Specify a mask for the source IP address.
-.TP
-.BI dst-ip \ addr
-Includes the destination IP address.
-.TP
-.BI dst-ip-mask \ mask
-Specify a mask for the destination IP address.
-.TP
-.BI src-port \ port
-Includes the source port.
-.TP
-.BI src-port-mask \ mask
-Specify a mask for the source port.
-.TP
-.BI dst-port \ port
-Includes the destination port.
+.BR src \ \*(MA\ [ src-mask \ \*(MA]
+Includes the source MAC address, specified as 6 bytes in hexadecimal
+separated by colons, along with an optional mask.
 .TP
-.BI dst-port-mask \ mask
-Specify a mask for the destination port.
+.BR dst \ \*(MA\ [ src-mask \ \*(MA]
+Includes the destination MAC address, specified as 6 bytes in hexadecimal
+separated by colons, along with an optional mask.
 .TP
-.BI src \ mac-addr
-Includes the source MAC address, specified as 6 bytes in hexadecimal
-separated by colons.
+.BI proto \ N \\fR\ [\\fPproto-mask \ N \\fR]\\fP
+Includes the Ethernet protocol number (ethertype) and an optional mask.
 .TP
-.BI src-mask \ mask
-Specify a mask for the source MAC address.
+.BR src-ip \ \*(PA\ [ src-ip-mask \ \*(PA]
+Specify the source IP address of the incoming packet to
+match along with an optional mask.
 .TP
-.BI dst \ mac-addr
-Includes the destination MAC address.
+.BR dst-ip \ \*(PA\ [ dst-ip-mask \ \*(PA]
+Specify the destination IP address of the incoming packet to
+match along with an optional mask.
 .TP
-.BI dst-mask \ mask
-Specify a mask for the destination MAC address.
+.BI tos \ N \\fR\ [\\fPtos-mask \ N \\fR]\\fP
+Specify the value of the Type of Service field in the incoming packet to
+match along with an optional mask.
 .TP
-.BI proto \ N
-Includes the Ethernet protocol number (ethertype).
+.BI l4proto \ N \\fR\ [\\fPl4proto-mask \ N \\fR]\\fP
+Includes the layer 4 protocol number and optional mask.
 .TP
-.BI proto-mask \ mask
-Specify a mask for the Ethernet protocol number.
+.BI src-port \ N \\fR\ [\\fPsrc-port-mask \ N \\fR]\\fP
+Specify the value of the source port field (applicable to
+TCP/UDP packets)in the incoming packet to match along with an
+optional mask.
 .TP
-.BI vlan \ VLAN-tag
-Includes the VLAN tag.
+.BI dst-port \ N \\fR\ [\\fPdst-port-mask \ N \\fR]\\fP
+Specify the value of the destination port field (applicable to
+TCP/UDP packets)in the incoming packet to match along with an
+optional mask.
 .TP
-.BI vlan-mask \ mask
-Specify a mask for the VLAN tag.
+.BI spi \ N \\fR\ [\\fPspi-mask \ N \\fR]\\fP
+Specify the value of the security parameter index field (applicable to
+AH/ESP packets)in the incoming packet to match along with an
+optional mask.
 .TP
-.BI user-def \ data
-Includes 64-bits of user-specific data.
+.BI vlan \ N \\fR\ [\\fPvlan-mask \ N \\fR]\\fP
+Includes the VLAN tag and an optional mask.
 .TP
-.BI user-def-mask \ mask
-Specify a mask for the user-specific data.
+.BI user-def \ N \\fR\ [\\fPuser-def-mask \ N \\fR]\\fP
+Includes 64-bits of user-specific data and an optional mask.
 .TP
 .BI action \ N
 Specifies the Rx queue to send packets to, or some other action.
 .RS
 .PD 0
 .TP 3
-.BR "-2" "             Clear the filter"
+.BR "-2" "             Clear the filter (ntuple only)"
 .TP 3
 .BR "-1" "             Drop the matched flow"
 .TP 3
 .BR "0 or higher" "    Rx queue to route the flow"
 .PD
 .RE
+.TP
+.BI loc \ N
+Specify the location/ID to insert the rule. This will overwrite
+any rule present in that location and will not go through any
+of the rule ordering process.
 .SH BUGS
 Not supported (in part or whole) on all network drivers.
 .SH AUTHOR
diff --git a/ethtool.c b/ethtool.c
index 2a084db..f4dfc39 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -6,6 +6,7 @@
  * Kernel 2.4 update Copyright 2001 Jeff Garzik <jgarzik@mandrakesoft.com>
  * Wake-on-LAN,natsemi,misc support by Tim Hockin <thockin@sun.com>
  * Portions Copyright 2002 Intel
+ * Portions Copyright (C) Sun Microsystems 2008
  * do_test support by Eli Kupermann <eli.kupermann@intel.com>
  * ETHTOOL_PHYS_ID support by Chris Leech <christopher.leech@intel.com>
  * e1000 support by Scott Feldman <scott.feldman@intel.com>
@@ -14,6 +15,7 @@
  * amd8111e support by Reeja John <reeja.john@amd.com>
  * long arguments by Andi Kleen.
  * SMSC LAN911x support by Steve Glendinning <steve.glendinning@smsc.com>
+ * Rx Network Flow Control configuration support <santwona.behera@sun.com>
  * Various features by Ben Hutchings <bhutchings@solarflare.com>;
  *	Copyright 2009, 2010 Solarflare Communications
  *
@@ -43,18 +45,13 @@
 #include <arpa/inet.h>
 
 #include <linux/sockios.h>
+#include <sys/socket.h>
+#include <arpa/inet.h>
 #include "ethtool-util.h"
 
-
-#ifndef SIOCETHTOOL
-#define SIOCETHTOOL     0x8946
-#endif
 #ifndef MAX_ADDR_LEN
 #define MAX_ADDR_LEN	32
 #endif
-#ifndef ARRAY_SIZE
-#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
-#endif
 
 #ifndef HAVE_NETIF_MSG
 enum {
@@ -100,7 +97,6 @@ static int do_gstats(int fd, struct ifreq *ifr);
 static int rxflow_str_to_type(const char *str);
 static int parse_rxfhashopts(char *optstr, u32 *data);
 static char *unparse_rxfhashopts(u64 opts);
-static void parse_rxntupleopts(int argc, char **argp, int first_arg);
 static int dump_rxfhash(int fhash, u64 val);
 static int do_srxclass(int fd, struct ifreq *ifr);
 static int do_grxclass(int fd, struct ifreq *ifr);
@@ -246,20 +242,37 @@ static struct option {
 		"		equal N | weight W0 W1 ...\n" },
     { "-U", "--config-ntuple", MODE_SNTUPLE, "Configure Rx ntuple filters "
 		"and actions",
-		"		{ flow-type tcp4|udp4|sctp4\n"
-		"		  [ src-ip ADDR [src-ip-mask MASK] ]\n"
-		"		  [ dst-ip ADDR [dst-ip-mask MASK] ]\n"
-		"		  [ src-port PORT [src-port-mask MASK] ]\n"
-		"		  [ dst-port PORT [dst-port-mask MASK] ]\n"
-		"		| flow-type ether\n"
-		"		  [ src MAC-ADDR [src-mask MASK] ]\n"
-		"		  [ dst MAC-ADDR [dst-mask MASK] ]\n"
-		"		  [ proto N [proto-mask MASK] ] }\n"
-		"		[ vlan VLAN-TAG [vlan-mask MASK] ]\n"
-		"		[ user-def DATA [user-def-mask MASK] ]\n"
-		"		action N\n" },
+		"		[ class-rule-del %d ]\n"
+		"		[ class-rule-add ether|ip4|tcp4|udp4|sctp4|ah4|esp4\n"
+		"			[ src %x:%x:%x:%x:%x:%x [src-mask %x:%x:%x:%x:%x:%x] ]\n"
+		"			[ dst %x:%x:%x:%x:%x:%x [dst-mask %x:%x:%x:%x:%x:%x] ]\n"
+		"			[ proto %d [proto-mask MASK] ]\n"
+		"			[ src-ip %d.%d.%d.%d [src-ip-mask %d.%d.%d.%d] ]\n"
+		"			[ dst-ip %d.%d.%d.%d [dst-ip-mask %d.%d.%d.%d] ]\n"
+		"			[ tos %d [tos-mask %x] ]\n"
+		"			[ l4proto %d [l4proto-mask MASK] ]\n"
+		"			[ src-port %d [src-port-mask %x] ]\n"
+		"			[ dst-port %d [dst-port-mask %x] ]\n"
+		"			[ spi %d [spi-mask %x] ]\n"
+		"			[ action %d ]\n"
+		"			[ loc %d] |\n"
+		"		  flow-type ether|ip4|tcp4|udp4|sctp4|ah4|esp4\n"
+		"			[ src %x:%x:%x:%x:%x:%x [src-mask %x:%x:%x:%x:%x:%x] ]\n"
+		"			[ dst %x:%x:%x:%x:%x:%x [dst-mask %x:%x:%x:%x:%x:%x] ]\n"
+		"			[ proto %d [proto-mask MASK] ]\n"
+		"			[ src-ip %d.%d.%d.%d [src-ip-mask %d.%d.%d.%d] ]\n"
+		"			[ dst-ip %d.%d.%d.%d [dst-ip-mask %d.%d.%d.%d] ]\n"
+		"			[ tos %d [tos-mask %x] ]\n"
+		"			[ l4proto %d [l4proto-mask MASK] ]\n"
+		"			[ src-port %d [src-port-mask %x] ]\n"
+		"			[ dst-port %d [dst-port-mask %x] ]\n"
+		"			[ spi %d [spi-mask %x] ]\n"
+		"			[ vlan %x [vlan-mask %x] ]\n"
+		"			[ user-def %x [user-def-mask %x] ]\n"
+		"			[ action %d ] ]\n" },
     { "-u", "--show-ntuple", MODE_GNTUPLE,
-		"Get Rx ntuple filters and actions\n" },
+		"Get Rx ntuple filters and actions",
+		"		[ class-rule %d ]\n"},
     { "-P", "--show-permaddr", MODE_PERMADDR,
 		"Show permanent hardware address" },
     { "-h", "--help", MODE_HELP, "Show this help" },
@@ -381,24 +394,6 @@ static int rxfhindir_equal = 0;
 static char **rxfhindir_weight = NULL;
 static int sntuple_changed = 0;
 static struct ethtool_rx_ntuple_flow_spec ntuple_fs;
-static int ntuple_ip4src_seen = 0;
-static int ntuple_ip4src_mask_seen = 0;
-static int ntuple_ip4dst_seen = 0;
-static int ntuple_ip4dst_mask_seen = 0;
-static int ntuple_psrc_seen = 0;
-static int ntuple_psrc_mask_seen = 0;
-static int ntuple_pdst_seen = 0;
-static int ntuple_pdst_mask_seen = 0;
-static int ntuple_ether_dst_seen = 0;
-static int ntuple_ether_dst_mask_seen = 0;
-static int ntuple_ether_src_seen = 0;
-static int ntuple_ether_src_mask_seen = 0;
-static int ntuple_ether_proto_seen = 0;
-static int ntuple_ether_proto_mask_seen = 0;
-static int ntuple_vlan_tag_seen = 0;
-static int ntuple_vlan_tag_mask_seen = 0;
-static int ntuple_user_def_seen = 0;
-static int ntuple_user_def_mask_seen = 0;
 static char *flash_file = NULL;
 static int flash = -1;
 static int flash_region = -1;
@@ -407,6 +402,12 @@ static int msglvl_changed;
 static u32 msglvl_wanted = 0;
 static u32 msglvl_mask = 0;
 
+static int rx_class_rule_get = -1;
+static int rx_class_rule_del = -1;
+static int rx_class_rule_added = 0;
+static struct ethtool_rx_flow_spec rx_rule_fs;
+static u8 rxclass_loc_valid = 0;
+
 static enum {
 	ONLINE=0,
 	OFFLINE,
@@ -519,58 +520,6 @@ static struct cmdline_info cmdline_coalesce[] = {
 	{ "tx-frames-high", CMDL_S32, &coal_tx_frames_high_wanted, &ecoal.tx_max_coalesced_frames_high },
 };
 
-static struct cmdline_info cmdline_ntuple_tcp_ip4[] = {
-	{ "src-ip", CMDL_IP4, &ntuple_fs.h_u.tcp_ip4_spec.ip4src, NULL,
-	  0, &ntuple_ip4src_seen },
-	{ "src-ip-mask", CMDL_IP4, &ntuple_fs.m_u.tcp_ip4_spec.ip4src, NULL,
-	  0, &ntuple_ip4src_mask_seen },
-	{ "dst-ip", CMDL_IP4, &ntuple_fs.h_u.tcp_ip4_spec.ip4dst, NULL,
-	  0, &ntuple_ip4dst_seen },
-	{ "dst-ip-mask", CMDL_IP4, &ntuple_fs.m_u.tcp_ip4_spec.ip4dst, NULL,
-	  0, &ntuple_ip4dst_mask_seen },
-	{ "src-port", CMDL_BE16, &ntuple_fs.h_u.tcp_ip4_spec.psrc, NULL,
-	  0, &ntuple_psrc_seen },
-	{ "src-port-mask", CMDL_BE16, &ntuple_fs.m_u.tcp_ip4_spec.psrc, NULL,
-	  0, &ntuple_psrc_mask_seen },
-	{ "dst-port", CMDL_BE16, &ntuple_fs.h_u.tcp_ip4_spec.pdst, NULL,
-	  0, &ntuple_pdst_seen },
-	{ "dst-port-mask", CMDL_BE16, &ntuple_fs.m_u.tcp_ip4_spec.pdst, NULL,
-	  0, &ntuple_pdst_mask_seen },
-	{ "vlan", CMDL_U16, &ntuple_fs.vlan_tag, NULL,
-	  0, &ntuple_vlan_tag_seen },
-	{ "vlan-mask", CMDL_U16, &ntuple_fs.vlan_tag_mask, NULL,
-	  0, &ntuple_vlan_tag_mask_seen },
-	{ "user-def", CMDL_U64, &ntuple_fs.data, NULL,
-	  0, &ntuple_user_def_seen },
-	{ "user-def-mask", CMDL_U64, &ntuple_fs.data_mask, NULL,
-	  0, &ntuple_user_def_mask_seen },
-	{ "action", CMDL_S32, &ntuple_fs.action, NULL },
-};
-
-static struct cmdline_info cmdline_ntuple_ether[] = {
-	{ "dst", CMDL_MAC, ntuple_fs.h_u.ether_spec.h_dest, NULL,
-	  0, &ntuple_ether_dst_seen },
-	{ "dst-mask", CMDL_MAC, ntuple_fs.m_u.ether_spec.h_dest, NULL,
-	  0, &ntuple_ether_dst_mask_seen },
-	{ "src", CMDL_MAC, ntuple_fs.h_u.ether_spec.h_source, NULL,
-	  0, &ntuple_ether_src_seen },
-	{ "src-mask", CMDL_MAC, ntuple_fs.m_u.ether_spec.h_source, NULL,
-	  0, &ntuple_ether_src_mask_seen },
-	{ "proto", CMDL_BE16, &ntuple_fs.h_u.ether_spec.h_proto, NULL,
-	  0, &ntuple_ether_proto_seen },
-	{ "proto-mask", CMDL_BE16, &ntuple_fs.m_u.ether_spec.h_proto, NULL,
-	  0, &ntuple_ether_proto_mask_seen },
-	{ "vlan", CMDL_U16, &ntuple_fs.vlan_tag, NULL,
-	  0, &ntuple_vlan_tag_seen },
-	{ "vlan-mask", CMDL_U16, &ntuple_fs.vlan_tag_mask, NULL,
-	  0, &ntuple_vlan_tag_mask_seen },
-	{ "user-def", CMDL_U64, &ntuple_fs.data, NULL,
-	  0, &ntuple_user_def_seen },
-	{ "user-def-mask", CMDL_U64, &ntuple_fs.data_mask, NULL,
-	  0, &ntuple_user_def_mask_seen },
-	{ "action", CMDL_S32, &ntuple_fs.action, NULL },
-};
-
 static struct cmdline_info cmdline_msglvl[] = {
 	{ "drv", CMDL_FLAG, &msglvl_wanted, NULL,
 	  NETIF_MSG_DRV, &msglvl_mask },
@@ -924,14 +873,49 @@ static void parse_cmdline(int argc, char **argp)
 			}
 			if (mode == MODE_SNTUPLE) {
 				if (!strcmp(argp[i], "flow-type")) {
+					if (rxclass_parse_ruleopts(&argp[i],
+								   argc - i,
+								   &ntuple_fs,
+								   NULL) < 0) {
+						show_usage(1);
+					} else {
+						i = argc;
+						sntuple_changed = 1;
+					}
+				} else if (!strcmp(argp[i], "class-rule-del")) {
 					i += 1;
 					if (i >= argc) {
 						show_usage(1);
 						break;
 					}
-					parse_rxntupleopts(argc, argp, i);
-					i = argc;
-					break;
+					rx_class_rule_del =
+						get_uint_range(argp[i], 0,
+							       INT_MAX);
+				} else if (!strcmp(argp[i], "class-rule-add")) {
+					if (rxclass_parse_ruleopts(&argp[i],
+								   argc - i,
+								   &rx_rule_fs,
+								   &rxclass_loc_valid) < 0) {
+						show_usage(1);
+					} else {
+						i = argc;
+						rx_class_rule_added = 1;
+					}
+				} else {
+					show_usage(1);
+				}
+				break;
+			}
+			if (mode == MODE_GNTUPLE) {
+				if (!strcmp(argp[i], "class-rule")) {
+					i += 1;
+					if (i >= argc) {
+						show_usage(1);
+						break;
+					}
+					rx_class_rule_get =
+						get_uint_range(argp[i], 0,
+							       INT_MAX);
 				} else {
 					show_usage(1);
 				}
@@ -981,8 +965,10 @@ static void parse_cmdline(int argc, char **argp)
 						show_usage(1);
 					else
 						rx_fhash_changed = 1;
-				} else
+				} else {
 					show_usage(1);
+				}
+
 				break;
 			}
 			if (mode == MODE_SRXFHINDIR) {
@@ -1594,66 +1580,6 @@ static char *unparse_rxfhashopts(u64 opts)
 	return buf;
 }
 
-static void parse_rxntupleopts(int argc, char **argp, int i)
-{
-	ntuple_fs.flow_type = rxflow_str_to_type(argp[i]);
-
-	switch (ntuple_fs.flow_type) {
-	case TCP_V4_FLOW:
-	case UDP_V4_FLOW:
-	case SCTP_V4_FLOW:
-		parse_generic_cmdline(argc, argp, i + 1,
-				      &sntuple_changed,
-				      cmdline_ntuple_tcp_ip4,
-				      ARRAY_SIZE(cmdline_ntuple_tcp_ip4));
-		if (!ntuple_ip4src_seen)
-			ntuple_fs.m_u.tcp_ip4_spec.ip4src = 0xffffffff;
-		if (!ntuple_ip4dst_seen)
-			ntuple_fs.m_u.tcp_ip4_spec.ip4dst = 0xffffffff;
-		if (!ntuple_psrc_seen)
-			ntuple_fs.m_u.tcp_ip4_spec.psrc = 0xffff;
-		if (!ntuple_pdst_seen)
-			ntuple_fs.m_u.tcp_ip4_spec.pdst = 0xffff;
-		ntuple_fs.m_u.tcp_ip4_spec.tos = 0xff;
-		break;
-	case ETHER_FLOW:
-		parse_generic_cmdline(argc, argp, i + 1,
-				      &sntuple_changed,
-				      cmdline_ntuple_ether,
-				      ARRAY_SIZE(cmdline_ntuple_ether));
-		if (!ntuple_ether_dst_seen)
-			memset(ntuple_fs.m_u.ether_spec.h_dest, 0xff, ETH_ALEN);
-		if (!ntuple_ether_src_seen)
-			memset(ntuple_fs.m_u.ether_spec.h_source, 0xff,
-			       ETH_ALEN);
-		if (!ntuple_ether_proto_seen)
-			ntuple_fs.m_u.ether_spec.h_proto = 0xffff;
-		break;
-	default:
-		fprintf(stderr, "Unsupported flow type \"%s\"\n", argp[i]);
-		exit(106);
-		break;
-	}
-
-	if (!ntuple_vlan_tag_seen)
-		ntuple_fs.vlan_tag_mask = 0xffff;
-	if (!ntuple_user_def_seen)
-		ntuple_fs.data_mask = 0xffffffffffffffffULL;
-
-	if ((ntuple_ip4src_mask_seen && !ntuple_ip4src_seen) ||
-	    (ntuple_ip4dst_mask_seen && !ntuple_ip4dst_seen) ||
-	    (ntuple_psrc_mask_seen && !ntuple_psrc_seen) ||
-	    (ntuple_pdst_mask_seen && !ntuple_pdst_seen) ||
-	    (ntuple_ether_dst_mask_seen && !ntuple_ether_dst_seen) ||
-	    (ntuple_ether_src_mask_seen && !ntuple_ether_src_seen) ||
-	    (ntuple_ether_proto_mask_seen && !ntuple_ether_proto_seen) ||
-	    (ntuple_vlan_tag_mask_seen && !ntuple_vlan_tag_seen) ||
-	    (ntuple_user_def_mask_seen && !ntuple_user_def_seen)) {
-		fprintf(stderr, "Cannot specify mask without value\n");
-		exit(107);
-	}
-}
-
 static struct {
 	const char *name;
 	int (*func)(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
@@ -2922,14 +2848,12 @@ static int do_gstats(int fd, struct ifreq *ifr)
 	return 0;
 }
 
-
 static int do_srxclass(int fd, struct ifreq *ifr)
 {
-	int err;
+	int err = 0;
+	struct ethtool_rxnfc nfccmd;
 
 	if (rx_fhash_changed) {
-		struct ethtool_rxnfc nfccmd;
-
 		nfccmd.cmd = ETHTOOL_SRXFH;
 		nfccmd.flow_type = rx_fhash_set;
 		nfccmd.data = rx_fhash_val;
@@ -2941,12 +2865,12 @@ static int do_srxclass(int fd, struct ifreq *ifr)
 
 	}
 
-	return 0;
+	return err ? 1 : 0;
 }
 
 static int do_grxclass(int fd, struct ifreq *ifr)
 {
-	int err;
+	int err = 0;
 
 	if (rx_fhash_get) {
 		struct ethtool_rxnfc nfccmd;
@@ -2961,7 +2885,7 @@ static int do_grxclass(int fd, struct ifreq *ifr)
 			dump_rxfhash(rx_fhash_get, nfccmd.data);
 	}
 
-	return 0;
+	return err ? 1 : 0;
 }
 
 static int do_grxfhindir(int fd, struct ifreq *ifr)
@@ -3142,7 +3066,17 @@ static int do_srxntuple(int fd, struct ifreq *ifr)
 {
 	int err;
 
-	if (sntuple_changed) {
+	if (rx_class_rule_added) {
+		err = rxclass_rule_ins(fd, ifr, &rx_rule_fs,
+				       rxclass_loc_valid);
+		if (err < 0)
+			fprintf(stderr, "Cannot insert RX classification rule\n");
+	} else if (rx_class_rule_del >= 0) {
+		err = rxclass_rule_del(fd, ifr, rx_class_rule_del);
+
+		if (err < 0)
+			fprintf(stderr, "Cannot delete RX classification rule\n");
+	} else if (sntuple_changed) {
 		struct ethtool_rx_ntuple ntuplecmd;
 
 		ntuplecmd.cmd = ETHTOOL_SRXNTUPLE;
@@ -3162,7 +3096,29 @@ static int do_srxntuple(int fd, struct ifreq *ifr)
 
 static int do_grxntuple(int fd, struct ifreq *ifr)
 {
-	return 0;
+	struct ethtool_rxnfc nfccmd;
+	int err;
+
+	if (rx_class_rule_get >= 0) {
+		err = rxclass_rule_get(fd, ifr, rx_class_rule_get);
+		if (err < 0)
+			fprintf(stderr, "Cannot get RX classification rule\n");
+		return err ? 1 : 0;
+	}
+
+	nfccmd.cmd = ETHTOOL_GRXRINGS;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0)
+		perror("Cannot get RX rings");
+	else
+		fprintf(stdout, "%d RX rings available\n",
+			(int)nfccmd.data);
+
+	err = rxclass_rule_getall(fd, ifr);
+	if (err < 0)
+		fprintf(stderr, "RX classification rule retrieval failed\n");
+	return err ? 1 : 0;
 }
 
 static int send_ioctl(int fd, struct ifreq *ifr)
diff --git a/rxclass.c b/rxclass.c
new file mode 100644
index 0000000..f2a8c96
--- /dev/null
+++ b/rxclass.c
@@ -0,0 +1,1169 @@
+/*
+ * Copyright (C) 2008 Sun Microsystems, Inc. All rights reserved.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stddef.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+
+#include <linux/sockios.h>
+#include <arpa/inet.h>
+#include "ethtool-util.h"
+#include "ethtool-bitops.h"
+
+/*
+ * This is a rule manager implementation for ordering rx flow
+ * classification rules in a longest prefix first match order.
+ * The assumption is that this rule manager is the only one adding rules to
+ * the device's hardware classifier.
+ */
+
+struct rmgr_ctrl {
+	/* slot contains a bitmap indicating which filters are valid */
+	unsigned long		*slot;
+	__u32			n_rules;
+	__u32			size;
+};
+
+static struct rmgr_ctrl rmgr;
+static int rmgr_init_done = 0;
+
+static void rmgr_print_nfc_rule(struct ethtool_rx_flow_spec *fsp)
+{
+	unsigned char	*smac, *smacm, *dmac, *dmacm;
+	__u32		sip, dip, sipm, dipm;
+	__u16		proto, protom;
+
+	fprintf(stdout,	"Filter: %d\n", fsp->location);
+
+	switch (fsp->flow_type) {
+	case TCP_V4_FLOW:
+	case UDP_V4_FLOW:
+	case SCTP_V4_FLOW:
+	case AH_V4_FLOW:
+	case ESP_V4_FLOW:
+	case IP_USER_FLOW:
+		sip = ntohl(fsp->h_u.tcp_ip4_spec.ip4src);
+		dip = ntohl(fsp->h_u.tcp_ip4_spec.ip4dst);
+		sipm = ntohl(fsp->m_u.tcp_ip4_spec.ip4src);
+		dipm = ntohl(fsp->m_u.tcp_ip4_spec.ip4dst);
+
+		switch (fsp->flow_type) {
+		case TCP_V4_FLOW:
+			fprintf(stdout, "\tRule Type: TCP over IPv4\n");
+			break;
+		case UDP_V4_FLOW:
+			fprintf(stdout, "\tRule Type: UDP over IPv4\n");
+			break;
+		case SCTP_V4_FLOW:
+			fprintf(stdout, "\tRule Type: SCTP over IPv4\n");
+			break;
+		case AH_V4_FLOW:
+			fprintf(stdout, "\tRule Type: IPSEC AH over IPv4\n");
+			break;
+		case ESP_V4_FLOW:
+			fprintf(stdout, "\tRule Type: IPSEC ESP over IPv4\n");
+			break;
+		case IP_USER_FLOW:
+			fprintf(stdout, "\tRule Type: Raw IPv4\n");
+			break;
+		default:
+			break;
+		}
+
+		fprintf(stdout,
+			"\tSrc IP addr: %d.%d.%d.%d mask: %d.%d.%d.%d\n"
+			"\tDest IP addr: %d.%d.%d.%d mask: %d.%d.%d.%d\n"
+			"\tTOS: 0x%x mask: 0x%x\n",
+			(sip & 0xff000000) >> 24,
+			(sip & 0xff0000) >> 16,
+			(sip & 0xff00) >> 8,
+			sip & 0xff,
+			(sipm & 0xff000000) >> 24,
+			(sipm & 0xff0000) >> 16,
+			(sipm & 0xff00) >> 8,
+			sipm & 0xff,
+			(dip & 0xff000000) >> 24,
+			(dip & 0xff0000) >> 16,
+			(dip & 0xff00) >> 8,
+			dip & 0xff,
+			(dipm & 0xff000000) >> 24,
+			(dipm & 0xff0000) >> 16,
+			(dipm & 0xff00) >> 8,
+			dipm & 0xff,
+			fsp->h_u.tcp_ip4_spec.tos,
+			fsp->m_u.tcp_ip4_spec.tos);
+
+		switch (fsp->flow_type) {
+		case TCP_V4_FLOW:
+		case UDP_V4_FLOW:
+		case SCTP_V4_FLOW:
+			fprintf(stdout,
+				"\tSrc port: %d mask: 0x%x\n"
+				"\tDest port: %d mask: 0x%x\n",
+				ntohs(fsp->h_u.tcp_ip4_spec.psrc),
+				ntohs(fsp->m_u.tcp_ip4_spec.psrc),
+				ntohs(fsp->h_u.tcp_ip4_spec.pdst),
+				ntohs(fsp->m_u.tcp_ip4_spec.pdst));
+			break;
+		case AH_V4_FLOW:
+		case ESP_V4_FLOW:
+			fprintf(stdout,
+				"\tSPI: %d mask: 0x%x\n",
+				ntohl(fsp->h_u.esp_ip4_spec.spi),
+				ntohl(fsp->m_u.esp_ip4_spec.spi));
+			break;
+		case IP_USER_FLOW:
+			fprintf(stdout,
+				"\tProtocol: %d mask: 0x%x\n"
+				"\tL4 bytes: 0x%x mask: 0x%x\n",
+				fsp->h_u.usr_ip4_spec.proto,
+				fsp->m_u.usr_ip4_spec.proto,
+				ntohl(fsp->h_u.usr_ip4_spec.l4_4_bytes),
+				ntohl(fsp->m_u.usr_ip4_spec.l4_4_bytes));
+			break;
+		default:
+			break;
+		}
+		break;
+	case ETHER_FLOW:
+		dmac = fsp->h_u.ether_spec.h_dest;
+		dmacm = fsp->m_u.ether_spec.h_dest;
+		smac = fsp->h_u.ether_spec.h_source;
+		smacm = fsp->m_u.ether_spec.h_source;
+		proto = ntohs(fsp->h_u.ether_spec.h_proto);
+		protom = ntohs(fsp->m_u.ether_spec.h_proto);
+
+		fprintf(stdout,
+			"\tFlow Type: Raw Ethernet\n"
+			"\tSrc MAC addr: %02X:%02X:%02X:%02X:%02X:%02X"
+			" mask: %02X:%02X:%02X:%02X:%02X:%02X\n"
+			"\tDest MAC addr: %02X:%02X:%02X:%02X:%02X:%02X"
+			" mask: %02X:%02X:%02X:%02X:%02X:%02X\n"
+			"\tEthertype: 0x%X mask: 0x%X\n",
+			smac[0], smac[1], smac[2], smac[3], smac[4], smac[5],
+			smacm[0], smacm[1], smacm[2], smacm[3], smacm[4], smacm[5],
+			dmac[0], dmac[1], dmac[2], dmac[3], dmac[4], dmac[5],
+			dmacm[0], dmacm[1], dmacm[2], dmacm[3], dmacm[4], dmacm[5],
+			proto, protom);
+		break;
+	default:
+		fprintf(stdout,
+			"\tUnknown Flow type: %d\n", fsp->flow_type);
+		break;
+	}
+
+	if (fsp->ring_cookie != RX_CLS_FLOW_DISC)
+		fprintf(stdout, "\tAction: Direct to queue %llu\n",
+			fsp->ring_cookie);
+	else
+		fprintf(stdout, "\tAction: Drop\n");
+
+	fprintf(stdout, "\n\n");
+}
+
+static void rmgr_print_rule(struct ethtool_rx_flow_spec *fsp)
+{
+	/* print the rule in this location */
+	switch (fsp->flow_type) {
+	case TCP_V4_FLOW:
+	case UDP_V4_FLOW:
+	case SCTP_V4_FLOW:
+	case AH_V4_FLOW:
+	case ESP_V4_FLOW:
+	case ETHER_FLOW:
+		rmgr_print_nfc_rule(fsp);
+		break;
+	case IP_USER_FLOW:
+		if (fsp->h_u.usr_ip4_spec.ip_ver == ETH_RX_NFC_IP4) {
+			rmgr_print_nfc_rule(fsp);
+			break;
+		}
+		/* IPv6 User Flow falls through to the case below */
+	case TCP_V6_FLOW:
+	case UDP_V6_FLOW:
+	case SCTP_V6_FLOW:
+	case AH_V6_FLOW:
+	case ESP_V6_FLOW:
+		fprintf(stderr, "IPv6 flows not implemented\n");
+		break;
+	default:
+		fprintf(stderr, "rmgr: Unknown flow type\n");
+		break;
+	}
+}
+
+static int rmgr_ins(__u32 loc)
+{
+	/* verify location is in rule manager range */
+	if ((loc < 0) || (loc >= rmgr.size)) {
+		fprintf(stderr, "rmgr: Location out of range\n");
+		return -1;
+	}
+
+	/* set bit for the rule */
+	set_bit(loc, rmgr.slot);
+
+	return 0;
+}
+
+static int rmgr_find(__u32 loc)
+{
+	/* verify location is in rule manager range */
+	if ((loc < 0) || (loc >= rmgr.size)) {
+		fprintf(stderr, "rmgr: Location out of range\n");
+		return -1;
+	}
+
+	/* if slot is found return 0 indicating success */
+	if (test_bit(loc, rmgr.slot))
+		return 0;
+
+	/* rule not found */
+	fprintf(stderr, "rmgr: No such rule\n");
+	return -1;
+}
+
+static int rmgr_del(__u32 loc)
+{
+	/* verify rule exists before attempting to delete */
+	int err = rmgr_find(loc);
+	if (err)
+		return err;
+
+	/* clear bit for the rule */
+	clear_bit(loc, rmgr.slot);
+
+	return 0;
+}
+
+static int rmgr_add(struct ethtool_rx_flow_spec *fsp, __u8 loc_valid)
+{
+	__u32 loc = fsp->location;
+
+	/* location provided, insert rule and update regions to match rule */
+	if (loc_valid)
+		return rmgr_ins(loc);
+
+	/* find an open slot */
+	for (loc = 0; loc < rmgr.size; loc += BITS_PER_LONG) {
+		if ((rmgr.slot[loc / BITS_PER_LONG]) != ~0UL)
+			break;
+	}
+
+	/* find and use available location in slot */
+	for (; loc < rmgr.size; loc++) {
+		if (!test_bit(loc, rmgr.slot)) {
+			fsp->location = loc;
+			return rmgr_ins(loc);
+		}
+	}
+
+	/* No space to add this rule */
+	fprintf(stderr, "rmgr: Cannot find appropriate slot to insert rule\n");
+
+	return -1;
+}
+
+static int rmgr_init(int fd, struct ifreq *ifr)
+{
+	struct ethtool_rxnfc *nfccmd;
+	int err, i;
+	__u32 *rule_locs;
+
+	if (rmgr_init_done)
+		return 0;
+
+	/* clear rule manager settings */
+	memset(&rmgr, 0, sizeof(struct rmgr_ctrl));
+
+	/* allocate memory for count request */
+	nfccmd = calloc(1, sizeof(*nfccmd));
+	if (!nfccmd) {
+		perror("rmgr: Cannot allocate memory for RX class rule data");
+		return -1;
+	}
+
+	/* request count and store in rmgr.n_rules */
+	nfccmd->cmd = ETHTOOL_GRXCLSRLCNT;
+	ifr->ifr_data = (caddr_t)nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	rmgr.n_rules = nfccmd->rule_cnt;
+	free(nfccmd);
+	if (err < 0) {
+		perror("rmgr: Cannot get RX class rule count");
+		return -1;
+	}
+
+	/* alloc memory for request of location list */
+	nfccmd = calloc(1, sizeof(*nfccmd) + (rmgr.n_rules * sizeof(__u32)));
+	if (!nfccmd) {
+		perror("rmgr: Cannot allocate memory for RX class rule locations");
+		return -1;
+	}
+
+	/* request location list */
+	nfccmd->cmd = ETHTOOL_GRXCLSRLALL;
+	nfccmd->rule_cnt = rmgr.n_rules;
+	ifr->ifr_data = (caddr_t)nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0) {
+		perror("rmgr: Cannot get RX class rules");
+		free(nfccmd);
+		return -1;
+	}
+
+	/* intitialize bitmap for storage of valid locations */
+	rmgr.size = nfccmd->data;
+	rmgr.slot = calloc(1, BITS_TO_LONGS(rmgr.size) * sizeof(long));
+	if (!rmgr.slot) {
+		perror("rmgr: Cannot allocate memory for RX class rules");
+		return -1;
+	}
+
+	/* write locations to bitmap */
+	rule_locs = nfccmd->rule_locs;
+	for (i = 0; i < rmgr.n_rules; i++) {
+		err = rmgr_ins(rule_locs[i]);
+		if (err < 0)
+			break;
+	}
+
+	/* free memory and set flag to avoid reinit */
+	free(nfccmd);
+	rmgr_init_done = 1;
+
+	return err;
+}
+
+static void rmgr_cleanup(void)
+{
+	if (!rmgr_init_done)
+		return;
+
+	rmgr_init_done = 0;
+
+	free(rmgr.slot);
+	rmgr.slot = NULL;
+	rmgr.size = 0;
+}
+
+int rxclass_rule_getall(int fd, struct ifreq *ifr)
+{
+	struct ethtool_rxnfc nfccmd;
+	int err, i, j;
+
+	/* init table of available rules */
+	err = rmgr_init(fd, ifr);
+	if (err < 0)
+		return err;
+
+	fprintf(stdout, "Total %d rules\n\n", rmgr.n_rules);
+
+	/* fetch and display all available rules */
+	for (i = 0; i < rmgr.size; i += BITS_PER_LONG) {
+		if (rmgr.slot[i / BITS_PER_LONG] == 0UL)
+			continue;
+		for (j = 0; j < BITS_PER_LONG; j++) {
+			if (!test_bit(i + j, rmgr.slot))
+				continue;
+			nfccmd.cmd = ETHTOOL_GRXCLSRULE;
+			memset(&nfccmd.fs, 0,
+			       sizeof(struct ethtool_rx_flow_spec));
+			nfccmd.fs.location = i + j;
+			ifr->ifr_data = (caddr_t)&nfccmd;
+			err = ioctl(fd, SIOCETHTOOL, ifr);
+			if (err < 0) {
+				perror("rmgr: Cannot get RX class rule");
+				return -1;
+			}
+			rmgr_print_rule(&nfccmd.fs);
+		}
+	}
+
+	rmgr_cleanup();
+
+	return 0;
+}
+
+int rxclass_rule_get(int fd, struct ifreq *ifr, __u32 loc)
+{
+	struct ethtool_rxnfc nfccmd;
+	int err;
+
+	/* init table of available rules */
+	err = rmgr_init(fd, ifr);
+	if (err < 0)
+		return err;
+
+	/* verify rule exists before attempting to display */
+	err = rmgr_find(loc);
+	if (err < 0)
+		return err;
+
+	/* fetch rule from netdev and display */
+	nfccmd.cmd = ETHTOOL_GRXCLSRULE;
+	memset(&nfccmd.fs, 0, sizeof(struct ethtool_rx_flow_spec));
+	nfccmd.fs.location = loc;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0) {
+		perror("rmgr: Cannot get RX class rule");
+		return -1;
+	}
+	rmgr_print_rule(&nfccmd.fs);
+
+	rmgr_cleanup();
+
+	return 0;
+}
+
+int rxclass_rule_ins(int fd, struct ifreq *ifr,
+		     struct ethtool_rx_flow_spec *fsp, __u8 loc_valid)
+{
+	struct ethtool_rxnfc nfccmd;
+	int err;
+
+	/* init table of available rules */
+	err = rmgr_init(fd, ifr);
+	if (err < 0)
+		return err;
+
+	/* verify rule location */
+	err = rmgr_add(fsp, loc_valid);
+	if (err < 0)
+		return err;
+
+	/* notify netdev of new rule */
+	nfccmd.cmd = ETHTOOL_SRXCLSRLINS;
+	nfccmd.fs = *fsp;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0) {
+		perror("rmgr: Cannot insert RX class rule");
+		return -1;
+	}
+	rmgr.n_rules++;
+
+	printf("Added rule with ID %d\n", fsp->location);
+
+	rmgr_cleanup();
+
+	return 0;
+}
+
+int rxclass_rule_del(int fd, struct ifreq *ifr, __u32 loc)
+{
+	struct ethtool_rxnfc nfccmd;
+	int err;
+
+	/* init table of available rules */
+	err = rmgr_init(fd, ifr);
+	if (err < 0)
+		return err;
+
+	/* verify rule exists */
+	err = rmgr_del(loc);
+	if (err < 0)
+		return err;
+
+	/* notify netdev of rule removal */
+	nfccmd.cmd = ETHTOOL_SRXCLSRLDEL;
+	nfccmd.fs.location = loc;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0) {
+		perror("rmgr: Cannot delete RX class rule");
+		return -1;
+	}
+	rmgr.n_rules--;
+
+	rmgr_cleanup();
+
+	return 0;
+}
+
+typedef enum {
+	OPT_NONE,
+	OPT_S32,
+	OPT_U8,
+	OPT_U16,
+	OPT_U32,
+	OPT_U64,
+	OPT_BE16,
+	OPT_BE32,
+	OPT_BE64,
+	OPT_IP4,
+	OPT_MAC,
+} rule_opt_type_t;
+
+typedef enum {
+	ETH_SPEC_NONE,
+	ETH_SPEC_NFC,
+	ETH_SPEC_NTUPLE,
+} rule_spec_type_t;
+
+#define NFC_FLAG_RING		0x001
+#define NFC_FLAG_LOC		0x002
+#define NFC_FLAG_SADDR		0x004
+#define NFC_FLAG_DADDR		0x008
+#define NFC_FLAG_SPORT		0x010
+#define NFC_FLAG_DPORT		0x020
+#define NFC_FLAG_SPI		0x030
+#define NFC_FLAG_TOS		0x040
+#define NFC_FLAG_PROTO		0x080
+#define NTUPLE_FLAG_VLAN	0x100
+#define NTUPLE_FLAG_UDEF	0x200
+
+struct rule_opts {
+	const char	*name;
+	rule_opt_type_t	type;
+	u32		flag;
+	int		offset;
+	int		moffset;
+};
+
+static struct rule_opts rule_nfc_tcp_ip4[] = {
+	{ "src-ip", OPT_IP4, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.ip4src),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.ip4src) },
+	{ "dst-ip", OPT_IP4, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.ip4dst),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.ip4dst) },
+	{ "tos", OPT_U8, NFC_FLAG_TOS,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.tos),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.tos) },
+	{ "src-port", OPT_BE16, NFC_FLAG_SPORT,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.psrc),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.psrc) },
+	{ "dst-port", OPT_BE16, NFC_FLAG_DPORT,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.pdst),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.pdst) },
+	{ "action", OPT_U64, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
+	{ "loc", OPT_U32, NFC_FLAG_LOC,
+	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+};
+
+static struct rule_opts rule_nfc_esp_ip4[] = {
+	{ "src-ip", OPT_IP4, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.esp_ip4_spec.ip4src),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.esp_ip4_spec.ip4src) },
+	{ "dst-ip", OPT_IP4, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.esp_ip4_spec.ip4dst),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.esp_ip4_spec.ip4dst) },
+	{ "tos", OPT_U8, NFC_FLAG_TOS,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.esp_ip4_spec.tos),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.esp_ip4_spec.tos) },
+	{ "spi", OPT_BE32, NFC_FLAG_SPI,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.esp_ip4_spec.spi),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.esp_ip4_spec.spi) },
+	{ "action", OPT_U64, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
+	{ "loc", OPT_U32, NFC_FLAG_LOC,
+	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+};
+
+static struct rule_opts rule_nfc_usr_ip4[] = {
+	{ "src-ip", OPT_IP4, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.ip4src),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.ip4src) },
+	{ "dst-ip", OPT_IP4, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.ip4dst),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.ip4dst) },
+	{ "tos", OPT_U8, NFC_FLAG_TOS,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.tos),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.tos) },
+	{ "l4proto", OPT_U8, NFC_FLAG_PROTO,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.proto),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.proto) },
+	{ "spi", OPT_BE32, NFC_FLAG_SPI,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.l4_4_bytes),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.l4_4_bytes) },
+	{ "src-port", OPT_BE16, NFC_FLAG_SPORT,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.l4_4_bytes),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.l4_4_bytes) },
+	{ "dst-port", OPT_BE16, NFC_FLAG_DPORT,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.l4_4_bytes) + 2,
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.l4_4_bytes) + 2 },
+	{ "action", OPT_U64, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
+	{ "loc", OPT_U32, NFC_FLAG_LOC,
+	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+};
+
+static struct rule_opts rule_nfc_ether[] = {
+	{ "src", OPT_MAC, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ether_spec.h_dest),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ether_spec.h_dest) },
+	{ "dst", OPT_MAC, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ether_spec.h_source),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ether_spec.h_source) },
+	{ "proto", OPT_BE16, NFC_FLAG_PROTO,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ether_spec.h_proto),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ether_spec.h_proto) },
+	{ "action", OPT_U64, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
+	{ "loc", OPT_U32, NFC_FLAG_LOC,
+	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+};
+
+static struct rule_opts rule_ntuple_tcp_ip4[] = {
+	{ "src-ip", OPT_IP4, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.tcp_ip4_spec.ip4src),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.tcp_ip4_spec.ip4src) },
+	{ "dst-ip", OPT_IP4, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.tcp_ip4_spec.ip4dst),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.tcp_ip4_spec.ip4dst) },
+	{ "tos", OPT_U8, NFC_FLAG_TOS,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.tcp_ip4_spec.tos),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.tcp_ip4_spec.tos) },
+	{ "src-port", OPT_BE16, NFC_FLAG_SPORT,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.tcp_ip4_spec.psrc),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.tcp_ip4_spec.psrc) },
+	{ "dst-port", OPT_BE16, NFC_FLAG_DPORT,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.tcp_ip4_spec.pdst),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.tcp_ip4_spec.pdst) },
+	{ "vlan", OPT_U16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, vlan_tag),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, vlan_tag_mask) },
+	{ "user-def", OPT_U64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, data),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, data_mask) },
+	{ "action", OPT_S32, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, action), -1 },
+};
+
+static struct rule_opts rule_ntuple_esp_ip4[] = {
+	{ "src-ip", OPT_IP4, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.esp_ip4_spec.ip4src),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.esp_ip4_spec.ip4src) },
+	{ "dst-ip", OPT_IP4, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.esp_ip4_spec.ip4dst),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.esp_ip4_spec.ip4dst) },
+	{ "tos", OPT_U8, NFC_FLAG_TOS,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.esp_ip4_spec.tos),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.esp_ip4_spec.tos) },
+	{ "spi", OPT_BE32, NFC_FLAG_SPI,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.esp_ip4_spec.spi),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.esp_ip4_spec.spi) },
+	{ "vlan", OPT_U16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, vlan_tag),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, vlan_tag_mask) },
+	{ "user-def", OPT_U64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, data),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, data_mask) },
+	{ "action", OPT_S32, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, action), -1 },
+};
+
+static struct rule_opts rule_ntuple_usr_ip4[] = {
+	{ "src-ip", OPT_IP4, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.usr_ip4_spec.ip4src),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.usr_ip4_spec.ip4src) },
+	{ "dst-ip", OPT_IP4, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.usr_ip4_spec.ip4dst),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.usr_ip4_spec.ip4dst) },
+	{ "tos", OPT_U8, NFC_FLAG_TOS,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.usr_ip4_spec.tos),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.usr_ip4_spec.tos) },
+	{ "l4proto", OPT_U8, NFC_FLAG_PROTO,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.usr_ip4_spec.proto),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.usr_ip4_spec.proto) },
+	{ "spi", OPT_BE32, NFC_FLAG_SPI,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.usr_ip4_spec.l4_4_bytes),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.usr_ip4_spec.l4_4_bytes) },
+	{ "src-port", OPT_BE16, NFC_FLAG_SPORT,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.usr_ip4_spec.l4_4_bytes),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.usr_ip4_spec.l4_4_bytes) },
+	{ "dst-port", OPT_BE16, NFC_FLAG_DPORT,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.usr_ip4_spec.l4_4_bytes) + 2,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.usr_ip4_spec.l4_4_bytes) + 2 },
+	{ "vlan", OPT_U16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, vlan_tag),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, vlan_tag_mask) },
+	{ "user-def", OPT_U64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, data),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, data_mask) },
+	{ "action", OPT_S32, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, action), -1 },
+};
+
+static struct rule_opts rule_ntuple_ether[] = {
+	{ "src", OPT_MAC, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.ether_spec.h_dest),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.ether_spec.h_dest) },
+	{ "dst", OPT_MAC, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.ether_spec.h_source),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.ether_spec.h_source) },
+	{ "proto", OPT_BE16, NFC_FLAG_PROTO,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, h_u.ether_spec.h_proto),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, m_u.ether_spec.h_proto) },
+	{ "vlan", OPT_U16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, vlan_tag),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, vlan_tag_mask) },
+	{ "user-def", OPT_U64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, data),
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, data_mask) },
+	{ "action", OPT_S32, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_ntuple_flow_spec, action), -1 },
+};
+
+static int rxclass_get_long(char *str, long long *val, int size)
+{
+	long long max = ~0ULL >> (65 - size);
+	char *endp;
+
+	errno = 0;
+
+	*val = strtoll(str, &endp, 0);
+
+	if (*endp || errno || (*val > max) || (*val < ~max))
+		return -1;
+
+	return 0;
+}
+
+static int rxclass_get_ulong(char *str, unsigned long long *val, int size)
+{
+	long long max = ~0ULL >> (64 - size);
+	char *endp;
+
+	errno = 0;
+
+	*val = strtoull(str, &endp, 0);
+
+	if (*endp || errno || (*val > max))
+		return -1;
+
+	return 0;
+}
+
+static int rxclass_get_ipv4(char *str, __be32 *val)
+{
+	if (!strchr(str, '.')) {
+		unsigned long long v;
+		int err;
+
+		err = rxclass_get_ulong(str, &v, 32);
+		if (err)
+			return -1;
+
+		*val = htonl((u32)v);
+
+		return 0;
+	}
+
+	if (!inet_pton(AF_INET, str, val))
+		return -1;
+
+	return 0;
+}
+
+static int rxclass_get_ether(char *str, unsigned char *val)
+{
+	unsigned int buf[ETH_ALEN];
+	int count;
+
+	if (!strchr(str, ':'))
+		return -1;
+
+	count = sscanf(str, "%2x:%2x:%2x:%2x:%2x:%2x",
+		       &buf[0], &buf[1], &buf[2],
+		       &buf[3], &buf[4], &buf[5]);
+
+	if (count != ETH_ALEN)
+		return -1;
+
+	do {
+		count--;
+		val[count] = buf[count];
+	} while (count);
+
+	return 0;
+}
+
+static int rxclass_get_val(char *str, unsigned char *p, u32 *flags,
+			   const struct rule_opts *opt, rule_spec_type_t spec)
+{
+	unsigned long long mask = (spec == ETH_SPEC_NFC) ? ~0ULL : 0ULL; 
+	int err = 0;
+
+	if (*flags & opt->flag)
+		return -1;
+
+	*flags |= opt->flag;
+
+	switch (opt->type) {
+	case OPT_S32: {
+		long long val;
+		err = rxclass_get_long(str, &val, 32);
+		if (err)
+			return -1;
+		*(int *)&p[opt->offset] = (int)val;
+		if (opt->moffset >= 0)
+			*(int *)&p[opt->moffset] = (int)mask;
+		break;
+	}
+	case OPT_U8: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 8);
+		if (err)
+			return -1;
+		*(u8 *)&p[opt->offset] = (u8)val;
+		if (opt->moffset >= 0)
+			*(u8 *)&p[opt->moffset] = (u8)mask;
+		break;
+	}
+	case OPT_U16: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 16);
+		if (err)
+			return -1;
+		*(u16 *)&p[opt->offset] = (u16)val;
+		if (opt->moffset >= 0)
+			*(u16 *)&p[opt->moffset] = (u16)mask;
+		break;
+	}
+	case OPT_U32: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 32);
+		if (err)
+			return -1;
+		*(u32 *)&p[opt->offset] = (u32)val;
+		if (opt->moffset >= 0)
+			*(u32 *)&p[opt->moffset] = (u32)mask;
+		break;
+	}
+	case OPT_U64: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 64);
+		if (err)
+			return -1;
+		*(u64 *)&p[opt->offset] = (u64)val;
+		if (opt->moffset >= 0)
+			*(u64 *)&p[opt->moffset] = (u64)mask;
+		break;
+	}
+	case OPT_BE16: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 16);
+		if (err)
+			return -1;
+		*(__be16 *)&p[opt->offset] = htons((u16)val);
+		if (opt->moffset >= 0)
+			*(__be16 *)&p[opt->moffset] = (__be16)mask;
+		break;
+	}
+	case OPT_BE32: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 32);
+		if (err)
+			return -1;
+		*(__be32 *)&p[opt->offset] = htonl((u32)val);
+		if (opt->moffset >= 0)
+			*(__be32 *)&p[opt->moffset] = (__be32)mask;
+		break;
+	}
+	case OPT_BE64: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 64);
+		if (err)
+			return -1;
+		*(__be64 *)&p[opt->offset] = htonll((u64)val);
+		if (opt->moffset >= 0)
+			*(__be64 *)&p[opt->moffset] = (__be64)mask;
+		break;
+	}
+	case OPT_IP4: {
+		__be32 val;
+		err = rxclass_get_ipv4(str, &val);
+		if (err)
+			return -1;
+		*(__be32 *)&p[opt->offset] = val;
+		if (opt->moffset >= 0)
+			*(__be32 *)&p[opt->moffset] = (__be32)mask;
+		break;
+	}
+	case OPT_MAC: {
+		unsigned char val[ETH_ALEN];
+		err = rxclass_get_ether(str, val);
+		if (err)
+			return -1;
+		memcpy(&p[opt->offset], val, ETH_ALEN);
+		if (opt->moffset >= 0)
+			memcpy(&p[opt->moffset], &mask, ETH_ALEN);
+		break;
+	}
+	case OPT_NONE:
+	default:
+		return -1;
+	}
+
+	return 0;
+}
+
+static int rxclass_get_mask(char *str, unsigned char *p,
+			    const struct rule_opts *opt)
+{
+	int err = 0;
+
+	if (opt->moffset < 0)
+		return -1;
+
+	switch (opt->type) {
+	case OPT_S32: {
+		long long val;
+		err = rxclass_get_long(str, &val, 32);
+		if (err)
+		*(int *)&p[opt->moffset] = (int)val;
+		break;
+	}
+	case OPT_U8: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 8);
+		if (err)
+			return -1;
+		*(u8 *)&p[opt->moffset] = (u8)val;
+		break;
+	}
+	case OPT_U16: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 16);
+		if (err)
+			return -1;
+		*(u16 *)&p[opt->moffset] = (u16)val;
+		break;
+	}
+	case OPT_U32: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 32);
+		if (err)
+			return -1;
+		*(u32 *)&p[opt->moffset] = (u32)val;
+		break;
+	}
+	case OPT_U64: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 64);
+		if (err)
+			return -1;
+		*(u64 *)&p[opt->moffset] = (u64)val;
+		break;
+	}
+	case OPT_BE16: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 16);
+		if (err)
+			return -1;
+		*(__be16 *)&p[opt->moffset] = htons((u16)val);
+		break;
+	}
+	case OPT_BE32: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 32);
+		if (err)
+			return -1;
+		*(__be32 *)&p[opt->moffset] = htonl((u32)val);
+		break;
+	}
+	case OPT_BE64: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 64);
+		if (err)
+			return -1;
+		*(__be64 *)&p[opt->moffset] = htonll((u64)val);
+		break;
+	}
+	case OPT_IP4: {
+		__be32 val;
+		err = rxclass_get_ipv4(str, &val);
+		if (err)
+			return -1;
+		*(__be32 *)&p[opt->moffset] = val;
+		break;
+	}
+	case OPT_MAC: {
+		unsigned char val[ETH_ALEN];
+		err = rxclass_get_ether(str, val);
+		if (err)
+			return -1;
+		memcpy(&p[opt->moffset], val, ETH_ALEN);
+		break;
+	}
+	case OPT_NONE:
+	default:
+		return -1;
+	}
+
+	return 0;
+}
+
+int rxclass_parse_ruleopts(char **argp, int argc, void *fsp, u8 *loc_valid)
+{
+	const struct rule_opts *options;
+	unsigned char *p = (unsigned char *)fsp;
+	int i = 0, n_opts, err;
+	u32 flags = 0;
+	rule_spec_type_t spec_type;
+	int flow_type;
+
+	if (*argp == NULL || **argp == '\0' || argc < 2)
+		goto syntax_err;
+
+	if (!strcmp(argp[0], "class-rule-add"))
+		spec_type = ETH_SPEC_NFC;
+	else if (!strcmp(argp[0], "flow-type"))
+		spec_type = ETH_SPEC_NTUPLE;
+	else
+		goto syntax_err;
+
+	if (!strcmp(argp[1], "tcp4"))
+		flow_type = TCP_V4_FLOW;
+	else if (!strcmp(argp[1], "udp4"))
+		flow_type = UDP_V4_FLOW;
+	else if (!strcmp(argp[1], "sctp4"))
+		flow_type = SCTP_V4_FLOW;
+	else if (!strcmp(argp[1], "ah4"))
+		flow_type = AH_V4_FLOW;
+	else if (!strcmp(argp[1], "esp4"))
+		flow_type = ESP_V4_FLOW;
+	else if (!strcmp(argp[1], "ip4"))
+		flow_type = IP_USER_FLOW;
+	else if (!strcmp(argp[1], "ether"))
+		flow_type = ETHER_FLOW;
+	else
+		goto syntax_err;
+
+	switch (spec_type) {
+	case ETH_SPEC_NFC:
+		memset(p, 0, sizeof(struct ethtool_rx_flow_spec));
+		*(u32 *)p = flow_type;
+
+		switch (flow_type) {
+		case TCP_V4_FLOW:
+		case UDP_V4_FLOW:
+		case SCTP_V4_FLOW:
+			options = rule_nfc_tcp_ip4;
+			n_opts = ARRAY_SIZE(rule_nfc_tcp_ip4);
+			break;
+		case AH_V4_FLOW:
+		case ESP_V4_FLOW:
+			options = rule_nfc_esp_ip4;
+			n_opts = ARRAY_SIZE(rule_nfc_esp_ip4);
+			break;
+		case IP_USER_FLOW:
+			options = rule_nfc_usr_ip4;
+			n_opts = ARRAY_SIZE(rule_nfc_usr_ip4);
+			break;
+		case ETHER_FLOW:
+			options = rule_nfc_ether;
+			n_opts = ARRAY_SIZE(rule_nfc_ether);
+			break;
+		default:
+			fprintf(stdout, "Add rule, invalid rule type[%s]\n",
+				argp[1]);
+			return -1;
+		}
+		break;
+	case ETH_SPEC_NTUPLE:
+		memset(p, 0xff,
+		       offsetof(struct ethtool_rx_ntuple_flow_spec, action));
+		memset(p + offsetof(struct ethtool_rx_ntuple_flow_spec, action),
+		       0, 4);
+		*(u32 *)fsp = flow_type;
+
+		switch (flow_type) {
+		case TCP_V4_FLOW:
+		case UDP_V4_FLOW:
+		case SCTP_V4_FLOW:
+			options = rule_ntuple_tcp_ip4;
+			n_opts = ARRAY_SIZE(rule_ntuple_tcp_ip4);
+			break;
+		case AH_V4_FLOW:
+		case ESP_V4_FLOW:
+			options = rule_ntuple_esp_ip4;
+			n_opts = ARRAY_SIZE(rule_nfc_esp_ip4);
+			break;
+		case IP_USER_FLOW:
+			options = rule_ntuple_usr_ip4;
+			n_opts = ARRAY_SIZE(rule_nfc_usr_ip4);
+			break;
+		case ETHER_FLOW:
+			options = rule_ntuple_ether;
+			n_opts = ARRAY_SIZE(rule_nfc_ether);
+			break;
+		default:
+			fprintf(stdout, "Add rule, invalid flow type[%s]\n",
+				argp[1]);
+			return -1;
+		}
+		break;
+	default:
+		fprintf(stdout, "Add rule, invalid command[%s]\n",
+			argp[0]);
+		return -1;
+	}
+
+	for (i = 2; i < argc;) {
+		const struct rule_opts *opt;
+		int idx;
+		for (opt = options, idx = 0; idx < n_opts; idx++, opt++) {
+			char mask_name[16];
+
+			if (strcmp(argp[i], opt->name))
+				continue;
+
+			i++;
+			if (i >= argc)
+				break;
+
+			err = rxclass_get_val(argp[i], p, &flags, opt, spec_type);
+			if (err) {
+				fprintf(stderr, "Invalid %s value[%s]\n",
+					opt->name, argp[i]);
+				return -1;
+			}
+
+			i++;
+			if (i >= argc)
+				break;
+
+			sprintf(mask_name, "%s-mask", opt->name);
+			if (strcmp(argp[i], mask_name))
+				break;
+
+			i++;
+			if (i >= argc)
+				goto syntax_err;
+
+			err = rxclass_get_mask(argp[i], p, opt);
+			if (err) {
+				fprintf(stderr, "Invalid %s mask[%s]\n",
+					opt->name, argp[i]);
+				return -1;
+			}
+
+			i++;
+
+			break;
+		}
+		if (idx == n_opts) {
+			fprintf(stdout, "Add rule, unreconized option[%s]\n", argp[i]);
+			return -1;
+		}
+	}
+
+	if (spec_type == ETH_SPEC_NFC) {
+		if (loc_valid && (flags & NFC_FLAG_LOC))
+			*loc_valid = 1;
+	}
+
+	return 0;
+
+syntax_err:
+	fprintf(stdout, "Add rule, invalid syntax\n");
+	return -1;
+}


^ permalink raw reply related

* [ethtool PATCH 4/4] Add support for displaying a ntuple contained in an rx_flow_spec
From: Alexander Duyck @ 2011-02-25 23:49 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225233902.8409.74474.stgit@gitlad.jf.intel.com>

This change is meant to add support for displaying a ntuple filter by using
the unused space contained within a standard rx_flow_spec.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 ethtool-copy.h |   20 ++++++++++++++++
 ethtool.8.in   |    6 +++++
 ethtool.c      |    3 ++
 rxclass.c      |   72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 101 insertions(+), 0 deletions(-)

diff --git a/ethtool-copy.h b/ethtool-copy.h
index dd1080b..4401238 100644
--- a/ethtool-copy.h
+++ b/ethtool-copy.h
@@ -376,10 +376,25 @@ struct ethtool_usrip4_spec {
 };
 
 /**
+ * struct ethtool_ntuple_spec_ext - flow spec extension for ntuple in nfc
+ * @unused: space unused by extension
+ * @vlan_etype: EtherType for vlan tagged packet to match
+ * @vlan_tci: VLAN tag to match
+ * @data: Driver-dependent data to match
+ */
+struct ethtool_ntuple_spec_ext {
+	__be32	unused[15];
+	__be16	vlan_etype;
+	__be16	vlan_tci;
+	__be32	data[2];
+};
+
+/**
  * struct ethtool_rx_flow_spec - specification for RX flow filter
  * @flow_type: Type of match to perform, e.g. %TCP_V4_FLOW
  * @h_u: Flow fields to match (dependent on @flow_type)
  * @m_u: Masks for flow field bits to be ignored
+ * @flow_type_ext: Type of extended match to perform, e.g. %NTUPLE_FLOW_EXT
  * @ring_cookie: RX ring/queue index to deliver to, or %RX_CLS_FLOW_DISC
  *	if packets should be discarded
  * @location: Index of filter in hardware table
@@ -394,8 +409,10 @@ struct ethtool_rx_flow_spec {
 		struct ethtool_ah_espip4_spec		esp_ip4_spec;
 		struct ethtool_usrip4_spec		usr_ip4_spec;
 		struct ethhdr				ether_spec;
+		struct ethtool_ntuple_spec_ext		ntuple_spec;
 		__u8					hdata[72];
 	} h_u, m_u;
+	__u32		flow_type_ext;
 	__u64		ring_cookie;
 	__u32		location;
 };
@@ -706,6 +723,9 @@ struct ethtool_flash {
 #define	IPV6_FLOW	0x11	/* hash only */
 #define	ETHER_FLOW	0x12	/* spec only (ether_spec) */
 
+/* Flow extension types for network flow classifier */
+#define NTUPLE_FLOW_EXT	0x01 /* indicates ntuple in nfc */
+
 /* L3-L4 network traffic flow hash options */
 #define	RXH_L2DA	(1 << 1)
 #define	RXH_VLAN	(1 << 2)
diff --git a/ethtool.8.in b/ethtool.8.in
index b68b010..9c768fb 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -278,6 +278,9 @@ ethtool \- query or control network driver and hardware settings
 .BM src-port
 .BM dst-port
 .BM spi
+.BM vlan-etype
+.BM vlan
+.BM user-def
 .BN action
 .BN loc
 .RB \ |\  flow-type \ \*(NC
@@ -764,6 +767,9 @@ Specify the value of the security parameter index field (applicable to
 AH/ESP packets)in the incoming packet to match along with an
 optional mask.
 .TP
+.BI vlan-etype \ N \\fR\ [\\fPvlan-etype-mask \ N \\fR]\\fP
+Includes the VLAN tag Ethertype and an optional mask.
+.TP
 .BI vlan \ N \\fR\ [\\fPvlan-mask \ N \\fR]\\fP
 Includes the VLAN tag and an optional mask.
 .TP
diff --git a/ethtool.c b/ethtool.c
index f4dfc39..d931353 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -254,6 +254,9 @@ static struct option {
 		"			[ src-port %d [src-port-mask %x] ]\n"
 		"			[ dst-port %d [dst-port-mask %x] ]\n"
 		"			[ spi %d [spi-mask %x] ]\n"
+		"			[ vlan-etype %x [vlan-etype-mask %x] ]\n"
+		"			[ vlan %x [vlan-mask %x] ]\n"
+		"			[ user-def %x [user-def-mask %x] ]\n"
 		"			[ action %d ]\n"
 		"			[ loc %d] |\n"
 		"		  flow-type ether|ip4|tcp4|udp4|sctp4|ah4|esp4\n"
diff --git a/rxclass.c b/rxclass.c
index f2a8c96..99071a4 100644
--- a/rxclass.c
+++ b/rxclass.c
@@ -30,6 +30,33 @@ struct rmgr_ctrl {
 static struct rmgr_ctrl rmgr;
 static int rmgr_init_done = 0;
 
+static void rmgr_print_nfc_spec_ext(struct ethtool_rx_flow_spec *fsp)
+{
+	u64 data, datam;
+	__u16 etype, etypem, tci, tcim;
+	
+	switch(fsp->flow_type_ext) {
+	case NTUPLE_FLOW_EXT:
+		etype = ntohs(fsp->h_u.ntuple_spec.vlan_etype);
+		etypem = ntohs(fsp->m_u.ntuple_spec.vlan_etype);
+		tci = ntohs(fsp->h_u.ntuple_spec.vlan_tci);
+		tcim = ntohs(fsp->m_u.ntuple_spec.vlan_tci);
+		data = (u64)ntohl(fsp->h_u.ntuple_spec.data[0]);
+		data = (u64)ntohl(fsp->h_u.ntuple_spec.data[1]) << 32;
+		datam = (u64)ntohl(fsp->m_u.ntuple_spec.data[0]);
+		datam = (u64)ntohl(fsp->m_u.ntuple_spec.data[1]) << 32;
+
+		fprintf(stdout,
+			"\tVLAN EtherType: 0x%x mask: 0x%x\n"
+			"\tVLAN: 0x%x mask: 0x%x\n"
+			"\tUser-defined: 0x%Lx mask: 0x%Lx\n",
+			etype, etypem, tci, tcim, data, datam);
+		break;
+	default:
+		break;
+	}
+}
+
 static void rmgr_print_nfc_rule(struct ethtool_rx_flow_spec *fsp)
 {
 	unsigned char	*smac, *smacm, *dmac, *dmacm;
@@ -127,6 +154,7 @@ static void rmgr_print_nfc_rule(struct ethtool_rx_flow_spec *fsp)
 		default:
 			break;
 		}
+		rmgr_print_nfc_spec_ext(fsp);
 		break;
 	case ETHER_FLOW:
 		dmac = fsp->h_u.ether_spec.h_dest;
@@ -148,6 +176,7 @@ static void rmgr_print_nfc_rule(struct ethtool_rx_flow_spec *fsp)
 			dmac[0], dmac[1], dmac[2], dmac[3], dmac[4], dmac[5],
 			dmacm[0], dmacm[1], dmacm[2], dmacm[3], dmacm[4], dmacm[5],
 			proto, protom);
+		rmgr_print_nfc_spec_ext(fsp);
 		break;
 	default:
 		fprintf(stdout,
@@ -516,6 +545,7 @@ typedef enum {
 #define NFC_FLAG_PROTO		0x080
 #define NTUPLE_FLAG_VLAN	0x100
 #define NTUPLE_FLAG_UDEF	0x200
+#define NTUPLE_FLAG_VETH	0x400
 
 struct rule_opts {
 	const char	*name;
@@ -545,6 +575,15 @@ static struct rule_opts rule_nfc_tcp_ip4[] = {
 	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
 	{ "loc", OPT_U32, NFC_FLAG_LOC,
 	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+	{ "vlan-etype", OPT_BE16, NTUPLE_FLAG_VETH,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.vlan_etype),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.vlan_etype) },
+	{ "vlan", OPT_BE16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.vlan_tci),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.vlan_tci) },
+	{ "user-def", OPT_BE64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.data),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.data) },
 };
 
 static struct rule_opts rule_nfc_esp_ip4[] = {
@@ -564,6 +603,15 @@ static struct rule_opts rule_nfc_esp_ip4[] = {
 	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
 	{ "loc", OPT_U32, NFC_FLAG_LOC,
 	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+	{ "vlan-etype", OPT_BE16, NTUPLE_FLAG_VETH,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.vlan_etype),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.vlan_etype) },
+	{ "vlan", OPT_BE16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.vlan_tci),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.vlan_tci) },
+	{ "user-def", OPT_BE64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.data),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.data) },
 };
 
 static struct rule_opts rule_nfc_usr_ip4[] = {
@@ -592,6 +640,15 @@ static struct rule_opts rule_nfc_usr_ip4[] = {
 	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
 	{ "loc", OPT_U32, NFC_FLAG_LOC,
 	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+	{ "vlan-etype", OPT_BE16, NTUPLE_FLAG_VETH,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.vlan_etype),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.vlan_etype) },
+	{ "vlan", OPT_BE16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.vlan_tci),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.vlan_tci) },
+	{ "user-def", OPT_BE64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.data),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.data) },
 };
 
 static struct rule_opts rule_nfc_ether[] = {
@@ -608,6 +665,15 @@ static struct rule_opts rule_nfc_ether[] = {
 	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
 	{ "loc", OPT_U32, NFC_FLAG_LOC,
 	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+	{ "vlan-etype", OPT_BE16, NTUPLE_FLAG_VETH,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.vlan_etype),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.vlan_etype) },
+	{ "vlan", OPT_BE16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.vlan_tci),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.vlan_tci) },
+	{ "user-def", OPT_BE64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ntuple_spec.data),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ntuple_spec.data) },
 };
 
 static struct rule_opts rule_ntuple_tcp_ip4[] = {
@@ -1159,6 +1225,12 @@ int rxclass_parse_ruleopts(char **argp, int argc, void *fsp, u8 *loc_valid)
 	if (spec_type == ETH_SPEC_NFC) {
 		if (loc_valid && (flags & NFC_FLAG_LOC))
 			*loc_valid = 1;
+		if (flags & (NTUPLE_FLAG_VLAN |
+			     NTUPLE_FLAG_UDEF |
+			     NTUPLE_FLAG_VETH)) {
+			struct ethtool_rx_flow_spec *fs = fsp;
+			fs->flow_type_ext = NTUPLE_FLOW_EXT;
+		}
 	}
 
 	return 0;


^ permalink raw reply related

* [ethtool PATCH 2/4] Remove strings based approach for displaying ntuple
From: Alexander Duyck @ 2011-02-25 23:48 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225233902.8409.74474.stgit@gitlad.jf.intel.com>

This change is meant to remove the strings based approach for displaying
ntuple filters.  A follow-on patch will replace that functionality with a
network flow classification based approach that will get the number of
filters, get their locations, and then request and display them
individually.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 ethtool-copy.h |    3 +--
 ethtool.c      |   44 --------------------------------------------
 2 files changed, 1 insertions(+), 46 deletions(-)

diff --git a/ethtool-copy.h b/ethtool-copy.h
index 75c3ae7..dd1080b 100644
--- a/ethtool-copy.h
+++ b/ethtool-copy.h
@@ -250,7 +250,6 @@ enum ethtool_stringset {
 	ETH_SS_TEST		= 0,
 	ETH_SS_STATS,
 	ETH_SS_PRIV_FLAGS,
-	ETH_SS_NTUPLE_FILTERS,
 };
 
 /* for passing string sets for data tagging */
@@ -580,7 +579,7 @@ struct ethtool_flash {
 #define ETHTOOL_FLASHDEV	0x00000033 /* Flash firmware to device */
 #define ETHTOOL_RESET		0x00000034 /* Reset hardware */
 #define ETHTOOL_SRXNTUPLE	0x00000035 /* Add an n-tuple filter to device */
-#define ETHTOOL_GRXNTUPLE	0x00000036 /* Get n-tuple filters from device */
+/* ETHTOOL_GRXNTUPLE		0x00000036 disabled due to multiple issues */
 #define ETHTOOL_GSSET_INFO	0x00000037 /* Get string set info */
 #define ETHTOOL_GRXFHINDIR	0x00000038 /* Get RX flow hash indir'n table */
 #define ETHTOOL_SRXFHINDIR	0x00000039 /* Set RX flow hash indir'n table */
diff --git a/ethtool.c b/ethtool.c
index 14740d5..2a084db 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -3162,50 +3162,6 @@ static int do_srxntuple(int fd, struct ifreq *ifr)
 
 static int do_grxntuple(int fd, struct ifreq *ifr)
 {
-	struct ethtool_sset_info *sset_info;
-	struct ethtool_gstrings *strings;
-	int sz_str, n_strings, err, i;
-
-	sset_info = malloc(sizeof(struct ethtool_sset_info) + sizeof(u32));
-	sset_info->cmd = ETHTOOL_GSSET_INFO;
-	sset_info->sset_mask = (1ULL << ETH_SS_NTUPLE_FILTERS);
-	ifr->ifr_data = (caddr_t)sset_info;
-	err = send_ioctl(fd, ifr);
-
-	if ((err < 0) ||
-	    (!(sset_info->sset_mask & (1ULL << ETH_SS_NTUPLE_FILTERS)))) {
-		perror("Cannot get driver strings info");
-		return 100;
-	}
-
-	n_strings = sset_info->data[0];
-	free(sset_info);
-	sz_str = n_strings * ETH_GSTRING_LEN;
-
-	strings = calloc(1, sz_str + sizeof(struct ethtool_gstrings));
-	if (!strings) {
-		fprintf(stderr, "no memory available\n");
-		return 95;
-	}
-
-	strings->cmd = ETHTOOL_GRXNTUPLE;
-	strings->string_set = ETH_SS_NTUPLE_FILTERS;
-	strings->len = n_strings;
-	ifr->ifr_data = (caddr_t) strings;
-	err = send_ioctl(fd, ifr);
-	if (err < 0) {
-		perror("Cannot get Rx n-tuple information");
-		free(strings);
-		return 101;
-	}
-
-	n_strings = strings->len;
-	fprintf(stdout, "Rx n-tuple filters:\n");
-	for (i = 0; i < n_strings; i++)
-		fprintf(stdout, "%s", &strings->data[i * ETH_GSTRING_LEN]);
-
-	free(strings);
-
 	return 0;
 }
 


^ permalink raw reply related

* [ethtool PATCH 1/4] Add support for ESP as a separate protocol from AH
From: Alexander Duyck @ 2011-02-25 23:48 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225233902.8409.74474.stgit@gitlad.jf.intel.com>

This change is meant to split out ESP from AH.  Currently they are present
as both a combined value, and two separate values.  In order to try and
support eventually splitting the two out into separate values this change
makes it so that ESP can be called out separately in ethtool.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 ethtool.8.in |    2 +-
 ethtool.c    |   21 ++++++++++++++++-----
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/ethtool.8.in b/ethtool.8.in
index 133825b..7dec259 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -50,7 +50,7 @@
 .\"
 .\"	\(*FL - flow type values
 .\"
-.ds FL \fBtcp4\fP|\fBudp4\fP|\fBah4\fP|\fBsctp4\fP|\fBtcp6\fP|\fBudp6\fP|\fBah6\fP|\fBsctp6\fP
+.ds FL \fBtcp4\fP|\fBudp4\fP|\fBah4\fP|\fBesp4\fP|\fBsctp4\fP|\fBtcp6\fP|\fBudp6\fP|\fBah6\fP|\fBesp6\fP|\fBsctp6\fP
 .\"
 .\"	\(*HO - hash options
 .\"
diff --git a/ethtool.c b/ethtool.c
index 1afdfe4..14740d5 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -32,7 +32,6 @@
 #include <sys/ioctl.h>
 #include <sys/stat.h>
 #include <stdio.h>
-#include <string.h>
 #include <errno.h>
 #include <net/if.h>
 #include <sys/utsname.h>
@@ -232,15 +231,15 @@ static struct option {
     { "-S", "--statistics", MODE_GSTATS, "Show adapter statistics" },
     { "-n", "--show-nfc", MODE_GNFC, "Show Rx network flow classification "
 		"options",
-		"		[ rx-flow-hash tcp4|udp4|ah4|sctp4|"
-		"tcp6|udp6|ah6|sctp6 ]\n" },
+		"		[ rx-flow-hash tcp4|udp4|ah4|esp4|sctp4|"
+		"tcp6|udp6|ah6|esp6|sctp6 ]\n" },
     { "-f", "--flash", MODE_FLASHDEV, "FILENAME " "Flash firmware image "
     		"from the specified file to a region on the device",
 		"               [ REGION-NUMBER-TO-FLASH ]\n" },
     { "-N", "--config-nfc", MODE_SNFC, "Configure Rx network flow "
 		"classification options",
-		"		[ rx-flow-hash tcp4|udp4|ah4|sctp4|"
-		"tcp6|udp6|ah6|sctp6 m|v|t|s|d|f|n|r... ]\n" },
+		"		[ rx-flow-hash tcp4|udp4|ah4|esp4|sctp4|"
+		"tcp6|udp6|ah6|esp6|sctp6 m|v|t|s|d|f|n|r... ]\n" },
     { "-x", "--show-rxfh-indir", MODE_GRXFHINDIR, "Show Rx flow hash "
 		"indirection" },
     { "-X", "--set-rxfh-indir", MODE_SRXFHINDIR, "Set Rx flow hash indirection",
@@ -778,6 +777,8 @@ static int rxflow_str_to_type(const char *str)
 		flow_type = UDP_V4_FLOW;
 	else if (!strcmp(str, "ah4"))
 		flow_type = AH_ESP_V4_FLOW;
+	else if (!strcmp(str, "esp4"))
+		flow_type = ESP_V4_FLOW;
 	else if (!strcmp(str, "sctp4"))
 		flow_type = SCTP_V4_FLOW;
 	else if (!strcmp(str, "tcp6"))
@@ -786,6 +787,8 @@ static int rxflow_str_to_type(const char *str)
 		flow_type = UDP_V6_FLOW;
 	else if (!strcmp(str, "ah6"))
 		flow_type = AH_ESP_V6_FLOW;
+	else if (!strcmp(str, "esp6"))
+		flow_type = ESP_V6_FLOW;
 	else if (!strcmp(str, "sctp6"))
 		flow_type = SCTP_V6_FLOW;
 	else if (!strcmp(str, "ether"))
@@ -1918,8 +1921,12 @@ static int dump_rxfhash(int fhash, u64 val)
 		fprintf(stdout, "SCTP over IPV4 flows");
 		break;
 	case AH_ESP_V4_FLOW:
+	case AH_V4_FLOW:
 		fprintf(stdout, "IPSEC AH over IPV4 flows");
 		break;
+	case ESP_V4_FLOW:
+		fprintf(stdout, "IPSEC ESP over IPV4 flows");
+		break;
 	case TCP_V6_FLOW:
 		fprintf(stdout, "TCP over IPV6 flows");
 		break;
@@ -1930,8 +1937,12 @@ static int dump_rxfhash(int fhash, u64 val)
 		fprintf(stdout, "SCTP over IPV6 flows");
 		break;
 	case AH_ESP_V6_FLOW:
+	case AH_V6_FLOW:
 		fprintf(stdout, "IPSEC AH over IPV6 flows");
 		break;
+	case ESP_V6_FLOW:
+		fprintf(stdout, "IPSEC ESP over IPV6 flows");
+		break;
 	default:
 		break;
 	}


^ permalink raw reply related

* [ethtool PATCH 0/4] Add support for network flow classifier rules
From: Alexander Duyck @ 2011-02-25 23:48 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev

This patch series implements the user-space portion of network flow
classifier rules.  The original patches were applied to the kernel a couple
of years ago, but the user-space side was never applied.  As such I have
gone though and updated the original patch from Santwona Behera to make it
applicable to the current ethtool git tree.

In addition this updates the network flow classification rules to handle
displaying the same fields as ntuple filters due to the fact that ntuple
display functionality had some serious issues, and provided no means for a
driver to display rules if they are stored internally.

The formatting of the man page and the help text may still need some work.
I was having difficulties determining the best way to layout the
class-rule-add and flow-type help instructions since they are mutually
exclusive but both contain a large number of possible options.

Finally there was one minor change adding ESP hashing as a separate option
to Rx-hashing that was contained in the original patch that I have moved
out into a separate patch.

---

Alexander Duyck (3):
      Add support for displaying a ntuple contained in an rx_flow_spec
      Remove strings based approach for displaying ntuple
      Add support for ESP as a separate protocol from AH

Santwona Behera (1):
      v2 Add RX packet classification interface

 Makefile.am      |    3 
 ethtool-bitops.h |   25 +
 ethtool-copy.h   |   23 +
 ethtool-util.h   |   42 ++
 ethtool.8.in     |  204 +++++----
 ethtool.c        |  344 ++++++---------
 rxclass.c        | 1241 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 1573 insertions(+), 309 deletions(-)
 create mode 100644 ethtool-bitops.h
 create mode 100644 rxclass.c

-- 

^ permalink raw reply

* Re: [patch net-next-2.6 V3] net: convert bonding to use rx_handler
From: Nicolas de Pesloüan @ 2011-02-25 23:46 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David Miller, kaber, eric.dumazet, netdev, shemminger, fubar,
	andy
In-Reply-To: <20110223190541.GB2783@psychotron.redhat.com>

Le 23/02/2011 20:05, Jiri Pirko a écrit :
> This patch converts bonding to use rx_handler. Results in cleaner
> __netif_receive_skb() with much less exceptions needed. Also
> bond-specific work is moved into bond code.
>
> Did performance test using pktgen and counting incoming packets by
> iptables. No regression noted.
>
> Signed-off-by: Jiri Pirko<jpirko@redhat.com>
>
> v1->v2:
>          using skb_iif instead of new input_dev to remember original
> 	device
>
> v2->v3:
> 	do another loop in case skb->dev is changed. That way orig_dev
> 	core can be left untouched.

Hi Jiri,

Eventually taking enough time for a review.

I think we should split this change :

1/ Change __netif_receive_skb() to call rx_handler for diverted net_device, until rx_handler is NULL.

2/ Convert currently existing rx_handlers (bridge and macvlan) to use this new "loop" feature, 
removing the need to call netif_rx() inside their respective rx_handler and also removing the 
associated overhead.

3/ Convert bonding to use rx_handlers.

Also, on step 1, we definitely need to clarify what orig_dev should be.

I now think that orig_dev should be "the device one level below the current one" or NULL if current 
device was not diverted from another one. It means that we should keep an array of crossed 
(diverted) devices and the associated orig_dev. This array would be used to pass the right orig_dev 
to protocol handlers, depending on the device they register on :

eth0 -> bond0 -> br0

A protocol handler registered on bond0 would receive eth0 as orig_dev.
A protocol handler registered on br0 would receive bond0 as orig_dev.

[snip]

> @@ -3167,32 +3135,8 @@ static int __netif_receive_skb(struct sk_buff *skb)

[snip]

> +another_round:
> +
> +	__this_cpu_inc(softnet_data.processed);
> +
>   #ifdef CONFIG_NET_CLS_ACT
>   	if (skb->tc_verd&  TC_NCLS) {
>   		skb->tc_verd = CLR_TC_NCLS(skb->tc_verd);
> @@ -3209,8 +3157,7 @@ static int __netif_receive_skb(struct sk_buff *skb)
>   #endif
>
>   	list_for_each_entry_rcu(ptype,&ptype_all, list) {
> -		if (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
> -		    ptype->dev == orig_dev) {
> +		if (!ptype->dev || ptype->dev == skb->dev) {
>   			if (pt_prev)
>   				ret = deliver_skb(skb, pt_prev, orig_dev);
>   			pt_prev = ptype;
> @@ -3224,16 +3171,20 @@ static int __netif_receive_skb(struct sk_buff *skb)
>   ncls:
>   #endif
>

Why do you loop to ptype_all before calling rx_handler ?

I don't understand why ptype_all and ptype_base are not handled at the same place in current 
__netif_receive_skb() but I think we should take the opportunity to change that, unless someone know 
of a good reason not to do so.

> -	/* Handle special case of bridge or macvlan */
>   	rx_handler = rcu_dereference(skb->dev->rx_handler);
>   	if (rx_handler) {

	Nicolas.

^ permalink raw reply

* [net-next-2.6 PATCH 07/10] [RFC] ixgbe: add basic support for settting and getting nfc controls
From: Alexander Duyck @ 2011-02-25 23:33 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225232357.7920.58559.stgit@gitlad.jf.intel.com>

This change adds basic suppport for the obtaining of RSS ring counts and
setting of RSS hash options.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 drivers/net/ixgbe/ixgbe_ethtool.c |   60 +++++++++++++++++++++++++++++++++++++
 1 files changed, 60 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 1b40c02..6c17e45 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -2299,6 +2299,65 @@ static int ixgbe_set_flags(struct net_device *netdev, u32 data)
 	return 0;
 }
 
+static int ixgbe_get_rss_hash_opts(struct ixgbe_adapter *adapter,
+				   struct ethtool_rxnfc *cmd)
+{
+	cmd->data = 0;
+
+	/* if RSS is disabled then report no hashing */
+	if (!(adapter->flags & IXGBE_FLAG_RSS_ENABLED))
+		return 0;
+
+	/* Report default options for RSS on ixgbe */
+	switch (cmd->flow_type) {
+	case TCP_V4_FLOW:
+		cmd->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
+	case UDP_V4_FLOW:
+	case SCTP_V4_FLOW:
+	case AH_ESP_V4_FLOW:
+	case AH_V4_FLOW:
+	case ESP_V4_FLOW:
+	case IPV4_FLOW:
+		cmd->data |= RXH_IP_SRC | RXH_IP_DST;
+		break;
+	case TCP_V6_FLOW:
+		cmd->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
+	case UDP_V6_FLOW:
+	case SCTP_V6_FLOW:
+	case AH_ESP_V6_FLOW:
+	case AH_V6_FLOW:
+	case ESP_V6_FLOW:
+	case IPV6_FLOW:
+		cmd->data |= RXH_IP_SRC | RXH_IP_DST;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int ixgbe_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd,
+			   void *rule_locs)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(dev);
+	int ret = -EOPNOTSUPP;
+
+	switch (cmd->cmd) {
+	case ETHTOOL_GRXFH:
+		ret = ixgbe_get_rss_hash_opts(adapter, cmd);
+		break;
+	case ETHTOOL_GRXRINGS:
+		cmd->data = adapter->num_rx_queues;
+		ret = 0;
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
 static const struct ethtool_ops ixgbe_ethtool_ops = {
 	.get_settings           = ixgbe_get_settings,
 	.set_settings           = ixgbe_set_settings,
@@ -2334,6 +2393,7 @@ static const struct ethtool_ops ixgbe_ethtool_ops = {
 	.set_coalesce           = ixgbe_set_coalesce,
 	.get_flags              = ethtool_op_get_flags,
 	.set_flags              = ixgbe_set_flags,
+	.get_rxnfc		= ixgbe_get_rxnfc,
 };
 
 void ixgbe_set_ethtool_ops(struct net_device *netdev)


^ permalink raw reply related

* [net-next-2.6 PATCH 10/10] [RFC] ixgbe: Add support for using the same fields as ntuple in nfc
From: Alexander Duyck @ 2011-02-25 23:33 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225232357.7920.58559.stgit@gitlad.jf.intel.com>

This change is meant to make use of the NTUPLE_FLOW_EXT to allow the
network flow classifier interface to support the same type of options in
terms of VLAN and User-defined as the ntuple interface.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 drivers/net/ixgbe/ixgbe_ethtool.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index a91e8db..494e1bf 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -2388,6 +2388,13 @@ static int ixgbe_get_ethtool_fdir_entry(struct ixgbe_adapter *adapter,
 	fsp->m_u.tcp_ip4_spec.ip4src = mask->formatted.src_ip[0];
 	fsp->h_u.tcp_ip4_spec.ip4dst = rule->filter.formatted.dst_ip[0];
 	fsp->m_u.tcp_ip4_spec.ip4dst = mask->formatted.dst_ip[0];
+	fsp->h_u.ntuple_spec.vlan_tci = rule->filter.formatted.vlan_id;
+	fsp->m_u.ntuple_spec.vlan_tci = mask->formatted.vlan_id;
+	fsp->h_u.ntuple_spec.vlan_etype = rule->filter.formatted.flex_bytes;
+	fsp->m_u.ntuple_spec.vlan_etype = mask->formatted.flex_bytes;
+	*(u8 *)fsp->h_u.ntuple_spec.data = rule->filter.formatted.vm_pool;
+	*(u8 *)fsp->m_u.ntuple_spec.data = mask->formatted.vm_pool;
+	fsp->flow_type_ext = NTUPLE_FLOW_EXT;
 
 	/* record action */
 	if (rule->action == IXGBE_FDIR_DROP_QUEUE)
@@ -2603,6 +2610,18 @@ static int ixgbe_add_ethtool_fdir_entry(struct ixgbe_adapter *adapter,
 	input->filter.formatted.dst_port = fsp->h_u.tcp_ip4_spec.pdst;
 	mask.formatted.dst_port = fsp->m_u.tcp_ip4_spec.pdst;
 
+	if (fsp->flow_type_ext == NTUPLE_FLOW_EXT) {
+		input->filter.formatted.vm_pool =
+				*(unsigned char *)fsp->h_u.ntuple_spec.data;
+		mask.formatted.vm_pool =
+				*(unsigned char *)fsp->m_u.ntuple_spec.data;
+		input->filter.formatted.vlan_id = fsp->h_u.ntuple_spec.vlan_tci;
+		mask.formatted.vlan_id = fsp->m_u.ntuple_spec.vlan_tci;
+		input->filter.formatted.flex_bytes =
+						fsp->h_u.ntuple_spec.vlan_etype;
+		mask.formatted.flex_bytes = fsp->m_u.ntuple_spec.vlan_etype;
+	}
+
 	/* determine if we need to drop or route the packet */
 	if (fsp->ring_cookie == RX_CLS_FLOW_DISC)
 		input->action = IXGBE_FDIR_DROP_QUEUE;


^ permalink raw reply related

* Re: SO_REUSEPORT - can it be done in kernel?
From: Bill Sommerfeld @ 2011-02-25 23:33 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Daniel Baluta, netdev, Thomas Graf
In-Reply-To: <AANLkTinNost8Swh2fhQh8UXVdPTW_bToS7LXmyuwqNNQ@mail.gmail.com>

On Fri, Feb 25, 2011 at 11:51, Tom Herbert <therbert@google.com> wrote:
>> Tom, Bill: do you have a timeline for merging this? Especially the
>> UDP bits?
> Bill has been working on the TCP implementation which is requiring
> some fairly major surgery on the listener connections in syn-rcvd
> state, this is ongoing.

Yup.  The broad approach I settled on is to delay binding of new
connections to listener sockets by moving receive_sock's from a
per-listen_sock hash table to new hash chains in the global hash
table.

This is very much a work-in-progress.  I'm part way through the
conversion and have running code with most of the new structures in
place in parallel with the old; I'm about to start relying exclusively
on the new, and then will tear down the old; once that's done I'll be
in a position to hook that up to SO_REUSEPORT and start actually
measuring the difference.  In short: it will be a while.

So splitting SO_REUSEPORT for UDP from SO_REUSEPORT for TCP makes a
lot of sense to me.

^ permalink raw reply

* [net-next-2.6 PATCH 09/10] [RFC] ixgbe: add support for nfc addition and removal of filters
From: Alexander Duyck @ 2011-02-25 23:33 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225232357.7920.58559.stgit@gitlad.jf.intel.com>

This change is meant to allow for nfc to insert and remove filters in order
to test the ethtool interface which includes it's own rules manager.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 drivers/net/ixgbe/ixgbe_ethtool.c |  232 +++++++++++++++++++++++++++++++++++++
 drivers/net/ixgbe/ixgbe_main.c    |   45 +++++++
 2 files changed, 277 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 145e018..a91e8db 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -2453,6 +2453,237 @@ static int ixgbe_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd,
 	return ret;
 }
 
+static int ixgbe_update_ethtool_fdir_entry(struct ixgbe_adapter *adapter,
+					   struct ixgbe_fdir_filter *input,
+					   u16 sw_idx)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	struct hlist_node *node, *node2, *parent;
+	struct ixgbe_fdir_filter *rule;
+	int err = -EINVAL;
+
+	parent = NULL;
+	rule = NULL;
+
+	hlist_for_each_entry_safe(rule, node, node2,
+				  &adapter->fdir_filter_list, fdir_node) {
+		/* hash found, or no matching entry */
+		if (rule->sw_idx >= sw_idx)
+			break;
+		parent = node;
+	}
+
+	/* if there is an old rule occupying our place remove it */
+	if (rule && (rule->sw_idx == sw_idx)) {
+		if (!input || (rule->filter.formatted.bkt_hash !=
+			       input->filter.formatted.bkt_hash)) {
+			err = ixgbe_fdir_erase_perfect_filter_82599(hw,
+								    &rule->filter,
+								    sw_idx);
+		}
+
+		hlist_del(&rule->fdir_node);
+		kfree(rule);
+		adapter->fdir_filter_count--;
+	}
+
+	/* stop here if there was not an input filter provided */
+	if (!input)
+		return err;
+
+	/* reset error since we are now adding a filter */
+	err = 0;
+
+	/* initialize node and set software index */
+	INIT_HLIST_NODE(&input->fdir_node);
+
+	/* add filter to the list */
+	if (parent)
+		hlist_add_after(parent, &input->fdir_node);
+	else
+		hlist_add_head(&input->fdir_node,
+			       &adapter->fdir_filter_list);
+
+	/* update counts */
+	adapter->fdir_filter_count++;
+
+	return err;
+}
+
+static int ixgbe_flowspec_to_flow_type(struct ethtool_rx_flow_spec *fsp,
+				       u8 *flow_type)
+{
+	switch (fsp->flow_type) {
+	case TCP_V4_FLOW:
+		*flow_type = IXGBE_ATR_FLOW_TYPE_TCPV4;
+		break;
+	case UDP_V4_FLOW:
+		*flow_type = IXGBE_ATR_FLOW_TYPE_UDPV4;
+		break;
+	case SCTP_V4_FLOW:
+		*flow_type = IXGBE_ATR_FLOW_TYPE_SCTPV4;
+		break;
+	case IP_USER_FLOW:
+		switch (fsp->h_u.usr_ip4_spec.proto) {
+		case IPPROTO_TCP:
+			*flow_type = IXGBE_ATR_FLOW_TYPE_TCPV4;
+			break;
+		case IPPROTO_UDP:
+			*flow_type = IXGBE_ATR_FLOW_TYPE_UDPV4;
+			break;
+		case IPPROTO_SCTP:
+			*flow_type = IXGBE_ATR_FLOW_TYPE_SCTPV4;
+			break;
+		case 0:
+			if (!fsp->m_u.usr_ip4_spec.proto) {
+				*flow_type = IXGBE_ATR_FLOW_TYPE_IPV4;
+				break;
+			}
+		default:
+			return 0;
+		}
+		break;
+	default:
+		return 0;
+	}
+
+	return 1;
+}
+
+static int ixgbe_add_ethtool_fdir_entry(struct ixgbe_adapter *adapter,
+					struct ethtool_rxnfc *cmd)
+{
+	struct ethtool_rx_flow_spec *fsp =
+		(struct ethtool_rx_flow_spec *)&cmd->fs;
+	struct ixgbe_hw *hw = &adapter->hw;
+	struct ixgbe_fdir_filter *input;
+	union ixgbe_atr_input mask;
+	int err;
+
+	if (!(adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE))
+		return -EOPNOTSUPP;
+
+	/*
+	 * Don't allow programming if the action is a queue greater than
+	 * the number of online Rx queues.
+	 */
+	if ((fsp->ring_cookie != RX_CLS_FLOW_DISC) &&
+	    (fsp->ring_cookie >= adapter->num_rx_queues))
+		return -EINVAL;
+
+	input = kzalloc(sizeof(*input), GFP_ATOMIC);
+	if (!input)
+		return -ENOMEM;
+
+	memset(&mask, 0, sizeof(union ixgbe_atr_input));
+
+	/* set SW index */
+	input->sw_idx = fsp->location;
+
+	/* record flow type */
+	if (!ixgbe_flowspec_to_flow_type(fsp,
+					 &input->filter.formatted.flow_type)) {
+		e_err(drv, "Unrecognized flow type\n");
+		goto err_out;
+	}
+
+	mask.formatted.flow_type = IXGBE_ATR_L4TYPE_IPV6_MASK |
+				   IXGBE_ATR_L4TYPE_MASK;
+
+	if (input->filter.formatted.flow_type == IXGBE_ATR_FLOW_TYPE_IPV4)
+		mask.formatted.flow_type &= IXGBE_ATR_L4TYPE_IPV6_MASK;
+
+	/* Copy input into formatted structures */
+	input->filter.formatted.src_ip[0] = fsp->h_u.tcp_ip4_spec.ip4src;
+	mask.formatted.src_ip[0] = fsp->m_u.tcp_ip4_spec.ip4src;
+	input->filter.formatted.dst_ip[0] = fsp->h_u.tcp_ip4_spec.ip4dst;
+	mask.formatted.dst_ip[0] = fsp->m_u.tcp_ip4_spec.ip4dst;
+	input->filter.formatted.src_port = fsp->h_u.tcp_ip4_spec.psrc;
+	mask.formatted.src_port = fsp->m_u.tcp_ip4_spec.psrc;
+	input->filter.formatted.dst_port = fsp->h_u.tcp_ip4_spec.pdst;
+	mask.formatted.dst_port = fsp->m_u.tcp_ip4_spec.pdst;
+
+	/* determine if we need to drop or route the packet */
+	if (fsp->ring_cookie == RX_CLS_FLOW_DISC)
+		input->action = IXGBE_FDIR_DROP_QUEUE;
+	else
+		input->action = fsp->ring_cookie;
+
+	spin_lock(&adapter->fdir_perfect_lock);
+
+	if (hlist_empty(&adapter->fdir_filter_list)) {
+		/* save mask and program input mask into HW */
+		memcpy(&adapter->fdir_mask, &mask, sizeof(mask));
+		err = ixgbe_fdir_set_input_mask_82599(hw, &mask);
+		if (err) {
+			e_err(drv, "Error writing mask\n");
+			goto err_out_w_lock;
+		}
+	} else if (memcmp(&adapter->fdir_mask, &mask, sizeof(mask))) {
+		e_err(drv, "Only one mask supported per port\n");
+		goto err_out_w_lock;
+	}
+
+	/* apply mask and compute/store hash */
+	ixgbe_atr_compute_perfect_hash_82599(&input->filter, &mask);
+
+	/* Don't exceed the available space for filters in the HW */
+	if (adapter->fdir_filter_count >=
+	    ((1024 << adapter->fdir_pballoc) - 2))
+		goto err_out_w_lock;
+
+	/* program filters to filter memory */
+	err = ixgbe_fdir_write_perfect_filter_82599(hw,
+				&input->filter, input->sw_idx,
+				adapter->rx_ring[input->action]->reg_idx);
+	if (err)
+		goto err_out_w_lock;
+
+	ixgbe_update_ethtool_fdir_entry(adapter, input, input->sw_idx);
+
+	spin_unlock(&adapter->fdir_perfect_lock);
+
+	return err;
+err_out_w_lock:
+	spin_unlock(&adapter->fdir_perfect_lock);
+err_out:
+	kfree(input);
+	return -1;
+}
+
+static int ixgbe_del_ethtool_fdir_entry(struct ixgbe_adapter *adapter,
+					struct ethtool_rxnfc *cmd)
+{
+	struct ethtool_rx_flow_spec *fsp =
+		(struct ethtool_rx_flow_spec *)&cmd->fs;
+	int err;
+
+	spin_lock(&adapter->fdir_perfect_lock);
+	err = ixgbe_update_ethtool_fdir_entry(adapter, NULL, fsp->location);
+	spin_unlock(&adapter->fdir_perfect_lock);
+
+	return err;
+}
+
+static int ixgbe_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(dev);
+	int ret = -EOPNOTSUPP;
+
+	switch (cmd->cmd) {
+	case ETHTOOL_SRXCLSRLINS:
+		ret = ixgbe_add_ethtool_fdir_entry(adapter, cmd);
+		break;
+	case ETHTOOL_SRXCLSRLDEL:
+		ret = ixgbe_del_ethtool_fdir_entry(adapter, cmd);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
 static const struct ethtool_ops ixgbe_ethtool_ops = {
 	.get_settings           = ixgbe_get_settings,
 	.set_settings           = ixgbe_set_settings,
@@ -2489,6 +2720,7 @@ static const struct ethtool_ops ixgbe_ethtool_ops = {
 	.get_flags              = ethtool_op_get_flags,
 	.set_flags              = ixgbe_set_flags,
 	.get_rxnfc		= ixgbe_get_rxnfc,
+	.set_rxnfc		= ixgbe_set_rxnfc,
 };
 
 void ixgbe_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 6b6f602..bd79ac1 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -3667,6 +3667,28 @@ static void ixgbe_configure_dcb(struct ixgbe_adapter *adapter)
 }
 
 #endif
+static void ixgbe_fdir_filter_restore(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	struct hlist_node *node, *node2;
+	struct ixgbe_fdir_filter *filter;
+
+	spin_lock(&adapter->fdir_perfect_lock);
+
+	if (!hlist_empty(&adapter->fdir_filter_list))
+		ixgbe_fdir_set_input_mask_82599(hw, &adapter->fdir_mask);
+
+	hlist_for_each_entry_safe(filter, node, node2,
+				  &adapter->fdir_filter_list, fdir_node) {
+		ixgbe_fdir_write_perfect_filter_82599(hw,
+						      &filter->filter,
+						      filter->sw_idx,
+						      filter->action);
+	}
+
+	spin_unlock(&adapter->fdir_perfect_lock);
+}
+
 static void ixgbe_configure(struct ixgbe_adapter *adapter)
 {
 	struct net_device *netdev = adapter->netdev;
@@ -3690,6 +3712,10 @@ static void ixgbe_configure(struct ixgbe_adapter *adapter)
 			adapter->tx_ring[i]->atr_sample_rate =
 						       adapter->atr_sample_rate;
 		ixgbe_init_fdir_signature_82599(hw, adapter->fdir_pballoc);
+	} else if (adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE) {
+		ixgbe_init_fdir_perfect_82599(&adapter->hw,
+					      adapter->fdir_pballoc);
+		ixgbe_fdir_filter_restore(adapter);
 	}
 	ixgbe_configure_virtualization(adapter);
 
@@ -4075,6 +4101,23 @@ static void ixgbe_clean_all_tx_rings(struct ixgbe_adapter *adapter)
 		ixgbe_clean_tx_ring(adapter->tx_ring[i]);
 }
 
+static void ixgbe_fdir_filter_exit(struct ixgbe_adapter *adapter)
+{
+	struct hlist_node *node, *node2;
+	struct ixgbe_fdir_filter *filter;
+
+	spin_lock(&adapter->fdir_perfect_lock);
+
+	hlist_for_each_entry_safe(filter, node, node2,
+				  &adapter->fdir_filter_list, fdir_node) {
+		hlist_del(&filter->fdir_node);
+		kfree(filter);
+	}
+	adapter->fdir_filter_count = 0;
+
+	spin_unlock(&adapter->fdir_perfect_lock);
+}
+
 void ixgbe_down(struct ixgbe_adapter *adapter)
 {
 	struct net_device *netdev = adapter->netdev;
@@ -5534,6 +5577,8 @@ static int ixgbe_close(struct net_device *netdev)
 	ixgbe_down(adapter);
 	ixgbe_free_irq(adapter);
 
+	ixgbe_fdir_filter_exit(adapter);
+
 	ixgbe_free_all_tx_resources(adapter);
 	ixgbe_free_all_rx_resources(adapter);
 


^ permalink raw reply related

* [net-next-2.6 PATCH 08/10] [RFC] ixgbe: add support for displaying ntuple filters via the nfc interface
From: Alexander Duyck @ 2011-02-25 23:33 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225232357.7920.58559.stgit@gitlad.jf.intel.com>

This code adds support for displaying the filters that were added via the
nfc interface.  This is primarily to test the interface for now, but I am
also looking into the feasability of moving all of the ntuple filter code
in ixgbe over to the nfc interface since it seems to be better implemented.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 drivers/net/ixgbe/ixgbe.h         |   11 ++++
 drivers/net/ixgbe/ixgbe_ethtool.c |   95 +++++++++++++++++++++++++++++++++++++
 2 files changed, 106 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index 0cd75d6..903c828 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -466,6 +466,17 @@ struct ixgbe_adapter {
 	DECLARE_BITMAP(active_vfs, IXGBE_MAX_VF_FUNCTIONS);
 	unsigned int num_vfs;
 	struct vf_data_storage *vfinfo;
+
+	struct hlist_head fdir_filter_list;
+	union ixgbe_atr_input fdir_mask;
+	int fdir_filter_count;
+};
+
+struct ixgbe_fdir_filter {
+	struct  hlist_node fdir_node;
+	union ixgbe_atr_input filter;
+	u16 sw_idx;
+	u16 action;
 };
 
 enum ixbge_state_t {
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 6c17e45..145e018 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -2337,6 +2337,89 @@ static int ixgbe_get_rss_hash_opts(struct ixgbe_adapter *adapter,
 	return 0;
 }
 
+static int ixgbe_get_ethtool_fdir_entry(struct ixgbe_adapter *adapter,
+					struct ethtool_rxnfc *cmd)
+{
+	union ixgbe_atr_input *mask = &adapter->fdir_mask;
+	struct ethtool_rx_flow_spec *fsp =
+		(struct ethtool_rx_flow_spec *)&cmd->fs;
+	struct hlist_node *node, *node2;
+	struct ixgbe_fdir_filter *rule = NULL;
+
+	/* report total rule count */
+	cmd->data = (1024 << adapter->fdir_pballoc) - 2;
+
+	hlist_for_each_entry_safe(rule, node, node2,
+				  &adapter->fdir_filter_list, fdir_node) {
+		if (fsp->location <= rule->sw_idx)
+			break;
+	}
+
+	if (!rule || fsp->location != rule->sw_idx)
+		return -EINVAL;
+
+	/* fill out the flow spec entry */
+
+	/* set flow type field */
+	switch (rule->filter.formatted.flow_type) {
+	case IXGBE_ATR_FLOW_TYPE_TCPV4:
+		fsp->flow_type = TCP_V4_FLOW;
+		break;
+	case IXGBE_ATR_FLOW_TYPE_UDPV4:
+		fsp->flow_type = UDP_V4_FLOW;
+		break;
+	case IXGBE_ATR_FLOW_TYPE_SCTPV4:
+		fsp->flow_type = SCTP_V4_FLOW;
+		break;
+	case IXGBE_ATR_FLOW_TYPE_IPV4:
+		fsp->flow_type = IP_USER_FLOW;
+		fsp->h_u.usr_ip4_spec.proto = 0;
+		fsp->m_u.usr_ip4_spec.proto = 0;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	fsp->h_u.tcp_ip4_spec.psrc = rule->filter.formatted.src_port;
+	fsp->m_u.tcp_ip4_spec.psrc = mask->formatted.src_port;
+	fsp->h_u.tcp_ip4_spec.pdst = rule->filter.formatted.dst_port;
+	fsp->m_u.tcp_ip4_spec.pdst = mask->formatted.dst_port;
+	fsp->h_u.tcp_ip4_spec.ip4src = rule->filter.formatted.src_ip[0];
+	fsp->m_u.tcp_ip4_spec.ip4src = mask->formatted.src_ip[0];
+	fsp->h_u.tcp_ip4_spec.ip4dst = rule->filter.formatted.dst_ip[0];
+	fsp->m_u.tcp_ip4_spec.ip4dst = mask->formatted.dst_ip[0];
+
+	/* record action */
+	if (rule->action == IXGBE_FDIR_DROP_QUEUE)
+		fsp->ring_cookie = RX_CLS_FLOW_DISC;
+	else
+		fsp->ring_cookie = rule->action;
+
+	return 0;
+}
+
+static int ixgbe_get_ethtool_fdir_all(struct ixgbe_adapter *adapter,
+				      struct ethtool_rxnfc *cmd,
+				      u32 *rule_locs)
+{
+	struct hlist_node *node, *node2;
+	struct ixgbe_fdir_filter *rule;
+	int cnt = 0;
+
+	/* report total rule count */
+	cmd->data = (1024 << adapter->fdir_pballoc) - 2;
+
+	hlist_for_each_entry_safe(rule, node, node2,
+				  &adapter->fdir_filter_list, fdir_node) {
+		if (cnt == cmd->rule_cnt)
+			return -EMSGSIZE;
+		rule_locs[cnt] = rule->sw_idx;
+		cnt++;
+	}
+
+	return 0;
+}
+
 static int ixgbe_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd,
 			   void *rule_locs)
 {
@@ -2351,6 +2434,18 @@ static int ixgbe_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd,
 		cmd->data = adapter->num_rx_queues;
 		ret = 0;
 		break;
+	case ETHTOOL_GRXCLSRLCNT:
+		cmd->rule_cnt = adapter->fdir_filter_count;
+		ret = 0;
+		break;
+	case ETHTOOL_GRXCLSRULE:
+		ret = ixgbe_get_ethtool_fdir_entry(adapter, cmd);
+		break;
+	case ETHTOOL_GRXCLSRLALL:
+		ret = ixgbe_get_ethtool_fdir_all(adapter, cmd,
+						 (u32 *)rule_locs);
+		break;
+
 	default:
 		break;
 	}


^ permalink raw reply related

* [net-next-2.6 PATCH 06/10] [RFC] ixgbe: update perfect filter framework to support retaining filters
From: Alexander Duyck @ 2011-02-25 23:33 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225232357.7920.58559.stgit@gitlad.jf.intel.com>

This change is meant to update the internal framework of ixgbe so that
perfect filters can be stored and tracked via software.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 drivers/net/ixgbe/ixgbe.h       |   18 +
 drivers/net/ixgbe/ixgbe_82599.c |  659 +++++++++++++++++++++------------------
 drivers/net/ixgbe/ixgbe_main.c  |    2 
 drivers/net/ixgbe/ixgbe_type.h  |   24 -
 4 files changed, 376 insertions(+), 327 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index 12769b5..0cd75d6 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -526,16 +526,22 @@ extern void ixgbe_alloc_rx_buffers(struct ixgbe_ring *, u16);
 extern void ixgbe_write_eitr(struct ixgbe_q_vector *);
 extern int ethtool_ioctl(struct ifreq *ifr);
 extern s32 ixgbe_reinit_fdir_tables_82599(struct ixgbe_hw *hw);
-extern s32 ixgbe_init_fdir_signature_82599(struct ixgbe_hw *hw, u32 pballoc);
-extern s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 pballoc);
+extern s32 ixgbe_init_fdir_signature_82599(struct ixgbe_hw *hw, u32 fdirctrl);
+extern s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 fdirctrl);
 extern s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
 						 union ixgbe_atr_hash_dword input,
 						 union ixgbe_atr_hash_dword common,
                                                  u8 queue);
-extern s32 ixgbe_fdir_add_perfect_filter_82599(struct ixgbe_hw *hw,
-                                      union ixgbe_atr_input *input,
-                                      struct ixgbe_atr_input_masks *input_masks,
-                                      u16 soft_id, u8 queue);
+extern s32 ixgbe_fdir_set_input_mask_82599(struct ixgbe_hw *hw,
+					   union ixgbe_atr_input *input_mask);
+extern s32 ixgbe_fdir_write_perfect_filter_82599(struct ixgbe_hw *hw,
+						 union ixgbe_atr_input *input,
+						 u16 soft_id, u8 queue);
+extern s32 ixgbe_fdir_erase_perfect_filter_82599(struct ixgbe_hw *hw,
+						 union ixgbe_atr_input *input,
+						 u16 soft_id);
+extern void ixgbe_atr_compute_perfect_hash_82599(union ixgbe_atr_input *input,
+						 union ixgbe_atr_input *mask);
 extern void ixgbe_configure_rscctl(struct ixgbe_adapter *adapter,
                                    struct ixgbe_ring *ring);
 extern void ixgbe_clear_rscctl(struct ixgbe_adapter *adapter,
diff --git a/drivers/net/ixgbe/ixgbe_82599.c b/drivers/net/ixgbe/ixgbe_82599.c
index c9bfa64..e5fa29d 100644
--- a/drivers/net/ixgbe/ixgbe_82599.c
+++ b/drivers/net/ixgbe/ixgbe_82599.c
@@ -413,14 +413,14 @@ static s32 ixgbe_start_mac_link_82599(struct ixgbe_hw *hw,
 	return status;
 }
 
- /**
-  *  ixgbe_disable_tx_laser_multispeed_fiber - Disable Tx laser
-  *  @hw: pointer to hardware structure
-  *
-  *  The base drivers may require better control over SFP+ module
-  *  PHY states.  This includes selectively shutting down the Tx
-  *  laser on the PHY, effectively halting physical link.
-  **/
+/**
+ *  ixgbe_disable_tx_laser_multispeed_fiber - Disable Tx laser
+ *  @hw: pointer to hardware structure
+ *
+ *  The base drivers may require better control over SFP+ module
+ *  PHY states.  This includes selectively shutting down the Tx
+ *  laser on the PHY, effectively halting physical link.
+ **/
 static void ixgbe_disable_tx_laser_multispeed_fiber(struct ixgbe_hw *hw)
 {
 	u32 esdp_reg = IXGBE_READ_REG(hw, IXGBE_ESDP);
@@ -1060,153 +1060,87 @@ s32 ixgbe_reinit_fdir_tables_82599(struct ixgbe_hw *hw)
 }
 
 /**
- *  ixgbe_init_fdir_signature_82599 - Initialize Flow Director signature filters
+ *  ixgbe_set_fdir_rxpba_82599 - Initialize Flow Director RX packet buffer
  *  @hw: pointer to hardware structure
  *  @pballoc: which mode to allocate filters with
  **/
-s32 ixgbe_init_fdir_signature_82599(struct ixgbe_hw *hw, u32 pballoc)
+static s32 ixgbe_set_fdir_rxpba_82599(struct ixgbe_hw *hw, const u32 pballoc)
 {
-	u32 fdirctrl = 0;
-	u32 pbsize;
+	u32 fdir_pbsize = hw->mac.rx_pb_size << IXGBE_RXPBSIZE_SHIFT;
+	u32 current_rxpbsize = 0;
 	int i;
 
-	/*
-	 * Before enabling Flow Director, the Rx Packet Buffer size
-	 * must be reduced.  The new value is the current size minus
-	 * flow director memory usage size.
-	 */
-	pbsize = (1 << (IXGBE_FDIR_PBALLOC_SIZE_SHIFT + pballoc));
-	IXGBE_WRITE_REG(hw, IXGBE_RXPBSIZE(0),
-	    (IXGBE_READ_REG(hw, IXGBE_RXPBSIZE(0)) - pbsize));
-
-	/*
-	 * The defaults in the HW for RX PB 1-7 are not zero and so should be
-	 * initialized to zero for non DCB mode otherwise actual total RX PB
-	 * would be bigger than programmed and filter space would run into
-	 * the PB 0 region.
-	 */
-	for (i = 1; i < 8; i++)
-		IXGBE_WRITE_REG(hw, IXGBE_RXPBSIZE(i), 0);
-
-	/* Send interrupt when 64 filters are left */
-	fdirctrl |= 4 << IXGBE_FDIRCTRL_FULL_THRESH_SHIFT;
-
-	/* Set the maximum length per hash bucket to 0xA filters */
-	fdirctrl |= 0xA << IXGBE_FDIRCTRL_MAX_LENGTH_SHIFT;
-
+	/* reserve space for Flow Director filters */
 	switch (pballoc) {
-	case IXGBE_FDIR_PBALLOC_64K:
-		/* 8k - 1 signature filters */
-		fdirctrl |= IXGBE_FDIRCTRL_PBALLOC_64K;
+	case IXGBE_FDIR_PBALLOC_256K:
+		fdir_pbsize -= 256 << IXGBE_RXPBSIZE_SHIFT;
 		break;
 	case IXGBE_FDIR_PBALLOC_128K:
-		/* 16k - 1 signature filters */
-		fdirctrl |= IXGBE_FDIRCTRL_PBALLOC_128K;
+		fdir_pbsize -= 128 << IXGBE_RXPBSIZE_SHIFT;
 		break;
-	case IXGBE_FDIR_PBALLOC_256K:
-		/* 32k - 1 signature filters */
-		fdirctrl |= IXGBE_FDIRCTRL_PBALLOC_256K;
+	case IXGBE_FDIR_PBALLOC_64K:
+		fdir_pbsize -= 64 << IXGBE_RXPBSIZE_SHIFT;
 		break;
+	case IXGBE_FDIR_PBALLOC_NONE:
 	default:
-		/* bad value */
-		return IXGBE_ERR_CONFIG;
-	};
-
-	/* Move the flexible bytes to use the ethertype - shift 6 words */
-	fdirctrl |= (0x6 << IXGBE_FDIRCTRL_FLEX_SHIFT);
+		return IXGBE_ERR_PARAM;
+	}
 
+	/* determine current RX packet buffer size */
+	for (i = 0; i < 8; i++)
+		current_rxpbsize += IXGBE_READ_REG(hw, IXGBE_RXPBSIZE(i));
 
-	/* Prime the keys for hashing */
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRHKEY, IXGBE_ATR_BUCKET_HASH_KEY);
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRSKEY, IXGBE_ATR_SIGNATURE_HASH_KEY);
+	/* if there is already room for the filters do nothing */
+	if (current_rxpbsize <= fdir_pbsize)
+		return 0;
 
-	/*
-	 * Poll init-done after we write the register.  Estimated times:
-	 *      10G: PBALLOC = 11b, timing is 60us
-	 *       1G: PBALLOC = 11b, timing is 600us
-	 *     100M: PBALLOC = 11b, timing is 6ms
-	 *
-	 *     Multiple these timings by 4 if under full Rx load
-	 *
-	 * So we'll poll for IXGBE_FDIR_INIT_DONE_POLL times, sleeping for
-	 * 1 msec per poll time.  If we're at line rate and drop to 100M, then
-	 * this might not finish in our poll time, but we can live with that
-	 * for now.
-	 */
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRCTRL, fdirctrl);
-	IXGBE_WRITE_FLUSH(hw);
-	for (i = 0; i < IXGBE_FDIR_INIT_DONE_POLL; i++) {
-		if (IXGBE_READ_REG(hw, IXGBE_FDIRCTRL) &
-		                   IXGBE_FDIRCTRL_INIT_DONE)
-			break;
-		msleep(1);
+	if (current_rxpbsize > hw->mac.rx_pb_size) {
+		/*
+		 * if rxpbsize is greater than max then HW max the Rx buffer
+		 * sizes are unconfigured or misconfigured since HW default is
+		 * to give the full buffer to each traffic class resulting in
+		 * the total size being buffer size 8x actual size
+		 *
+		 * This assumes no DCB since the RXPBSIZE registers appear to
+		 * be unconfigured.
+		 */
+		IXGBE_WRITE_REG(hw, IXGBE_RXPBSIZE(0), fdir_pbsize);
+		for (i = 1; i < 8; i++)
+			IXGBE_WRITE_REG(hw, IXGBE_RXPBSIZE(i), 0);
+	} else {
+		/*
+		 * Since the Rx packet buffer appears to have already been
+		 * configured we need to shrink each packet buffer by enough
+		 * to make room for the filters.  As such we take each rxpbsize
+		 * value and multiply it by a fraction representing the size
+		 * needed over the size we currently have.
+		 *
+		 * We need to reduce fdir_pbsize and current_rxpbsize to
+		 * 1/1024 of their original values in order to avoid
+		 * overflowing the u32 being used to store rxpbsize.
+		 */
+		fdir_pbsize >>= IXGBE_RXPBSIZE_SHIFT;
+		current_rxpbsize >>= IXGBE_RXPBSIZE_SHIFT;
+		for (i = 0; i < 8; i++) {
+			u32 rxpbsize = IXGBE_READ_REG(hw, IXGBE_RXPBSIZE(i));
+			rxpbsize *= fdir_pbsize;
+			rxpbsize /= current_rxpbsize;
+			IXGBE_WRITE_REG(hw, IXGBE_RXPBSIZE(i), rxpbsize);
+		}
 	}
-	if (i >= IXGBE_FDIR_INIT_DONE_POLL)
-		hw_dbg(hw, "Flow Director Signature poll time exceeded!\n");
 
 	return 0;
 }
 
 /**
- *  ixgbe_init_fdir_perfect_82599 - Initialize Flow Director perfect filters
+ *  ixgbe_fdir_enable_82599 - Initialize Flow Director control registers
  *  @hw: pointer to hardware structure
- *  @pballoc: which mode to allocate filters with
+ *  @fdirctrl: value to write to flow director control register
  **/
-s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 pballoc)
+static void ixgbe_fdir_enable_82599(struct ixgbe_hw *hw, u32 fdirctrl)
 {
-	u32 fdirctrl = 0;
-	u32 pbsize;
 	int i;
 
-	/*
-	 * Before enabling Flow Director, the Rx Packet Buffer size
-	 * must be reduced.  The new value is the current size minus
-	 * flow director memory usage size.
-	 */
-	pbsize = (1 << (IXGBE_FDIR_PBALLOC_SIZE_SHIFT + pballoc));
-	IXGBE_WRITE_REG(hw, IXGBE_RXPBSIZE(0),
-	    (IXGBE_READ_REG(hw, IXGBE_RXPBSIZE(0)) - pbsize));
-
-	/*
-	 * The defaults in the HW for RX PB 1-7 are not zero and so should be
-	 * initialized to zero for non DCB mode otherwise actual total RX PB
-	 * would be bigger than programmed and filter space would run into
-	 * the PB 0 region.
-	 */
-	for (i = 1; i < 8; i++)
-		IXGBE_WRITE_REG(hw, IXGBE_RXPBSIZE(i), 0);
-
-	/* Send interrupt when 64 filters are left */
-	fdirctrl |= 4 << IXGBE_FDIRCTRL_FULL_THRESH_SHIFT;
-
-	/* Initialize the drop queue to Rx queue 127 */
-	fdirctrl |= (127 << IXGBE_FDIRCTRL_DROP_Q_SHIFT);
-
-	switch (pballoc) {
-	case IXGBE_FDIR_PBALLOC_64K:
-		/* 2k - 1 perfect filters */
-		fdirctrl |= IXGBE_FDIRCTRL_PBALLOC_64K;
-		break;
-	case IXGBE_FDIR_PBALLOC_128K:
-		/* 4k - 1 perfect filters */
-		fdirctrl |= IXGBE_FDIRCTRL_PBALLOC_128K;
-		break;
-	case IXGBE_FDIR_PBALLOC_256K:
-		/* 8k - 1 perfect filters */
-		fdirctrl |= IXGBE_FDIRCTRL_PBALLOC_256K;
-		break;
-	default:
-		/* bad value */
-		return IXGBE_ERR_CONFIG;
-	};
-
-	/* Turn perfect match filtering on */
-	fdirctrl |= IXGBE_FDIRCTRL_PERFECT_MATCH;
-	fdirctrl |= IXGBE_FDIRCTRL_REPORT_STATUS;
-
-	/* Move the flexible bytes to use the ethertype - shift 6 words */
-	fdirctrl |= (0x6 << IXGBE_FDIRCTRL_FLEX_SHIFT);
-
 	/* Prime the keys for hashing */
 	IXGBE_WRITE_REG(hw, IXGBE_FDIRHKEY, IXGBE_ATR_BUCKET_HASH_KEY);
 	IXGBE_WRITE_REG(hw, IXGBE_FDIRSKEY, IXGBE_ATR_SIGNATURE_HASH_KEY);
@@ -1224,10 +1158,6 @@ s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 pballoc)
 	 * this might not finish in our poll time, but we can live with that
 	 * for now.
 	 */
-
-	/* Set the maximum length per hash bucket to 0xA filters */
-	fdirctrl |= (0xA << IXGBE_FDIRCTRL_MAX_LENGTH_SHIFT);
-
 	IXGBE_WRITE_REG(hw, IXGBE_FDIRCTRL, fdirctrl);
 	IXGBE_WRITE_FLUSH(hw);
 	for (i = 0; i < IXGBE_FDIR_INIT_DONE_POLL; i++) {
@@ -1236,101 +1166,77 @@ s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 pballoc)
 			break;
 		msleep(1);
 	}
-	if (i >= IXGBE_FDIR_INIT_DONE_POLL)
-		hw_dbg(hw, "Flow Director Perfect poll time exceeded!\n");
 
-	return 0;
+	if (i >= IXGBE_FDIR_INIT_DONE_POLL)
+		hw_dbg(hw, "Flow Director poll time exceeded!\n");
 }
 
-
 /**
- *  ixgbe_atr_compute_hash_82599 - Compute the hashes for SW ATR
- *  @stream: input bitstream to compute the hash on
- *  @key: 32-bit hash key
+ *  ixgbe_init_fdir_signature_82599 - Initialize Flow Director signature filters
+ *  @hw: pointer to hardware structure
+ *  @fdirctrl: value to write to flow director control register, initially
+ *             contains just the value of the Rx packet buffer allocation
  **/
-static u32 ixgbe_atr_compute_hash_82599(union ixgbe_atr_input *atr_input,
-					u32 key)
+s32 ixgbe_init_fdir_signature_82599(struct ixgbe_hw *hw, u32 fdirctrl)
 {
-	/*
-	 * The algorithm is as follows:
-	 *    Hash[15:0] = Sum { S[n] x K[n+16] }, n = 0...350
-	 *    where Sum {A[n]}, n = 0...n is bitwise XOR of A[0], A[1]...A[n]
-	 *    and A[n] x B[n] is bitwise AND between same length strings
-	 *
-	 *    K[n] is 16 bits, defined as:
-	 *       for n modulo 32 >= 15, K[n] = K[n % 32 : (n % 32) - 15]
-	 *       for n modulo 32 < 15, K[n] =
-	 *             K[(n % 32:0) | (31:31 - (14 - (n % 32)))]
-	 *
-	 *    S[n] is 16 bits, defined as:
-	 *       for n >= 15, S[n] = S[n:n - 15]
-	 *       for n < 15, S[n] = S[(n:0) | (350:350 - (14 - n))]
-	 *
-	 *    To simplify for programming, the algorithm is implemented
-	 *    in software this way:
-	 *
-	 *    key[31:0], hi_hash_dword[31:0], lo_hash_dword[31:0], hash[15:0]
-	 *
-	 *    for (i = 0; i < 352; i+=32)
-	 *        hi_hash_dword[31:0] ^= Stream[(i+31):i];
-	 *
-	 *    lo_hash_dword[15:0]  ^= Stream[15:0];
-	 *    lo_hash_dword[15:0]  ^= hi_hash_dword[31:16];
-	 *    lo_hash_dword[31:16] ^= hi_hash_dword[15:0];
-	 *
-	 *    hi_hash_dword[31:0]  ^= Stream[351:320];
-	 *
-	 *    if(key[0])
-	 *        hash[15:0] ^= Stream[15:0];
-	 *
-	 *    for (i = 0; i < 16; i++) {
-	 *        if (key[i])
-	 *            hash[15:0] ^= lo_hash_dword[(i+15):i];
-	 *        if (key[i + 16])
-	 *            hash[15:0] ^= hi_hash_dword[(i+15):i];
-	 *    }
-	 *
-	 */
-	__be32 common_hash_dword = 0;
-	u32 hi_hash_dword, lo_hash_dword, flow_vm_vlan;
-	u32 hash_result = 0;
-	u8 i;
+	s32 err;
 
-	/* record the flow_vm_vlan bits as they are a key part to the hash */
-	flow_vm_vlan = ntohl(atr_input->dword_stream[0]);
+	/* Before enabling Flow Director, verify the Rx Packet Buffer size */
+	err = ixgbe_set_fdir_rxpba_82599(hw, fdirctrl);
+	if (err)
+		return err;
 
-	/* generate common hash dword */
-	for (i = 10; i; i -= 2)
-		common_hash_dword ^= atr_input->dword_stream[i] ^
-				     atr_input->dword_stream[i - 1];
+	/*
+	 * Continue setup of fdirctrl register bits:
+	 *  Move the flexible bytes to use the ethertype - shift 6 words
+	 *  Set the maximum length per hash bucket to 0xA filters
+	 *  Send interrupt when 64 filters are left
+	 */
+	fdirctrl |= (0x6 << IXGBE_FDIRCTRL_FLEX_SHIFT) |
+		    (0xA << IXGBE_FDIRCTRL_MAX_LENGTH_SHIFT) |
+		    (4 << IXGBE_FDIRCTRL_FULL_THRESH_SHIFT);
 
-	hi_hash_dword = ntohl(common_hash_dword);
+	/* write hashes and fdirctrl register, poll for completion */
+	ixgbe_fdir_enable_82599(hw, fdirctrl);
 
-	/* low dword is word swapped version of common */
-	lo_hash_dword = (hi_hash_dword >> 16) | (hi_hash_dword << 16);
+	return 0;
+}
 
-	/* apply flow ID/VM pool/VLAN ID bits to hash words */
-	hi_hash_dword ^= flow_vm_vlan ^ (flow_vm_vlan >> 16);
+/**
+ *  ixgbe_init_fdir_perfect_82599 - Initialize Flow Director perfect filters
+ *  @hw: pointer to hardware structure
+ *  @fdirctrl: value to write to flow director control register, initially
+ *             contains just the value of the Rx packet buffer allocation
+ **/
+s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 fdirctrl)
+{
+	s32 err;
 
-	/* Process bits 0 and 16 */
-	if (key & 0x0001) hash_result ^= lo_hash_dword;
-	if (key & 0x00010000) hash_result ^= hi_hash_dword;
+	/* Before enabling Flow Director, verify the Rx Packet Buffer size */
+	err = ixgbe_set_fdir_rxpba_82599(hw, fdirctrl);
+	if (err)
+		return err;
 
 	/*
-	 * apply flow ID/VM pool/VLAN ID bits to lo hash dword, we had to
-	 * delay this because bit 0 of the stream should not be processed
-	 * so we do not add the vlan until after bit 0 was processed
+	 * Continue setup of fdirctrl register bits:
+	 *  Turn perfect match filtering on
+	 *  Report hash in RSS field of Rx wb descriptor
+	 *  Initialize the drop queue
+	 *  Move the flexible bytes to use the ethertype - shift 6 words
+	 *  Set the maximum length per hash bucket to 0xA filters
+	 *  Send interrupt when 64 (0x4 * 16) filters are left
 	 */
-	lo_hash_dword ^= flow_vm_vlan ^ (flow_vm_vlan << 16);
-
+	fdirctrl |= IXGBE_FDIRCTRL_PERFECT_MATCH |
+		    IXGBE_FDIRCTRL_REPORT_STATUS |
+		    (IXGBE_FDIR_DROP_QUEUE << IXGBE_FDIRCTRL_DROP_Q_SHIFT) |
+		    (0x6 << IXGBE_FDIRCTRL_FLEX_SHIFT) |
+		    (0xA << IXGBE_FDIRCTRL_MAX_LENGTH_SHIFT) |
+		    (4 << IXGBE_FDIRCTRL_FULL_THRESH_SHIFT);
 
-	/* process the remaining 30 bits in the key 2 bits at a time */
-	for (i = 15; i; i-- ) {
-		if (key & (0x0001 << i)) hash_result ^= lo_hash_dword >> i;
-		if (key & (0x00010000 << i)) hash_result ^= hi_hash_dword >> i;
-	}
+	/* write hashes and fdirctrl register, poll for completion */
+	ixgbe_fdir_enable_82599(hw, fdirctrl);
 
-	return hash_result & IXGBE_ATR_HASH_MASK;
+	return 0;
 }
 
 /*
@@ -1467,7 +1373,6 @@ s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
 	 */
 	fdirhashcmd = (u64)fdircmd << 32;
 	fdirhashcmd |= ixgbe_atr_compute_sig_hash_82599(input, common);
-
 	IXGBE_WRITE_REG64(hw, IXGBE_FDIRHASH, fdirhashcmd);
 
 	hw_dbg(hw, "Tx Queue=%x hash=%x\n", queue, (u32)fdirhashcmd);
@@ -1475,6 +1380,101 @@ s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
 	return 0;
 }
 
+#define IXGBE_COMPUTE_BKT_HASH_ITERATION(_n) \
+do { \
+	u32 n = (_n); \
+	if (IXGBE_ATR_BUCKET_HASH_KEY & (0x01 << n)) \
+		bucket_hash ^= lo_hash_dword >> n; \
+	if (IXGBE_ATR_BUCKET_HASH_KEY & (0x01 << (n + 16))) \
+		bucket_hash ^= hi_hash_dword >> n; \
+} while (0);
+
+/**
+ *  ixgbe_atr_compute_perfect_hash_82599 - Compute the perfect filter hash
+ *  @atr_input: input bitstream to compute the hash on
+ *  @input_mask: mask for the input bitstream
+ *
+ *  This function serves two main purposes.  First it applys the input_mask
+ *  to the atr_input resulting in a cleaned up atr_input data stream.
+ *  Secondly it computes the hash and stores it in the bkt_hash field at
+ *  the end of the input byte stream.  This way it will be available for
+ *  future use without needing to recompute the hash.
+ **/
+void ixgbe_atr_compute_perfect_hash_82599(union ixgbe_atr_input *input,
+					  union ixgbe_atr_input *input_mask)
+{
+
+	u32 hi_hash_dword, lo_hash_dword, flow_vm_vlan;
+	u32 bucket_hash = 0;
+
+	/* Apply masks to input data */
+	input->dword_stream[0]  &= input_mask->dword_stream[0];
+	input->dword_stream[1]  &= input_mask->dword_stream[1];
+	input->dword_stream[2]  &= input_mask->dword_stream[2];
+	input->dword_stream[3]  &= input_mask->dword_stream[3];
+	input->dword_stream[4]  &= input_mask->dword_stream[4];
+	input->dword_stream[5]  &= input_mask->dword_stream[5];
+	input->dword_stream[6]  &= input_mask->dword_stream[6];
+	input->dword_stream[7]  &= input_mask->dword_stream[7];
+	input->dword_stream[8]  &= input_mask->dword_stream[8];
+	input->dword_stream[9]  &= input_mask->dword_stream[9];
+	input->dword_stream[10] &= input_mask->dword_stream[10];
+
+	/* record the flow_vm_vlan bits as they are a key part to the hash */
+	flow_vm_vlan = ntohl(input->dword_stream[0]);
+
+	/* generate common hash dword */
+	hi_hash_dword = ntohl(input->dword_stream[1] ^
+				    input->dword_stream[2] ^
+				    input->dword_stream[3] ^
+				    input->dword_stream[4] ^
+				    input->dword_stream[5] ^
+				    input->dword_stream[6] ^
+				    input->dword_stream[7] ^
+				    input->dword_stream[8] ^
+				    input->dword_stream[9] ^
+				    input->dword_stream[10]);
+
+	/* low dword is word swapped version of common */
+	lo_hash_dword = (hi_hash_dword >> 16) | (hi_hash_dword << 16);
+
+	/* apply flow ID/VM pool/VLAN ID bits to hash words */
+	hi_hash_dword ^= flow_vm_vlan ^ (flow_vm_vlan >> 16);
+
+	/* Process bits 0 and 16 */
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(0);
+
+	/*
+	 * apply flow ID/VM pool/VLAN ID bits to lo hash dword, we had to
+	 * delay this because bit 0 of the stream should not be processed
+	 * so we do not add the vlan until after bit 0 was processed
+	 */
+	lo_hash_dword ^= flow_vm_vlan ^ (flow_vm_vlan << 16);
+
+	/* Process remaining 30 bit of the key */
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(1);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(2);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(3);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(4);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(5);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(6);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(7);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(8);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(9);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(10);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(11);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(12);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(13);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(14);
+	IXGBE_COMPUTE_BKT_HASH_ITERATION(15);
+
+	/*
+	 * Limit hash to 13 bits since max bucket count is 8K.
+	 * Store result at the end of the input stream.
+	 */
+	input->formatted.bkt_hash = bucket_hash & 0x1FFF;
+}
+
 /**
  *  ixgbe_get_fdirtcpm_82599 - generate a tcp port from atr_input_masks
  *  @input_mask: mask to be bit swapped
@@ -1484,11 +1484,11 @@ s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
  *  generate a correctly swapped value we need to bit swap the mask and that
  *  is what is accomplished by this function.
  **/
-static u32 ixgbe_get_fdirtcpm_82599(struct ixgbe_atr_input_masks *input_masks)
+static u32 ixgbe_get_fdirtcpm_82599(union ixgbe_atr_input *input_mask)
 {
-	u32 mask = ntohs(input_masks->dst_port_mask);
+	u32 mask = ntohs(input_mask->formatted.dst_port);
 	mask <<= IXGBE_FDIRTCPM_DPORTM_SHIFT;
-	mask |= ntohs(input_masks->src_port_mask);
+	mask |= ntohs(input_mask->formatted.src_port);
 	mask = ((mask & 0x55555555) << 1) | ((mask & 0xAAAAAAAA) >> 1);
 	mask = ((mask & 0x33333333) << 2) | ((mask & 0xCCCCCCCC) >> 2);
 	mask = ((mask & 0x0F0F0F0F) << 4) | ((mask & 0xF0F0F0F0) >> 4);
@@ -1510,52 +1510,14 @@ static u32 ixgbe_get_fdirtcpm_82599(struct ixgbe_atr_input_masks *input_masks)
 	IXGBE_WRITE_REG((a), (reg), IXGBE_STORE_AS_BE32(ntohl(value)))
 
 #define IXGBE_STORE_AS_BE16(_value) \
-	(((u16)(_value) >> 8) | ((u16)(_value) << 8))
+	ntohs(((u16)(_value) >> 8) | ((u16)(_value) << 8))
 
-/**
- *  ixgbe_fdir_add_perfect_filter_82599 - Adds a perfect filter
- *  @hw: pointer to hardware structure
- *  @input: input bitstream
- *  @input_masks: bitwise masks for relevant fields
- *  @soft_id: software index into the silicon hash tables for filter storage
- *  @queue: queue index to direct traffic to
- *
- *  Note that the caller to this function must lock before calling, since the
- *  hardware writes must be protected from one another.
- **/
-s32 ixgbe_fdir_add_perfect_filter_82599(struct ixgbe_hw *hw,
-                                      union ixgbe_atr_input *input,
-                                      struct ixgbe_atr_input_masks *input_masks,
-                                      u16 soft_id, u8 queue)
+s32 ixgbe_fdir_set_input_mask_82599(struct ixgbe_hw *hw,
+				    union ixgbe_atr_input *input_mask)
 {
-	u32 fdirhash;
-	u32 fdircmd;
-	u32 fdirport, fdirtcpm;
-	u32 fdirvlan;
-	/* start with VLAN, flex bytes, VM pool, and IPv6 destination masked */
-	u32 fdirm = IXGBE_FDIRM_VLANID | IXGBE_FDIRM_VLANP | IXGBE_FDIRM_FLEX |
-		    IXGBE_FDIRM_POOL | IXGBE_FDIRM_DIPv6;
-
-	/*
-	 * Check flow_type formatting, and bail out before we touch the hardware
-	 * if there's a configuration issue
-	 */
-	switch (input->formatted.flow_type) {
-	case IXGBE_ATR_FLOW_TYPE_IPV4:
-		/* use the L4 protocol mask for raw IPv4/IPv6 traffic */
-		fdirm |= IXGBE_FDIRM_L4P;
-	case IXGBE_ATR_FLOW_TYPE_SCTPV4:
-		if (input_masks->dst_port_mask || input_masks->src_port_mask) {
-			hw_dbg(hw, " Error on src/dst port mask\n");
-			return IXGBE_ERR_CONFIG;
-		}
-	case IXGBE_ATR_FLOW_TYPE_TCPV4:
-	case IXGBE_ATR_FLOW_TYPE_UDPV4:
-		break;
-	default:
-		hw_dbg(hw, " Error on flow type input\n");
-		return IXGBE_ERR_CONFIG;
-	}
+	/* mask IPv6 since it is currently not supported */
+	u32 fdirm = IXGBE_FDIRM_DIPv6;
+	u32 fdirtcpm;
 
 	/*
 	 * Program the relevant mask registers.  If src/dst_port or src/dst_addr
@@ -1567,41 +1529,71 @@ s32 ixgbe_fdir_add_perfect_filter_82599(struct ixgbe_hw *hw,
 	 * point in time.
 	 */
 
-	/* Program FDIRM */
-	switch (ntohs(input_masks->vlan_id_mask) & 0xEFFF) {
-	case 0xEFFF:
-		/* Unmask VLAN ID - bit 0 and fall through to unmask prio */
-		fdirm &= ~IXGBE_FDIRM_VLANID;
-	case 0xE000:
-		/* Unmask VLAN prio - bit 1 */
-		fdirm &= ~IXGBE_FDIRM_VLANP;
+	/* verify bucket hash is cleared on hash generation */
+	if (input_mask->formatted.bkt_hash)
+		hw_dbg(hw, " bucket hash should always be 0 in mask\n");
+
+	/* Program FDIRM and verify partial masks */
+	switch (input_mask->formatted.vm_pool & 0x7F) {
+	case 0x0:
+		fdirm |= IXGBE_FDIRM_POOL;
+	case 0x7F:
 		break;
-	case 0x0FFF:
-		/* Unmask VLAN ID - bit 0 */
-		fdirm &= ~IXGBE_FDIRM_VLANID;
+	default:
+		hw_dbg(hw, " Error on vm pool mask\n");
+		return IXGBE_ERR_CONFIG;
+	}
+
+	switch (input_mask->formatted.flow_type & IXGBE_ATR_L4TYPE_MASK) {
+	case 0x0:
+		fdirm |= IXGBE_FDIRM_L4P;
+		if (input_mask->formatted.dst_port ||
+		    input_mask->formatted.src_port) {
+			hw_dbg(hw, " Error on src/dst port mask\n");
+			return IXGBE_ERR_CONFIG;
+		}
+	case IXGBE_ATR_L4TYPE_MASK:
 		break;
+	default:
+		hw_dbg(hw, " Error on flow type mask\n");
+		return IXGBE_ERR_CONFIG;
+	}
+
+	switch (ntohs(input_mask->formatted.vlan_id) & 0xEFFF) {
 	case 0x0000:
-		/* do nothing, vlans already masked */
+		/* mask VLAN ID, fall through to mask VLAN priority */
+		fdirm |= IXGBE_FDIRM_VLANID;
+	case 0x0FFF:
+		/* mask VLAN priority */
+		fdirm |= IXGBE_FDIRM_VLANP;
+		break;
+	case 0xE000:
+		/* mask VLAN ID only, fall through */
+		fdirm |= IXGBE_FDIRM_VLANID;
+	case 0xEFFF:
+		/* no VLAN fields masked */
 		break;
 	default:
 		hw_dbg(hw, " Error on VLAN mask\n");
 		return IXGBE_ERR_CONFIG;
 	}
 
-	if (input_masks->flex_mask & 0xFFFF) {
-		if ((input_masks->flex_mask & 0xFFFF) != 0xFFFF) {
-			hw_dbg(hw, " Error on flexible byte mask\n");
-			return IXGBE_ERR_CONFIG;
-		}
-		/* Unmask Flex Bytes - bit 4 */
-		fdirm &= ~IXGBE_FDIRM_FLEX;
+	switch (input_mask->formatted.flex_bytes & 0xFFFF) {
+	case 0x0000:
+		/* Mask Flex Bytes, fall through */
+		fdirm |= IXGBE_FDIRM_FLEX;
+	case 0xFFFF:
+		break;
+	default:
+		hw_dbg(hw, " Error on flexible byte mask\n");
+		return IXGBE_ERR_CONFIG;
 	}
 
 	/* Now mask VM pool and destination IPv6 - bits 5 and 2 */
 	IXGBE_WRITE_REG(hw, IXGBE_FDIRM, fdirm);
 
 	/* store the TCP/UDP port masks, bit reversed from port layout */
-	fdirtcpm = ixgbe_get_fdirtcpm_82599(input_masks);
+	fdirtcpm = ixgbe_get_fdirtcpm_82599(input_mask);
 
 	/* write both the same so that UDP and TCP use the same mask */
 	IXGBE_WRITE_REG(hw, IXGBE_FDIRTCPM, ~fdirtcpm);
@@ -1609,24 +1601,32 @@ s32 ixgbe_fdir_add_perfect_filter_82599(struct ixgbe_hw *hw,
 
 	/* store source and destination IP masks (big-enian) */
 	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRSIP4M,
-			     ~input_masks->src_ip_mask[0]);
+			     ~input_mask->formatted.src_ip[0]);
 	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRDIP4M,
-			     ~input_masks->dst_ip_mask[0]);
+			     ~input_mask->formatted.dst_ip[0]);
 
-	/* Apply masks to input data */
-	input->formatted.vlan_id &= input_masks->vlan_id_mask;
-	input->formatted.flex_bytes &= input_masks->flex_mask;
-	input->formatted.src_port &= input_masks->src_port_mask;
-	input->formatted.dst_port &= input_masks->dst_port_mask;
-	input->formatted.src_ip[0] &= input_masks->src_ip_mask[0];
-	input->formatted.dst_ip[0] &= input_masks->dst_ip_mask[0];
+	return 0;
+}
 
-	/* record vlan (little-endian) and flex_bytes(big-endian) */
-	fdirvlan =
-		IXGBE_STORE_AS_BE16(ntohs(input->formatted.flex_bytes));
-	fdirvlan <<= IXGBE_FDIRVLAN_FLEX_SHIFT;
-	fdirvlan |= ntohs(input->formatted.vlan_id);
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRVLAN, fdirvlan);
+s32 ixgbe_fdir_write_perfect_filter_82599(struct ixgbe_hw *hw,
+					  union ixgbe_atr_input *input,
+					  u16 soft_id, u8 queue)
+{
+	u32 fdirport, fdirvlan, fdirhash, fdircmd;
+
+	/* currently IPv6 is not supported, must be programmed with 0 */
+	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRSIPv6(0),
+			     input->formatted.src_ip[0]);
+	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRSIPv6(1),
+			     input->formatted.src_ip[1]);
+	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRSIPv6(2),
+			     input->formatted.src_ip[2]);
+
+	/* record the source address (big-endian) */
+	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRIPSA, input->formatted.src_ip[0]);
+
+	/* record the first 32 bits of the destination address (big-endian) */
+	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRIPDA, input->formatted.dst_ip[0]);
 
 	/* record source and destination port (little-endian)*/
 	fdirport = ntohs(input->formatted.dst_port);
@@ -1634,29 +1634,80 @@ s32 ixgbe_fdir_add_perfect_filter_82599(struct ixgbe_hw *hw,
 	fdirport |= ntohs(input->formatted.src_port);
 	IXGBE_WRITE_REG(hw, IXGBE_FDIRPORT, fdirport);
 
-	/* record the first 32 bits of the destination address (big-endian) */
-	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRIPDA, input->formatted.dst_ip[0]);
+	/* record vlan (little-endian) and flex_bytes(big-endian) */
+	fdirvlan = IXGBE_STORE_AS_BE16(input->formatted.flex_bytes);
+	fdirvlan <<= IXGBE_FDIRVLAN_FLEX_SHIFT;
+	fdirvlan |= ntohs(input->formatted.vlan_id);
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRVLAN, fdirvlan);
 
-	/* record the source address (big-endian) */
-	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRIPSA, input->formatted.src_ip[0]);
+	/* configure FDIRHASH register */
+	fdirhash = input->formatted.bkt_hash;
+	fdirhash |= soft_id << IXGBE_FDIRHASH_SIG_SW_INDEX_SHIFT;
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRHASH, fdirhash);
+
+	/*
+	 * flush all previous writes to make certain registers are
+	 * programmed prior to issuing the command
+	 */
+	IXGBE_WRITE_FLUSH(hw);
 
 	/* configure FDIRCMD register */
 	fdircmd = IXGBE_FDIRCMD_CMD_ADD_FLOW | IXGBE_FDIRCMD_FILTER_UPDATE |
 		  IXGBE_FDIRCMD_LAST | IXGBE_FDIRCMD_QUEUE_EN;
+	if (queue == IXGBE_FDIR_DROP_QUEUE)
+		fdircmd |= IXGBE_FDIRCMD_DROP;
 	fdircmd |= input->formatted.flow_type << IXGBE_FDIRCMD_FLOW_TYPE_SHIFT;
 	fdircmd |= (u32)queue << IXGBE_FDIRCMD_RX_QUEUE_SHIFT;
+	fdircmd |= (u32)input->formatted.vm_pool << IXGBE_FDIRCMD_VT_POOL_SHIFT;
 
-	/* we only want the bucket hash so drop the upper 16 bits */
-	fdirhash = ixgbe_atr_compute_hash_82599(input,
-						IXGBE_ATR_BUCKET_HASH_KEY);
-	fdirhash |= soft_id << IXGBE_FDIRHASH_SIG_SW_INDEX_SHIFT;
-
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRHASH, fdirhash);
 	IXGBE_WRITE_REG(hw, IXGBE_FDIRCMD, fdircmd);
 
 	return 0;
 }
 
+s32 ixgbe_fdir_erase_perfect_filter_82599(struct ixgbe_hw *hw,
+					  union ixgbe_atr_input *input,
+					  u16 soft_id)
+{
+	u32 fdirhash;
+	u32 fdircmd = 0;
+	u32 retry_count;
+	s32 err = 0;
+
+	/* configure FDIRHASH register */
+	fdirhash = input->formatted.bkt_hash;
+	fdirhash |= soft_id << IXGBE_FDIRHASH_SIG_SW_INDEX_SHIFT;
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRHASH, fdirhash);
+
+	/* flush hash to HW */
+	IXGBE_WRITE_FLUSH(hw);
+
+	/* Query if filter is present */
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRCMD, IXGBE_FDIRCMD_CMD_QUERY_REM_FILT);
+
+	for (retry_count = 10; retry_count; retry_count--) {
+		/* allow 10us for query to process */
+		udelay(10);
+		/* verify query completed successfully */
+		fdircmd = IXGBE_READ_REG(hw, IXGBE_FDIRCMD);
+		if (!(fdircmd & IXGBE_FDIRCMD_CMD_MASK))
+			break;
+	}
+
+	if (!retry_count)
+		err = IXGBE_ERR_FDIR_REINIT_FAILED;
+
+	/* if filter exists in hardware then remove it */
+	if (fdircmd & IXGBE_FDIRCMD_FILTER_VALID) {
+		IXGBE_WRITE_REG(hw, IXGBE_FDIRHASH, fdirhash);
+		IXGBE_WRITE_FLUSH(hw);
+		IXGBE_WRITE_REG(hw, IXGBE_FDIRCMD,
+				IXGBE_FDIRCMD_CMD_REMOVE_FLOW);
+	}
+
+	return err;
+}
+
 /**
  *  ixgbe_read_analog_reg8_82599 - Reads 8 bit Omer analog register
  *  @hw: pointer to hardware structure
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 929f059..6b6f602 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -5137,7 +5137,7 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
 		adapter->atr_sample_rate = 20;
 		adapter->ring_feature[RING_F_FDIR].indices =
 							 IXGBE_MAX_FDIR_INDICES;
-		adapter->fdir_pballoc = 0;
+		adapter->fdir_pballoc = IXGBE_FDIR_PBALLOC_64K;
 #ifdef IXGBE_FCOE
 		adapter->flags |= IXGBE_FLAG_FCOE_CAPABLE;
 		adapter->flags &= ~IXGBE_FLAG_FCOE_ENABLED;
diff --git a/drivers/net/ixgbe/ixgbe_type.h b/drivers/net/ixgbe/ixgbe_type.h
index 2f8ba73..efcf65e 100644
--- a/drivers/net/ixgbe/ixgbe_type.h
+++ b/drivers/net/ixgbe/ixgbe_type.h
@@ -1922,9 +1922,10 @@
 #endif
 
 enum ixgbe_fdir_pballoc_type {
-	IXGBE_FDIR_PBALLOC_64K = 0,
-	IXGBE_FDIR_PBALLOC_128K,
-	IXGBE_FDIR_PBALLOC_256K,
+	IXGBE_FDIR_PBALLOC_NONE = 0,
+	IXGBE_FDIR_PBALLOC_64K  = 1,
+	IXGBE_FDIR_PBALLOC_128K = 2,
+	IXGBE_FDIR_PBALLOC_256K = 3,
 };
 #define IXGBE_FDIR_PBALLOC_SIZE_SHIFT           16
 
@@ -1978,7 +1979,7 @@ enum ixgbe_fdir_pballoc_type {
 #define IXGBE_FDIRCMD_CMD_ADD_FLOW              0x00000001
 #define IXGBE_FDIRCMD_CMD_REMOVE_FLOW           0x00000002
 #define IXGBE_FDIRCMD_CMD_QUERY_REM_FILT        0x00000003
-#define IXGBE_FDIRCMD_CMD_QUERY_REM_HASH        0x00000007
+#define IXGBE_FDIRCMD_FILTER_VALID              0x00000004
 #define IXGBE_FDIRCMD_FILTER_UPDATE             0x00000008
 #define IXGBE_FDIRCMD_IPv6DMATCH                0x00000010
 #define IXGBE_FDIRCMD_L4TYPE_UDP                0x00000020
@@ -1997,6 +1998,7 @@ enum ixgbe_fdir_pballoc_type {
 #define IXGBE_FDIR_INIT_DONE_POLL               10
 #define IXGBE_FDIRCMD_CMD_POLL                  10
 
+#define IXGBE_FDIR_DROP_QUEUE                   127
 /* Transmit Descriptor - Advanced */
 union ixgbe_adv_tx_desc {
 	struct {
@@ -2183,7 +2185,7 @@ union ixgbe_atr_input {
 	 * src_port   - 2 bytes
 	 * dst_port   - 2 bytes
 	 * flex_bytes - 2 bytes
-	 * rsvd0      - 2 bytes - space reserved must be 0.
+	 * bkt_hash   - 2 bytes
 	 */
 	struct {
 		u8     vm_pool;
@@ -2194,7 +2196,7 @@ union ixgbe_atr_input {
 		__be16 src_port;
 		__be16 dst_port;
 		__be16 flex_bytes;
-		__be16 rsvd0;
+		__be16 bkt_hash;
 	} formatted;
 	__be32 dword_stream[11];
 };
@@ -2215,16 +2217,6 @@ union ixgbe_atr_hash_dword {
 	__be32 dword;
 };
 
-struct ixgbe_atr_input_masks {
-	__be16 rsvd0;
-	__be16 vlan_id_mask;
-	__be32 dst_ip_mask[4];
-	__be32 src_ip_mask[4];
-	__be16 src_port_mask;
-	__be16 dst_port_mask;
-	__be16 flex_mask;
-};
-
 enum ixgbe_eeprom_type {
 	ixgbe_eeprom_uninitialized = 0,
 	ixgbe_eeprom_spi,


^ permalink raw reply related

* [net-next-2.6 PATCH 05/10] [RFC] ixgbe: add support for different Rx packet buffer sizes
From: Alexander Duyck @ 2011-02-25 23:33 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225232357.7920.58559.stgit@gitlad.jf.intel.com>

This change adds support for hardware that can have different Rx packet
buffer sizes.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 drivers/net/ixgbe/ixgbe_82598.c |    2 ++
 drivers/net/ixgbe/ixgbe_82599.c |    2 ++
 drivers/net/ixgbe/ixgbe_type.h  |    1 +
 drivers/net/ixgbe/ixgbe_x540.c  |    2 ++
 4 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_82598.c b/drivers/net/ixgbe/ixgbe_82598.c
index d0f1d9d..d2dc50c 100644
--- a/drivers/net/ixgbe/ixgbe_82598.c
+++ b/drivers/net/ixgbe/ixgbe_82598.c
@@ -37,6 +37,7 @@
 #define IXGBE_82598_RAR_ENTRIES   16
 #define IXGBE_82598_MC_TBL_SIZE  128
 #define IXGBE_82598_VFT_TBL_SIZE 128
+#define IXGBE_82598_RX_PB_SIZE   512
 
 static s32 ixgbe_setup_copper_link_82598(struct ixgbe_hw *hw,
                                          ixgbe_link_speed speed,
@@ -123,6 +124,7 @@ static s32 ixgbe_get_invariants_82598(struct ixgbe_hw *hw)
 	mac->mcft_size = IXGBE_82598_MC_TBL_SIZE;
 	mac->vft_size = IXGBE_82598_VFT_TBL_SIZE;
 	mac->num_rar_entries = IXGBE_82598_RAR_ENTRIES;
+	mac->rx_pb_size = IXGBE_82598_RX_PB_SIZE;
 	mac->max_rx_queues = IXGBE_82598_MAX_RX_QUEUES;
 	mac->max_tx_queues = IXGBE_82598_MAX_TX_QUEUES;
 	mac->max_msix_vectors = ixgbe_get_pcie_msix_count_82598(hw);
diff --git a/drivers/net/ixgbe/ixgbe_82599.c b/drivers/net/ixgbe/ixgbe_82599.c
index a21f581..c9bfa64 100644
--- a/drivers/net/ixgbe/ixgbe_82599.c
+++ b/drivers/net/ixgbe/ixgbe_82599.c
@@ -38,6 +38,7 @@
 #define IXGBE_82599_RAR_ENTRIES   128
 #define IXGBE_82599_MC_TBL_SIZE   128
 #define IXGBE_82599_VFT_TBL_SIZE  128
+#define IXGBE_82599_RX_PB_SIZE    512
 
 static void ixgbe_disable_tx_laser_multispeed_fiber(struct ixgbe_hw *hw);
 static void ixgbe_enable_tx_laser_multispeed_fiber(struct ixgbe_hw *hw);
@@ -168,6 +169,7 @@ static s32 ixgbe_get_invariants_82599(struct ixgbe_hw *hw)
 	mac->vft_size = IXGBE_82599_VFT_TBL_SIZE;
 	mac->num_rar_entries = IXGBE_82599_RAR_ENTRIES;
 	mac->max_rx_queues = IXGBE_82599_MAX_RX_QUEUES;
+	mac->rx_pb_size = IXGBE_82599_RX_PB_SIZE;
 	mac->max_tx_queues = IXGBE_82599_MAX_TX_QUEUES;
 	mac->max_msix_vectors = ixgbe_get_pcie_msix_count_generic(hw);
 
diff --git a/drivers/net/ixgbe/ixgbe_type.h b/drivers/net/ixgbe/ixgbe_type.h
index ab65d13..2f8ba73 100644
--- a/drivers/net/ixgbe/ixgbe_type.h
+++ b/drivers/net/ixgbe/ixgbe_type.h
@@ -2571,6 +2571,7 @@ struct ixgbe_mac_info {
 	u32                             vft_size;
 	u32                             num_rar_entries;
 	u32                             rar_highwater;
+	u32                             rx_pb_size;
 	u32                             max_tx_queues;
 	u32                             max_rx_queues;
 	u32                             max_msix_vectors;
diff --git a/drivers/net/ixgbe/ixgbe_x540.c b/drivers/net/ixgbe/ixgbe_x540.c
index f2518b0..fa095f8 100644
--- a/drivers/net/ixgbe/ixgbe_x540.c
+++ b/drivers/net/ixgbe/ixgbe_x540.c
@@ -38,6 +38,7 @@
 #define IXGBE_X540_RAR_ENTRIES   128
 #define IXGBE_X540_MC_TBL_SIZE   128
 #define IXGBE_X540_VFT_TBL_SIZE  128
+#define IXGBE_X540_RX_PB_SIZE    384
 
 static s32 ixgbe_update_flash_X540(struct ixgbe_hw *hw);
 static s32 ixgbe_poll_flash_update_done_X540(struct ixgbe_hw *hw);
@@ -62,6 +63,7 @@ static s32 ixgbe_get_invariants_X540(struct ixgbe_hw *hw)
 	mac->vft_size = IXGBE_X540_VFT_TBL_SIZE;
 	mac->num_rar_entries = IXGBE_X540_RAR_ENTRIES;
 	mac->max_rx_queues = IXGBE_X540_MAX_RX_QUEUES;
+	mac->rx_pb_size = IXGBE_X540_RX_PB_SIZE;
 	mac->max_tx_queues = IXGBE_X540_MAX_TX_QUEUES;
 	mac->max_msix_vectors = ixgbe_get_pcie_msix_count_generic(hw);
 


^ permalink raw reply related

* [net-next-2.6 PATCH 04/10] [RFC] ethtool: remove support for ETHTOOL_GRXNTUPLE
From: Alexander Duyck @ 2011-02-25 23:33 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225232357.7920.58559.stgit@gitlad.jf.intel.com>

This change is meant to remove all support for displaying an ntuple as
strings via ETHTOOL_GRXNTUPLE.  The reason for this change is due to the
fact that multiple issues have been found including:
 - Multiple buffer overruns for strings being displayed.
 - Incorrect filters displayed, cleared filters with ring of -2 are displayed
 - Setting get_rx_ntuple displays no rules if defined.
 - Endianess wrong on displayed values.
 - Hard limit of 1024 filters makes display functionality extremely limited

In order to address this I am proposing moving the displaying of rules over
to the ETHTOOL_GRXCLSRLCNT/ETHTOOL_GRXCLSRULE/ETHTOOL_GRXCLSRULEALL
interfaces and will have a follow-on patch to allow this.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 include/linux/ethtool.h   |    7 -
 include/linux/netdevice.h |    3 
 net/core/dev.c            |    5 -
 net/core/ethtool.c        |  299 ---------------------------------------------
 4 files changed, 1 insertions(+), 313 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 3d1f8e0..2f2f27c 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -250,7 +250,6 @@ enum ethtool_stringset {
 	ETH_SS_TEST		= 0,
 	ETH_SS_STATS,
 	ETH_SS_PRIV_FLAGS,
-	ETH_SS_NTUPLE_FILTERS,
 	ETH_SS_FEATURES,
 };
 
@@ -637,8 +636,6 @@ struct ethtool_rx_ntuple_flow_spec_container {
 };
 
 struct ethtool_rx_ntuple_list {
-#define ETHTOOL_MAX_NTUPLE_LIST_ENTRY 1024
-#define ETHTOOL_MAX_NTUPLE_STRING_PER_ENTRY 14
 	struct list_head	list;
 	unsigned int		count;
 };
@@ -659,7 +656,6 @@ u32 ethtool_op_get_ufo(struct net_device *dev);
 int ethtool_op_set_ufo(struct net_device *dev, u32 data);
 u32 ethtool_op_get_flags(struct net_device *dev);
 int ethtool_op_set_flags(struct net_device *dev, u32 data, u32 supported);
-void ethtool_ntuple_flush(struct net_device *dev);
 
 /**
  * &ethtool_ops - Alter and report network device settings
@@ -776,7 +772,6 @@ struct ethtool_ops {
 	int	(*reset)(struct net_device *, u32 *);
 	int	(*set_rx_ntuple)(struct net_device *,
 				 struct ethtool_rx_ntuple *);
-	int	(*get_rx_ntuple)(struct net_device *, u32 stringset, void *);
 	int	(*get_rxfh_indir)(struct net_device *,
 				  struct ethtool_rxfh_indir *);
 	int	(*set_rxfh_indir)(struct net_device *,
@@ -842,7 +837,7 @@ struct ethtool_ops {
 #define ETHTOOL_FLASHDEV	0x00000033 /* Flash firmware to device */
 #define ETHTOOL_RESET		0x00000034 /* Reset hardware */
 #define ETHTOOL_SRXNTUPLE	0x00000035 /* Add an n-tuple filter to device */
-#define ETHTOOL_GRXNTUPLE	0x00000036 /* Get n-tuple filters from device */
+/* ETHTOOL_GRXNTUPLE		0x00000036 disabled due to multiple issues */
 #define ETHTOOL_GSSET_INFO	0x00000037 /* Get string set info */
 #define ETHTOOL_GRXFHINDIR	0x00000038 /* Get RX flow hash indir'n table */
 #define ETHTOOL_SRXFHINDIR	0x00000039 /* Set RX flow hash indir'n table */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ffe56c1..f20fabe 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1246,9 +1246,6 @@ struct net_device {
 	/* max exchange id for FCoE LRO by ddp */
 	unsigned int		fcoe_ddp_xid;
 #endif
-	/* n-tuple filter list attached to this device */
-	struct ethtool_rx_ntuple_list ethtool_ntuple_list;
-
 	/* phy device may attach itself for hardware timestamping */
 	struct phy_device *phydev;
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 69a3c08..c788c98 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5886,8 +5886,6 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 
 	dev->gso_max_size = GSO_MAX_SIZE;
 
-	INIT_LIST_HEAD(&dev->ethtool_ntuple_list.list);
-	dev->ethtool_ntuple_list.count = 0;
 	INIT_LIST_HEAD(&dev->napi_list);
 	INIT_LIST_HEAD(&dev->unreg_list);
 	INIT_LIST_HEAD(&dev->link_watch_list);
@@ -5951,9 +5949,6 @@ void free_netdev(struct net_device *dev)
 	/* Flush device addresses */
 	dev_addr_flush(dev);
 
-	/* Clear ethtool n-tuple list */
-	ethtool_ntuple_flush(dev);
-
 	list_for_each_entry_safe(p, n, &dev->napi_list, dev_list)
 		netif_napi_del(p);
 
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 4843674..3c5ba60 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -152,18 +152,6 @@ int ethtool_op_set_flags(struct net_device *dev, u32 data, u32 supported)
 }
 EXPORT_SYMBOL(ethtool_op_set_flags);
 
-void ethtool_ntuple_flush(struct net_device *dev)
-{
-	struct ethtool_rx_ntuple_flow_spec_container *fsc, *f;
-
-	list_for_each_entry_safe(fsc, f, &dev->ethtool_ntuple_list.list, list) {
-		list_del(&fsc->list);
-		kfree(fsc);
-	}
-	dev->ethtool_ntuple_list.count = 0;
-}
-EXPORT_SYMBOL(ethtool_ntuple_flush);
-
 /* Handlers for each ethtool command */
 
 #define ETHTOOL_DEV_FEATURE_WORDS	1
@@ -825,34 +813,6 @@ out:
 	return ret;
 }
 
-static void __rx_ntuple_filter_add(struct ethtool_rx_ntuple_list *list,
-			struct ethtool_rx_ntuple_flow_spec *spec,
-			struct ethtool_rx_ntuple_flow_spec_container *fsc)
-{
-
-	/* don't add filters forever */
-	if (list->count >= ETHTOOL_MAX_NTUPLE_LIST_ENTRY) {
-		/* free the container */
-		kfree(fsc);
-		return;
-	}
-
-	/* Copy the whole filter over */
-	fsc->fs.flow_type = spec->flow_type;
-	memcpy(&fsc->fs.h_u, &spec->h_u, sizeof(spec->h_u));
-	memcpy(&fsc->fs.m_u, &spec->m_u, sizeof(spec->m_u));
-
-	fsc->fs.vlan_tag = spec->vlan_tag;
-	fsc->fs.vlan_tag_mask = spec->vlan_tag_mask;
-	fsc->fs.data = spec->data;
-	fsc->fs.data_mask = spec->data_mask;
-	fsc->fs.action = spec->action;
-
-	/* add to the list */
-	list_add_tail_rcu(&fsc->list, &list->list);
-	list->count++;
-}
-
 /*
  * ethtool does not (or did not) set masks for flow parameters that are
  * not specified, so if both value and mask are 0 then this must be
@@ -904,268 +864,12 @@ static noinline_for_stack int ethtool_set_rx_ntuple(struct net_device *dev,
 
 	rx_ntuple_fix_masks(&cmd.fs);
 
-	/*
-	 * Cache filter in dev struct for GET operation only if
-	 * the underlying driver doesn't have its own GET operation, and
-	 * only if the filter was added successfully.  First make sure we
-	 * can allocate the filter, then continue if successful.
-	 */
-	if (!ops->get_rx_ntuple) {
-		fsc = kmalloc(sizeof(*fsc), GFP_ATOMIC);
-		if (!fsc)
-			return -ENOMEM;
-	}
-
 	ret = ops->set_rx_ntuple(dev, &cmd);
 	if (ret) {
 		kfree(fsc);
 		return ret;
 	}
 
-	if (!ops->get_rx_ntuple)
-		__rx_ntuple_filter_add(&dev->ethtool_ntuple_list, &cmd.fs, fsc);
-
-	return ret;
-}
-
-static int ethtool_get_rx_ntuple(struct net_device *dev, void __user *useraddr)
-{
-	struct ethtool_gstrings gstrings;
-	const struct ethtool_ops *ops = dev->ethtool_ops;
-	struct ethtool_rx_ntuple_flow_spec_container *fsc;
-	u8 *data;
-	char *p;
-	int ret, i, num_strings = 0;
-
-	if (!ops->get_sset_count)
-		return -EOPNOTSUPP;
-
-	if (copy_from_user(&gstrings, useraddr, sizeof(gstrings)))
-		return -EFAULT;
-
-	ret = ops->get_sset_count(dev, gstrings.string_set);
-	if (ret < 0)
-		return ret;
-
-	gstrings.len = ret;
-
-	data = kzalloc(gstrings.len * ETH_GSTRING_LEN, GFP_USER);
-	if (!data)
-		return -ENOMEM;
-
-	if (ops->get_rx_ntuple) {
-		/* driver-specific filter grab */
-		ret = ops->get_rx_ntuple(dev, gstrings.string_set, data);
-		goto copy;
-	}
-
-	/* default ethtool filter grab */
-	i = 0;
-	p = (char *)data;
-	list_for_each_entry(fsc, &dev->ethtool_ntuple_list.list, list) {
-		sprintf(p, "Filter %d:\n", i);
-		p += ETH_GSTRING_LEN;
-		num_strings++;
-
-		switch (fsc->fs.flow_type) {
-		case TCP_V4_FLOW:
-			sprintf(p, "\tFlow Type: TCP\n");
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		case UDP_V4_FLOW:
-			sprintf(p, "\tFlow Type: UDP\n");
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		case SCTP_V4_FLOW:
-			sprintf(p, "\tFlow Type: SCTP\n");
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		case AH_ESP_V4_FLOW:
-			sprintf(p, "\tFlow Type: AH ESP\n");
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		case ESP_V4_FLOW:
-			sprintf(p, "\tFlow Type: ESP\n");
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		case IP_USER_FLOW:
-			sprintf(p, "\tFlow Type: Raw IP\n");
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		case IPV4_FLOW:
-			sprintf(p, "\tFlow Type: IPv4\n");
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		default:
-			sprintf(p, "\tFlow Type: Unknown\n");
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			goto unknown_filter;
-		}
-
-		/* now the rest of the filters */
-		switch (fsc->fs.flow_type) {
-		case TCP_V4_FLOW:
-		case UDP_V4_FLOW:
-		case SCTP_V4_FLOW:
-			sprintf(p, "\tSrc IP addr: 0x%x\n",
-				fsc->fs.h_u.tcp_ip4_spec.ip4src);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tSrc IP mask: 0x%x\n",
-				fsc->fs.m_u.tcp_ip4_spec.ip4src);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tDest IP addr: 0x%x\n",
-				fsc->fs.h_u.tcp_ip4_spec.ip4dst);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tDest IP mask: 0x%x\n",
-				fsc->fs.m_u.tcp_ip4_spec.ip4dst);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tSrc Port: %d, mask: 0x%x\n",
-				fsc->fs.h_u.tcp_ip4_spec.psrc,
-				fsc->fs.m_u.tcp_ip4_spec.psrc);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tDest Port: %d, mask: 0x%x\n",
-				fsc->fs.h_u.tcp_ip4_spec.pdst,
-				fsc->fs.m_u.tcp_ip4_spec.pdst);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tTOS: %d, mask: 0x%x\n",
-				fsc->fs.h_u.tcp_ip4_spec.tos,
-				fsc->fs.m_u.tcp_ip4_spec.tos);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		case AH_ESP_V4_FLOW:
-		case ESP_V4_FLOW:
-			sprintf(p, "\tSrc IP addr: 0x%x\n",
-				fsc->fs.h_u.ah_ip4_spec.ip4src);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tSrc IP mask: 0x%x\n",
-				fsc->fs.m_u.ah_ip4_spec.ip4src);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tDest IP addr: 0x%x\n",
-				fsc->fs.h_u.ah_ip4_spec.ip4dst);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tDest IP mask: 0x%x\n",
-				fsc->fs.m_u.ah_ip4_spec.ip4dst);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tSPI: %d, mask: 0x%x\n",
-				fsc->fs.h_u.ah_ip4_spec.spi,
-				fsc->fs.m_u.ah_ip4_spec.spi);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tTOS: %d, mask: 0x%x\n",
-				fsc->fs.h_u.ah_ip4_spec.tos,
-				fsc->fs.m_u.ah_ip4_spec.tos);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		case IP_USER_FLOW:
-			sprintf(p, "\tSrc IP addr: 0x%x\n",
-				fsc->fs.h_u.usr_ip4_spec.ip4src);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tSrc IP mask: 0x%x\n",
-				fsc->fs.m_u.usr_ip4_spec.ip4src);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tDest IP addr: 0x%x\n",
-				fsc->fs.h_u.usr_ip4_spec.ip4dst);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tDest IP mask: 0x%x\n",
-				fsc->fs.m_u.usr_ip4_spec.ip4dst);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		case IPV4_FLOW:
-			sprintf(p, "\tSrc IP addr: 0x%x\n",
-				fsc->fs.h_u.usr_ip4_spec.ip4src);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tSrc IP mask: 0x%x\n",
-				fsc->fs.m_u.usr_ip4_spec.ip4src);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tDest IP addr: 0x%x\n",
-				fsc->fs.h_u.usr_ip4_spec.ip4dst);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tDest IP mask: 0x%x\n",
-				fsc->fs.m_u.usr_ip4_spec.ip4dst);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tL4 bytes: 0x%x, mask: 0x%x\n",
-				fsc->fs.h_u.usr_ip4_spec.l4_4_bytes,
-				fsc->fs.m_u.usr_ip4_spec.l4_4_bytes);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tTOS: %d, mask: 0x%x\n",
-				fsc->fs.h_u.usr_ip4_spec.tos,
-				fsc->fs.m_u.usr_ip4_spec.tos);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tIP Version: %d, mask: 0x%x\n",
-				fsc->fs.h_u.usr_ip4_spec.ip_ver,
-				fsc->fs.m_u.usr_ip4_spec.ip_ver);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			sprintf(p, "\tProtocol: %d, mask: 0x%x\n",
-				fsc->fs.h_u.usr_ip4_spec.proto,
-				fsc->fs.m_u.usr_ip4_spec.proto);
-			p += ETH_GSTRING_LEN;
-			num_strings++;
-			break;
-		}
-		sprintf(p, "\tVLAN: %d, mask: 0x%x\n",
-			fsc->fs.vlan_tag, fsc->fs.vlan_tag_mask);
-		p += ETH_GSTRING_LEN;
-		num_strings++;
-		sprintf(p, "\tUser-defined: 0x%Lx\n", fsc->fs.data);
-		p += ETH_GSTRING_LEN;
-		num_strings++;
-		sprintf(p, "\tUser-defined mask: 0x%Lx\n", fsc->fs.data_mask);
-		p += ETH_GSTRING_LEN;
-		num_strings++;
-		if (fsc->fs.action == ETHTOOL_RXNTUPLE_ACTION_DROP)
-			sprintf(p, "\tAction: Drop\n");
-		else
-			sprintf(p, "\tAction: Direct to queue %d\n",
-				fsc->fs.action);
-		p += ETH_GSTRING_LEN;
-		num_strings++;
-unknown_filter:
-		i++;
-	}
-copy:
-	/* indicate to userspace how many strings we actually have */
-	gstrings.len = num_strings;
-	ret = -EFAULT;
-	if (copy_to_user(useraddr, &gstrings, sizeof(gstrings)))
-		goto out;
-	useraddr += sizeof(gstrings);
-	if (copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN))
-		goto out;
-	ret = 0;
-
-out:
-	kfree(data);
 	return ret;
 }
 
@@ -1902,9 +1606,6 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 	case ETHTOOL_SRXNTUPLE:
 		rc = ethtool_set_rx_ntuple(dev, useraddr);
 		break;
-	case ETHTOOL_GRXNTUPLE:
-		rc = ethtool_get_rx_ntuple(dev, useraddr);
-		break;
 	case ETHTOOL_GSSET_INFO:
 		rc = ethtool_get_sset_info(dev, useraddr);
 		break;


^ permalink raw reply related

* [net-next-2.6 PATCH 03/10] [RFC] ixgbe: remove ntuple filtering
From: Alexander Duyck @ 2011-02-25 23:32 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225232357.7920.58559.stgit@gitlad.jf.intel.com>

Due to numerous issues in ntuple filters it has been decieded to move the
interface over to the network flow classification interface.  As a first
step to achieving this I first need to remove the old ntuple interface.

In addition I am removing the requirement that ntuple filters have the same
number of queues and requirements as ATR.  As a result this change will
make it so that all the ntuple flag does is disable ATR for now.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 drivers/net/ixgbe/ixgbe_dcb_nl.c  |   47 +++++------
 drivers/net/ixgbe/ixgbe_ethtool.c |  162 +++----------------------------------
 drivers/net/ixgbe/ixgbe_main.c    |   45 +++-------
 3 files changed, 48 insertions(+), 206 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_dcb_nl.c b/drivers/net/ixgbe/ixgbe_dcb_nl.c
index a977df3..47fd9f6 100644
--- a/drivers/net/ixgbe/ixgbe_dcb_nl.c
+++ b/drivers/net/ixgbe/ixgbe_dcb_nl.c
@@ -114,11 +114,12 @@ static u8 ixgbe_dcbnl_set_state(struct net_device *netdev, u8 state)
 	u8 err = 0;
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);
 
+	/* verify there is something to do, if not then exit */
+	if (!!state != !(adapter->flags & IXGBE_FLAG_DCB_ENABLED))
+		return err;
+
 	if (state > 0) {
 		/* Turn on DCB */
-		if (adapter->flags & IXGBE_FLAG_DCB_ENABLED)
-			goto out;
-
 		if (!(adapter->flags & IXGBE_FLAG_MSIX_ENABLED)) {
 			e_err(drv, "Enable failed, needs MSI-X\n");
 			err = 1;
@@ -138,7 +139,6 @@ static u8 ixgbe_dcbnl_set_state(struct net_device *netdev, u8 state)
 		case ixgbe_mac_82599EB:
 		case ixgbe_mac_X540:
 			adapter->flags &= ~IXGBE_FLAG_FDIR_HASH_CAPABLE;
-			adapter->flags &= ~IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
 			break;
 		default:
 			break;
@@ -150,29 +150,28 @@ static u8 ixgbe_dcbnl_set_state(struct net_device *netdev, u8 state)
 			netdev->netdev_ops->ndo_open(netdev);
 	} else {
 		/* Turn off DCB */
-		if (adapter->flags & IXGBE_FLAG_DCB_ENABLED) {
-			if (netif_running(netdev))
-				netdev->netdev_ops->ndo_stop(netdev);
-			ixgbe_clear_interrupt_scheme(adapter);
+		if (netif_running(netdev))
+			netdev->netdev_ops->ndo_stop(netdev);
+		ixgbe_clear_interrupt_scheme(adapter);
 
-			adapter->hw.fc.requested_mode = adapter->last_lfc_mode;
-			adapter->temp_dcb_cfg.pfc_mode_enable = false;
-			adapter->dcb_cfg.pfc_mode_enable = false;
-			adapter->flags &= ~IXGBE_FLAG_DCB_ENABLED;
-			adapter->flags |= IXGBE_FLAG_RSS_ENABLED;
-			switch (adapter->hw.mac.type) {
-			case ixgbe_mac_82599EB:
-			case ixgbe_mac_X540:
+		adapter->hw.fc.requested_mode = adapter->last_lfc_mode;
+		adapter->temp_dcb_cfg.pfc_mode_enable = false;
+		adapter->dcb_cfg.pfc_mode_enable = false;
+		adapter->flags &= ~IXGBE_FLAG_DCB_ENABLED;
+		adapter->flags |= IXGBE_FLAG_RSS_ENABLED;
+		switch (adapter->hw.mac.type) {
+		case ixgbe_mac_82599EB:
+		case ixgbe_mac_X540:
+			if (!(adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE))
 				adapter->flags |= IXGBE_FLAG_FDIR_HASH_CAPABLE;
-				break;
-			default:
-				break;
-			}
-
-			ixgbe_init_interrupt_scheme(adapter);
-			if (netif_running(netdev))
-				netdev->netdev_ops->ndo_open(netdev);
+			break;
+		default:
+			break;
 		}
+
+		ixgbe_init_interrupt_scheme(adapter);
+		if (netif_running(netdev))
+			netdev->netdev_ops->ndo_open(netdev);
 	}
 out:
 	return err;
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 309272f..1b40c02 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -1031,9 +1031,6 @@ static int ixgbe_get_sset_count(struct net_device *netdev, int sset)
 		return IXGBE_TEST_LEN;
 	case ETH_SS_STATS:
 		return IXGBE_STATS_LEN;
-	case ETH_SS_NTUPLE_FILTERS:
-		return ETHTOOL_MAX_NTUPLE_LIST_ENTRY *
-		       ETHTOOL_MAX_NTUPLE_STRING_PER_ENTRY;
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -2277,20 +2274,19 @@ static int ixgbe_set_flags(struct net_device *netdev, u32 data)
 	 * Check if Flow Director n-tuple support was enabled or disabled.  If
 	 * the state changed, we need to reset.
 	 */
-	if ((adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE) &&
-	    (!(data & ETH_FLAG_NTUPLE))) {
-		/* turn off Flow Director perfect, set hash and reset */
+	if (!(adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE)) {
+		/* turn off ATR, enable perfect filters and reset */
+		if (data & ETH_FLAG_NTUPLE) {
+			adapter->flags &= ~IXGBE_FLAG_FDIR_HASH_CAPABLE;
+			adapter->flags |= IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
+			need_reset = true;
+		}
+	} else if (!(data & ETH_FLAG_NTUPLE)) {
+		/* turn off Flow Director, set ATR and reset */
 		adapter->flags &= ~IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
-		adapter->flags |= IXGBE_FLAG_FDIR_HASH_CAPABLE;
-		need_reset = true;
-	} else if ((!(adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE)) &&
-	           (data & ETH_FLAG_NTUPLE)) {
-		/* turn off Flow Director hash, enable perfect and reset */
-		adapter->flags &= ~IXGBE_FLAG_FDIR_HASH_CAPABLE;
-		adapter->flags |= IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
+		if (adapter->flags & IXGBE_FLAG_RSS_ENABLED)
+			adapter->flags |= IXGBE_FLAG_FDIR_HASH_CAPABLE;
 		need_reset = true;
-	} else {
-		/* no state change */
 	}
 
 	if (need_reset) {
@@ -2303,141 +2299,6 @@ static int ixgbe_set_flags(struct net_device *netdev, u32 data)
 	return 0;
 }
 
-static int ixgbe_set_rx_ntuple(struct net_device *dev,
-                               struct ethtool_rx_ntuple *cmd)
-{
-	struct ixgbe_adapter *adapter = netdev_priv(dev);
-	struct ethtool_rx_ntuple_flow_spec *fs = &cmd->fs;
-	union ixgbe_atr_input input_struct;
-	struct ixgbe_atr_input_masks input_masks;
-	int target_queue;
-	int err;
-
-	if (adapter->hw.mac.type == ixgbe_mac_82598EB)
-		return -EOPNOTSUPP;
-
-	/*
-	 * Don't allow programming if the action is a queue greater than
-	 * the number of online Tx queues.
-	 */
-	if ((fs->action >= adapter->num_tx_queues) ||
-	    (fs->action < ETHTOOL_RXNTUPLE_ACTION_DROP))
-		return -EINVAL;
-
-	memset(&input_struct, 0, sizeof(union ixgbe_atr_input));
-	memset(&input_masks, 0, sizeof(struct ixgbe_atr_input_masks));
-
-	/* record flow type */
-	switch (fs->flow_type) {
-	case IPV4_FLOW:
-		input_struct.formatted.flow_type = IXGBE_ATR_FLOW_TYPE_IPV4;
-		break;
-	case TCP_V4_FLOW:
-		input_struct.formatted.flow_type = IXGBE_ATR_FLOW_TYPE_TCPV4;
-		break;
-	case UDP_V4_FLOW:
-		input_struct.formatted.flow_type = IXGBE_ATR_FLOW_TYPE_UDPV4;
-		break;
-	case SCTP_V4_FLOW:
-		input_struct.formatted.flow_type = IXGBE_ATR_FLOW_TYPE_SCTPV4;
-		break;
-	default:
-		return -1;
-	}
-
-	/* copy vlan tag minus the CFI bit */
-	if ((fs->vlan_tag & 0xEFFF) || (~fs->vlan_tag_mask & 0xEFFF)) {
-		input_struct.formatted.vlan_id = htons(fs->vlan_tag & 0xEFFF);
-		if (!fs->vlan_tag_mask) {
-			input_masks.vlan_id_mask = htons(0xEFFF);
-		} else {
-			switch (~fs->vlan_tag_mask & 0xEFFF) {
-			/* all of these are valid vlan-mask values */
-			case 0xEFFF:
-			case 0xE000:
-			case 0x0FFF:
-			case 0x0000:
-				input_masks.vlan_id_mask =
-					htons(~fs->vlan_tag_mask);
-				break;
-			/* exit with error if vlan-mask is invalid */
-			default:
-				e_err(drv, "Partial VLAN ID or "
-				      "priority mask in vlan-mask is not "
-				      "supported by hardware\n");
-				return -1;
-			}
-		}
-	}
-
-	/* make sure we only use the first 2 bytes of user data */
-	if ((fs->data & 0xFFFF) || (~fs->data_mask & 0xFFFF)) {
-		input_struct.formatted.flex_bytes = htons(fs->data & 0xFFFF);
-		if (!(fs->data_mask & 0xFFFF)) {
-			input_masks.flex_mask = 0xFFFF;
-		} else if (~fs->data_mask & 0xFFFF) {
-			e_err(drv, "Partial user-def-mask is not "
-			      "supported by hardware\n");
-			return -1;
-		}
-	}
-
-	/*
-	 * Copy input into formatted structures
-	 *
-	 * These assignments are based on the following logic
-	 * If neither input or mask are set assume value is masked out.
-	 * If input is set, but mask is not mask should default to accept all.
-	 * If input is not set, but mask is set then mask likely results in 0.
-	 * If input is set and mask is set then assign both.
-	 */
-	if (fs->h_u.tcp_ip4_spec.ip4src || ~fs->m_u.tcp_ip4_spec.ip4src) {
-		input_struct.formatted.src_ip[0] = fs->h_u.tcp_ip4_spec.ip4src;
-		if (!fs->m_u.tcp_ip4_spec.ip4src)
-			input_masks.src_ip_mask[0] = 0xFFFFFFFF;
-		else
-			input_masks.src_ip_mask[0] =
-				~fs->m_u.tcp_ip4_spec.ip4src;
-	}
-	if (fs->h_u.tcp_ip4_spec.ip4dst || ~fs->m_u.tcp_ip4_spec.ip4dst) {
-		input_struct.formatted.dst_ip[0] = fs->h_u.tcp_ip4_spec.ip4dst;
-		if (!fs->m_u.tcp_ip4_spec.ip4dst)
-			input_masks.dst_ip_mask[0] = 0xFFFFFFFF;
-		else
-			input_masks.dst_ip_mask[0] =
-				~fs->m_u.tcp_ip4_spec.ip4dst;
-	}
-	if (fs->h_u.tcp_ip4_spec.psrc || ~fs->m_u.tcp_ip4_spec.psrc) {
-		input_struct.formatted.src_port = fs->h_u.tcp_ip4_spec.psrc;
-		if (!fs->m_u.tcp_ip4_spec.psrc)
-			input_masks.src_port_mask = 0xFFFF;
-		else
-			input_masks.src_port_mask = ~fs->m_u.tcp_ip4_spec.psrc;
-	}
-	if (fs->h_u.tcp_ip4_spec.pdst || ~fs->m_u.tcp_ip4_spec.pdst) {
-		input_struct.formatted.dst_port = fs->h_u.tcp_ip4_spec.pdst;
-		if (!fs->m_u.tcp_ip4_spec.pdst)
-			input_masks.dst_port_mask = 0xFFFF;
-		else
-			input_masks.dst_port_mask = ~fs->m_u.tcp_ip4_spec.pdst;
-	}
-
-	/* determine if we need to drop or route the packet */
-	if (fs->action == ETHTOOL_RXNTUPLE_ACTION_DROP)
-		target_queue = MAX_RX_QUEUES - 1;
-	else
-		target_queue = fs->action;
-
-	spin_lock(&adapter->fdir_perfect_lock);
-	err = ixgbe_fdir_add_perfect_filter_82599(&adapter->hw,
-						  &input_struct,
-						  &input_masks, 0,
-						  target_queue);
-	spin_unlock(&adapter->fdir_perfect_lock);
-
-	return err ? -1 : 0;
-}
-
 static const struct ethtool_ops ixgbe_ethtool_ops = {
 	.get_settings           = ixgbe_get_settings,
 	.set_settings           = ixgbe_set_settings,
@@ -2473,7 +2334,6 @@ static const struct ethtool_ops ixgbe_ethtool_ops = {
 	.set_coalesce           = ixgbe_set_coalesce,
 	.get_flags              = ethtool_op_get_flags,
 	.set_flags              = ixgbe_set_flags,
-	.set_rx_ntuple          = ixgbe_set_rx_ntuple,
 };
 
 void ixgbe_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index f0d0c5a..929f059 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -1568,9 +1568,8 @@ static void ixgbe_configure_msix(struct ixgbe_adapter *adapter)
 			q_vector->eitr = adapter->rx_eitr_param;
 
 		ixgbe_write_eitr(q_vector);
-		/* If Flow Director is enabled, set interrupt affinity */
-		if ((adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) ||
-		    (adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE)) {
+		/* If ATR is enabled, set interrupt affinity */
+		if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) {
 			/*
 			 * Allocate the affinity_hint cpumask, assign the mask
 			 * for this vector, and set our affinity_hint for
@@ -2453,8 +2452,7 @@ static inline void ixgbe_irq_enable(struct ixgbe_adapter *adapter, bool queues,
 	default:
 		break;
 	}
-	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE ||
-	    adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE)
+	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)
 		mask |= IXGBE_EIMS_FLOW_DIR;
 
 	IXGBE_WRITE_REG(&adapter->hw, IXGBE_EIMS, mask);
@@ -3692,8 +3690,6 @@ static void ixgbe_configure(struct ixgbe_adapter *adapter)
 			adapter->tx_ring[i]->atr_sample_rate =
 						       adapter->atr_sample_rate;
 		ixgbe_init_fdir_signature_82599(hw, adapter->fdir_pballoc);
-	} else if (adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE) {
-		ixgbe_init_fdir_perfect_82599(hw, adapter->fdir_pballoc);
 	}
 	ixgbe_configure_virtualization(adapter);
 
@@ -4138,8 +4134,7 @@ void ixgbe_down(struct ixgbe_adapter *adapter)
 		free_cpumask_var(q_vector->affinity_mask);
 	}
 
-	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE ||
-	    adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE)
+	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)
 		cancel_work_sync(&adapter->fdir_reinit_task);
 
 	if (adapter->flags2 & IXGBE_FLAG2_TEMP_SENSOR_CAPABLE)
@@ -4164,9 +4159,6 @@ void ixgbe_down(struct ixgbe_adapter *adapter)
 		break;
 	}
 
-	/* clear n-tuple filters that are cached */
-	ethtool_ntuple_flush(netdev);
-
 	if (!pci_channel_offline(adapter->pdev))
 		ixgbe_reset(adapter);
 
@@ -4313,15 +4305,13 @@ static inline bool ixgbe_set_fdir_queues(struct ixgbe_adapter *adapter)
 	f_fdir->mask = 0;
 
 	/* Flow Director must have RSS enabled */
-	if (adapter->flags & IXGBE_FLAG_RSS_ENABLED &&
-	    ((adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE ||
-	     (adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE)))) {
+	if ((adapter->flags & IXGBE_FLAG_RSS_ENABLED) &&
+	    (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)) {
 		adapter->num_tx_queues = f_fdir->indices;
 		adapter->num_rx_queues = f_fdir->indices;
 		ret = true;
 	} else {
 		adapter->flags &= ~IXGBE_FLAG_FDIR_HASH_CAPABLE;
-		adapter->flags &= ~IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
 	}
 	return ret;
 }
@@ -4354,8 +4344,7 @@ static inline bool ixgbe_set_fcoe_queues(struct ixgbe_adapter *adapter)
 #endif
 		if (adapter->flags & IXGBE_FLAG_RSS_ENABLED) {
 			e_info(probe, "FCoE enabled with RSS\n");
-			if ((adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) ||
-			    (adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE))
+			if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)
 				ixgbe_set_fdir_queues(adapter);
 			else
 				ixgbe_set_rss_queues(adapter);
@@ -4598,9 +4587,8 @@ static inline bool ixgbe_cache_ring_fdir(struct ixgbe_adapter *adapter)
 	int i;
 	bool ret = false;
 
-	if (adapter->flags & IXGBE_FLAG_RSS_ENABLED &&
-	    ((adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) ||
-	     (adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE))) {
+	if ((adapter->flags & IXGBE_FLAG_RSS_ENABLED) &&
+	    (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)) {
 		for (i = 0; i < adapter->num_rx_queues; i++)
 			adapter->rx_ring[i]->reg_idx = i;
 		for (i = 0; i < adapter->num_tx_queues; i++)
@@ -4656,8 +4644,7 @@ static inline bool ixgbe_cache_ring_fcoe(struct ixgbe_adapter *adapter)
 	}
 #endif /* CONFIG_IXGBE_DCB */
 	if (adapter->flags & IXGBE_FLAG_RSS_ENABLED) {
-		if ((adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) ||
-		    (adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE))
+		if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)
 			ixgbe_cache_ring_fdir(adapter);
 		else
 			ixgbe_cache_ring_rss(adapter);
@@ -4837,14 +4824,12 @@ static int ixgbe_set_interrupt_capability(struct ixgbe_adapter *adapter)
 
 	adapter->flags &= ~IXGBE_FLAG_DCB_ENABLED;
 	adapter->flags &= ~IXGBE_FLAG_RSS_ENABLED;
-	if (adapter->flags & (IXGBE_FLAG_FDIR_HASH_CAPABLE |
-			      IXGBE_FLAG_FDIR_PERFECT_CAPABLE)) {
+	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) {
 		e_err(probe,
-		      "Flow Director is not supported while multiple "
+		      "ATR is not supported while multiple "
 		      "queues are disabled.  Disabling Flow Director\n");
 	}
 	adapter->flags &= ~IXGBE_FLAG_FDIR_HASH_CAPABLE;
-	adapter->flags &= ~IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
 	adapter->atr_sample_rate = 0;
 	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
 		ixgbe_disable_sriov(adapter);
@@ -7446,8 +7431,7 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 	/* carrier off reporting is important to ethtool even BEFORE open */
 	netif_carrier_off(netdev);
 
-	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE ||
-	    adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE)
+	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)
 		INIT_WORK(&adapter->fdir_reinit_task, ixgbe_fdir_reinit_task);
 
 	if (adapter->flags2 & IXGBE_FLAG2_TEMP_SENSOR_CAPABLE)
@@ -7524,8 +7508,7 @@ static void __devexit ixgbe_remove(struct pci_dev *pdev)
 	cancel_work_sync(&adapter->sfp_task);
 	cancel_work_sync(&adapter->multispeed_fiber_task);
 	cancel_work_sync(&adapter->sfp_config_module_task);
-	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE ||
-	    adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE)
+	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)
 		cancel_work_sync(&adapter->fdir_reinit_task);
 	if (adapter->flags2 & IXGBE_FLAG2_TEMP_SENSOR_CAPABLE)
 		cancel_work_sync(&adapter->check_overtemp_task);


^ permalink raw reply related

* [net-next-2.6 PATCH 02/10] ethtool: add ntuple flow specifier to network flow classifier
From: Alexander Duyck @ 2011-02-25 23:32 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225232357.7920.58559.stgit@gitlad.jf.intel.com>

This change is meant to add an ntuple define type to the rx network flow
classification specifiers.  The idea is to allow ntuple to be displayed and
possibly configured via the network flow classification interface.  To do
this I added a ntuple_flow_spec_ext to the lsit of supported filters, and
added a flow_type_ext value to the structure in an unused hole within the
ethtool_rx_flow_spec structure.

Due to the fact that the flow specifier structures are only 4 byte aligned
instead of 8 I had to break the user data field into 2 sections.  In
addition I added the vlan ethertype field since this is what ixgbe was
using the user-data for currently and it allows for the fields to stay 4
byte aligned while occupying space at the end of the flow_spec.

In order to guarantee byte ordering I also thought it best to keep all
fields in the flow_spec area a big endian value, as such I added vlan, vlan
ethertype, and data as big endian values.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 include/linux/ethtool.h |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index aac3e2e..3d1f8e0 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -378,10 +378,25 @@ struct ethtool_usrip4_spec {
 };
 
 /**
+ * struct ethtool_ntuple_spec_ext - flow spec extension for ntuple in nfc
+ * @unused: space unused by extension
+ * @vlan_etype: EtherType for vlan tagged packet to match
+ * @vlan_tci: VLAN tag to match
+ * @data: Driver-dependent data to match
+ */
+struct ethtool_ntuple_spec_ext {
+	__be32	unused[15];
+	__be16	vlan_etype;
+	__be16	vlan_tci;
+	__be32	data[2];
+};
+
+/**
  * struct ethtool_rx_flow_spec - specification for RX flow filter
  * @flow_type: Type of match to perform, e.g. %TCP_V4_FLOW
  * @h_u: Flow fields to match (dependent on @flow_type)
  * @m_u: Masks for flow field bits to be ignored
+ * @flow_type_ext: Type of extended match to perform, e.g. %NTUPLE_FLOW_EXT
  * @ring_cookie: RX ring/queue index to deliver to, or %RX_CLS_FLOW_DISC
  *	if packets should be discarded
  * @location: Index of filter in hardware table
@@ -396,8 +411,10 @@ struct ethtool_rx_flow_spec {
 		struct ethtool_ah_espip4_spec		esp_ip4_spec;
 		struct ethtool_usrip4_spec		usr_ip4_spec;
 		struct ethhdr				ether_spec;
+		struct ethtool_ntuple_spec_ext		ntuple_spec;
 		__u8					hdata[72];
 	} h_u, m_u;
+	__u32		flow_type_ext;
 	__u64		ring_cookie;
 	__u32		location;
 };
@@ -955,6 +972,9 @@ struct ethtool_ops {
 #define	IPV6_FLOW	0x11	/* hash only */
 #define	ETHER_FLOW	0x12	/* spec only (ether_spec) */
 
+/* Flow extension types for network flow classifier */
+#define NTUPLE_FLOW_EXT	0x01 /* indicates ntuple in nfc */
+
 /* L3-L4 network traffic flow hash options */
 #define	RXH_L2DA	(1 << 1)
 #define	RXH_VLAN	(1 << 2)


^ permalink raw reply related

* [net-next-2.6 PATCH 01/10] ethtool: prevent null pointer dereference with NTUPLE set but no set_rx_ntuple
From: Alexander Duyck @ 2011-02-25 23:32 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev
In-Reply-To: <20110225232357.7920.58559.stgit@gitlad.jf.intel.com>

This change is meant to prevent a possible null pointer dereference if
NETIF_F_NTUPLE is defined but the set_rx_ntuple function pointer is not.

This issue appears to affect all kernels since 2.6.34.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 net/core/ethtool.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index c1a71bb..4843674 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -893,6 +893,9 @@ static noinline_for_stack int ethtool_set_rx_ntuple(struct net_device *dev,
 	struct ethtool_rx_ntuple_flow_spec_container *fsc = NULL;
 	int ret;
 
+	if (!ops->set_rx_ntuple)
+		return -EOPNOTSUPP;
+
 	if (!(dev->features & NETIF_F_NTUPLE))
 		return -EINVAL;
 


^ permalink raw reply related

* [net-next-2.6 PATCH 00/10] Workarounds and fixes for ntuple filters
From: Alexander Duyck @ 2011-02-25 23:32 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev

This patch series is meant to address issue in the implementation of ntuple
filters in both ixgbe and the kernel.  The main issue being that ntuple
filters didn't support the ability for drivers to retain and display
filters.

This series addresses it by removing the ETHTOOL_GRXNTUPLE interface
from the kernel and extends the ethtool network flow classifier interface
to handle the same fields that ntuple filters provide.  I have a separate
set of patches I will be emailing out in the next few minutes that will
contain the ethtool user-space portion of these changes.

The patches labeled with RFC are currently just for comments as I am not
submitting them directly to the netdev tree and will be instead submitting
them to the Jeff Kirsher's Intel network driver tree for submission.

---

Alexander Duyck (10):
      [RFC] ixgbe: Add support for using the same fields as ntuple in nfc
      [RFC] ixgbe: add support for nfc addition and removal of filters
      [RFC] ixgbe: add support for displaying ntuple filters via the nfc interface
      [RFC] ixgbe: add basic support for settting and getting nfc controls
      [RFC] ixgbe: update perfect filter framework to support retaining filters
      [RFC] ixgbe: add support for different Rx packet buffer sizes
      [RFC] ethtool: remove support for ETHTOOL_GRXNTUPLE
      [RFC] ixgbe: remove ntuple filtering
      ethtool: add ntuple flow specifier to network flow classifier
      ethtool: prevent null pointer dereference with NTUPLE set but no set_rx_ntuple

 drivers/net/ixgbe/ixgbe.h         |   29 +-
 drivers/net/ixgbe/ixgbe_82598.c   |    2 
 drivers/net/ixgbe/ixgbe_82599.c   |  661 ++++++++++++++++++++-----------------
 drivers/net/ixgbe/ixgbe_dcb_nl.c  |   47 +--
 drivers/net/ixgbe/ixgbe_ethtool.c |  504 ++++++++++++++++++++++------
 drivers/net/ixgbe/ixgbe_main.c    |   90 +++--
 drivers/net/ixgbe/ixgbe_type.h    |   25 +
 drivers/net/ixgbe/ixgbe_x540.c    |    2 
 include/linux/ethtool.h           |   27 +-
 include/linux/netdevice.h         |    3 
 net/core/dev.c                    |    5 
 net/core/ethtool.c                |  302 -----------------
 12 files changed, 884 insertions(+), 813 deletions(-)

-- 

^ permalink raw reply

* Re: SO_REUSEPORT - can it be done in kernel?
From: Rick Jones @ 2011-02-25 23:15 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Tom Herbert, Bill Sommerfeld, Daniel Baluta, netdev
In-Reply-To: <20110225224846.GC9763@canuck.infradead.org>

On Fri, 2011-02-25 at 17:48 -0500, Thomas Graf wrote:
> On Fri, Feb 25, 2011 at 11:18:15AM -0800, Rick Jones wrote:
> > I think the idea is goodness, but will ask, was the (first) bottleneck
> > actually in the kernel, or was it in bind itself?  I've seen
> > single-instance, single-byte burst-mode netperf TCP_RR do in excess of
> > 300K transactions per second (with TCP_NODELAY set) on an X5560 core.
> > 
> > ftp://ftp.netperf.org/netperf/misc/dl380g6_X5560_rhel54_ad386_cxgb3_1.4.1.2_b2b_to_same_agg_1500mtu_20100513-2.csv
> > 
> > and that was with now ancient RHEL5.4 bits...  yes, there is a bit of
> > apples, oranges and kumquats but still, I am wondering if this didn't
> > also "work around" some internal BIND scaling issues as well.
> 
> Yes it is. We have observed two separate bottlenecks.
> 
> The first we have discovered is within BIND. As soon as more than 1
> worker thread is being used strace showed a ton of futex() system
> calls to the kernel as soon as the number of queries crossed a magic
> barrier. This suggested heavy lock contention within BIND.

The more things change, the more they remain the same, or perhaps "Code
may come and go, but lock contention is forever:

ftp://ftp.cup.hp.com/dist/networking/briefs/bind9_perf.txt

rick jones

The system ftp.cup.hp.com is probably going away before long, I will
probably put its collection of ancient writeups somewhere on netperf.org

> 
> This BIND lock contetion was not visible on all systems having scalability
> issues though. Some machines were not able to deliver enough queries to
> BIND in order for the lock contention to appear.



^ permalink raw reply

* Re: [PATCH net-2.6] bonding: drop frames received with master's source MAC
From: Nicolas de Pesloüan @ 2011-02-25 23:08 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: netdev, David Miller, Herbert Xu, Jay Vosburgh, Jiri Pirko
In-Reply-To: <20110225222455.GI11864@gospo.rdu.redhat.com>

Le 25/02/2011 23:24, Andy Gospodarek a écrit :
> On Fri, Feb 25, 2011 at 11:04:27PM +0100, Nicolas de Pesloüan wrote:
>> Le 25/02/2011 22:13, Andy Gospodarek a écrit :
>>> I was looking at my system and wondering why I sometimes saw these
>>> DAD messages in my logs:
>>>
>>> bond0: IPv6 duplicate address fe80::21b:21ff:fe38:2ec4 detected!
>>>
>>> I traced it back and realized the IPv6 Neighbor Solicitations I was
>>> sending were also coming back into the stack on the slave(s) that did
>>> not transmit the frames.  I could not think of a compelling reason to
>>> notify the user that a NS we sent came back, so I set out to just drop
>>> the frame silently in ndisc_recv_ns drop.
>>>
>>> That seemed to work well, but when I thought about it I could not
>>> compelling reason to save any of these frames.  Dropping them as soon as
>>> we get them seems like a much better idea as it fixes other issues that
>>> may exist for more than just IPv6 DAD.
>>>
>>> I chose to check the incoming frame against the master's MAC address as
>>> that should be the MAC address used anytime a broadcast frame is sent by
>>> the bonding driver that had the chance to make its way back into one of
>>> the other devices.
>>
>> I think this could break the ARP monitoring. ARP monitoring rely on a
>> normal protocol handler, registered in bond_main.c.
>>
>> void bond_register_arp(struct bonding *bond)
>> {
>>          struct packet_type *pt =&bond->arp_mon_pt;
>>
>>          if (pt->type)
>>                  return;
>>
>>          pt->type = htons(ETH_P_ARP);
>>          pt->dev = bond->dev;
>>          pt->func = bond_arp_rcv;
>>          dev_add_pack(pt);
>> }
>>
>> For as far as I understand, some variants of arp_validate require the
>> backup interfaces to receive ARP requests sent from the master, through
>> the active interface, presumably with the master MAC as the source MAC.
>>
>> As this protocol handler is registered at the master level, the exact
>> match logic in __netif_receive_skb(), which apply at the slave level,
>> shouldn't deliver this skb to bond_arp_rcv().
>>
>> Can someone confirm ? Jay ?
>>
>> 	Nicolas.
>>
>
> I confirmed your suspicion, this breaks ARP monitoring.  I would still
> welcome other opinions though as I think it would be nice to fix this as
> low as possible.

Why do you want to fix it earlier that in ndisc_recv_ns drop? Your original idea of silently 
dropping the frame there seems perfect to me.

	Nicolas.

^ permalink raw reply

* Re: SO_REUSEPORT - can it be done in kernel?
From: Thomas Graf @ 2011-02-25 22:58 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Bill Sommerfeld, Daniel Baluta, netdev
In-Reply-To: <AANLkTinNost8Swh2fhQh8UXVdPTW_bToS7LXmyuwqNNQ@mail.gmail.com>

On Fri, Feb 25, 2011 at 11:51:13AM -0800, Tom Herbert wrote:
> On the UDP side, I believe the patch is functional, but as Eric
> pointed out it probably could be further optimized.  I'll split out
> the UDP bits into a separate patch and post that...

Cool! I will be happy to assist in improving it further.

We already see 97-99% CPU utizliation spread evenly over all cores
(with about 20-30% spent in softirq) so at least it already scales
perfectly well.

^ permalink raw reply

* Re: 2.6.37 regression: adding main interface to a bridge breaks vlan interface RX
From: Jesse Gross @ 2011-02-25 22:57 UTC (permalink / raw)
  To: chriss; +Cc: netdev
In-Reply-To: <loom.20110214T141934-88@post.gmane.org>

On Mon, Feb 14, 2011 at 5:22 AM, chriss <mail_to_chriss@gmx.net> wrote:
> Nicolas de Pesloüan <nicolas.2p.debian <at> gmail.com> writes:
>
>> I think you should have a look at ebtables command, in particular, the
> BROUTING chain of broute
>> table. If this chain ask the packet to be dropped, then bridge will ignore it
> and give a chance to
>> the upper layer to use it. Upper layer might be IP, or in your particular
> setup, VLAN.
>>
>> HTH,
>>
>>       Nicolas.
>
> Thank you very much for the ebtables hint.
>
> I also tried to add the vlan to my bridge device but only droping the vlan
> tagged paket with ebtables got it working.
>
> I'm not sure if this is the wanted behavior for bridging vlan actions.
> ..or my network setup is just to ..f%%%'ed up?!

What driver is in use with the NIC you are seeing this on?

^ permalink raw reply

* Re: [PATCH 1/2] bonding: fix incorrect transmit queue offset
From: Phil Oester @ 2011-02-25 22:56 UTC (permalink / raw)
  To: David Miller; +Cc: bhutchings, andy, netdev, fubar
In-Reply-To: <20110223.155451.70202591.davem@davemloft.net>

On Wed, Feb 23, 2011 at 03:54:51PM -0800, David Miller wrote:
> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Wed, 23 Feb 2011 23:37:49 +0000
> 
> > On Wed, 2011-02-23 at 15:13 -0800, David Miller wrote:
> >> From: Phil Oester <kernel@linuxace.com>
> >> Date: Wed, 23 Feb 2011 15:08:44 -0800
> >> 
> >> > On Wed, Feb 23, 2011 at 02:42:49PM -0500, Andy Gospodarek wrote:
> >> >> +     while (txq >= dev->real_num_tx_queues) {
> >> >> +             /* let the user know if we do not have enough tx queues */
> >> >> +             if (net_ratelimit())
> >> >> +                     pr_warning("%s selects invalid tx queue %d.  Consider"
> >> >> +                                " setting module option tx_queues > %d.",
> >> >> +                                dev->name, txq, dev->real_num_tx_queues);
> >> >> +             txq -= dev->real_num_tx_queues;
> >> >> +     }
> >> > 
> >> > Think this would be better as a WARN_ONCE, as otherwise syslog will still
> >> > get flooded with this - even when ratelimited.  See get_rps_cpu in 
> >> > net/core/dev.c as an example.o
> >> 
> >> Agreed.
> > 
> > This shouldn't WARN at all.  It is perfectly valid (though non-optimal)
> > to have different numbers of queues on two different multiqueue devices.
> 
> That's also a good point.

The patch works as expected.  Do we have any agreement on a final version?

Phil

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox