netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [ethtool PATCH 0/4] Add support for network flow classifier
@ 2011-05-03 16:12 Alexander Duyck
  2011-05-03 16:12 ` [ethtool PATCH 1/4] ethtool: remove strings based approach for displaying n-tuple Alexander Duyck
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Alexander Duyck @ 2011-05-03 16:12 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev

This series is version 5 of the patches for adding network flow classifier
rules support to ethtool.  The main changes are that I split a general header
cleanup off into a separate patch, I combined the documentation update into
the addition of the RX packet classification interface, and I generally
cleaned up the code further for when we need to maintain the rule table and
when we don't.  As a result I dropped a few dozen lines of code related to
finding and deleting a rule from the table since we only need the table to
determine which rules are present.

I will be submitting a set of patches through Jeff Kirsher once these patches
are accepted to allow for ixgbe to make use of the network flow classifier
rules interface.

Thanks,

Alex

---

Alexander Duyck (3):
      Add support for __be64 and bitops, centralize several needed macros
      Cleanup defines and header includes to address several issues
      ethtool: remove strings based approach for displaying n-tuple

Santwona Behera (1):
      v5 Add RX packet classification interface


 Makefile.am      |    3 
 ethtool-bitops.h |   25 +
 ethtool-util.h   |   47 ++
 ethtool.8.in     |  194 +++++-----
 ethtool.c        |  449 ++++++++++++-----------
 rxclass.c        | 1039 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 1433 insertions(+), 324 deletions(-)
 create mode 100644 ethtool-bitops.h
 create mode 100644 rxclass.c

-- 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [ethtool PATCH 1/4] ethtool: remove strings based approach for displaying n-tuple
  2011-05-03 16:12 [ethtool PATCH 0/4] Add support for network flow classifier Alexander Duyck
@ 2011-05-03 16:12 ` Alexander Duyck
  2011-05-03 16:12 ` [ethtool PATCH 2/4] Cleanup defines and header includes to address several issues Alexander Duyck
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Alexander Duyck @ 2011-05-03 16:12 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev

This change is meant to remove the strings based approach for displaying
n-tuple filters.  A follow-on patch will replace that functionality with a
network flow classification based approach that will get the number of
filters, get their locations, and then request and display them
individually.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 ethtool.c |   44 --------------------------------------------
 1 files changed, 0 insertions(+), 44 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index cfdac65..9ad7000 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -3194,50 +3194,6 @@ static int do_srxntuple(int fd, struct ifreq *ifr)
 
 static int do_grxntuple(int fd, struct ifreq *ifr)
 {
-	struct ethtool_sset_info *sset_info;
-	struct ethtool_gstrings *strings;
-	int sz_str, n_strings, err, i;
-
-	sset_info = malloc(sizeof(struct ethtool_sset_info) + sizeof(u32));
-	sset_info->cmd = ETHTOOL_GSSET_INFO;
-	sset_info->sset_mask = (1ULL << ETH_SS_NTUPLE_FILTERS);
-	ifr->ifr_data = (caddr_t)sset_info;
-	err = send_ioctl(fd, ifr);
-
-	if ((err < 0) ||
-	    (!(sset_info->sset_mask & (1ULL << ETH_SS_NTUPLE_FILTERS)))) {
-		perror("Cannot get driver strings info");
-		return 100;
-	}
-
-	n_strings = sset_info->data[0];
-	free(sset_info);
-	sz_str = n_strings * ETH_GSTRING_LEN;
-
-	strings = calloc(1, sz_str + sizeof(struct ethtool_gstrings));
-	if (!strings) {
-		fprintf(stderr, "no memory available\n");
-		return 95;
-	}
-
-	strings->cmd = ETHTOOL_GRXNTUPLE;
-	strings->string_set = ETH_SS_NTUPLE_FILTERS;
-	strings->len = n_strings;
-	ifr->ifr_data = (caddr_t) strings;
-	err = send_ioctl(fd, ifr);
-	if (err < 0) {
-		perror("Cannot get Rx n-tuple information");
-		free(strings);
-		return 101;
-	}
-
-	n_strings = strings->len;
-	fprintf(stdout, "Rx n-tuple filters:\n");
-	for (i = 0; i < n_strings; i++)
-		fprintf(stdout, "%s", &strings->data[i * ETH_GSTRING_LEN]);
-
-	free(strings);
-
 	return 0;
 }
 


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [ethtool PATCH 2/4] Cleanup defines and header includes to address several issues
  2011-05-03 16:12 [ethtool PATCH 0/4] Add support for network flow classifier Alexander Duyck
  2011-05-03 16:12 ` [ethtool PATCH 1/4] ethtool: remove strings based approach for displaying n-tuple Alexander Duyck
@ 2011-05-03 16:12 ` Alexander Duyck
  2011-05-03 16:12 ` [ethtool PATCH 3/4] Add support for __be64 and bitops, centralize several needed macros Alexander Duyck
  2011-05-03 16:12 ` [ethtool PATCH 4/4] v5 Add RX packet classification interface Alexander Duyck
  3 siblings, 0 replies; 23+ messages in thread
From: Alexander Duyck @ 2011-05-03 16:12 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev

This change is meant to address several issues.  First it moves the check
for ethtool-config.h into ethtool-util.h the reason for this change is so
that any references to ethtool-util.h outside of ethtool.c will use the
correct defines for the endian types.

In addition I have pulled several headers that will be common to both
ethtool.c and rxclass.c into the ethtool-util.h header file.  I am also
centralizing several macros that will be needed across multiple files when
I implement the network flow classifier rules.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 ethtool-util.h |   17 +++++++++++++++--
 ethtool.c      |   17 +----------------
 2 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/ethtool-util.h b/ethtool-util.h
index f053028..6a4f3f4 100644
--- a/ethtool-util.h
+++ b/ethtool-util.h
@@ -3,8 +3,13 @@
 #ifndef ETHTOOL_UTIL_H__
 #define ETHTOOL_UTIL_H__
 
+#ifdef HAVE_CONFIG_H
+#include "ethtool-config.h"
+#endif
 #include <sys/types.h>
 #include <endian.h>
+#include <sys/ioctl.h>
+#include <net/if.h>
 
 /* ethtool.h expects these to be defined by <linux/types.h> */
 #ifndef HAVE_BE_TYPES
@@ -12,14 +17,14 @@ typedef __uint16_t __be16;
 typedef __uint32_t __be32;
 #endif
 
-#include "ethtool-copy.h"
-
 typedef unsigned long long u64;
 typedef __uint32_t u32;
 typedef __uint16_t u16;
 typedef __uint8_t u8;
 typedef __int32_t s32;
 
+#include "ethtool-copy.h"
+
 #if __BYTE_ORDER == __BIG_ENDIAN
 static inline u16 cpu_to_be16(u16 value)
 {
@@ -40,6 +45,14 @@ static inline u32 cpu_to_be32(u32 value)
 }
 #endif
 
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+#endif
+
+#ifndef SIOCETHTOOL
+#define SIOCETHTOOL     0x8946
+#endif
+
 /* National Semiconductor DP83815, DP83816 */
 int natsemi_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
 int natsemi_dump_eeprom(struct ethtool_drvinfo *info,
diff --git a/ethtool.c b/ethtool.c
index 9ad7000..24d4e4f 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -21,19 +21,12 @@
  *   * show settings for all devices
  */
 
-#ifdef HAVE_CONFIG_H
-#  include "ethtool-config.h"
-#endif
-
-#include <sys/types.h>
+#include "ethtool-util.h"
 #include <string.h>
 #include <stdlib.h>
-#include <sys/types.h>
-#include <sys/ioctl.h>
 #include <sys/stat.h>
 #include <stdio.h>
 #include <errno.h>
-#include <net/if.h>
 #include <sys/utsname.h>
 #include <limits.h>
 #include <ctype.h>
@@ -43,18 +36,10 @@
 #include <arpa/inet.h>
 
 #include <linux/sockios.h>
-#include "ethtool-util.h"
 
-
-#ifndef SIOCETHTOOL
-#define SIOCETHTOOL     0x8946
-#endif
 #ifndef MAX_ADDR_LEN
 #define MAX_ADDR_LEN	32
 #endif
-#ifndef ARRAY_SIZE
-#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
-#endif
 
 #ifndef HAVE_NETIF_MSG
 enum {


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [ethtool PATCH 3/4] Add support for __be64 and bitops, centralize several needed macros
  2011-05-03 16:12 [ethtool PATCH 0/4] Add support for network flow classifier Alexander Duyck
  2011-05-03 16:12 ` [ethtool PATCH 1/4] ethtool: remove strings based approach for displaying n-tuple Alexander Duyck
  2011-05-03 16:12 ` [ethtool PATCH 2/4] Cleanup defines and header includes to address several issues Alexander Duyck
@ 2011-05-03 16:12 ` Alexander Duyck
  2011-05-03 16:12 ` [ethtool PATCH 4/4] v5 Add RX packet classification interface Alexander Duyck
  3 siblings, 0 replies; 23+ messages in thread
From: Alexander Duyck @ 2011-05-03 16:12 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev

This change is meant to add support for __be64 values and bitops to
ethtool.  In addition the patch pulls the SIOCETHTOOL define and the
ARRAY_SIZE define into ethtool-util.h for later use by the rxclass files.
These changes will be needed in order to support network flow
classifier rule configuration.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 ethtool-bitops.h |   25 +++++++++++++++++++++++++
 ethtool-util.h   |   16 ++++++++++++++--
 2 files changed, 39 insertions(+), 2 deletions(-)
 create mode 100644 ethtool-bitops.h

diff --git a/ethtool-bitops.h b/ethtool-bitops.h
new file mode 100644
index 0000000..b1eb426
--- /dev/null
+++ b/ethtool-bitops.h
@@ -0,0 +1,25 @@
+#ifndef ETHTOOL_BITOPS_H__
+#define ETHTOOL_BITOPS_H__
+
+#define BITS_PER_BYTE		8
+#define BITS_PER_LONG		(BITS_PER_BYTE * sizeof(long))
+#define DIV_ROUND_UP(n, d)	(((n) + (d) - 1) / (d))
+#define BITS_TO_LONGS(nr)	DIV_ROUND_UP(nr, BITS_PER_LONG)
+
+static inline void set_bit(int nr, unsigned long *addr)
+{
+	addr[nr / BITS_PER_LONG] |= 1UL << (nr % BITS_PER_LONG);
+}
+
+static inline void clear_bit(int nr, unsigned long *addr)
+{
+	addr[nr / BITS_PER_LONG] &= ~(1UL << (nr % BITS_PER_LONG));
+}
+
+static inline int test_bit(unsigned int nr, const unsigned long *addr)
+{
+	return !!((1UL << (nr % BITS_PER_LONG)) &
+		  (((unsigned long *)addr)[nr / BITS_PER_LONG]));
+}
+
+#endif
diff --git a/ethtool-util.h b/ethtool-util.h
index 6a4f3f4..d8b621c 100644
--- a/ethtool-util.h
+++ b/ethtool-util.h
@@ -15,6 +15,7 @@
 #ifndef HAVE_BE_TYPES
 typedef __uint16_t __be16;
 typedef __uint32_t __be32;
+typedef unsigned long long __be64;
 #endif
 
 typedef unsigned long long u64;
@@ -28,11 +29,15 @@ typedef __int32_t s32;
 #if __BYTE_ORDER == __BIG_ENDIAN
 static inline u16 cpu_to_be16(u16 value)
 {
-    return value;
+	return value;
 }
 static inline u32 cpu_to_be32(u32 value)
 {
-    return value;
+	return value;
+}
+static inline u64 cpu_to_be64(u64 value)
+{
+	return value;
 }
 #else
 static inline u16 cpu_to_be16(u16 value)
@@ -43,8 +48,15 @@ static inline u32 cpu_to_be32(u32 value)
 {
 	return cpu_to_be16(value >> 16) | (cpu_to_be16(value) << 16);
 }
+static inline u64 cpu_to_be64(u64 value)
+{
+	return cpu_to_be32(value >> 32) | ((u64)cpu_to_be32(value) << 32);
+}
 #endif
 
+#define ntohll cpu_to_be64
+#define htonll cpu_to_be64
+
 #ifndef ARRAY_SIZE
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
 #endif


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-03 16:12 [ethtool PATCH 0/4] Add support for network flow classifier Alexander Duyck
                   ` (2 preceding siblings ...)
  2011-05-03 16:12 ` [ethtool PATCH 3/4] Add support for __be64 and bitops, centralize several needed macros Alexander Duyck
@ 2011-05-03 16:12 ` Alexander Duyck
  2011-05-03 23:23   ` Dimitris Michailidis
  3 siblings, 1 reply; 23+ messages in thread
From: Alexander Duyck @ 2011-05-03 16:12 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, bhutchings; +Cc: netdev

From: Santwona Behera <santwona.behera@sun.com>

This patch was originally introduced as:
  [PATCH 1/3] [ethtool] Add rx pkt classification interface
  Signed-off-by: Santwona Behera <santwona.behera@sun.com>
  http://patchwork.ozlabs.org/patch/23223/

v2:
I have updated it to address a number of issues.  As a result I removed the
local caching of rules due to the fact that there were memory leaks in this
code and the rule manager would consume over 1Mb of space for an 8K table
when all that was needed was 1K in order to store which rules were active
and which were not.

In addition I dropped the use of regions as there were multiple issue found
including the fact that the regions were not properly expanding beyond 2
and the fact that the regions required reading all of the rules in order to
correctly expand beyond 2.  By dropping the regions from the rule manager
it is possible to write a much cleaner interface leaving region management
to be done by either the driver or by external management scripts.

v3:
The latest update to this patch now inverts the masks to match the mask
types used for n-tuple.  As such a network flow classifier is defined using
the exact same syntax as n-tuple, and the tool will correct for the fact
that NFC uses the 1's compliment of the n-tuple mask.

I also updated the ordering of new rules being added.  All new rules will
take the highest numbered open rule when no location is specified.

Since NFC now uses the same syntax as n-tuple I added code such that now
when location is not specified the -U option will first try to add a new
n-tuple rule, and if that fails with a ENOTSUPP it will then try to add the
rule via the NFC interface.

Finally I split out the addition of bitops and the updates to documentation
into separate patches.  This makes the total patch size a bit more
manageable since the addition of NFC and the merging of it with n-tuple were
combined into this patch.

v4:
This change merges the ntuple and network flow classifier rules so that if
we setup a rule and the device has the NTUPLE flag set we will first try to
use set_rx_ntuple.  If that fails with EOPNOTSUPP we then will attempt to
use the network flow classifier rule insertion.  This way we can support
legacy configurations such as niu on kernels prior to 2.6.40 that support
network flow classifier but not ntuple, but for drivers such as ixgbe we
can test for ntuple first, and then network flow classifier.

This patch has also updated the output to make use of the updated network
flow classifier extensions that have been accepted into the kernel.

v5:
This change merged the documentation update into this patch.  In addition the
documentation changes were made such that there is only one listing of the
individual options and they are all listed as optional.

In addition this patch contains several fixes to address things such as the
fact that we were maintaining the table logic even though we only need it for
displaying all of the rules, or when adding a rule with no location specified.
As such all of the logic for deleting or finding rules in the table has been
removed.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 Makefile.am    |    3 
 ethtool-util.h |   14 +
 ethtool.8.in   |  194 ++++++----
 ethtool.c      |  400 ++++++++++++----------
 rxclass.c      | 1039 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 1384 insertions(+), 266 deletions(-)
 create mode 100644 rxclass.c

diff --git a/Makefile.am b/Makefile.am
index a0d2116..0262c31 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -8,7 +8,8 @@ ethtool_SOURCES = ethtool.c ethtool-copy.h ethtool-util.h	\
 		  amd8111e.c de2104x.c e100.c e1000.c igb.c	\
 		  fec_8xx.c ibm_emac.c ixgb.c ixgbe.c natsemi.c	\
 		  pcnet32.c realtek.c tg3.c marvell.c vioc.c	\
-		  smsc911x.c at76c50x-usb.c sfc.c stmmac.c
+		  smsc911x.c at76c50x-usb.c sfc.c stmmac.c	\
+		  rxclass.c
 
 dist-hook:
 	cp $(top_srcdir)/ethtool.spec $(distdir)
diff --git a/ethtool-util.h b/ethtool-util.h
index d8b621c..79be7f2 100644
--- a/ethtool-util.h
+++ b/ethtool-util.h
@@ -65,6 +65,8 @@ static inline u64 cpu_to_be64(u64 value)
 #define SIOCETHTOOL     0x8946
 #endif
 
+#define	RX_CLS_LOC_UNSPEC	0xffffffffUL
+
 /* National Semiconductor DP83815, DP83816 */
 int natsemi_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
 int natsemi_dump_eeprom(struct ethtool_drvinfo *info,
@@ -128,4 +130,14 @@ int sfc_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
 int st_mac100_dump_regs(struct ethtool_drvinfo *info,
 			struct ethtool_regs *regs);
 int st_gmac_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
-#endif
+
+/* Rx flow classification */
+int rxclass_parse_ruleopts(char **optstr, int opt_cnt,
+			   struct ethtool_rx_flow_spec *fsp);
+int rxclass_rule_getall(int fd, struct ifreq *ifr);
+int rxclass_rule_get(int fd, struct ifreq *ifr, __u32 loc);
+int rxclass_rule_ins(int fd, struct ifreq *ifr,
+		     struct ethtool_rx_flow_spec *fsp);
+int rxclass_rule_del(int fd, struct ifreq *ifr, __u32 loc);
+
+#endif /* ETHTOOL_UTIL_H__ */
diff --git a/ethtool.8.in b/ethtool.8.in
index 9f484fb..c923319 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -42,10 +42,20 @@
 [\\fB\\$1\\fP\ \\fIN\\fP]
 ..
 .\"
+.\"	.BM - same as above but has a mask field for format "[value N [m N]]"
+.\"
+.de BM
+[\\fB\\$1\\fP\ \\fIN\\fP\ [\\fBm\\fP\ \\fIN\\fP]]
+..
+.\"
 .\"	\(*MA - mac address
 .\"
 .ds MA \fIxx\fP\fB:\fP\fIyy\fP\fB:\fP\fIzz\fP\fB:\fP\fIaa\fP\fB:\fP\fIbb\fP\fB:\fP\fIcc\fP
 .\"
+.\"	\(*PA - IP address
+.\"
+.ds PA \fIx\fP\fB.\fP\fIx\fP\fB.\fP\fIx\fP\fB.\fP\fIx\fP
+.\"
 .\"	\(*WO - wol flags
 .\"
 .ds WO \fBp\fP|\fBu\fP|\fBm\fP|\fBb\fP|\fBa\fP|\fBg\fP|\fBs\fP|\fBd\fP...
@@ -57,6 +67,12 @@
 .\"	\(*HO - hash options
 .\"
 .ds HO \fBm\fP|\fBv\fP|\fBt\fP|\fBs\fP|\fBd\fP|\fBf\fP|\fBn\fP|\fBr\fP...
+.\"
+.\"	\(*NC - Network Classifier type values
+.\"
+.ds NC \fBether\fP|\fBip4\fP|\fBtcp4\fP|\fBudp4\fP|\fBsctp4\fP|\fBah4\fP|\fBesp4\fP
+
+.\"
 .\" Start URL.
 .de UR
 .  ds m1 \\$1\"
@@ -236,8 +252,7 @@ ethtool \- query or control network driver and hardware settings
 .HP
 .B ethtool \-N
 .I ethX
-.RB [ rx\-flow\-hash \ \*(FL
-.RB \ \*(HO]
+.RB [ rx\-flow\-hash \ \*(FL \  \*(HO]
 .HP
 .B ethtool \-x|\-\-show\-rxfh\-indir
 .I ethX
@@ -257,50 +272,28 @@ ethtool \- query or control network driver and hardware settings
 .HP
 .B ethtool \-u|\-\-show\-ntuple
 .I ethX
-.TP
+.BN rule
+.HP
+
 .BI ethtool\ \-U|\-\-config\-ntuple \ ethX
-.RB {
-.A3 flow\-type tcp4 udp4 sctp4
-.RB [ src\-ip
-.IR addr
-.RB [ src\-ip\-mask
-.IR mask ]]
-.RB [ dst\-ip
-.IR addr
-.RB [ dst\-ip\-mask
-.IR mask ]]
-.RB [ src\-port
-.IR port
-.RB [ src\-port\-mask
-.IR mask ]]
-.RB [ dst\-port
-.IR port
-.RB [ dst\-port\-mask
-.IR mask ]]
-.br
-.RB | \ flow\-type\ ether
-.RB [ src
-.IR mac\-addr
-.RB [ src\-mask
-.IR mask ]]
-.RB [ dst
-.IR mac\-addr
-.RB [ dst\-mask
-.IR mask ]]
-.RB [ proto
-.IR N
-.RB [ proto\-mask
-.IR mask ]]\ }
-.br
-.RB [ vlan
-.IR VLAN\-tag
-.RB [ vlan\-mask
-.IR mask ]]
-.RB [ user\-def
-.IR data
-.RB [ user\-def\-mask
-.IR mask ]]
-.RI action \ N
+.BN delete
+.RB [\  flow\-type \ \*(NC
+.RB [ src \ \*(MA\ [ m \ \*(MA]]
+.RB [ dst \ \*(MA\ [ m \ \*(MA]]
+.BM proto
+.RB [ src\-ip \ \*(PA\ [ m \ \*(PA]]
+.RB [ dst\-ip \ \*(PA\ [ m \ \*(PA]]
+.BM tos
+.BM l4proto
+.BM src\-port
+.BM dst\-port
+.BM spi
+.BM vlan\-etype
+.BM vlan
+.BM user\-def
+.BN action
+.BN loc
+.RB ]
 .
 .\" Adjust lines (i.e. full justification) and hyphenate.
 .ad
@@ -630,77 +623,90 @@ Default region is 0 which denotes all regions in the flash.
 .TP
 .B \-u \-\-show\-ntuple
 Get Rx ntuple filters and actions, then display them to the user.
+.TP
+.BI rule \ N
+Retrieves the RX classification rule with the given ID.
 .PD
 .RE
 .TP
 .B \-U \-\-config\-ntuple
 Configure Rx ntuple filters and actions
 .TP
-.B flow\-type tcp4|udp4|sctp4|ether
+.BI delete \ N
+Deletes the RX classification rule with the given ID.
+.TP
+.B flow\-type \*(NC
 .TS
 nokeep;
 lB	l.
+ether	Ethernet
+ip4	Raw IPv4
 tcp4	TCP over IPv4
 udp4	UDP over IPv4
 sctp4	SCTP over IPv4
-ether	Ethernet
+ah4	IPSEC AH over IPv4
+esp4	IPSEC ESP over IPv4
 .TE
+.PP
+All fields below that include a mask option may either use "m" to
+indicate a mask, or may use the full name of the field with a "-mask"
+appended to indicate that this is the mask for a given field.
+.PD
+.RE
 .TP
-.BI src\-ip \ addr
-Includes the source IP address, specified using dotted-quad notation
-or as a single 32-bit number.
-.TP
-.BI src\-ip\-mask \ mask
-Specify a mask for the source IP address.
-.TP
-.BI dst\-ip \ addr
-Includes the destination IP address.
-.TP
-.BI dst\-ip\-mask \ mask
-Specify a mask for the destination IP address.
-.TP
-.BI src\-port \ port
-Includes the source port.
-.TP
-.BI src\-port\-mask \ mask
-Specify a mask for the source port.
+.BR src \ \*(MA\ [ m \ \*(MA]
+Includes the source MAC address, specified as 6 bytes in hexadecimal
+separated by colons, along with an optional mask.  Valid only for
+flow-type ether.
 .TP
-.BI dst\-port \ port
-Includes the destination port.
+.BR dst \ \*(MA\ [ m \ \*(MA]
+Includes the destination MAC address, specified as 6 bytes in hexadecimal
+separated by colons, along with an optional mask.  Valid only for
+flow-type ether.
 .TP
-.BI dst\-port\-mask \ mask
-Specify a mask for the destination port.
+.BI proto \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Includes the Ethernet protocol number (ethertype) and an optional mask.
+Valid only for flow-type ether.
 .TP
-.BI src \ mac\-addr
-Includes the source MAC address, specified as 6 bytes in hexadecimal
-separated by colons.
+.BR src\-ip \ \*(PA\ [ m \ \*(PA]
+Specify the source IP address of the incoming packet to match along with
+an optional mask.  Valid for all IPv4 based flow-types.
 .TP
-.BI src\-mask \ mask
-Specify a mask for the source MAC address.
+.BR dst\-ip \ \*(PA\ [ m \ \*(PA]
+Specify the destination IP address of the incoming packet to match along
+with an optional mask.  Valid for all IPv4 based flow-types.
 .TP
-.BI dst \ mac\-addr
-Includes the destination MAC address.
+.BI tos \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Specify the value of the Type of Service field in the incoming packet to
+match along with an optional mask.  Applies to all IPv4 based flow-types.
 .TP
-.BI dst\-mask \ mask
-Specify a mask for the destination MAC address.
+.BI l4proto \ N \\fR\ [\\fPl4m \ N \\fR]\\fP
+Includes the layer 4 protocol number and optional mask.  Valid only for
+flow-type ip4.
 .TP
-.BI proto \ N
-Includes the Ethernet protocol number (ethertype).
+.BI src\-port \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Specify the value of the source port field (applicable to TCP/UDP packets)
+in the incoming packet to match along with an optional mask.  Valid for
+flow-types ip4, tcp4, udp4, and sctp4.
 .TP
-.BI proto\-mask \ mask
-Specify a mask for the Ethernet protocol number.
+.BI dst\-port \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Specify the value of the destination port field (applicable to TCP/UDP
+packets)in the incoming packet to match along with an optional mask.
+Valid for flow-types ip4, tcp4, udp4, and sctp4.
 .TP
-.BI vlan \ VLAN\-tag
-Includes the VLAN tag.
+.BI spi \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Specify the value of the security parameter index field (applicable to
+AH/ESP packets)in the incoming packet to match along with an optional
+mask.  Valid for flow-types ip4, ah4, and esp4.
 .TP
-.BI vlan\-mask \ mask
-Specify a mask for the VLAN tag.
+.BI vlan\-etype \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Includes the VLAN tag Ethertype and an optional mask.
 .TP
-.BI user\-def \ data
-Includes 64-bits of user-specific data.
+.BI vlan \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Includes the VLAN tag and an optional mask.
 .TP
-.BI user\-def\-mask \ mask
-Specify a mask for the user-specific data.
+.BI user\-def \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Includes 64-bits of user-specific data and an optional mask.
 .TP
 .BI action \ N
 Specifies the Rx queue to send packets to, or some other action.
@@ -711,6 +717,11 @@ lB	l.
 -1	Drop the matched flow
 0 or higher	Rx queue to route the flow
 .TE
+.TP
+.BI loc \ N
+Specify the location/ID to insert the rule. This will overwrite
+any rule present in that location and will not go through any
+of the rule ordering process.
 .SH BUGS
 Not supported (in part or whole) on all network drivers.
 .SH AUTHOR
@@ -724,7 +735,8 @@ Jakub Jelinek,
 Andre Majorel,
 Eli Kupermann,
 Scott Feldman,
-Andi Kleen.
+Andi Kleen,
+Alexander Duyck.
 .SH AVAILABILITY
 .B ethtool
 is available from
diff --git a/ethtool.c b/ethtool.c
index 24d4e4f..2e04d87 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -6,6 +6,7 @@
  * Kernel 2.4 update Copyright 2001 Jeff Garzik <jgarzik@mandrakesoft.com>
  * Wake-on-LAN,natsemi,misc support by Tim Hockin <thockin@sun.com>
  * Portions Copyright 2002 Intel
+ * Portions Copyright (C) Sun Microsystems 2008
  * do_test support by Eli Kupermann <eli.kupermann@intel.com>
  * ETHTOOL_PHYS_ID support by Chris Leech <christopher.leech@intel.com>
  * e1000 support by Scott Feldman <scott.feldman@intel.com>
@@ -14,6 +15,7 @@
  * amd8111e support by Reeja John <reeja.john@amd.com>
  * long arguments by Andi Kleen.
  * SMSC LAN911x support by Steve Glendinning <steve.glendinning@smsc.com>
+ * Rx Network Flow Control configuration support <santwona.behera@sun.com>
  * Various features by Ben Hutchings <bhutchings@solarflare.com>;
  *	Copyright 2009, 2010 Solarflare Communications
  *
@@ -85,14 +87,13 @@ static int do_gstats(int fd, struct ifreq *ifr);
 static int rxflow_str_to_type(const char *str);
 static int parse_rxfhashopts(char *optstr, u32 *data);
 static char *unparse_rxfhashopts(u64 opts);
-static void parse_rxntupleopts(int argc, char **argp, int first_arg);
 static int dump_rxfhash(int fhash, u64 val);
 static int do_srxclass(int fd, struct ifreq *ifr);
 static int do_grxclass(int fd, struct ifreq *ifr);
 static int do_grxfhindir(int fd, struct ifreq *ifr);
 static int do_srxfhindir(int fd, struct ifreq *ifr);
-static int do_srxntuple(int fd, struct ifreq *ifr);
-static int do_grxntuple(int fd, struct ifreq *ifr);
+static int do_srxclsrule(int fd, struct ifreq *ifr);
+static int do_grxclsrule(int fd, struct ifreq *ifr);
 static int do_flash(int fd, struct ifreq *ifr);
 static int do_permaddr(int fd, struct ifreq *ifr);
 
@@ -123,8 +124,8 @@ static enum {
 	MODE_SNFC,
 	MODE_GRXFHINDIR,
 	MODE_SRXFHINDIR,
-	MODE_SNTUPLE,
-	MODE_GNTUPLE,
+	MODE_SCLSRULE,
+	MODE_GCLSRULE,
 	MODE_FLASHDEV,
 	MODE_PERMADDR,
 } mode = MODE_GSET;
@@ -230,22 +231,28 @@ static struct option {
 		"indirection" },
     { "-X", "--set-rxfh-indir", MODE_SRXFHINDIR, "Set Rx flow hash indirection",
 		"		equal N | weight W0 W1 ...\n" },
-    { "-U", "--config-ntuple", MODE_SNTUPLE, "Configure Rx ntuple filters "
+    { "-U", "--config-ntuple", MODE_SCLSRULE, "Configure Rx ntuple filters "
 		"and actions",
-		"		{ flow-type tcp4|udp4|sctp4\n"
-		"		  [ src-ip ADDR [src-ip-mask MASK] ]\n"
-		"		  [ dst-ip ADDR [dst-ip-mask MASK] ]\n"
-		"		  [ src-port PORT [src-port-mask MASK] ]\n"
-		"		  [ dst-port PORT [dst-port-mask MASK] ]\n"
-		"		| flow-type ether\n"
-		"		  [ src MAC-ADDR [src-mask MASK] ]\n"
-		"		  [ dst MAC-ADDR [dst-mask MASK] ]\n"
-		"		  [ proto N [proto-mask MASK] ] }\n"
-		"		[ vlan VLAN-TAG [vlan-mask MASK] ]\n"
-		"		[ user-def DATA [user-def-mask MASK] ]\n"
-		"		action N\n" },
-    { "-u", "--show-ntuple", MODE_GNTUPLE,
-		"Get Rx ntuple filters and actions\n" },
+		"		[ delete %d ] |\n"
+		"		[ flow-type ether|ip4|tcp4|udp4|sctp4|ah4|esp4\n"
+		"			[ src %x:%x:%x:%x:%x:%x [m %x:%x:%x:%x:%x:%x] ]\n"
+		"			[ dst %x:%x:%x:%x:%x:%x [m %x:%x:%x:%x:%x:%x] ]\n"
+		"			[ proto %d [m %x] ]\n"
+		"			[ src-ip %d.%d.%d.%d [m %d.%d.%d.%d] ]\n"
+		"			[ dst-ip %d.%d.%d.%d [m %d.%d.%d.%d] ]\n"
+		"			[ tos %d [m %x] ]\n"
+		"			[ l4proto %d [m %x] ]\n"
+		"			[ src-port %d [m %x] ]\n"
+		"			[ dst-port %d [m %x] ]\n"
+		"			[ spi %d [m %x] ]\n"
+		"			[ vlan-etype %x [m %x] ]\n"
+		"			[ vlan %x [m %x] ]\n"
+		"			[ user-def %x [m %x] ]\n"
+		"			[ action %d ]\n"
+		"			[ loc %d]]\n" },
+    { "-u", "--show-ntuple", MODE_GCLSRULE,
+		"Get Rx ntuple filters and actions",
+		"		[ rule %d ]\n"},
     { "-P", "--show-permaddr", MODE_PERMADDR,
 		"Show permanent hardware address" },
     { "-h", "--help", MODE_HELP, "Show this help" },
@@ -371,26 +378,6 @@ static u32 rx_fhash_val = 0;
 static int rx_fhash_changed = 0;
 static int rxfhindir_equal = 0;
 static char **rxfhindir_weight = NULL;
-static int sntuple_changed = 0;
-static struct ethtool_rx_ntuple_flow_spec ntuple_fs;
-static int ntuple_ip4src_seen = 0;
-static int ntuple_ip4src_mask_seen = 0;
-static int ntuple_ip4dst_seen = 0;
-static int ntuple_ip4dst_mask_seen = 0;
-static int ntuple_psrc_seen = 0;
-static int ntuple_psrc_mask_seen = 0;
-static int ntuple_pdst_seen = 0;
-static int ntuple_pdst_mask_seen = 0;
-static int ntuple_ether_dst_seen = 0;
-static int ntuple_ether_dst_mask_seen = 0;
-static int ntuple_ether_src_seen = 0;
-static int ntuple_ether_src_mask_seen = 0;
-static int ntuple_ether_proto_seen = 0;
-static int ntuple_ether_proto_mask_seen = 0;
-static int ntuple_vlan_tag_seen = 0;
-static int ntuple_vlan_tag_mask_seen = 0;
-static int ntuple_user_def_seen = 0;
-static int ntuple_user_def_mask_seen = 0;
 static char *flash_file = NULL;
 static int flash = -1;
 static int flash_region = -1;
@@ -399,6 +386,11 @@ static int msglvl_changed;
 static u32 msglvl_wanted = 0;
 static u32 msglvl_mask = 0;
 
+static int rx_class_rule_get = -1;
+static int rx_class_rule_del = -1;
+static int rx_class_rule_added = 0;
+static struct ethtool_rx_flow_spec rx_rule_fs;
+
 static enum {
 	ONLINE=0,
 	OFFLINE,
@@ -511,58 +503,6 @@ static struct cmdline_info cmdline_coalesce[] = {
 	{ "tx-frames-high", CMDL_S32, &coal_tx_frames_high_wanted, &ecoal.tx_max_coalesced_frames_high },
 };
 
-static struct cmdline_info cmdline_ntuple_tcp_ip4[] = {
-	{ "src-ip", CMDL_IP4, &ntuple_fs.h_u.tcp_ip4_spec.ip4src, NULL,
-	  0, &ntuple_ip4src_seen },
-	{ "src-ip-mask", CMDL_IP4, &ntuple_fs.m_u.tcp_ip4_spec.ip4src, NULL,
-	  0, &ntuple_ip4src_mask_seen },
-	{ "dst-ip", CMDL_IP4, &ntuple_fs.h_u.tcp_ip4_spec.ip4dst, NULL,
-	  0, &ntuple_ip4dst_seen },
-	{ "dst-ip-mask", CMDL_IP4, &ntuple_fs.m_u.tcp_ip4_spec.ip4dst, NULL,
-	  0, &ntuple_ip4dst_mask_seen },
-	{ "src-port", CMDL_BE16, &ntuple_fs.h_u.tcp_ip4_spec.psrc, NULL,
-	  0, &ntuple_psrc_seen },
-	{ "src-port-mask", CMDL_BE16, &ntuple_fs.m_u.tcp_ip4_spec.psrc, NULL,
-	  0, &ntuple_psrc_mask_seen },
-	{ "dst-port", CMDL_BE16, &ntuple_fs.h_u.tcp_ip4_spec.pdst, NULL,
-	  0, &ntuple_pdst_seen },
-	{ "dst-port-mask", CMDL_BE16, &ntuple_fs.m_u.tcp_ip4_spec.pdst, NULL,
-	  0, &ntuple_pdst_mask_seen },
-	{ "vlan", CMDL_U16, &ntuple_fs.vlan_tag, NULL,
-	  0, &ntuple_vlan_tag_seen },
-	{ "vlan-mask", CMDL_U16, &ntuple_fs.vlan_tag_mask, NULL,
-	  0, &ntuple_vlan_tag_mask_seen },
-	{ "user-def", CMDL_U64, &ntuple_fs.data, NULL,
-	  0, &ntuple_user_def_seen },
-	{ "user-def-mask", CMDL_U64, &ntuple_fs.data_mask, NULL,
-	  0, &ntuple_user_def_mask_seen },
-	{ "action", CMDL_S32, &ntuple_fs.action, NULL },
-};
-
-static struct cmdline_info cmdline_ntuple_ether[] = {
-	{ "dst", CMDL_MAC, ntuple_fs.h_u.ether_spec.h_dest, NULL,
-	  0, &ntuple_ether_dst_seen },
-	{ "dst-mask", CMDL_MAC, ntuple_fs.m_u.ether_spec.h_dest, NULL,
-	  0, &ntuple_ether_dst_mask_seen },
-	{ "src", CMDL_MAC, ntuple_fs.h_u.ether_spec.h_source, NULL,
-	  0, &ntuple_ether_src_seen },
-	{ "src-mask", CMDL_MAC, ntuple_fs.m_u.ether_spec.h_source, NULL,
-	  0, &ntuple_ether_src_mask_seen },
-	{ "proto", CMDL_BE16, &ntuple_fs.h_u.ether_spec.h_proto, NULL,
-	  0, &ntuple_ether_proto_seen },
-	{ "proto-mask", CMDL_BE16, &ntuple_fs.m_u.ether_spec.h_proto, NULL,
-	  0, &ntuple_ether_proto_mask_seen },
-	{ "vlan", CMDL_U16, &ntuple_fs.vlan_tag, NULL,
-	  0, &ntuple_vlan_tag_seen },
-	{ "vlan-mask", CMDL_U16, &ntuple_fs.vlan_tag_mask, NULL,
-	  0, &ntuple_vlan_tag_mask_seen },
-	{ "user-def", CMDL_U64, &ntuple_fs.data, NULL,
-	  0, &ntuple_user_def_seen },
-	{ "user-def-mask", CMDL_U64, &ntuple_fs.data_mask, NULL,
-	  0, &ntuple_user_def_mask_seen },
-	{ "action", CMDL_S32, &ntuple_fs.action, NULL },
-};
-
 static struct cmdline_info cmdline_msglvl[] = {
 	{ "drv", CMDL_FLAG, &msglvl_wanted, NULL,
 	  NETIF_MSG_DRV, &msglvl_mask },
@@ -833,8 +773,8 @@ static void parse_cmdline(int argc, char **argp)
 			    (mode == MODE_SNFC) ||
 			    (mode == MODE_GRXFHINDIR) ||
 			    (mode == MODE_SRXFHINDIR) ||
-			    (mode == MODE_SNTUPLE) ||
-			    (mode == MODE_GNTUPLE) ||
+			    (mode == MODE_SCLSRULE) ||
+			    (mode == MODE_GCLSRULE) ||
 			    (mode == MODE_PHYS_ID) ||
 			    (mode == MODE_FLASHDEV) ||
 			    (mode == MODE_PERMADDR)) {
@@ -918,16 +858,45 @@ static void parse_cmdline(int argc, char **argp)
 				i = argc;
 				break;
 			}
-			if (mode == MODE_SNTUPLE) {
+			if (mode == MODE_SCLSRULE) {
 				if (!strcmp(argp[i], "flow-type")) {
 					i += 1;
 					if (i >= argc) {
 						exit_bad_args();
 						break;
 					}
-					parse_rxntupleopts(argc, argp, i);
-					i = argc;
-					break;
+					if (rxclass_parse_ruleopts(&argp[i],
+							argc - i,
+							&rx_rule_fs) < 0) {
+						exit_bad_args();
+					} else {
+						i = argc;
+						rx_class_rule_added = 1;
+					}
+				} else if (!strcmp(argp[i], "delete")) {
+					i += 1;
+					if (i >= argc) {
+						exit_bad_args();
+						break;
+					}
+					rx_class_rule_del =
+						get_uint_range(argp[i], 0,
+							       INT_MAX);
+				} else {
+					exit_bad_args();
+				}
+				break;
+			}
+			if (mode == MODE_GCLSRULE) {
+				if (!strcmp(argp[i], "rule")) {
+					i += 1;
+					if (i >= argc) {
+						exit_bad_args();
+						break;
+					}
+					rx_class_rule_get =
+						get_uint_range(argp[i], 0,
+							       INT_MAX);
 				} else {
 					exit_bad_args();
 				}
@@ -1598,66 +1567,6 @@ static char *unparse_rxfhashopts(u64 opts)
 	return buf;
 }
 
-static void parse_rxntupleopts(int argc, char **argp, int i)
-{
-	ntuple_fs.flow_type = rxflow_str_to_type(argp[i]);
-
-	switch (ntuple_fs.flow_type) {
-	case TCP_V4_FLOW:
-	case UDP_V4_FLOW:
-	case SCTP_V4_FLOW:
-		parse_generic_cmdline(argc, argp, i + 1,
-				      &sntuple_changed,
-				      cmdline_ntuple_tcp_ip4,
-				      ARRAY_SIZE(cmdline_ntuple_tcp_ip4));
-		if (!ntuple_ip4src_seen)
-			ntuple_fs.m_u.tcp_ip4_spec.ip4src = 0xffffffff;
-		if (!ntuple_ip4dst_seen)
-			ntuple_fs.m_u.tcp_ip4_spec.ip4dst = 0xffffffff;
-		if (!ntuple_psrc_seen)
-			ntuple_fs.m_u.tcp_ip4_spec.psrc = 0xffff;
-		if (!ntuple_pdst_seen)
-			ntuple_fs.m_u.tcp_ip4_spec.pdst = 0xffff;
-		ntuple_fs.m_u.tcp_ip4_spec.tos = 0xff;
-		break;
-	case ETHER_FLOW:
-		parse_generic_cmdline(argc, argp, i + 1,
-				      &sntuple_changed,
-				      cmdline_ntuple_ether,
-				      ARRAY_SIZE(cmdline_ntuple_ether));
-		if (!ntuple_ether_dst_seen)
-			memset(ntuple_fs.m_u.ether_spec.h_dest, 0xff, ETH_ALEN);
-		if (!ntuple_ether_src_seen)
-			memset(ntuple_fs.m_u.ether_spec.h_source, 0xff,
-			       ETH_ALEN);
-		if (!ntuple_ether_proto_seen)
-			ntuple_fs.m_u.ether_spec.h_proto = 0xffff;
-		break;
-	default:
-		fprintf(stderr, "Unsupported flow type \"%s\"\n", argp[i]);
-		exit(106);
-		break;
-	}
-
-	if (!ntuple_vlan_tag_seen)
-		ntuple_fs.vlan_tag_mask = 0xffff;
-	if (!ntuple_user_def_seen)
-		ntuple_fs.data_mask = 0xffffffffffffffffULL;
-
-	if ((ntuple_ip4src_mask_seen && !ntuple_ip4src_seen) ||
-	    (ntuple_ip4dst_mask_seen && !ntuple_ip4dst_seen) ||
-	    (ntuple_psrc_mask_seen && !ntuple_psrc_seen) ||
-	    (ntuple_pdst_mask_seen && !ntuple_pdst_seen) ||
-	    (ntuple_ether_dst_mask_seen && !ntuple_ether_dst_seen) ||
-	    (ntuple_ether_src_mask_seen && !ntuple_ether_src_seen) ||
-	    (ntuple_ether_proto_mask_seen && !ntuple_ether_proto_seen) ||
-	    (ntuple_vlan_tag_mask_seen && !ntuple_vlan_tag_seen) ||
-	    (ntuple_user_def_mask_seen && !ntuple_user_def_seen)) {
-		fprintf(stderr, "Cannot specify mask without value\n");
-		exit(107);
-	}
-}
-
 static struct {
 	const char *name;
 	int (*func)(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
@@ -2019,10 +1928,10 @@ static int doit(void)
 		return do_grxfhindir(fd, &ifr);
 	} else if (mode == MODE_SRXFHINDIR) {
 		return do_srxfhindir(fd, &ifr);
-	} else if (mode == MODE_SNTUPLE) {
-		return do_srxntuple(fd, &ifr);
-	} else if (mode == MODE_GNTUPLE) {
-		return do_grxntuple(fd, &ifr);
+	} else if (mode == MODE_SCLSRULE) {
+		return do_srxclsrule(fd, &ifr);
+	} else if (mode == MODE_GCLSRULE) {
+		return do_grxclsrule(fd, &ifr);
 	} else if (mode == MODE_FLASHDEV) {
 		return do_flash(fd, &ifr);
 	} else if (mode == MODE_PERMADDR) {
@@ -3155,21 +3064,143 @@ static int do_permaddr(int fd, struct ifreq *ifr)
 	return err;
 }
 
+static int flow_spec_to_ntuple(struct ethtool_rx_flow_spec *fsp,
+			       struct ethtool_rx_ntuple_flow_spec *ntuple)
+{
+	size_t i;
+
+	/* verify location is not specified */
+	if (fsp->location != RX_CLS_LOC_UNSPEC)
+		return -1;
+
+	/* verify ring cookie can transfer to action */
+	if (fsp->ring_cookie > INT_MAX && fsp->ring_cookie < (u64)(-2))
+		return -1;
+
+	/* verify only one field is setting data field */
+	if ((fsp->flow_type & FLOW_EXT) &&
+	    (fsp->m_ext.data[0] || fsp->m_ext.data[1]) &&
+	    fsp->m_ext.vlan_etype)
+		return -1;
+
+	/* Set entire ntuple to ~0 to guarantee all masks are set */
+	memset(ntuple, ~0, sizeof(*ntuple));
+
+	/* set non-filter values */
+	ntuple->flow_type = fsp->flow_type;
+	ntuple->action = fsp->ring_cookie;
+
+	/*
+	 * Copy over header union, they are identical in layout however
+	 * the ntuple union contains additional padding on the end
+	 */
+	memcpy(&ntuple->h_u, &fsp->h_u, sizeof(fsp->h_u));
+
+	/*
+	 * The same rule mentioned above applies to the mask union.  However,
+	 * in addition we need to invert the mask bits to match the ntuple
+	 * mask which is 1 for masked, versus 0 for masked as seen in nfc.
+	 */
+	memcpy(&ntuple->m_u, &fsp->m_u, sizeof(fsp->m_u));
+	for (i = 0; i < sizeof(fsp->m_u); i++)
+		ntuple->m_u.hdata[i] ^= 0xFF;
+
+	/* copy extended fields */
+	if (fsp->flow_type & FLOW_EXT) {
+		ntuple->vlan_tag =
+			ntohs(fsp->h_ext.vlan_tci);
+		ntuple->vlan_tag_mask =
+			~ntohs(fsp->m_ext.vlan_tci);
+		if (fsp->m_ext.vlan_etype) {
+			/*
+			 * vlan_etype and user data are mutually exclusive
+			 * in ntuple configuration as they occupy the same
+			 * space.
+			 */
+			if (fsp->m_ext.data[0] || fsp->m_ext.data[1])
+				return -1;
+			ntuple->data =
+				ntohl(fsp->h_ext.vlan_etype);
+			ntuple->data_mask =
+				~(u64)ntohl(fsp->m_ext.vlan_etype);
+		} else {
+			ntuple->data =
+				(u64)ntohl(fsp->h_ext.data[0]) << 32;
+			ntuple->data |=
+				(u64)ntohl(fsp->h_ext.data[1]);
+			ntuple->data_mask =
+				(u64)ntohl(~fsp->m_ext.data[0]) << 32;
+			ntuple->data_mask |=
+				(u64)ntohl(~fsp->m_ext.data[1]);
+		}
+	}
+
+	return 0;
+}
+
 static int do_srxntuple(int fd, struct ifreq *ifr)
 {
+	struct ethtool_rx_ntuple ntuplecmd;
+	struct ethtool_value eval;
 	int err;
 
-	if (sntuple_changed) {
-		struct ethtool_rx_ntuple ntuplecmd;
+	/* attempt to convert the flow classifier to an ntuple classifier */
+	err = flow_spec_to_ntuple(&rx_rule_fs, &ntuplecmd.fs);
+	if (err)
+		return -1;
 
-		ntuplecmd.cmd = ETHTOOL_SRXNTUPLE;
-		memcpy(&ntuplecmd.fs, &ntuple_fs,
-		       sizeof(struct ethtool_rx_ntuple_flow_spec));
+	/*
+	 * Check to see if the flag is set for N-tuple, this allows
+	 * us to avoid the possible EINVAL response for the N-tuple
+	 * flag not being set on the device
+	 */
+	eval.cmd = ETHTOOL_GFLAGS;
+	ifr->ifr_data = (caddr_t)&eval;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err || !(eval.data & ETH_FLAG_NTUPLE))
+		return -1;
 
-		ifr->ifr_data = (caddr_t)&ntuplecmd;
-		err = ioctl(fd, SIOCETHTOOL, ifr);
-		if (err < 0)
-			perror("Cannot add new RX n-tuple filter");
+	/* send rule via N-tuple */
+	ntuplecmd.cmd = ETHTOOL_SRXNTUPLE;
+	ifr->ifr_data = (caddr_t)&ntuplecmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+
+	/*
+	 * Display error only if reponse is something other than op not
+	 * supported.  It is possible that the interface uses the network
+	 * flow classifier interface instead of N-tuple. 
+	 */ 
+	if (err && errno != EOPNOTSUPP)
+		perror("Cannot add new rule via N-tuple");
+
+	return err;
+}
+
+static int do_srxclsrule(int fd, struct ifreq *ifr)
+{
+	int err;
+
+	if (rx_class_rule_added) {
+		/* attempt to add rule via N-tuple specifier */
+		err = do_srxntuple(fd, ifr);
+		if (!err)
+			return 0;
+
+		/* attempt to add rule via network flow classifier */
+		err = rxclass_rule_ins(fd, ifr, &rx_rule_fs);
+		if (err < 0) {
+			fprintf(stderr, "Cannot insert"
+				" classification rule\n");
+			return 1;
+		}
+	} else if (rx_class_rule_del >= 0) {
+		err = rxclass_rule_del(fd, ifr, rx_class_rule_del);
+
+		if (err < 0) {
+			fprintf(stderr, "Cannot delete"
+				" classification rule\n");
+			return 1;
+		}
 	} else {
 		exit_bad_args();
 	}
@@ -3177,9 +3208,32 @@ static int do_srxntuple(int fd, struct ifreq *ifr)
 	return 0;
 }
 
-static int do_grxntuple(int fd, struct ifreq *ifr)
+static int do_grxclsrule(int fd, struct ifreq *ifr)
 {
-	return 0;
+	struct ethtool_rxnfc nfccmd;
+	int err;
+
+	if (rx_class_rule_get >= 0) {
+		err = rxclass_rule_get(fd, ifr, rx_class_rule_get);
+		if (err < 0)
+			fprintf(stderr, "Cannot get RX classification rule\n");
+		return err ? 1 : 0;
+	}
+
+	nfccmd.cmd = ETHTOOL_GRXRINGS;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0)
+		perror("Cannot get RX rings");
+	else
+		fprintf(stdout, "%d RX rings available\n",
+			(int)nfccmd.data);
+
+	err = rxclass_rule_getall(fd, ifr);
+	if (err < 0)
+		fprintf(stderr, "RX classification rule retrieval failed\n");
+
+	return err ? 1 : 0;
 }
 
 static int send_ioctl(int fd, struct ifreq *ifr)
diff --git a/rxclass.c b/rxclass.c
new file mode 100644
index 0000000..1e6b2a6
--- /dev/null
+++ b/rxclass.c
@@ -0,0 +1,1039 @@
+/*
+ * Copyright (C) 2008 Sun Microsystems, Inc. All rights reserved.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stddef.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+
+#include <linux/sockios.h>
+#include <arpa/inet.h>
+#include "ethtool-util.h"
+#include "ethtool-bitops.h"
+
+/*
+ * This is a rule manager implementation for ordering rx flow
+ * classification rules in a longest prefix first match order.
+ * The assumption is that this rule manager is the only one adding rules to
+ * the device's hardware classifier.
+ */
+
+struct rmgr_ctrl {
+	/* slot contains a bitmap indicating which filters are valid */
+	unsigned long		*slot;
+	__u32			n_rules;
+	__u32			size;
+};
+
+static struct rmgr_ctrl rmgr;
+static int rmgr_init_done = 0;
+
+static void invert_flow_mask(struct ethtool_rx_flow_spec *fsp)
+{
+	size_t i;
+
+	for (i = 0; i < sizeof(fsp->m_u); i++)
+		fsp->m_u.hdata[i] ^= 0xFF;
+}
+
+static void rmgr_print_ipv4_rule(__be32 sip, __be32 sipm, __be32 dip,
+				 __be32 dipm, u8 tos, u8 tosm)
+{
+	char sip_str[INET_ADDRSTRLEN];
+	char sipm_str[INET_ADDRSTRLEN];
+	char dip_str[INET_ADDRSTRLEN];
+	char dipm_str[INET_ADDRSTRLEN];
+
+	fprintf(stdout,
+		"\tSrc IP addr: %s mask: %s\n"
+		"\tDest IP addr: %s mask: %s\n"
+		"\tTOS: 0x%x mask: 0x%x\n",
+		inet_ntop(AF_INET, &sip, sip_str, INET_ADDRSTRLEN),
+		inet_ntop(AF_INET, &sipm, sipm_str, INET_ADDRSTRLEN),
+		inet_ntop(AF_INET, &dip, dip_str, INET_ADDRSTRLEN),
+		inet_ntop(AF_INET, &dipm, dipm_str, INET_ADDRSTRLEN),
+		tos, tosm);
+}
+
+static void rmgr_print_nfc_spec_ext(struct ethtool_rx_flow_spec *fsp)
+{
+	u64 data, datam;
+	__u16 etype, etypem, tci, tcim;
+
+	if (!(fsp->flow_type & FLOW_EXT))
+		return;
+
+	etype = ntohs(fsp->h_ext.vlan_etype);
+	etypem = ntohs(~fsp->m_ext.vlan_etype);
+	tci = ntohs(fsp->h_ext.vlan_tci);
+	tcim = ntohs(~fsp->m_ext.vlan_tci);
+	data = (u64)ntohl(fsp->h_ext.data[0]) << 32;
+	data = (u64)ntohl(fsp->h_ext.data[1]);
+	datam = (u64)ntohl(~fsp->m_ext.data[0]) << 32;
+	datam |= (u64)ntohl(~fsp->m_ext.data[1]);
+
+	fprintf(stdout,
+		"\tVLAN EtherType: 0x%x mask: 0x%x\n"
+		"\tVLAN: 0x%x mask: 0x%x\n"
+		"\tUser-defined: 0x%llx mask: 0x%llx\n",
+		etype, etypem, tci, tcim, data, datam);
+}
+
+static void rmgr_print_nfc_rule(struct ethtool_rx_flow_spec *fsp)
+{
+	unsigned char	*smac, *smacm, *dmac, *dmacm;
+	__u32		flow_type;
+
+	if (fsp->location != RX_CLS_LOC_UNSPEC)
+		fprintf(stdout,	"Filter: %d\n", fsp->location);
+	else
+		fprintf(stdout,	"Filter: Unspecified\n");
+
+	flow_type = fsp->flow_type & ~FLOW_EXT;
+
+	invert_flow_mask(fsp);
+
+	switch (flow_type) {
+	case TCP_V4_FLOW:
+	case UDP_V4_FLOW:
+	case SCTP_V4_FLOW:
+		if (flow_type == TCP_V4_FLOW)
+			fprintf(stdout, "\tRule Type: TCP over IPv4\n");
+		else if (flow_type == UDP_V4_FLOW)
+			fprintf(stdout, "\tRule Type: UDP over IPv4\n");
+		else
+			fprintf(stdout, "\tRule Type: SCTP over IPv4\n");
+		rmgr_print_ipv4_rule(fsp->h_u.tcp_ip4_spec.ip4src,
+				     fsp->m_u.tcp_ip4_spec.ip4src,
+				     fsp->h_u.tcp_ip4_spec.ip4dst,
+				     fsp->m_u.tcp_ip4_spec.ip4dst,
+				     fsp->h_u.tcp_ip4_spec.tos,
+				     fsp->m_u.tcp_ip4_spec.tos);
+		fprintf(stdout,
+			"\tSrc port: %d mask: 0x%x\n"
+			"\tDest port: %d mask: 0x%x\n",
+			ntohs(fsp->h_u.tcp_ip4_spec.psrc),
+			ntohs(fsp->m_u.tcp_ip4_spec.psrc),
+			ntohs(fsp->h_u.tcp_ip4_spec.pdst),
+			ntohs(fsp->m_u.tcp_ip4_spec.pdst));
+		break;
+	case AH_V4_FLOW:
+	case ESP_V4_FLOW:
+		if (flow_type == AH_V4_FLOW)
+			fprintf(stdout, "\tRule Type: IPSEC AH over IPv4\n");
+		else
+			fprintf(stdout, "\tRule Type: IPSEC ESP over IPv4\n");
+		rmgr_print_ipv4_rule(fsp->h_u.ah_ip4_spec.ip4src,
+				     fsp->m_u.ah_ip4_spec.ip4src,
+				     fsp->h_u.ah_ip4_spec.ip4dst,
+				     fsp->m_u.ah_ip4_spec.ip4dst,
+				     fsp->h_u.ah_ip4_spec.tos,
+				     fsp->m_u.ah_ip4_spec.tos);
+		fprintf(stdout,
+			"\tSPI: %d mask: 0x%x\n",
+			ntohl(fsp->h_u.esp_ip4_spec.spi),
+			ntohl(fsp->m_u.esp_ip4_spec.spi));
+		break;
+	case IP_USER_FLOW:
+		fprintf(stdout, "\tRule Type: Raw IPv4\n");
+		rmgr_print_ipv4_rule(fsp->h_u.usr_ip4_spec.ip4src,
+				     fsp->m_u.usr_ip4_spec.ip4src,
+				     fsp->h_u.usr_ip4_spec.ip4dst,
+				     fsp->m_u.usr_ip4_spec.ip4dst,
+				     fsp->h_u.usr_ip4_spec.tos,
+				     fsp->m_u.usr_ip4_spec.tos);
+		fprintf(stdout,
+			"\tProtocol: %d mask: 0x%x\n"
+			"\tL4 bytes: 0x%x mask: 0x%x\n",
+			fsp->h_u.usr_ip4_spec.proto,
+			fsp->m_u.usr_ip4_spec.proto,
+			ntohl(fsp->h_u.usr_ip4_spec.l4_4_bytes),
+			ntohl(fsp->m_u.usr_ip4_spec.l4_4_bytes));
+		break;
+	case ETHER_FLOW:
+		dmac = fsp->h_u.ether_spec.h_dest;
+		dmacm = fsp->m_u.ether_spec.h_dest;
+		smac = fsp->h_u.ether_spec.h_source;
+		smacm = fsp->m_u.ether_spec.h_source;
+
+		fprintf(stdout,
+			"\tFlow Type: Raw Ethernet\n"
+			"\tSrc MAC addr: %02X:%02X:%02X:%02X:%02X:%02X"
+			" mask: %02X:%02X:%02X:%02X:%02X:%02X\n"
+			"\tDest MAC addr: %02X:%02X:%02X:%02X:%02X:%02X"
+			" mask: %02X:%02X:%02X:%02X:%02X:%02X\n"
+			"\tEthertype: 0x%X mask: 0x%X\n",
+			smac[0], smac[1], smac[2], smac[3], smac[4], smac[5],
+			smacm[0], smacm[1], smacm[2], smacm[3], smacm[4],
+			smacm[5], dmac[0], dmac[1], dmac[2], dmac[3], dmac[4],
+			dmac[5], dmacm[0], dmacm[1], dmacm[2], dmacm[3],
+			dmacm[4], dmacm[5],
+			ntohs(fsp->h_u.ether_spec.h_proto),
+			ntohs(fsp->m_u.ether_spec.h_proto));
+		break;
+	default:
+		fprintf(stdout,
+			"\tUnknown Flow type: %d\n", flow_type);
+		break;
+	}
+
+	rmgr_print_nfc_spec_ext(fsp);
+
+	if (fsp->ring_cookie != RX_CLS_FLOW_DISC)
+		fprintf(stdout, "\tAction: Direct to queue %llu\n",
+			fsp->ring_cookie);
+	else
+		fprintf(stdout, "\tAction: Drop\n");
+
+	fprintf(stdout, "\n");
+}
+
+static void rmgr_print_rule(struct ethtool_rx_flow_spec *fsp)
+{
+	/* print the rule in this location */
+	switch (fsp->flow_type & ~FLOW_EXT) {
+	case TCP_V4_FLOW:
+	case UDP_V4_FLOW:
+	case SCTP_V4_FLOW:
+	case AH_V4_FLOW:
+	case ESP_V4_FLOW:
+	case ETHER_FLOW:
+		rmgr_print_nfc_rule(fsp);
+		break;
+	case IP_USER_FLOW:
+		if (fsp->h_u.usr_ip4_spec.ip_ver == ETH_RX_NFC_IP4) {
+			rmgr_print_nfc_rule(fsp);
+			break;
+		}
+		/* IPv6 User Flow falls through to the case below */
+	case TCP_V6_FLOW:
+	case UDP_V6_FLOW:
+	case SCTP_V6_FLOW:
+	case AH_V6_FLOW:
+	case ESP_V6_FLOW:
+		fprintf(stderr, "IPv6 flows not implemented\n");
+		break;
+	default:
+		fprintf(stderr, "rmgr: Unknown flow type\n");
+		break;
+	}
+}
+
+static int rmgr_ins(__u32 loc)
+{
+	/* verify location is in rule manager range */
+	if (loc >= rmgr.size) {
+		fprintf(stderr, "rmgr: Location out of range\n");
+		return -1;
+	}
+
+	/* set bit for the rule */
+	set_bit(loc, rmgr.slot);
+
+	return 0;
+}
+
+static int rmgr_add(struct ethtool_rx_flow_spec *fsp)
+{
+	__u32 loc = fsp->location;
+	__u32 slot_num;
+
+	/* start at the end of the list since it is lowest priority */
+	loc = rmgr.size - 1;
+
+	/* locate the first slot a rule can be placed in */
+	slot_num = loc / BITS_PER_LONG;
+
+	/*
+	 * Avoid testing individual bits by inverting the word and checking
+	 * to see if any bits are left set, if so there are empty spots.  By
+	 * moving 1 + loc % BITS_PER_LONG we align ourselves to the last bit
+	 * in the previous word.
+	 *
+	 * If loc rolls over it should be greater than or equal to rmgr.size
+	 * and as such we know we have reached the end of the list.
+	 */
+	if (!~(rmgr.slot[slot_num] | (~1UL << rmgr.size % BITS_PER_LONG))) {
+		loc -= 1 + (loc % BITS_PER_LONG);
+		slot_num--;
+	}
+
+	/*
+	 * Now that we are aligned with the last bit in each long we can just
+	 * go though and eliminate all the longs with no free bits
+	 */
+	while (loc < rmgr.size && !~(rmgr.slot[slot_num])) {
+		loc -= BITS_PER_LONG;
+		slot_num--;
+	}
+
+	/*
+	 * If we still are inside the range, test individual bits as one is
+	 * likely available for our use.
+	 */
+	while (loc < rmgr.size && test_bit(loc, rmgr.slot))
+		loc--;
+
+	/* location found, insert rule */
+	if (loc < rmgr.size) {
+		fsp->location = loc;
+		return rmgr_ins(loc);
+	}
+
+	/* No space to add this rule */
+	fprintf(stderr, "rmgr: Cannot find appropriate slot to insert rule\n");
+
+	return -1;
+}
+
+static int rmgr_init(int fd, struct ifreq *ifr)
+{
+	struct ethtool_rxnfc *nfccmd;
+	int err, i;
+	__u32 *rule_locs;
+
+	if (rmgr_init_done)
+		return 0;
+
+	/* clear rule manager settings */
+	memset(&rmgr, 0, sizeof(struct rmgr_ctrl));
+
+	/* allocate memory for count request */
+	nfccmd = calloc(1, sizeof(*nfccmd));
+	if (!nfccmd) {
+		perror("rmgr: Cannot allocate memory for RX class rule data");
+		return -1;
+	}
+
+	/* request count and store in rmgr.n_rules */
+	nfccmd->cmd = ETHTOOL_GRXCLSRLCNT;
+	ifr->ifr_data = (caddr_t)nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	rmgr.n_rules = nfccmd->rule_cnt;
+	free(nfccmd);
+	if (err < 0) {
+		perror("rmgr: Cannot get RX class rule count");
+		return -1;
+	}
+
+	/* alloc memory for request of location list */
+	nfccmd = calloc(1, sizeof(*nfccmd) + (rmgr.n_rules * sizeof(__u32)));
+	if (!nfccmd) {
+		perror("rmgr: Cannot allocate memory for"
+		       " RX class rule locations");
+		return -1;
+	}
+
+	/* request location list */
+	nfccmd->cmd = ETHTOOL_GRXCLSRLALL;
+	nfccmd->rule_cnt = rmgr.n_rules;
+	ifr->ifr_data = (caddr_t)nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0) {
+		perror("rmgr: Cannot get RX class rules");
+		free(nfccmd);
+		return -1;
+	}
+
+	/* make certain the table size is valid */
+	rmgr.size = nfccmd->data;
+	if (rmgr.size == 0 || rmgr.size < rmgr.n_rules) {
+		perror("rmgr: Invalid RX class rules table size");
+		return -1;
+	}
+
+	/* initialize bitmap for storage of valid locations */
+	rmgr.slot = calloc(1, BITS_TO_LONGS(rmgr.size) * sizeof(long));
+	if (!rmgr.slot) {
+		perror("rmgr: Cannot allocate memory for RX class rules");
+		return -1;
+	}
+
+	/* write locations to bitmap */
+	rule_locs = nfccmd->rule_locs;
+	for (i = 0; i < rmgr.n_rules; i++) {
+		err = rmgr_ins(rule_locs[i]);
+		if (err < 0)
+			break;
+	}
+
+	/* free memory and set flag to avoid reinit */
+	free(nfccmd);
+	rmgr_init_done = 1;
+
+	return err;
+}
+
+static void rmgr_cleanup(void)
+{
+	if (!rmgr_init_done)
+		return;
+
+	rmgr_init_done = 0;
+
+	free(rmgr.slot);
+	rmgr.slot = NULL;
+	rmgr.size = 0;
+}
+
+int rxclass_rule_get(int fd, struct ifreq *ifr, __u32 loc)
+{
+	struct ethtool_rxnfc nfccmd;
+
+	/* fetch rule from netdev and display */
+	nfccmd.cmd = ETHTOOL_GRXCLSRULE;
+	memset(&nfccmd.fs, 0, sizeof(struct ethtool_rx_flow_spec));
+	nfccmd.fs.location = loc;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	if (ioctl(fd, SIOCETHTOOL, ifr) < 0) {
+		perror("rmgr: Cannot get RX class rule");
+		return -1;
+	}
+
+	rmgr_print_rule(&nfccmd.fs);
+	return 0;
+}
+
+int rxclass_rule_getall(int fd, struct ifreq *ifr)
+{
+	int err, i, j;
+
+	/* init table of available rules */
+	err = rmgr_init(fd, ifr);
+	if (err < 0)
+		return err;
+
+	fprintf(stdout, "Total %d rules\n\n", rmgr.n_rules);
+
+	/* fetch and display all available rules */
+	for (i = 0; i < rmgr.size; i += BITS_PER_LONG) {
+		if (!rmgr.slot[i / BITS_PER_LONG])
+			continue;
+		for (j = 0; j < BITS_PER_LONG; j++) {
+			if (!test_bit(i + j, rmgr.slot))
+				continue;
+			err = rxclass_rule_get(fd, ifr, i + j);
+			if (err < 0) {
+				rmgr_cleanup();
+				return err;
+			}
+		}
+	}
+
+	rmgr_cleanup();
+	return 0;
+}
+
+int rxclass_rule_ins(int fd, struct ifreq *ifr,
+		     struct ethtool_rx_flow_spec *fsp)
+{
+	struct ethtool_rxnfc nfccmd;
+	__u32 loc = fsp->location;
+	int err;
+
+	/*
+	 * if location is unspecified pull rules from device
+	 * and allocate a free rule for our use
+	 */
+	if (loc == RX_CLS_LOC_UNSPEC) {
+		/* init table of available rules */
+		err = rmgr_init(fd, ifr);
+		if (err < 0)
+			return err;
+
+		/* verify rule location */
+		err = rmgr_add(fsp);
+		if (err < 0)
+			return err;
+
+		/* cleanup table and free resources */
+		rmgr_cleanup();
+	}
+
+	/* notify netdev of new rule */
+	nfccmd.cmd = ETHTOOL_SRXCLSRLINS;
+	nfccmd.fs = *fsp;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0)
+		perror("rmgr: Cannot insert RX class rule");
+	else if (loc == RX_CLS_LOC_UNSPEC)
+		printf("Added rule with ID %d\n", fsp->location);
+
+	return 0;
+}
+
+int rxclass_rule_del(int fd, struct ifreq *ifr, __u32 loc)
+{
+	struct ethtool_rxnfc nfccmd;
+	int err;
+
+	/* notify netdev of rule removal */
+	nfccmd.cmd = ETHTOOL_SRXCLSRLDEL;
+	nfccmd.fs.location = loc;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0)
+		perror("rmgr: Cannot delete RX class rule");
+
+	return err;
+}
+
+typedef enum {
+	OPT_NONE = 0,
+	OPT_S32,
+	OPT_U8,
+	OPT_U16,
+	OPT_U32,
+	OPT_U64,
+	OPT_BE16,
+	OPT_BE32,
+	OPT_BE64,
+	OPT_IP4,
+	OPT_MAC,
+} rule_opt_type_t;
+
+#define NFC_FLAG_RING		0x001
+#define NFC_FLAG_LOC		0x002
+#define NFC_FLAG_SADDR		0x004
+#define NFC_FLAG_DADDR		0x008
+#define NFC_FLAG_SPORT		0x010
+#define NFC_FLAG_DPORT		0x020
+#define NFC_FLAG_SPI		0x030
+#define NFC_FLAG_TOS		0x040
+#define NFC_FLAG_PROTO		0x080
+#define NTUPLE_FLAG_VLAN	0x100
+#define NTUPLE_FLAG_UDEF	0x200
+#define NTUPLE_FLAG_VETH	0x400
+
+struct rule_opts {
+	const char	*name;
+	rule_opt_type_t	type;
+	u32		flag;
+	int		offset;
+	int		moffset;
+};
+
+static struct rule_opts rule_nfc_tcp_ip4[] = {
+	{ "src-ip", OPT_IP4, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.ip4src),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.ip4src) },
+	{ "dst-ip", OPT_IP4, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.ip4dst),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.ip4dst) },
+	{ "tos", OPT_U8, NFC_FLAG_TOS,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.tos),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.tos) },
+	{ "src-port", OPT_BE16, NFC_FLAG_SPORT,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.psrc),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.psrc) },
+	{ "dst-port", OPT_BE16, NFC_FLAG_DPORT,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.pdst),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.pdst) },
+	{ "action", OPT_U64, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
+	{ "loc", OPT_U32, NFC_FLAG_LOC,
+	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+	{ "vlan-etype", OPT_BE16, NTUPLE_FLAG_VETH,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.vlan_etype),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.vlan_etype) },
+	{ "vlan", OPT_BE16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.vlan_tci),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.vlan_tci) },
+	{ "user-def", OPT_BE64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.data),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.data) },
+};
+
+static struct rule_opts rule_nfc_esp_ip4[] = {
+	{ "src-ip", OPT_IP4, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.esp_ip4_spec.ip4src),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.esp_ip4_spec.ip4src) },
+	{ "dst-ip", OPT_IP4, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.esp_ip4_spec.ip4dst),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.esp_ip4_spec.ip4dst) },
+	{ "tos", OPT_U8, NFC_FLAG_TOS,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.esp_ip4_spec.tos),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.esp_ip4_spec.tos) },
+	{ "spi", OPT_BE32, NFC_FLAG_SPI,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.esp_ip4_spec.spi),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.esp_ip4_spec.spi) },
+	{ "action", OPT_U64, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
+	{ "loc", OPT_U32, NFC_FLAG_LOC,
+	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+	{ "vlan-etype", OPT_BE16, NTUPLE_FLAG_VETH,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.vlan_etype),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.vlan_etype) },
+	{ "vlan", OPT_BE16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.vlan_tci),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.vlan_tci) },
+	{ "user-def", OPT_BE64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.data),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.data) },
+};
+
+static struct rule_opts rule_nfc_usr_ip4[] = {
+	{ "src-ip", OPT_IP4, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.ip4src),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.ip4src) },
+	{ "dst-ip", OPT_IP4, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.ip4dst),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.ip4dst) },
+	{ "tos", OPT_U8, NFC_FLAG_TOS,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.tos),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.tos) },
+	{ "l4proto", OPT_U8, NFC_FLAG_PROTO,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.proto),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.proto) },
+	{ "spi", OPT_BE32, NFC_FLAG_SPI,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.l4_4_bytes),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.l4_4_bytes) },
+	{ "src-port", OPT_BE16, NFC_FLAG_SPORT,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.l4_4_bytes),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.l4_4_bytes) },
+	{ "dst-port", OPT_BE16, NFC_FLAG_DPORT,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.l4_4_bytes) + 2,
+	  offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.l4_4_bytes) + 2 },
+	{ "action", OPT_U64, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
+	{ "loc", OPT_U32, NFC_FLAG_LOC,
+	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+	{ "vlan-etype", OPT_BE16, NTUPLE_FLAG_VETH,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.vlan_etype),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.vlan_etype) },
+	{ "vlan", OPT_BE16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.vlan_tci),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.vlan_tci) },
+	{ "user-def", OPT_BE64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.data),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.data) },
+};
+
+static struct rule_opts rule_nfc_ether[] = {
+	{ "src", OPT_MAC, NFC_FLAG_SADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ether_spec.h_dest),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ether_spec.h_dest) },
+	{ "dst", OPT_MAC, NFC_FLAG_DADDR,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ether_spec.h_source),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ether_spec.h_source) },
+	{ "proto", OPT_BE16, NFC_FLAG_PROTO,
+	  offsetof(struct ethtool_rx_flow_spec, h_u.ether_spec.h_proto),
+	  offsetof(struct ethtool_rx_flow_spec, m_u.ether_spec.h_proto) },
+	{ "action", OPT_U64, NFC_FLAG_RING,
+	  offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
+	{ "loc", OPT_U32, NFC_FLAG_LOC,
+	  offsetof(struct ethtool_rx_flow_spec, location), -1 },
+	{ "vlan-etype", OPT_BE16, NTUPLE_FLAG_VETH,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.vlan_etype),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.vlan_etype) },
+	{ "vlan", OPT_BE16, NTUPLE_FLAG_VLAN,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.vlan_tci),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.vlan_tci) },
+	{ "user-def", OPT_BE64, NTUPLE_FLAG_UDEF,
+	  offsetof(struct ethtool_rx_flow_spec, h_ext.data),
+	  offsetof(struct ethtool_rx_flow_spec, m_ext.data) },
+};
+
+static int rxclass_get_long(char *str, long long *val, int size)
+{
+	long long max = ~0ULL >> (65 - size);
+	char *endp;
+
+	errno = 0;
+
+	*val = strtoll(str, &endp, 0);
+
+	if (*endp || errno || (*val > max) || (*val < ~max))
+		return -1;
+
+	return 0;
+}
+
+static int rxclass_get_ulong(char *str, unsigned long long *val, int size)
+{
+	long long max = ~0ULL >> (64 - size);
+	char *endp;
+
+	errno = 0;
+
+	*val = strtoull(str, &endp, 0);
+
+	if (*endp || errno || (*val > max))
+		return -1;
+
+	return 0;
+}
+
+static int rxclass_get_ipv4(char *str, __be32 *val)
+{
+	if (!inet_pton(AF_INET, str, val))
+		return -1;
+
+	return 0;
+}
+
+static int rxclass_get_ether(char *str, unsigned char *val)
+{
+	unsigned int buf[ETH_ALEN];
+	int count;
+
+	if (!strchr(str, ':'))
+		return -1;
+
+	count = sscanf(str, "%2x:%2x:%2x:%2x:%2x:%2x",
+		       &buf[0], &buf[1], &buf[2],
+		       &buf[3], &buf[4], &buf[5]);
+
+	if (count != ETH_ALEN)
+		return -1;
+
+	do {
+		count--;
+		val[count] = buf[count];
+	} while (count);
+
+	return 0;
+}
+
+static int rxclass_get_val(char *str, unsigned char *p, u32 *flags,
+			   const struct rule_opts *opt)
+{
+	unsigned long long mask = ~0ULL;
+	int err = 0;
+
+	if (*flags & opt->flag)
+		return -1;
+
+	*flags |= opt->flag;
+
+	switch (opt->type) {
+	case OPT_S32: {
+		long long val;
+		err = rxclass_get_long(str, &val, 32);
+		if (err)
+			return -1;
+		*(int *)&p[opt->offset] = (int)val;
+		if (opt->moffset >= 0)
+			*(int *)&p[opt->moffset] = (int)mask;
+		break;
+	}
+	case OPT_U8: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 8);
+		if (err)
+			return -1;
+		*(u8 *)&p[opt->offset] = (u8)val;
+		if (opt->moffset >= 0)
+			*(u8 *)&p[opt->moffset] = (u8)mask;
+		break;
+	}
+	case OPT_U16: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 16);
+		if (err)
+			return -1;
+		*(u16 *)&p[opt->offset] = (u16)val;
+		if (opt->moffset >= 0)
+			*(u16 *)&p[opt->moffset] = (u16)mask;
+		break;
+	}
+	case OPT_U32: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 32);
+		if (err)
+			return -1;
+		*(u32 *)&p[opt->offset] = (u32)val;
+		if (opt->moffset >= 0)
+			*(u32 *)&p[opt->moffset] = (u32)mask;
+		break;
+	}
+	case OPT_U64: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 64);
+		if (err)
+			return -1;
+		*(u64 *)&p[opt->offset] = (u64)val;
+		if (opt->moffset >= 0)
+			*(u64 *)&p[opt->moffset] = (u64)mask;
+		break;
+	}
+	case OPT_BE16: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 16);
+		if (err)
+			return -1;
+		*(__be16 *)&p[opt->offset] = htons((u16)val);
+		if (opt->moffset >= 0)
+			*(__be16 *)&p[opt->moffset] = (__be16)mask;
+		break;
+	}
+	case OPT_BE32: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 32);
+		if (err)
+			return -1;
+		*(__be32 *)&p[opt->offset] = htonl((u32)val);
+		if (opt->moffset >= 0)
+			*(__be32 *)&p[opt->moffset] = (__be32)mask;
+		break;
+	}
+	case OPT_BE64: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 64);
+		if (err)
+			return -1;
+		*(__be64 *)&p[opt->offset] = htonll((u64)val);
+		if (opt->moffset >= 0)
+			*(__be64 *)&p[opt->moffset] = (__be64)mask;
+		break;
+	}
+	case OPT_IP4: {
+		__be32 val;
+		err = rxclass_get_ipv4(str, &val);
+		if (err)
+			return -1;
+		*(__be32 *)&p[opt->offset] = val;
+		if (opt->moffset >= 0)
+			*(__be32 *)&p[opt->moffset] = (__be32)mask;
+		break;
+	}
+	case OPT_MAC: {
+		unsigned char val[ETH_ALEN];
+		err = rxclass_get_ether(str, val);
+		if (err)
+			return -1;
+		memcpy(&p[opt->offset], val, ETH_ALEN);
+		if (opt->moffset >= 0)
+			memcpy(&p[opt->moffset], &mask, ETH_ALEN);
+		break;
+	}
+	case OPT_NONE:
+	default:
+		return -1;
+	}
+
+	return 0;
+}
+
+static int rxclass_get_mask(char *str, unsigned char *p,
+			    const struct rule_opts *opt)
+{
+	int err = 0;
+
+	if (opt->moffset < 0)
+		return -1;
+
+	switch (opt->type) {
+	case OPT_S32: {
+		long long val;
+		err = rxclass_get_long(str, &val, 32);
+		if (err)
+			return -1;
+		*(int *)&p[opt->moffset] = ~(int)val;
+		break;
+	}
+	case OPT_U8: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 8);
+		if (err)
+			return -1;
+		*(u8 *)&p[opt->moffset] = ~(u8)val;
+		break;
+	}
+	case OPT_U16: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 16);
+		if (err)
+			return -1;
+		*(u16 *)&p[opt->moffset] = ~(u16)val;
+		break;
+	}
+	case OPT_U32: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 32);
+		if (err)
+			return -1;
+		*(u32 *)&p[opt->moffset] = ~(u32)val;
+		break;
+	}
+	case OPT_U64: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 64);
+		if (err)
+			return -1;
+		*(u64 *)&p[opt->moffset] = ~(u64)val;
+		break;
+	}
+	case OPT_BE16: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 16);
+		if (err)
+			return -1;
+		*(__be16 *)&p[opt->moffset] = ~htons((u16)val);
+		break;
+	}
+	case OPT_BE32: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 32);
+		if (err)
+			return -1;
+		*(__be32 *)&p[opt->moffset] = ~htonl((u32)val);
+		break;
+	}
+	case OPT_BE64: {
+		unsigned long long val;
+		err = rxclass_get_ulong(str, &val, 64);
+		if (err)
+			return -1;
+		*(__be64 *)&p[opt->moffset] = ~htonll((u64)val);
+		break;
+	}
+	case OPT_IP4: {
+		__be32 val;
+		err = rxclass_get_ipv4(str, &val);
+		if (err)
+			return -1;
+		*(__be32 *)&p[opt->moffset] = ~val;
+		break;
+	}
+	case OPT_MAC: {
+		unsigned char val[ETH_ALEN];
+		int i;
+		err = rxclass_get_ether(str, val);
+		if (err)
+			return -1;
+
+		for (i = 0; i < ETH_ALEN; i++)
+			val[i] = ~val[i];
+
+		memcpy(&p[opt->moffset], val, ETH_ALEN);
+		break;
+	}
+	case OPT_NONE:
+	default:
+		return -1;
+	}
+
+	return 0;
+}
+
+int rxclass_parse_ruleopts(char **argp, int argc,
+			   struct ethtool_rx_flow_spec *fsp)
+{
+	const struct rule_opts *options;
+	unsigned char *p = (unsigned char *)fsp;
+	int i = 0, n_opts, err;
+	u32 flags = 0;
+	int flow_type;
+
+	if (argc < 1)
+		goto syntax_err;
+
+	if (!strcmp(argp[0], "tcp4"))
+		flow_type = TCP_V4_FLOW;
+	else if (!strcmp(argp[0], "udp4"))
+		flow_type = UDP_V4_FLOW;
+	else if (!strcmp(argp[0], "sctp4"))
+		flow_type = SCTP_V4_FLOW;
+	else if (!strcmp(argp[0], "ah4"))
+		flow_type = AH_V4_FLOW;
+	else if (!strcmp(argp[0], "esp4"))
+		flow_type = ESP_V4_FLOW;
+	else if (!strcmp(argp[0], "ip4"))
+		flow_type = IP_USER_FLOW;
+	else if (!strcmp(argp[0], "ether"))
+		flow_type = ETHER_FLOW;
+	else
+		goto syntax_err;
+
+	switch (flow_type) {
+	case TCP_V4_FLOW:
+	case UDP_V4_FLOW:
+	case SCTP_V4_FLOW:
+		options = rule_nfc_tcp_ip4;
+		n_opts = ARRAY_SIZE(rule_nfc_tcp_ip4);
+		break;
+	case AH_V4_FLOW:
+	case ESP_V4_FLOW:
+		options = rule_nfc_esp_ip4;
+		n_opts = ARRAY_SIZE(rule_nfc_esp_ip4);
+		break;
+	case IP_USER_FLOW:
+		options = rule_nfc_usr_ip4;
+		n_opts = ARRAY_SIZE(rule_nfc_usr_ip4);
+		break;
+	case ETHER_FLOW:
+		options = rule_nfc_ether;
+		n_opts = ARRAY_SIZE(rule_nfc_ether);
+		break;
+	default:
+		fprintf(stderr, "Add rule, invalid rule type[%s]\n", argp[0]);
+		return -1;
+	}
+
+	memset(p, 0, sizeof(*fsp));
+	fsp->flow_type = flow_type;
+	fsp->location = RX_CLS_LOC_UNSPEC;
+
+	for (i = 1; i < argc;) {
+		const struct rule_opts *opt;
+		int idx;
+		for (opt = options, idx = 0; idx < n_opts; idx++, opt++) {
+			char mask_name[16];
+
+			if (strcmp(argp[i], opt->name))
+				continue;
+
+			i++;
+			if (i >= argc)
+				break;
+
+			err = rxclass_get_val(argp[i], p, &flags, opt);
+			if (err) {
+				fprintf(stderr, "Invalid %s value[%s]\n",
+					opt->name, argp[i]);
+				return -1;
+			}
+
+			i++;
+			if (i >= argc)
+				break;
+
+			sprintf(mask_name, "%s-mask", opt->name);
+			if (strcmp(argp[i], "m") && strcmp(argp[i], mask_name))
+				break;
+
+			i++;
+			if (i >= argc)
+				goto syntax_err;
+
+			err = rxclass_get_mask(argp[i], p, opt);
+			if (err) {
+				fprintf(stderr, "Invalid %s mask[%s]\n",
+					opt->name, argp[i]);
+				return -1;
+			}
+
+			i++;
+
+			break;
+		}
+		if (idx == n_opts) {
+			fprintf(stdout, "Add rule, unrecognized option[%s]\n",
+				argp[i]);
+			return -1;
+		}
+	}
+
+	if (flags & (NTUPLE_FLAG_VLAN | NTUPLE_FLAG_UDEF | NTUPLE_FLAG_VETH))
+		fsp->flow_type |= FLOW_EXT;
+
+	return 0;
+
+syntax_err:
+	fprintf(stderr, "Add rule, invalid syntax\n");
+	return -1;
+}


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-03 16:12 ` [ethtool PATCH 4/4] v5 Add RX packet classification interface Alexander Duyck
@ 2011-05-03 23:23   ` Dimitris Michailidis
  2011-05-03 23:34     ` Ben Hutchings
  0 siblings, 1 reply; 23+ messages in thread
From: Dimitris Michailidis @ 2011-05-03 23:23 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: davem, jeffrey.t.kirsher, bhutchings, netdev

On 05/03/2011 09:12 AM, Alexander Duyck wrote:

[...]

> +static int rmgr_add(struct ethtool_rx_flow_spec *fsp)
> +{
> +	__u32 loc = fsp->location;

Unneeded assignment.  Couple lines later you assign a new value to loc.

> +	__u32 slot_num;
> +
> +	/* start at the end of the list since it is lowest priority */
> +	loc = rmgr.size - 1;
> +
> +	/* locate the first slot a rule can be placed in */
> +	slot_num = loc / BITS_PER_LONG;
> +
> +	/*
> +	 * Avoid testing individual bits by inverting the word and checking
> +	 * to see if any bits are left set, if so there are empty spots.  By
> +	 * moving 1 + loc % BITS_PER_LONG we align ourselves to the last bit
> +	 * in the previous word.
> +	 *
> +	 * If loc rolls over it should be greater than or equal to rmgr.size
> +	 * and as such we know we have reached the end of the list.
> +	 */
> +	if (!~(rmgr.slot[slot_num] | (~1UL << rmgr.size % BITS_PER_LONG))) {
> +		loc -= 1 + (loc % BITS_PER_LONG);
> +		slot_num--;
> +	}
> +
> +	/*
> +	 * Now that we are aligned with the last bit in each long we can just
> +	 * go though and eliminate all the longs with no free bits
> +	 */
> +	while (loc < rmgr.size && !~(rmgr.slot[slot_num])) {
> +		loc -= BITS_PER_LONG;
> +		slot_num--;
> +	}
> +
> +	/*
> +	 * If we still are inside the range, test individual bits as one is
> +	 * likely available for our use.
> +	 */
> +	while (loc < rmgr.size && test_bit(loc, rmgr.slot))
> +		loc--;
> +
> +	/* location found, insert rule */
> +	if (loc < rmgr.size) {
> +		fsp->location = loc;
> +		return rmgr_ins(loc);
> +	}
> +
> +	/* No space to add this rule */
> +	fprintf(stderr, "rmgr: Cannot find appropriate slot to insert rule\n");
> +
> +	return -1;
> +}

[...]

> +int rxclass_rule_ins(int fd, struct ifreq *ifr,
> +		     struct ethtool_rx_flow_spec *fsp)
> +{
> +	struct ethtool_rxnfc nfccmd;
> +	__u32 loc = fsp->location;
> +	int err;
> +
> +	/*
> +	 * if location is unspecified pull rules from device
> +	 * and allocate a free rule for our use
> +	 */
> +	if (loc == RX_CLS_LOC_UNSPEC) {
> +		/* init table of available rules */
> +		err = rmgr_init(fd, ifr);
> +		if (err < 0)
> +			return err;
> +
> +		/* verify rule location */
> +		err = rmgr_add(fsp);
> +		if (err < 0)
> +			return err;
> +
> +		/* cleanup table and free resources */
> +		rmgr_cleanup();
> +	}

This logic where ethtool tries to select a filter slot when a user provides 
RX_CLS_LOC_UNSPEC does not work in general.  It assumes that all slots are 
equal and a new filter can go into any available slot. But a device may have 
restrictions on where a filter may go that ethtool doesn't know.  I 
mentioned during a previous review that for cxgb4 some filters require a 
slot number that is a multiple of 4.  There are some other constraints as 
well depending on the type of filter being added. For such a device ethtool 
doesn't know enough to handle RX_CLS_LOC_UNSPEC correctly.

I think RX_CLS_LOC_UNSPEC should be passed to the driver, where there is 
enough knowledge to pick an appropriate slot.  So I'd remove the

     if (loc == RX_CLS_LOC_UNSPEC)

block above, let the driver pick a slot, and then pass the selected location 
back for ethtool to report.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-03 23:23   ` Dimitris Michailidis
@ 2011-05-03 23:34     ` Ben Hutchings
  2011-05-04  0:29       ` Alexander Duyck
  2011-05-04 17:09       ` Dimitris Michailidis
  0 siblings, 2 replies; 23+ messages in thread
From: Ben Hutchings @ 2011-05-03 23:34 UTC (permalink / raw)
  To: Dimitris Michailidis; +Cc: Alexander Duyck, davem, jeffrey.t.kirsher, netdev

On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
> On 05/03/2011 09:12 AM, Alexander Duyck wrote:
[...]
> > +int rxclass_rule_ins(int fd, struct ifreq *ifr,
> > +		     struct ethtool_rx_flow_spec *fsp)
> > +{
> > +	struct ethtool_rxnfc nfccmd;
> > +	__u32 loc = fsp->location;
> > +	int err;
> > +
> > +	/*
> > +	 * if location is unspecified pull rules from device
> > +	 * and allocate a free rule for our use
> > +	 */
> > +	if (loc == RX_CLS_LOC_UNSPEC) {
> > +		/* init table of available rules */
> > +		err = rmgr_init(fd, ifr);
> > +		if (err < 0)
> > +			return err;
> > +
> > +		/* verify rule location */
> > +		err = rmgr_add(fsp);
> > +		if (err < 0)
> > +			return err;
> > +
> > +		/* cleanup table and free resources */
> > +		rmgr_cleanup();
> > +	}
> 
> This logic where ethtool tries to select a filter slot when a user provides 
> RX_CLS_LOC_UNSPEC does not work in general.  It assumes that all slots are 
> equal and a new filter can go into any available slot. But a device may have 
> restrictions on where a filter may go that ethtool doesn't know.

I agree.  And if filter lookup is largely hash-based (as it is in
Solarflare hardware) the user will also find it very difficult to
specify the right location!

> I mentioned during a previous review that for cxgb4 some filters require a 
> slot number that is a multiple of 4.  There are some other constraints as 
> well depending on the type of filter being added. For such a device ethtool 
> doesn't know enough to handle RX_CLS_LOC_UNSPEC correctly.
>
> I think RX_CLS_LOC_UNSPEC should be passed to the driver, where there is 
> enough knowledge to pick an appropriate slot.  So I'd remove the
> 
>      if (loc == RX_CLS_LOC_UNSPEC)
> 
> block above, let the driver pick a slot, and then pass the selected location 
> back for ethtool to report.

But first we have to specify this in the ethtool API.  So please propose
a patch to ethtool.h.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-03 23:34     ` Ben Hutchings
@ 2011-05-04  0:29       ` Alexander Duyck
  2011-05-04  1:35         ` Dimitris Michailidis
  2011-05-04 17:09       ` Dimitris Michailidis
  1 sibling, 1 reply; 23+ messages in thread
From: Alexander Duyck @ 2011-05-04  0:29 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Dimitris Michailidis, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On 5/3/2011 4:34 PM, Ben Hutchings wrote:
> On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
>> On 05/03/2011 09:12 AM, Alexander Duyck wrote:
> [...]
>>> +int rxclass_rule_ins(int fd, struct ifreq *ifr,
>>> +		     struct ethtool_rx_flow_spec *fsp)
>>> +{
>>> +	struct ethtool_rxnfc nfccmd;
>>> +	__u32 loc = fsp->location;
>>> +	int err;
>>> +
>>> +	/*
>>> +	 * if location is unspecified pull rules from device
>>> +	 * and allocate a free rule for our use
>>> +	 */
>>> +	if (loc == RX_CLS_LOC_UNSPEC) {
>>> +		/* init table of available rules */
>>> +		err = rmgr_init(fd, ifr);
>>> +		if (err<  0)
>>> +			return err;
>>> +
>>> +		/* verify rule location */
>>> +		err = rmgr_add(fsp);
>>> +		if (err<  0)
>>> +			return err;
>>> +
>>> +		/* cleanup table and free resources */
>>> +		rmgr_cleanup();
>>> +	}
>>
>> This logic where ethtool tries to select a filter slot when a user provides
>> RX_CLS_LOC_UNSPEC does not work in general.  It assumes that all slots are
>> equal and a new filter can go into any available slot. But a device may have
>> restrictions on where a filter may go that ethtool doesn't know.
>
> I agree.  And if filter lookup is largely hash-based (as it is in
> Solarflare hardware) the user will also find it very difficult to
> specify the right location!

The thing to keep in mind is that the index doesn't have to be a 
hardware index.  In ixgbe we have a field in the hardware which is meant 
to just be a unique software index and that is what I am using as the 
location field for our filters.  All the location information in the 
rules really provides is a logical way of tracking rules.  It doesn't 
necessarily have to represent the physical location of the rule in hardware.

>> I mentioned during a previous review that for cxgb4 some filters require a
>> slot number that is a multiple of 4.  There are some other constraints as
>> well depending on the type of filter being added. For such a device ethtool
>> doesn't know enough to handle RX_CLS_LOC_UNSPEC correctly.
>>
>> I think RX_CLS_LOC_UNSPEC should be passed to the driver, where there is
>> enough knowledge to pick an appropriate slot.  So I'd remove the
>>
>>       if (loc == RX_CLS_LOC_UNSPEC)
>>
>> block above, let the driver pick a slot, and then pass the selected location
>> back for ethtool to report.
>
> But first we have to specify this in the ethtool API.  So please propose
> a patch to ethtool.h.
>
> Ben.

The other thing to keep in mind with this is that it doesn't lock you 
into the network flow classifier configuration.  If you want to be able 
to specify a rule without having any location information included there 
is always the option of ntuple which accepts almost all the same fields 
but doesn't include any specific location information.

Thanks,

Alex


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04  0:29       ` Alexander Duyck
@ 2011-05-04  1:35         ` Dimitris Michailidis
  2011-05-04  3:10           ` Alexander Duyck
  0 siblings, 1 reply; 23+ messages in thread
From: Dimitris Michailidis @ 2011-05-04  1:35 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Ben Hutchings, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On 05/03/2011 05:29 PM, Alexander Duyck wrote:
> On 5/3/2011 4:34 PM, Ben Hutchings wrote:
>> On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
>>> On 05/03/2011 09:12 AM, Alexander Duyck wrote:
>> [...]
>>>> +int rxclass_rule_ins(int fd, struct ifreq *ifr,
>>>> +             struct ethtool_rx_flow_spec *fsp)
>>>> +{
>>>> +    struct ethtool_rxnfc nfccmd;
>>>> +    __u32 loc = fsp->location;
>>>> +    int err;
>>>> +
>>>> +    /*
>>>> +     * if location is unspecified pull rules from device
>>>> +     * and allocate a free rule for our use
>>>> +     */
>>>> +    if (loc == RX_CLS_LOC_UNSPEC) {
>>>> +        /* init table of available rules */
>>>> +        err = rmgr_init(fd, ifr);
>>>> +        if (err<  0)
>>>> +            return err;
>>>> +
>>>> +        /* verify rule location */
>>>> +        err = rmgr_add(fsp);
>>>> +        if (err<  0)
>>>> +            return err;
>>>> +
>>>> +        /* cleanup table and free resources */
>>>> +        rmgr_cleanup();
>>>> +    }
>>>
>>> This logic where ethtool tries to select a filter slot when a user 
>>> provides
>>> RX_CLS_LOC_UNSPEC does not work in general.  It assumes that all 
>>> slots are
>>> equal and a new filter can go into any available slot. But a device 
>>> may have
>>> restrictions on where a filter may go that ethtool doesn't know.
>>
>> I agree.  And if filter lookup is largely hash-based (as it is in
>> Solarflare hardware) the user will also find it very difficult to
>> specify the right location!
> 
> The thing to keep in mind is that the index doesn't have to be a 
> hardware index.  In ixgbe we have a field in the hardware which is meant 
> to just be a unique software index and that is what I am using as the 
> location field for our filters.  All the location information in the 
> rules really provides is a logical way of tracking rules.  It doesn't 
> necessarily have to represent the physical location of the rule in 
> hardware.

I appreciate the intent but there are couple problems.
a) ethtool.h documents location as

  * @location: Index of filter in hardware table

i.e., physical location.  But we could change that.

b) for TCAMs physical location is important and if ethtool offers to provide 
only a logical index and massages the original user input to do so where 
will a driver get the physical location it ultimately needs?  For a device 
with physical indices a multiple of 4 the logical index ethtool picks will 
very frequently be illegal as physical location.  E.g., if location 1 is 
available ethtool will keep selecting it and the driver will need to deal 
with these requests without the benefit of knowing what the user really 
asked for.

Another problem with ethtool selecting locations is it assumes it's the sole 
allocator (I think there's a comment in the code pointing this out).  But 
this isn't a valid assumption, e.g., HW RFS comes to mind as another allocator.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04  1:35         ` Dimitris Michailidis
@ 2011-05-04  3:10           ` Alexander Duyck
  0 siblings, 0 replies; 23+ messages in thread
From: Alexander Duyck @ 2011-05-04  3:10 UTC (permalink / raw)
  To: Dimitris Michailidis
  Cc: Alexander Duyck, Ben Hutchings, davem@davemloft.net,
	Kirsher, Jeffrey T, netdev@vger.kernel.org

On Tue, May 3, 2011 at 6:35 PM, Dimitris Michailidis <dm@chelsio.com> wrote:
> On 05/03/2011 05:29 PM, Alexander Duyck wrote:
>> The thing to keep in mind is that the index doesn't have to be a hardware
>> index.  In ixgbe we have a field in the hardware which is meant to just be a
>> unique software index and that is what I am using as the location field for
>> our filters.  All the location information in the rules really provides is a
>> logical way of tracking rules.  It doesn't necessarily have to represent the
>> physical location of the rule in hardware.
>
> I appreciate the intent but there are couple problems.
> a) ethtool.h documents location as
>
>  * @location: Index of filter in hardware table
>
> i.e., physical location.  But we could change that.

I will probably want to change that.  The fact is as I recall even niu
was using a hash in addition to the location index.  As such it isn't
really the true location in the hardware since there is a second piece
to determining the actual location in hardware.

> b) for TCAMs physical location is important and if ethtool offers to provide
> only a logical index and massages the original user input to do so where
> will a driver get the physical location it ultimately needs?  For a device
> with physical indices a multiple of 4 the logical index ethtool picks will
> very frequently be illegal as physical location.  E.g., if location 1 is
> available ethtool will keep selecting it and the driver will need to deal
> with these requests without the benefit of knowing what the user really
> asked for.
>
> Another problem with ethtool selecting locations is it assumes it's the sole
> allocator (I think there's a comment in the code pointing this out).  But
> this isn't a valid assumption, e.g., HW RFS comes to mind as another
> allocator.

The other thing to keep in mind is that this doesn't preclude the
option of adding an ethtool command at some point in the future for
handling the determination of filter location.  The fact is all that
would need to be done is to add an extra ioctl call to determine the
location based on the filter and if that returns op not supported we
fall back to the current rule manager.

The way I have things setup now provides a good foundation in
user-space for managing the rules.  I fully admit it doesn't fit all
solutions, and I welcome follow-on patches to add extra functionality,
but at this time I really would prefer avoiding adding extra
functionality for yet to be implemented features in device drivers.
The ntuple display functionality provides a good example of why I
would prefer to avoid it.  Everything looked like it should have
worked when get_rx_ntuple was implemented in the device drivers, but
as soon as I implemented a get_rx_ntuple for ixgbe I quickly
discovered it didn't work and as such I am now stuck moving everything
over to network flow classifier for ixgbe.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-03 23:34     ` Ben Hutchings
  2011-05-04  0:29       ` Alexander Duyck
@ 2011-05-04 17:09       ` Dimitris Michailidis
  2011-05-04 17:24         ` Ben Hutchings
  1 sibling, 1 reply; 23+ messages in thread
From: Dimitris Michailidis @ 2011-05-04 17:09 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Alexander Duyck, davem, jeffrey.t.kirsher, netdev

On 05/03/2011 04:34 PM, Ben Hutchings wrote:
> On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
>> I think RX_CLS_LOC_UNSPEC should be passed to the driver, where there is 
>> enough knowledge to pick an appropriate slot.  So I'd remove the
>>
>>      if (loc == RX_CLS_LOC_UNSPEC)
>>
>> block above, let the driver pick a slot, and then pass the selected location 
>> back for ethtool to report.
> 
> But first we have to specify this in the ethtool API.  So please propose
> a patch to ethtool.h.

In the past we discussed that being able to specify the first available slot or 
the last available would be useful, so something like the below?

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 4194a20..909ef79 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -442,7 +442,8 @@ struct ethtool_flow_ext {
   *	includes the %FLOW_EXT flag.
   * @ring_cookie: RX ring/queue index to deliver to, or %RX_CLS_FLOW_DISC
   *	if packets should be discarded
- * @location: Index of filter in hardware table
+ * @location: Index of filter in hardware table, or %RX_CLS_FLOW_FIRST_LOC for
+ *	first available index, or %RX_CLS_FLOW_LAST_LOC for last available
   */
  struct ethtool_rx_flow_spec {
  	__u32		flow_type;
@@ -1142,6 +1143,10 @@ struct ethtool_ops {

  #define	RX_CLS_FLOW_DISC	0xffffffffffffffffULL

+/* special values for ethtool_rx_flow_spec.location */
+#define RX_CLS_FLOW_FIRST_LOC	0xffffffffU;
+#define RX_CLS_FLOW_LAST_LOC	0xfffffffeU;
+
  /* Reset flags */
  /* The reset() operation must clear the flags for the components which
   * were actually reset.  On successful return, the flags indicate the

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 17:09       ` Dimitris Michailidis
@ 2011-05-04 17:24         ` Ben Hutchings
  2011-05-04 17:33           ` Dimitris Michailidis
  0 siblings, 1 reply; 23+ messages in thread
From: Ben Hutchings @ 2011-05-04 17:24 UTC (permalink / raw)
  To: Dimitris Michailidis; +Cc: Alexander Duyck, davem, jeffrey.t.kirsher, netdev

On Wed, 2011-05-04 at 10:09 -0700, Dimitris Michailidis wrote:
> On 05/03/2011 04:34 PM, Ben Hutchings wrote:
> > On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
> >> I think RX_CLS_LOC_UNSPEC should be passed to the driver, where there is 
> >> enough knowledge to pick an appropriate slot.  So I'd remove the
> >>
> >>      if (loc == RX_CLS_LOC_UNSPEC)
> >>
> >> block above, let the driver pick a slot, and then pass the selected location 
> >> back for ethtool to report.
> > 
> > But first we have to specify this in the ethtool API.  So please propose
> > a patch to ethtool.h.
> 
> In the past we discussed that being able to specify the first available slot or 
> the last available would be useful, so something like the below?
>
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index 4194a20..909ef79 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -442,7 +442,8 @@ struct ethtool_flow_ext {
>    *	includes the %FLOW_EXT flag.
>    * @ring_cookie: RX ring/queue index to deliver to, or %RX_CLS_FLOW_DISC
>    *	if packets should be discarded
> - * @location: Index of filter in hardware table
> + * @location: Index of filter in hardware table, or %RX_CLS_FLOW_FIRST_LOC for
> + *	first available index, or %RX_CLS_FLOW_LAST_LOC for last available
[...]

I think that's reasonable.  We should also explicitly state that
location determines priority, i.e. if a packet matches two filters then
the one with the lower location wins.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 17:24         ` Ben Hutchings
@ 2011-05-04 17:33           ` Dimitris Michailidis
  2011-05-04 17:41             ` Alexander Duyck
  0 siblings, 1 reply; 23+ messages in thread
From: Dimitris Michailidis @ 2011-05-04 17:33 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Alexander Duyck, davem, jeffrey.t.kirsher, netdev

On 05/04/2011 10:24 AM, Ben Hutchings wrote:
> On Wed, 2011-05-04 at 10:09 -0700, Dimitris Michailidis wrote:
>> On 05/03/2011 04:34 PM, Ben Hutchings wrote:
>>> On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
>>>> I think RX_CLS_LOC_UNSPEC should be passed to the driver, where there is 
>>>> enough knowledge to pick an appropriate slot.  So I'd remove the
>>>>
>>>>      if (loc == RX_CLS_LOC_UNSPEC)
>>>>
>>>> block above, let the driver pick a slot, and then pass the selected location 
>>>> back for ethtool to report.
>>> But first we have to specify this in the ethtool API.  So please propose
>>> a patch to ethtool.h.
>> In the past we discussed that being able to specify the first available slot or 
>> the last available would be useful, so something like the below?
>>
>> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>> index 4194a20..909ef79 100644
>> --- a/include/linux/ethtool.h
>> +++ b/include/linux/ethtool.h
>> @@ -442,7 +442,8 @@ struct ethtool_flow_ext {
>>    *	includes the %FLOW_EXT flag.
>>    * @ring_cookie: RX ring/queue index to deliver to, or %RX_CLS_FLOW_DISC
>>    *	if packets should be discarded
>> - * @location: Index of filter in hardware table
>> + * @location: Index of filter in hardware table, or %RX_CLS_FLOW_FIRST_LOC for
>> + *	first available index, or %RX_CLS_FLOW_LAST_LOC for last available
> [...]
> 
> I think that's reasonable.  We should also explicitly state that
> location determines priority, i.e. if a packet matches two filters then
> the one with the lower location wins.

Easy and true for a TCAM.  For hashing would you use the location to decide how 
to order filters that fall in the same bucket?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 17:33           ` Dimitris Michailidis
@ 2011-05-04 17:41             ` Alexander Duyck
  2011-05-04 18:05               ` Ben Hutchings
  2011-05-04 18:18               ` Dimitris Michailidis
  0 siblings, 2 replies; 23+ messages in thread
From: Alexander Duyck @ 2011-05-04 17:41 UTC (permalink / raw)
  To: Dimitris Michailidis
  Cc: Ben Hutchings, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On 5/4/2011 10:33 AM, Dimitris Michailidis wrote:
> On 05/04/2011 10:24 AM, Ben Hutchings wrote:
>> On Wed, 2011-05-04 at 10:09 -0700, Dimitris Michailidis wrote:
>>> On 05/03/2011 04:34 PM, Ben Hutchings wrote:
>>>> On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
>>>>> I think RX_CLS_LOC_UNSPEC should be passed to the driver, where there is
>>>>> enough knowledge to pick an appropriate slot.  So I'd remove the
>>>>>
>>>>>       if (loc == RX_CLS_LOC_UNSPEC)
>>>>>
>>>>> block above, let the driver pick a slot, and then pass the selected location
>>>>> back for ethtool to report.
>>>> But first we have to specify this in the ethtool API.  So please propose
>>>> a patch to ethtool.h.
>>> In the past we discussed that being able to specify the first available slot or
>>> the last available would be useful, so something like the below?
>>>
>>> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>>> index 4194a20..909ef79 100644
>>> --- a/include/linux/ethtool.h
>>> +++ b/include/linux/ethtool.h
>>> @@ -442,7 +442,8 @@ struct ethtool_flow_ext {
>>>     *	includes the %FLOW_EXT flag.
>>>     * @ring_cookie: RX ring/queue index to deliver to, or %RX_CLS_FLOW_DISC
>>>     *	if packets should be discarded
>>> - * @location: Index of filter in hardware table
>>> + * @location: Index of filter in hardware table, or %RX_CLS_FLOW_FIRST_LOC for
>>> + *	first available index, or %RX_CLS_FLOW_LAST_LOC for last available
>> [...]
>>
>> I think that's reasonable.  We should also explicitly state that
>> location determines priority, i.e. if a packet matches two filters then
>> the one with the lower location wins.
>
> Easy and true for a TCAM.  For hashing would you use the location to decide how
> to order filters that fall in the same bucket?
>

The problem is none of this is backwards compatible.  The niu driver has 
supported the network flow classifier rules since 2.6.30.  Adding this 
would cause all rule setups for niu to fail because these locations 
would have to exist outside of the current rule locations.

This is why I was suggesting that the best approach would be to update 
the kernel to add a separate ioctl for letting the driver setup the 
location.  We can just attempt to make that call and when we get the 
EOPNOTSUPP errno we know the device driver doesn't support it and can 
then let the rule manager take over.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 17:41             ` Alexander Duyck
@ 2011-05-04 18:05               ` Ben Hutchings
  2011-05-04 18:21                 ` Alexander Duyck
  2011-05-04 19:06                 ` Dimitris Michailidis
  2011-05-04 18:18               ` Dimitris Michailidis
  1 sibling, 2 replies; 23+ messages in thread
From: Ben Hutchings @ 2011-05-04 18:05 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Dimitris Michailidis, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On Wed, 2011-05-04 at 10:41 -0700, Alexander Duyck wrote:
> On 5/4/2011 10:33 AM, Dimitris Michailidis wrote:
> > On 05/04/2011 10:24 AM, Ben Hutchings wrote:
> >> On Wed, 2011-05-04 at 10:09 -0700, Dimitris Michailidis wrote:
> >>> On 05/03/2011 04:34 PM, Ben Hutchings wrote:
> >>>> On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
> >>>>> I think RX_CLS_LOC_UNSPEC should be passed to the driver, where there is
> >>>>> enough knowledge to pick an appropriate slot.  So I'd remove the
> >>>>>
> >>>>>       if (loc == RX_CLS_LOC_UNSPEC)
> >>>>>
> >>>>> block above, let the driver pick a slot, and then pass the selected location
> >>>>> back for ethtool to report.
> >>>> But first we have to specify this in the ethtool API.  So please propose
> >>>> a patch to ethtool.h.
> >>> In the past we discussed that being able to specify the first available slot or
> >>> the last available would be useful, so something like the below?
> >>>
> >>> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> >>> index 4194a20..909ef79 100644
> >>> --- a/include/linux/ethtool.h
> >>> +++ b/include/linux/ethtool.h
> >>> @@ -442,7 +442,8 @@ struct ethtool_flow_ext {
> >>>     *	includes the %FLOW_EXT flag.
> >>>     * @ring_cookie: RX ring/queue index to deliver to, or %RX_CLS_FLOW_DISC
> >>>     *	if packets should be discarded
> >>> - * @location: Index of filter in hardware table
> >>> + * @location: Index of filter in hardware table, or %RX_CLS_FLOW_FIRST_LOC for
> >>> + *	first available index, or %RX_CLS_FLOW_LAST_LOC for last available
> >> [...]
> >>
> >> I think that's reasonable.  We should also explicitly state that
> >> location determines priority, i.e. if a packet matches two filters then
> >> the one with the lower location wins.
> >
> > Easy and true for a TCAM.  For hashing would you use the location to decide how
> > to order filters that fall in the same bucket?
> >
> 
> The problem is none of this is backwards compatible.  The niu driver has 
> supported the network flow classifier rules since 2.6.30.  Adding this 
> would cause all rule setups for niu to fail because these locations 
> would have to exist outside of the current rule locations.

Those rule setups would have to be using some unofficial patched
ethtool.  I don't think that should be a major concern for us.
However...

> This is why I was suggesting that the best approach would be to update 
> the kernel to add a separate ioctl for letting the driver setup the 
> location.
>
> We can just attempt to make that call and when we get the 
> EOPNOTSUPP errno we know the device driver doesn't support it and can 
> then let the rule manager take over.

How about having ETHTOOL_GRXCLSRLCNT set a flag in the 'data' field to
indicate that the driver can assign locations?  (We would have to
specify that for compatibility with older kernels the application must
initialise this filed to 0.)

rmgr_init() would then check for this flag.

I hope someone is actually going to test this on niu, as it sounds like
that will be the only driver using a TCAM... David?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 17:41             ` Alexander Duyck
  2011-05-04 18:05               ` Ben Hutchings
@ 2011-05-04 18:18               ` Dimitris Michailidis
  2011-05-04 18:35                 ` Alexander Duyck
  1 sibling, 1 reply; 23+ messages in thread
From: Dimitris Michailidis @ 2011-05-04 18:18 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Ben Hutchings, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On 05/04/2011 10:41 AM, Alexander Duyck wrote:
> On 5/4/2011 10:33 AM, Dimitris Michailidis wrote:
>> On 05/04/2011 10:24 AM, Ben Hutchings wrote:
>>> On Wed, 2011-05-04 at 10:09 -0700, Dimitris Michailidis wrote:
>>>> On 05/03/2011 04:34 PM, Ben Hutchings wrote:
>>>>> On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
>>>>>> I think RX_CLS_LOC_UNSPEC should be passed to the driver, where 
>>>>>> there is
>>>>>> enough knowledge to pick an appropriate slot.  So I'd remove the
>>>>>>
>>>>>>       if (loc == RX_CLS_LOC_UNSPEC)
>>>>>>
>>>>>> block above, let the driver pick a slot, and then pass the 
>>>>>> selected location
>>>>>> back for ethtool to report.
>>>>> But first we have to specify this in the ethtool API.  So please 
>>>>> propose
>>>>> a patch to ethtool.h.
>>>> In the past we discussed that being able to specify the first 
>>>> available slot or
>>>> the last available would be useful, so something like the below?
>>>>
>>>> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>>>> index 4194a20..909ef79 100644
>>>> --- a/include/linux/ethtool.h
>>>> +++ b/include/linux/ethtool.h
>>>> @@ -442,7 +442,8 @@ struct ethtool_flow_ext {
>>>>     *    includes the %FLOW_EXT flag.
>>>>     * @ring_cookie: RX ring/queue index to deliver to, or 
>>>> %RX_CLS_FLOW_DISC
>>>>     *    if packets should be discarded
>>>> - * @location: Index of filter in hardware table
>>>> + * @location: Index of filter in hardware table, or 
>>>> %RX_CLS_FLOW_FIRST_LOC for
>>>> + *    first available index, or %RX_CLS_FLOW_LAST_LOC for last 
>>>> available
>>> [...]
>>>
>>> I think that's reasonable.  We should also explicitly state that
>>> location determines priority, i.e. if a packet matches two filters then
>>> the one with the lower location wins.
>>
>> Easy and true for a TCAM.  For hashing would you use the location to 
>> decide how
>> to order filters that fall in the same bucket?
>>
> 
> The problem is none of this is backwards compatible.  The niu driver has 
> supported the network flow classifier rules since 2.6.30.  Adding this 
> would cause all rule setups for niu to fail because these locations 
> would have to exist outside of the current rule locations.

Looking at niu it already has a problem with its handling of location because 
it does

	u16 idx;

	idx = nfc->fs.location;

i.e., it disregards the upper 16 bits of location.  With the current code niu 
would return -EINVAL for the two new constants, which seems correct.

> 
> This is why I was suggesting that the best approach would be to update 
> the kernel to add a separate ioctl for letting the driver setup the 
> location.  We can just attempt to make that call and when we get the 
> EOPNOTSUPP errno we know the device driver doesn't support it and can 
> then let the rule manager take over.

The problem with this is the location is dependent on the type of filter being 
added.  I.e., the ioctl would need to get all the information the existing 
ioctl carries making the new ioctl a small superset of the current one.
Additionally, if the driver only allocates a location in a separate ioctl how 
does it know that the app is actually going to use it?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 18:05               ` Ben Hutchings
@ 2011-05-04 18:21                 ` Alexander Duyck
  2011-05-04 18:45                   ` Ben Hutchings
  2011-05-04 19:06                 ` Dimitris Michailidis
  1 sibling, 1 reply; 23+ messages in thread
From: Alexander Duyck @ 2011-05-04 18:21 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Dimitris Michailidis, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On 5/4/2011 11:05 AM, Ben Hutchings wrote:
> On Wed, 2011-05-04 at 10:41 -0700, Alexander Duyck wrote:
>> On 5/4/2011 10:33 AM, Dimitris Michailidis wrote:
>>> On 05/04/2011 10:24 AM, Ben Hutchings wrote:
>>>> On Wed, 2011-05-04 at 10:09 -0700, Dimitris Michailidis wrote:
>>>>> On 05/03/2011 04:34 PM, Ben Hutchings wrote:
>>>>>> On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
>>>>>>> I think RX_CLS_LOC_UNSPEC should be passed to the driver, where there is
>>>>>>> enough knowledge to pick an appropriate slot.  So I'd remove the
>>>>>>>
>>>>>>>        if (loc == RX_CLS_LOC_UNSPEC)
>>>>>>>
>>>>>>> block above, let the driver pick a slot, and then pass the selected location
>>>>>>> back for ethtool to report.
>>>>>> But first we have to specify this in the ethtool API.  So please propose
>>>>>> a patch to ethtool.h.
>>>>> In the past we discussed that being able to specify the first available slot or
>>>>> the last available would be useful, so something like the below?
>>>>>
>>>>> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>>>>> index 4194a20..909ef79 100644
>>>>> --- a/include/linux/ethtool.h
>>>>> +++ b/include/linux/ethtool.h
>>>>> @@ -442,7 +442,8 @@ struct ethtool_flow_ext {
>>>>>      *	includes the %FLOW_EXT flag.
>>>>>      * @ring_cookie: RX ring/queue index to deliver to, or %RX_CLS_FLOW_DISC
>>>>>      *	if packets should be discarded
>>>>> - * @location: Index of filter in hardware table
>>>>> + * @location: Index of filter in hardware table, or %RX_CLS_FLOW_FIRST_LOC for
>>>>> + *	first available index, or %RX_CLS_FLOW_LAST_LOC for last available
>>>> [...]
>>>>
>>>> I think that's reasonable.  We should also explicitly state that
>>>> location determines priority, i.e. if a packet matches two filters then
>>>> the one with the lower location wins.
>>>
>>> Easy and true for a TCAM.  For hashing would you use the location to decide how
>>> to order filters that fall in the same bucket?
>>>
>>
>> The problem is none of this is backwards compatible.  The niu driver has
>> supported the network flow classifier rules since 2.6.30.  Adding this
>> would cause all rule setups for niu to fail because these locations
>> would have to exist outside of the current rule locations.
>
> Those rule setups would have to be using some unofficial patched
> ethtool.  I don't think that should be a major concern for us.
> However...

No, what I am saying is that if we were to add those locations to the 
ETHTOOL_SRXCLSRLINS then niu would not work anymore as it would treat 
them as invalid locations.  This existing setup will at least work with 
the existing niu from what I can tell.  If we were to try adding rules 
with locations outside of actual allowable rules then we would likely 
receive an indication that we provided an invalid argument.

>> This is why I was suggesting that the best approach would be to update
>> the kernel to add a separate ioctl for letting the driver setup the
>> location.
>>
>> We can just attempt to make that call and when we get the
>> EOPNOTSUPP errno we know the device driver doesn't support it and can
>> then let the rule manager take over.
>
> How about having ETHTOOL_GRXCLSRLCNT set a flag in the 'data' field to
> indicate that the driver can assign locations?  (We would have to
> specify that for compatibility with older kernels the application must
> initialise this filed to 0.)
>
> rmgr_init() would then check for this flag.
>
> I hope someone is actually going to test this on niu, as it sounds like
> that will be the only driver using a TCAM... David?
>
> Ben.
>

Honestly what I would prefer to see is a seperate call added such as an 
ETHTOOL_GRSCLSRLLOC that we could pass the flow specifier to and perhaps 
include first/last location call in that and then let the driver return 
where it wants to drop the rule.  That way we can avoid having to create 
an overly complicated rule manager that can handle all the bizarre rule 
ordering options that I am sure all the different network devices support.

The only reason I am not implementing this now is because there aren't 
any drivers in place that would currently use it.  I figure once cxgb 
has a means in place of supporting flow classifier rules then Dimitris 
can add the necessary code to ethtool and the kernel to allow the driver 
to specify rule locations.  I would prefer to avoid adding features 
based on speculation of what will be needed and would like to be able to 
actually see how the features will be used.

Thanks,

Alex


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 18:18               ` Dimitris Michailidis
@ 2011-05-04 18:35                 ` Alexander Duyck
  2011-05-04 18:50                   ` Dimitris Michailidis
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Duyck @ 2011-05-04 18:35 UTC (permalink / raw)
  To: Dimitris Michailidis
  Cc: Ben Hutchings, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On 5/4/2011 11:18 AM, Dimitris Michailidis wrote:
> On 05/04/2011 10:41 AM, Alexander Duyck wrote:
>> On 5/4/2011 10:33 AM, Dimitris Michailidis wrote:
>>> On 05/04/2011 10:24 AM, Ben Hutchings wrote:
>>>> On Wed, 2011-05-04 at 10:09 -0700, Dimitris Michailidis wrote:
>>>>> On 05/03/2011 04:34 PM, Ben Hutchings wrote:
>>>>>> On Tue, 2011-05-03 at 16:23 -0700, Dimitris Michailidis wrote:
>>>>>>> I think RX_CLS_LOC_UNSPEC should be passed to the driver, where
>>>>>>> there is
>>>>>>> enough knowledge to pick an appropriate slot.  So I'd remove the
>>>>>>>
>>>>>>>        if (loc == RX_CLS_LOC_UNSPEC)
>>>>>>>
>>>>>>> block above, let the driver pick a slot, and then pass the
>>>>>>> selected location
>>>>>>> back for ethtool to report.
>>>>>> But first we have to specify this in the ethtool API.  So please
>>>>>> propose
>>>>>> a patch to ethtool.h.
>>>>> In the past we discussed that being able to specify the first
>>>>> available slot or
>>>>> the last available would be useful, so something like the below?
>>>>>
>>>>> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>>>>> index 4194a20..909ef79 100644
>>>>> --- a/include/linux/ethtool.h
>>>>> +++ b/include/linux/ethtool.h
>>>>> @@ -442,7 +442,8 @@ struct ethtool_flow_ext {
>>>>>      *    includes the %FLOW_EXT flag.
>>>>>      * @ring_cookie: RX ring/queue index to deliver to, or
>>>>> %RX_CLS_FLOW_DISC
>>>>>      *    if packets should be discarded
>>>>> - * @location: Index of filter in hardware table
>>>>> + * @location: Index of filter in hardware table, or
>>>>> %RX_CLS_FLOW_FIRST_LOC for
>>>>> + *    first available index, or %RX_CLS_FLOW_LAST_LOC for last
>>>>> available
>>>> [...]
>>>>
>>>> I think that's reasonable.  We should also explicitly state that
>>>> location determines priority, i.e. if a packet matches two filters then
>>>> the one with the lower location wins.
>>>
>>> Easy and true for a TCAM.  For hashing would you use the location to
>>> decide how
>>> to order filters that fall in the same bucket?
>>>
>>
>> The problem is none of this is backwards compatible.  The niu driver has
>> supported the network flow classifier rules since 2.6.30.  Adding this
>> would cause all rule setups for niu to fail because these locations
>> would have to exist outside of the current rule locations.
>
> Looking at niu it already has a problem with its handling of location because
> it does
>
> 	u16 idx;
>
> 	idx = nfc->fs.location;
>
> i.e., it disregards the upper 16 bits of location.  With the current code niu
> would return -EINVAL for the two new constants, which seems correct.

Actually it looks like this is going to trigger multiple bugs. 
Specifically since the upper 16 bits are being discarded it is possible 
to specify a value such as 0x10001 hex and it will misread that as 
0x0001.  This is something that will probably need to be fixed in niu.

>>
>> This is why I was suggesting that the best approach would be to update
>> the kernel to add a separate ioctl for letting the driver setup the
>> location.  We can just attempt to make that call and when we get the
>> EOPNOTSUPP errno we know the device driver doesn't support it and can
>> then let the rule manager take over.
>
> The problem with this is the location is dependent on the type of filter being
> added.  I.e., the ioctl would need to get all the information the existing
> ioctl carries making the new ioctl a small superset of the current one.
> Additionally, if the driver only allocates a location in a separate ioctl how
> does it know that the app is actually going to use it?

It doesn't know that the application is actually going to use it.  What 
should happen is that the location should be verified by the driver when 
it is used in the rule insertion call.  After all it is fully possible 
for the user to specify a location out of range since the insert call 
does no validation in ethtool if the user specified the location.  That 
responsibility now lies with the driver.

Thanks,

Alex


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 18:21                 ` Alexander Duyck
@ 2011-05-04 18:45                   ` Ben Hutchings
  2011-05-04 21:07                     ` Alexander Duyck
  0 siblings, 1 reply; 23+ messages in thread
From: Ben Hutchings @ 2011-05-04 18:45 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Dimitris Michailidis, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On Wed, 2011-05-04 at 11:21 -0700, Alexander Duyck wrote:
[...]
> Honestly what I would prefer to see is a seperate call added such as an 
> ETHTOOL_GRSCLSRLLOC that we could pass the flow specifier to and perhaps 
> include first/last location call in that and then let the driver return 
> where it wants to drop the rule.

This must not be done as a separate operation because it's racy (in fact
that's an inherent problem with the rule manager).  In the sfc driver
(and probably others in future) filters could be inserted for RFS at any
time.

> That way we can avoid having to create 
> an overly complicated rule manager that can handle all the bizarre rule 
> ordering options that I am sure all the different network devices support.

Right, the rule manager can't implement that.

> The only reason I am not implementing this now is because there aren't 
> any drivers in place that would currently use it.  I figure once cxgb 
> has a means in place of supporting flow classifier rules then Dimitris 
> can add the necessary code to ethtool and the kernel to allow the driver 
> to specify rule locations.  I would prefer to avoid adding features 
> based on speculation of what will be needed and would like to be able to 
> actually see how the features will be used.

If you are going to implement the same interface in ixgbe as in niu
(modulo bugs), then I have no objection to going ahead with this and
then adding the option for driver-assigned locations later.

Please can you confirm that the location specified for
ETHTOOL_SRXCLSRLINS will indeed be used as a priority in case of
overlapping filters?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 18:35                 ` Alexander Duyck
@ 2011-05-04 18:50                   ` Dimitris Michailidis
  0 siblings, 0 replies; 23+ messages in thread
From: Dimitris Michailidis @ 2011-05-04 18:50 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Ben Hutchings, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On 05/04/2011 11:35 AM, Alexander Duyck wrote:
> On 5/4/2011 11:18 AM, Dimitris Michailidis wrote:
>> On 05/04/2011 10:41 AM, Alexander Duyck wrote:
>>> This is why I was suggesting that the best approach would be to update
>>> the kernel to add a separate ioctl for letting the driver setup the
>>> location.  We can just attempt to make that call and when we get the
>>> EOPNOTSUPP errno we know the device driver doesn't support it and can
>>> then let the rule manager take over.
>>
>> The problem with this is the location is dependent on the type of 
>> filter being
>> added.  I.e., the ioctl would need to get all the information the 
>> existing
>> ioctl carries making the new ioctl a small superset of the current one.
>> Additionally, if the driver only allocates a location in a separate 
>> ioctl how
>> does it know that the app is actually going to use it?
> 
> It doesn't know that the application is actually going to use it.  What 
> should happen is that the location should be verified by the driver when 
> it is used in the rule insertion call.  After all it is fully possible 
> for the user to specify a location out of range since the insert call 
> does no validation in ethtool if the user specified the location.  That 
> responsibility now lies with the driver.

That's not the problem I was pointing out.  Of course the driver will verify 
the location it is given.  The problem is if you have a separate ioctl call 
that only reserves a location (and it needs to reserve otherwise several of 
these calls can get the same value), if you don't use it you leave the driver 
with orphan reservations.  Imagine you hit ^C between the two ioctls in ethtool.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 18:05               ` Ben Hutchings
  2011-05-04 18:21                 ` Alexander Duyck
@ 2011-05-04 19:06                 ` Dimitris Michailidis
  1 sibling, 0 replies; 23+ messages in thread
From: Dimitris Michailidis @ 2011-05-04 19:06 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Alexander Duyck, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On 05/04/2011 11:05 AM, Ben Hutchings wrote:
> How about having ETHTOOL_GRXCLSRLCNT set a flag in the 'data' field to
> indicate that the driver can assign locations?  (We would have to
> specify that for compatibility with older kernels the application must
> initialise this filed to 0.)
> 
> rmgr_init() would then check for this flag.

I think this is a good suggestion if we want to support location selection by 
either the driver or ethtool.  I also think ethtool's assumption that it is the 
only allocator and can allocate race-free is fundamentally flawed (take two 
parallel ethtools).

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 18:45                   ` Ben Hutchings
@ 2011-05-04 21:07                     ` Alexander Duyck
  2011-05-04 21:54                       ` Ben Hutchings
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Duyck @ 2011-05-04 21:07 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Dimitris Michailidis, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On 5/4/2011 11:45 AM, Ben Hutchings wrote:
> On Wed, 2011-05-04 at 11:21 -0700, Alexander Duyck wrote:
> [...]
>> Honestly what I would prefer to see is a seperate call added such as an
>> ETHTOOL_GRSCLSRLLOC that we could pass the flow specifier to and perhaps
>> include first/last location call in that and then let the driver return
>> where it wants to drop the rule.
>
> This must not be done as a separate operation because it's racy (in fact
> that's an inherent problem with the rule manager).  In the sfc driver
> (and probably others in future) filters could be inserted for RFS at any
> time.
>
>> That way we can avoid having to create
>> an overly complicated rule manager that can handle all the bizarre rule
>> ordering options that I am sure all the different network devices support.
>
> Right, the rule manager can't implement that.
>
>> The only reason I am not implementing this now is because there aren't
>> any drivers in place that would currently use it.  I figure once cxgb
>> has a means in place of supporting flow classifier rules then Dimitris
>> can add the necessary code to ethtool and the kernel to allow the driver
>> to specify rule locations.  I would prefer to avoid adding features
>> based on speculation of what will be needed and would like to be able to
>> actually see how the features will be used.
>
> If you are going to implement the same interface in ixgbe as in niu
> (modulo bugs), then I have no objection to going ahead with this and
> then adding the option for driver-assigned locations later.
>
> Please can you confirm that the location specified for
> ETHTOOL_SRXCLSRLINS will indeed be used as a priority in case of
> overlapping filters?
>
> Ben.
>

The ixgbe approach should be nearly identical in terms of how the 
priorities are based on the location of the filters.  The original 
version from Santwona had the rule manager breaking the rules up into 7 
sections so that rules that specified fewer fields would be near the end 
of the list.  I'm pretty sure that was all due to priorities from what I 
could see in the niu driver since the filters that covered wider ranges 
were being made lower priority to be matched last.

In terms of overloading the get count call, that probably would be the 
best route in terms of changing rule manager behavior.  The only thing I 
am having a hard time seeing is how the rule manager would be able to 
distinguish between low priority and high priority filter rules, or is 
this something that new keywords would be added to the parser for?

I just put out version 6 of the patches.  Essentially I have reduced the 
size of the rule manager to being used only on insertion without any 
rule location specified.  The one thing to keep in mind with this rule 
manager is that the rule at table size - 1 is always going to be the 
lowest priority rule.  So if it was reserved for unspecified rules it 
would be easy to use something like that to achieve an "auto-select" 
location that the driver could then reassign as rules were added to it.

Thanks,

Alex


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [ethtool PATCH 4/4] v5 Add RX packet classification interface
  2011-05-04 21:07                     ` Alexander Duyck
@ 2011-05-04 21:54                       ` Ben Hutchings
  0 siblings, 0 replies; 23+ messages in thread
From: Ben Hutchings @ 2011-05-04 21:54 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Dimitris Michailidis, davem@davemloft.net, Kirsher, Jeffrey T,
	netdev@vger.kernel.org

On Wed, 2011-05-04 at 14:07 -0700, Alexander Duyck wrote:
> On 5/4/2011 11:45 AM, Ben Hutchings wrote:
[...]
> > Please can you confirm that the location specified for
> > ETHTOOL_SRXCLSRLINS will indeed be used as a priority in case of
> > overlapping filters?
> >
> > Ben.
> >
> 
> The ixgbe approach should be nearly identical in terms of how the 
> priorities are based on the location of the filters.

OK, good.

> The original 
> version from Santwona had the rule manager breaking the rules up into 7 
> sections so that rules that specified fewer fields would be near the end 
> of the list.  I'm pretty sure that was all due to priorities from what I 
> could see in the niu driver since the filters that covered wider ranges 
> were being made lower priority to be matched last.

That would make sense.

> In terms of overloading the get count call, that probably would be the 
> best route in terms of changing rule manager behavior.  The only thing I 
> am having a hard time seeing is how the rule manager would be able to 
> distinguish between low priority and high priority filter rules, or is 
> this something that new keywords would be added to the parser for?

Right, there would have to be keywords to specify that.

> I just put out version 6 of the patches.  Essentially I have reduced the 
> size of the rule manager to being used only on insertion without any 
> rule location specified.  The one thing to keep in mind with this rule 
> manager is that the rule at table size - 1 is always going to be the 
> lowest priority rule.  So if it was reserved for unspecified rules it 
> would be easy to use something like that to achieve an "auto-select" 
> location that the driver could then reassign as rules were added to it.

I don't think any location value within the physical table size should
select this special behaviour.  The special location values for
auto-select (with whatever priority) should be distinct from all the
physical location values.

I still need to review your patches but it sounds like they will be
ready to apply.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2011-05-04 21:54 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-03 16:12 [ethtool PATCH 0/4] Add support for network flow classifier Alexander Duyck
2011-05-03 16:12 ` [ethtool PATCH 1/4] ethtool: remove strings based approach for displaying n-tuple Alexander Duyck
2011-05-03 16:12 ` [ethtool PATCH 2/4] Cleanup defines and header includes to address several issues Alexander Duyck
2011-05-03 16:12 ` [ethtool PATCH 3/4] Add support for __be64 and bitops, centralize several needed macros Alexander Duyck
2011-05-03 16:12 ` [ethtool PATCH 4/4] v5 Add RX packet classification interface Alexander Duyck
2011-05-03 23:23   ` Dimitris Michailidis
2011-05-03 23:34     ` Ben Hutchings
2011-05-04  0:29       ` Alexander Duyck
2011-05-04  1:35         ` Dimitris Michailidis
2011-05-04  3:10           ` Alexander Duyck
2011-05-04 17:09       ` Dimitris Michailidis
2011-05-04 17:24         ` Ben Hutchings
2011-05-04 17:33           ` Dimitris Michailidis
2011-05-04 17:41             ` Alexander Duyck
2011-05-04 18:05               ` Ben Hutchings
2011-05-04 18:21                 ` Alexander Duyck
2011-05-04 18:45                   ` Ben Hutchings
2011-05-04 21:07                     ` Alexander Duyck
2011-05-04 21:54                       ` Ben Hutchings
2011-05-04 19:06                 ` Dimitris Michailidis
2011-05-04 18:18               ` Dimitris Michailidis
2011-05-04 18:35                 ` Alexander Duyck
2011-05-04 18:50                   ` Dimitris Michailidis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).