Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next v1 0/3] net: fec: add Wake-on-LAN support
From: Fabio Estevam @ 2014-12-31 19:22 UTC (permalink / raw)
  To: David Miller
  Cc: Duan Fugang-B38611, Shawn Guo, netdev@vger.kernel.org,
	Ben Hutchings, Stephen Hemminger
In-Reply-To: <20141231.142034.494966377580113808.davem@davemloft.net>

On Wed, Dec 31, 2014 at 5:20 PM, David Miller <davem@davemloft.net> wrote:
> From: Fabio Estevam <festevam@gmail.com>
> Date: Wed, 31 Dec 2014 17:03:28 -0200
>
>> ,but when we see your patches applied they appear with Nimrod Andy as
>> the author instead:
>> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=de40ed31b3c577cefd7b54972365a272ecbe9dd6
>>
>> It would be nice if you could use the From field to match the name in
>> the Signed-off tag.
>
> It's his business to setup things properly so they match, not
> mine.  I just take what is in the patch as-is.

Exactly. This was what I suggested him to do.

^ permalink raw reply

* [net-next PATCH v1 00/11] A flow API
From: John Fastabend @ 2014-12-31 19:45 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy

So... I could continue to mull over this and tweak bits and pieces
here and there but I decided its best to get a wider group of folks
looking at it and hopefulyl with any luck using it so here it is.

This set creates a new netlink family and set of messages to configure
flow tables in hardware. I tried to make the commit messages
reasonably verbose at least in the flow_table patches.

What we get at the end of this series is a working API to get device
capabilities and program flows using the rocker switch.

I created a user space tool 'flow' that I use to configure and query
the devices it is posted here,

	https://github.com/jrfastab/iprotue2-flow-tool

For now it is a stand-alone tool but once the kernel bits get sorted
out (I'm guessing there will need to be a few versions of this series
to get it right) I would like to port it into the iproute2 package.
This way we can keep all of our tooling in one package see 'bridge'
for example.

As far as testing, I've tested various combinations of tables and
rules on the rocker switch and it seems to work. I have not tested
100% of the rocker code paths though. It would be great to get some
sort of automated framework around the API to do this. I don't
think should gate the inclusion of the API though.

I could use some help reviewing,

  (a) error paths and netlink validation code paths

  (b) Break down of structures vs netlink attributes. I
      am trying to balance flexibility given by having
      netlinnk TLV attributes vs conciseness. So some
      things are passed as structures.

  (c) are there any devices that have pipelines that we
      can't represent with this API? It would be good to
      know about these so we can design it in probably
      in a future series.

For some examples and maybe a bit more illustrative description I
posted a quickly typed up set of notes on github io pages. Here we
can show the description along with images produced by the flow tool
showing the pipeline. Once we settle a bit more on the API we should
probably do a clean up of this and other threads happening and commit
something to the Documentation directory.

 http://jrfastab.github.io/jekyll/update/2014/12/21/flow-api.html

Finally I have more patches to add support for creating and destroying
tables. This allows users to define the pipeline at runtime rather
than statically as rocker does now. After this set gets some traction
I'll look at pushing them in a next round. However it likely requires
adding another "world" to rocker. Another piece that I want to add is
a description of the actions and metadata. This way user space can
"learn" what an action is and how metadata interacts with the system.
This work is under development.

Thanks! Any comments/feedback always welcome.

And also thanks to everyone who helped with this flow API so far. All
the folks at Dusseldorf LPC, OVS summit Santa Clara, P4 authors for
some inspiration, the collection of IETF FoRCES documents I mulled
over, Netfilter workshop where I started to realize fixing ethtool
was most likely not going to work, etc.

---

John Fastabend (11):
      net: flow_table: create interface for hw match/action tables
      net: flow_table: add flow, delete flow
      net: flow_table: add apply action argument to tables
      rocker: add pipeline model for rocker switch
      net: rocker: add set flow rules
      net: rocker: add group_id slices and drop explicit goto
      net: rocker: add multicast path to bridging
      net: rocker: add get flow API operation
      net: rocker: add cookie to group acls and use flow_id to set cookie
      net: rocker: have flow api calls set cookie value
      net: rocker: implement delete flow routine

 drivers/net/ethernet/rocker/rocker.c          | 1641 +++++++++++++++++++++++++
 drivers/net/ethernet/rocker/rocker_pipeline.h |  793 ++++++++++++
 include/linux/if_flow.h                       |  115 ++
 include/linux/netdevice.h                     |   20 
 include/uapi/linux/if_flow.h                  |  413 ++++++
 net/Kconfig                                   |    7 
 net/core/Makefile                             |    1 
 net/core/flow_table.c                         | 1339 ++++++++++++++++++++
 8 files changed, 4312 insertions(+), 17 deletions(-)
 create mode 100644 drivers/net/ethernet/rocker/rocker_pipeline.h
 create mode 100644 include/linux/if_flow.h
 create mode 100644 include/uapi/linux/if_flow.h
 create mode 100644 net/core/flow_table.c

-- 
Signature

^ permalink raw reply

* [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables
From: John Fastabend @ 2014-12-31 19:45 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

Currently, we do not have an interface to query hardware and learn
the capabilities of the device. This makes it very difficult to use
hardware flow tables.

At the moment the only interface we have to work with hardware flow
tables is ethtool. This has many deficiencies, first its ioctl based
making it difficult to use in systems that need to monitor interfaces
because there is no support for multicast, notifiers, etc.

The next big gap is it doesn't support querying devices for
capabilities. The only way to learn hardware entries is by doing a
"try and see" operation. An error perhaps indicating the device can
not support your request but could be possibly for other reasons.
Maybe a table is full for example. The existing flow interface only
supports a single ingress table which is sufficient for some of the
existing NIC host interfaces but limiting for more advanced NIC
interfaces and switch devices.

Also it is not extensible without recompiling both drivers and core
interfaces. It may be possible to reprogram a device with additional
header types, new protocols, whatever and it would be great if the
flow table infrastructure can handle this.

So this patch scraps the ethtool flow classifier interface and
creates a new flow table interface. It is expected that device that
support the existing ethtool interface today can support both
interfaces without too much difficulty. I did a proof point on the
ixgbe driver. Only choosing ixgbe because I have a 82599 10Gbps
device in my development system. A more thorough implementation
was done for the rocker switch showing how to use the interface.

In this patch we create interfaces to get the headers a device
supports, the actions it supports, a header graph showing the
relationship between headers the device supports, the tables
supported by the device and how they are connected.

This patch _only_ provides the get routines in an attempt to
make the patch sequence manageable.

get_headers :

   report a set of headers/fields the device supports. These
   are specified as length/offsets so we can support standard
   protocols or vendor specific headers. This is more flexible
   then bitmasks of pre-defined packet types. In 'tc' for example
   I may use u32 to match on proprietary or vendor specific fields.
   A bitmask approach does not allow for this, but defining the
   fields as a set of offsets and lengths allows for this.

   A device that supports Openflow version 1.x for example could
   provide the set of field/offsets that are equivelent to the
   specification.

   One property of this type of interface is I don't have to
   rebuild my kernel/driver header interfaces, etc to support the
   latest and greatest trendy protocol foo.

   For some types of metadata the device understands we also
   use header fields to represent these. One example of this is
   we may have an ingress_port metadata field to report the
   port a packet was received on. At the moment we expect the
   metadata fields to be defined outside the interface. We can
   standardize on common ones such "ingress_port" across devices.

   Some examples of outside definitions specifying metadata
   might be OVS, internal definitions like skb->mark, or some
   FoRCES definitions.

get_header_graph :

   Simply providing a header/field offset I support is not sufficient
   to learn how many nested 802.1Q tags I can support and other
   similar cases where the ordering of headers matters.

   So we use this operation to query the device for a header
   graph showing how the headers need to be related.
   With this operation and the 'get_headers' operation you can
   interrogate the driver with questions like "do you support
   Q'in'Q?", "how many VLAN tags can I nest before the parser
   breaks?", "Do you support MPLS?", "How about Foo Header in
   a VXLAN tunnel?".

get_actions :

   Report a list of actions supported by the device along with the
   arguments they take. So "drop_packet" action takes no arguments
   and "set_field" action takes two arguments a field and value.

   This suffers again from being slightly opaque. Meaning if a device
   reports back action "foo_bar" with three arguments how do I as a
   consumer of this "know" what that action is? The easy thing to do
   is punt on it and say it should be described outside the driver
   somewhere. OVS for example defines a set of actions. If my FoRCeS
   quick read is correct they define actions using text in the
   messaging interface. A follow up patch series could use a
   description language to describe actions. Possibly using something
   from eBPF or nftables for example. This patch will not try to
   solve the isuse now and expect actions are defined outside the API
   or are well known.

get_tables :

   Hardware may support one or more tables. Each table supports a set
   of matches and a set of actions. The match fields supported are
   defined above by the 'get_headers' operations. Similarly the actions
   supported are defined by the 'get_actions' operation.

   This allows the hardware to report several tables all with distinct
   capabilities. Tables also have table attributes used to describe
   features of the table. Because netlink messages are TLV based we
   can easily add new table attribues as needed.

   Currently a table has two attributes size and source. The size
   indicates how many "slots" are in the table for flow entries. One
   caveat here is a rule in the flow table may consume multiple slots
   in the table. We deal with this in a subsequent patch.

   The source field is used to indicate table boundaries where actions
   are applied. A table with the same source value will not "see"
   actions from tables with the same source. An example where this is
   relavent would be to have an action to re-write the destiniation
   IP address of a packet. If you have a match rule in a table with
   the same source that matches on the new IP address it will not be
   hit. However if it is in a table with a different source value
   _and_ in another table that gets applied the rule will be hit. See
   the next operatoin for querying table ordering.

   Some basic hardware may only support a single table which simplifies
   some things. But even the simple 10/40Gbps NICs support multiple
   tables and different tables depending on ingress/egress.

get_table_graph :

   When a device supports multiple tables we need to identify how the
   tables are connected when each table is executed.

   To do this we provide a table graph which gives the pipeline of the
   device. The graph gives nodes representing each table and the edges
   indicate the criteria to progress to the next flow table. There are
   examples of this type of thing in both FoRCES and OVS. OVS
   prescribes a set of tables reachable with goto actions and FoRCES a
   slightly more flexible arrangement. In software tc's u32 classifier
   allows "linking" hash tables together. The OVS dataplane with the
   support of 'goto' action is completely connected. Without the
   'goto' action the tables are progressed linearly.

   By querying the graph from hardware we can "learn" what table flows
   are supported and map them into software.

   We also provide a bit to indicate if the node is a root node of the
   ingress pipeline or egress pipeline. This is used on devices that
   have different pipelines for ingres and egress. This appears to be
   fairly common for devices. The realtek chip presented at LPC in
   Dusseldorf for example appeared to have a separate ingress/egress
   pipeline.

With these five operations software can learn what types of fields
the hardware flow table supports and how they are arranged. Subsequent
patches will address programming the flow tables.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 include/linux/if_flow.h      |   93 +++++
 include/linux/netdevice.h    |   12 +
 include/uapi/linux/if_flow.h |  363 ++++++++++++++++++
 net/Kconfig                  |    7 
 net/core/Makefile            |    1 
 net/core/flow_table.c        |  837 ++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 1313 insertions(+)
 create mode 100644 include/linux/if_flow.h
 create mode 100644 include/uapi/linux/if_flow.h
 create mode 100644 net/core/flow_table.c

diff --git a/include/linux/if_flow.h b/include/linux/if_flow.h
new file mode 100644
index 0000000..1b6c1ea
--- /dev/null
+++ b/include/linux/if_flow.h
@@ -0,0 +1,93 @@
+/*
+ * include/linux/net/if_flow.h - Flow table interface for Switch devices
+ * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com>
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Author: John Fastabend <john.r.fastabend@intel.com>
+ */
+
+#ifndef _IF_FLOW_H
+#define _IF_FLOW_H
+
+#include <uapi/linux/if_flow.h>
+
+/**
+ * @struct net_flow_header
+ * @brief defines a match (header/field) an endpoint can use
+ *
+ * @uid unique identifier for header
+ * @field_sz number of fields are in the set
+ * @fields the set of fields in the net_flow_header
+ */
+struct net_flow_header {
+	char name[NET_FLOW_NAMSIZ];
+	int uid;
+	int field_sz;
+	struct net_flow_field *fields;
+};
+
+/**
+ * @struct net_flow_action
+ * @brief a description of a endpoint defined action
+ *
+ * @name printable name
+ * @uid unique action identifier
+ * @types NET_FLOW_ACTION_TYPE_NULL terminated list of action types
+ */
+struct net_flow_action {
+	char name[NET_FLOW_NAMSIZ];
+	int uid;
+	struct net_flow_action_arg *args;
+};
+
+/**
+ * @struct net_flow_table
+ * @brief define flow table with supported match/actions
+ *
+ * @uid unique identifier for table
+ * @source uid of parent table
+ * @size max number of entries for table or -1 for unbounded
+ * @matches null terminated set of supported match types given by match uid
+ * @actions null terminated set of supported action types given by action uid
+ * @flows set of flows
+ */
+struct net_flow_table {
+	char name[NET_FLOW_NAMSIZ];
+	int uid;
+	int source;
+	int size;
+	struct net_flow_field_ref *matches;
+	int *actions;
+};
+
+/* net_flow_hdr_node: node in a header graph of header fields.
+ *
+ * @uid : unique id of the graph node
+ * @flwo_header_ref : identify the hdrs that can handled by this node
+ * @net_flow_jump_table : give a case jump statement
+ */
+struct net_flow_hdr_node {
+	char name[NET_FLOW_NAMSIZ];
+	int uid;
+	int *hdrs;
+	struct net_flow_jump_table *jump;
+};
+
+struct net_flow_tbl_node {
+	int uid;
+	__u32 flags;
+	struct net_flow_jump_table *jump;
+};
+#endif
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 29c92ee..3c3c856 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -52,6 +52,11 @@
 #include <linux/neighbour.h>
 #include <uapi/linux/netdevice.h>
 
+#ifdef CONFIG_NET_FLOW_TABLES
+#include <linux/if_flow.h>
+#include <uapi/linux/if_flow.h>
+#endif
+
 struct netpoll_info;
 struct device;
 struct phy_device;
@@ -1186,6 +1191,13 @@ struct net_device_ops {
 	int			(*ndo_switch_port_stp_update)(struct net_device *dev,
 							      u8 state);
 #endif
+#ifdef CONFIG_NET_FLOW_TABLES
+	struct net_flow_action  **(*ndo_flow_get_actions)(struct net_device *dev);
+	struct net_flow_table	**(*ndo_flow_get_tables)(struct net_device *dev);
+	struct net_flow_header	**(*ndo_flow_get_headers)(struct net_device *dev);
+	struct net_flow_hdr_node **(*ndo_flow_get_hdr_graph)(struct net_device *dev);
+	struct net_flow_tbl_node **(*ndo_flow_get_tbl_graph)(struct net_device *dev);
+#endif
 };
 
 /**
diff --git a/include/uapi/linux/if_flow.h b/include/uapi/linux/if_flow.h
new file mode 100644
index 0000000..2acdb38
--- /dev/null
+++ b/include/uapi/linux/if_flow.h
@@ -0,0 +1,363 @@
+/*
+ * include/uapi/linux/if_flow.h - Flow table interface for Switch devices
+ * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com>
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Author: John Fastabend <john.r.fastabend@intel.com>
+ */
+
+/* Netlink description:
+ *
+ * Table definition used to describe running tables. The following
+ * describes the netlink message returned from a flow API messages.
+ *
+ * Flow table definitions used to define tables.
+ *
+ * [NET_FLOW_TABLE_IDENTIFIER_TYPE]
+ * [NET_FLOW_TABLE_IDENTIFIER]
+ * [NET_FLOW_TABLE_TABLES]
+ *     [NET_FLOW_TABLE]
+ *       [NET_FLOW_TABLE_ATTR_NAME]
+ *       [NET_FLOW_TABLE_ATTR_UID]
+ *       [NET_FLOW_TABLE_ATTR_SOURCE]
+ *       [NET_FLOW_TABLE_ATTR_SIZE]
+ *	 [NET_FLOW_TABLE_ATTR_MATCHES]
+ *	   [NET_FLOW_FIELD_REF]
+ *	   [NET_FLOW_FIELD_REF]
+ *	     [...]
+ *	   [...]
+ *	 [NET_FLOW_TABLE_ATTR_ACTIONS]
+ *	   [NET_FLOW_ACTION]
+ *	     [NET_FLOW_ACTION_ATTR_NAME]
+ *	     [NET_FLOW_ACTION_ATTR_UID]
+ *	     [NET_FLOW_ACTION_ATTR_SIGNATURE]
+ *		 [NET_FLOW_ACTION_ARG]
+ *	         [NET_FLOW_ACTION_ARG]
+ *	         [...]
+ *	   [NET_FLOW_ACTION]
+ *	     [...]
+ *	   [...]
+ *     [NET_FLOW_TABLE]
+ *       [...]
+ *
+ * Header definitions used to define headers with user friendly
+ * names.
+ *
+ * [NET_FLOW_TABLE_HEADERS]
+ *   [NET_FLOW_HEADER]
+ *	[NET_FLOW_HEADER_ATTR_NAME]
+ *	[NET_FLOW_HEADER_ATTR_UID]
+ *	[NET_FLOW_HEADER_ATTR_FIELDS]
+ *	  [NET_FLOW_HEADER_ATTR_FIELD]
+ *	    [NET_FLOW_FIELD_ATTR_NAME]
+ *	    [NET_FLOW_FIELD_ATTR_UID]
+ *	    [NET_FLOW_FIELD_ATTR_BITWIDTH]
+ *	  [NET_FLOW_HEADER_ATTR_FIELD]
+ *	    [...]
+ *	  [...]
+ *   [NET_FLOW_HEADER]
+ *      [...]
+ *   [...]
+ *
+ * Action definitions supported by tables
+ *
+ * [NET_FLOW_TABLE_ACTIONS]
+ *   [NET_FLOW_TABLE_ATTR_ACTIONS]
+ *	[NET_FLOW_ACTION]
+ *	  [NET_FLOW_ACTION_ATTR_NAME]
+ *	  [NET_FLOW_ACTION_ATTR_UID]
+ *	  [NET_FLOW_ACTION_ATTR_SIGNATURE]
+ *		 [NET_FLOW_ACTION_ARG]
+ *	         [NET_FLOW_ACTION_ARG]
+ *               [...]
+ *	[NET_FLOW_ACTION]
+ *	     [...]
+ *
+ * Parser definition used to unambiguously define match headers.
+ *
+ * [NET_FLOW_TABLE_PARSE_GRAPH]
+ *
+ * Primitive Type descriptions
+ *
+ * Get Table Graph <Request> only requires msg preamble.
+ *
+ * Get Table Graph <Reply> description
+ *
+ * [NET_FLOW_TABLE_TABLE_GRAPH]
+ *   [TABLE_GRAPH_NODE]
+ *	[TABLE_GRAPH_NODE_UID]
+ *	[TABLE_GRAPH_NODE_JUMP]
+ *	  [NET_FLOW_JUMP_TABLE_ENTRY]
+ *	  [NET_FLOW_JUMP_TABLE_ENTRY]
+ *	    [...]
+ *   [TABLE_GRAPH_NODE]
+ *	[..]
+ */
+
+#ifndef _UAPI_LINUX_IF_FLOW
+#define _UAPI_LINUX_IF_FLOW
+
+#include <linux/types.h>
+#include <linux/netlink.h>
+#include <linux/if.h>
+
+#define NET_FLOW_NAMSIZ 80
+
+/**
+ * @struct net_flow_fields
+ * @brief defines a field in a header
+ */
+struct net_flow_field {
+	char name[NET_FLOW_NAMSIZ];
+	int uid;
+	int bitwidth;
+};
+
+enum {
+	NET_FLOW_FIELD_UNSPEC,
+	NET_FLOW_FIELD,
+	__NET_FLOW_FIELD_MAX,
+};
+#define NET_FLOW_FIELD_MAX (__NET_FLOW_FIELD_MAX - 1)
+
+enum {
+	NET_FLOW_FIELD_ATTR_UNSPEC,
+	NET_FLOW_FIELD_ATTR_NAME,
+	NET_FLOW_FIELD_ATTR_UID,
+	NET_FLOW_FIELD_ATTR_BITWIDTH,
+	__NET_FLOW_FIELD_ATTR_MAX,
+};
+#define NET_FLOW_FIELD_ATTR_MAX (__NET_FLOW_FIELD_ATTR_MAX - 1)
+
+enum {
+	NET_FLOW_HEADER_UNSPEC,
+	NET_FLOW_HEADER,
+	__NET_FLOW_HEADER_MAX,
+};
+#define NET_FLOW_HEADER_MAX (__NET_FLOW_HEADER_MAX - 1)
+
+enum {
+	NET_FLOW_HEADER_ATTR_UNSPEC,
+	NET_FLOW_HEADER_ATTR_NAME,
+	NET_FLOW_HEADER_ATTR_UID,
+	NET_FLOW_HEADER_ATTR_FIELDS,
+	__NET_FLOW_HEADER_ATTR_MAX,
+};
+#define NET_FLOW_HEADER_ATTR_MAX (__NET_FLOW_HEADER_ATTR_MAX - 1)
+
+enum {
+	NET_FLOW_MASK_TYPE_UNSPEC,
+	NET_FLOW_MASK_TYPE_EXACT,
+	NET_FLOW_MASK_TYPE_LPM,
+};
+
+/**
+ * @struct net_flow_field_ref
+ * @brief uniquely identify field as header:field tuple
+ */
+struct net_flow_field_ref {
+	int instance;
+	int header;
+	int field;
+	int mask_type;
+	int type;
+	union {	/* Are these all the required data types */
+		__u8 value_u8;
+		__u16 value_u16;
+		__u32 value_u32;
+		__u64 value_u64;
+	};
+	union {	/* Are these all the required data types */
+		__u8 mask_u8;
+		__u16 mask_u16;
+		__u32 mask_u32;
+		__u64 mask_u64;
+	};
+};
+
+enum {
+	NET_FLOW_FIELD_REF_UNSPEC,
+	NET_FLOW_FIELD_REF,
+	__NET_FLOW_FIELD_REF_MAX,
+};
+#define NET_FLOW_FIELD_REF_MAX (__NET_FLOW_FIELD_REF_MAX - 1)
+
+enum {
+	NET_FLOW_FIELD_REF_ATTR_TYPE_UNSPEC,
+	NET_FLOW_FIELD_REF_ATTR_TYPE_U8,
+	NET_FLOW_FIELD_REF_ATTR_TYPE_U16,
+	NET_FLOW_FIELD_REF_ATTR_TYPE_U32,
+	NET_FLOW_FIELD_REF_ATTR_TYPE_U64,
+	/* Need more types for ether.addrs, ip.addrs, ... */
+};
+
+enum net_flow_action_arg_type {
+	NET_FLOW_ACTION_ARG_TYPE_NULL,
+	NET_FLOW_ACTION_ARG_TYPE_U8,
+	NET_FLOW_ACTION_ARG_TYPE_U16,
+	NET_FLOW_ACTION_ARG_TYPE_U32,
+	NET_FLOW_ACTION_ARG_TYPE_U64,
+	__NET_FLOW_ACTION_ARG_TYPE_VAL_MAX,
+};
+
+struct net_flow_action_arg {
+	char name[NET_FLOW_NAMSIZ];
+	enum net_flow_action_arg_type type;
+	union {
+		__u8  value_u8;
+		__u16 value_u16;
+		__u32 value_u32;
+		__u64 value_u64;
+	};
+};
+
+enum {
+	NET_FLOW_ACTION_ARG_UNSPEC,
+	NET_FLOW_ACTION_ARG,
+	__NET_FLOW_ACTION_ARG_MAX,
+};
+#define NET_FLOW_ACTION_ARG_MAX (__NET_FLOW_ACTION_ARG_MAX - 1)
+
+enum {
+	NET_FLOW_ACTION_UNSPEC,
+	NET_FLOW_ACTION,
+	__NET_FLOW_ACTION_MAX,
+};
+#define NET_FLOW_ACTION_MAX (__NET_FLOW_ACTION_MAX - 1)
+
+enum {
+	NET_FLOW_ACTION_ATTR_UNSPEC,
+	NET_FLOW_ACTION_ATTR_NAME,
+	NET_FLOW_ACTION_ATTR_UID,
+	NET_FLOW_ACTION_ATTR_SIGNATURE,
+	__NET_FLOW_ACTION_ATTR_MAX,
+};
+#define NET_FLOW_ACTION_ATTR_MAX (__NET_FLOW_ACTION_ATTR_MAX - 1)
+
+enum {
+	NET_FLOW_ACTION_SET_UNSPEC,
+	NET_FLOW_ACTION_SET_ACTIONS,
+	__NET_FLOW_ACTION_SET_MAX,
+};
+#define NET_FLOW_ACTION_SET_MAX (__NET_FLOW_ACTION_SET_MAX - 1)
+
+enum {
+	NET_FLOW_TABLE_UNSPEC,
+	NET_FLOW_TABLE,
+	__NET_FLOW_TABLE_MAX,
+};
+#define NET_FLOW_TABLE_MAX (__NET_FLOW_TABLE_MAX - 1)
+
+enum {
+	NET_FLOW_TABLE_ATTR_UNSPEC,
+	NET_FLOW_TABLE_ATTR_NAME,
+	NET_FLOW_TABLE_ATTR_UID,
+	NET_FLOW_TABLE_ATTR_SOURCE,
+	NET_FLOW_TABLE_ATTR_SIZE,
+	NET_FLOW_TABLE_ATTR_MATCHES,
+	NET_FLOW_TABLE_ATTR_ACTIONS,
+	__NET_FLOW_TABLE_ATTR_MAX,
+};
+#define NET_FLOW_TABLE_ATTR_MAX (__NET_FLOW_TABLE_ATTR_MAX - 1)
+
+struct net_flow_jump_table {
+	struct net_flow_field_ref field;
+	int node; /* <0 is a parser error */
+};
+
+#define NET_FLOW_JUMP_TABLE_DONE	-1
+
+enum {
+	NET_FLOW_JUMP_TABLE_ENTRY_UNSPEC,
+	NET_FLOW_JUMP_TABLE_ENTRY,
+	__NET_FLOW_JUMP_TABLE_ENTRY_MAX,
+};
+
+enum {
+	NET_FLOW_HEADER_NODE_HDRS_UNSPEC,
+	NET_FLOW_HEADER_NODE_HDRS_VALUE,
+	__NET_FLOW_HEADER_NODE_HDRS_MAX,
+};
+#define NET_FLOW_HEADER_NODE_HDRS_MAX (__NET_FLOW_HEADER_NODE_HDRS_MAX - 1)
+
+enum {
+	NET_FLOW_HEADER_NODE_UNSPEC,
+	NET_FLOW_HEADER_NODE_NAME,
+	NET_FLOW_HEADER_NODE_UID,
+	NET_FLOW_HEADER_NODE_HDRS,
+	NET_FLOW_HEADER_NODE_JUMP,
+	__NET_FLOW_HEADER_NODE_MAX,
+};
+#define NET_FLOW_HEADER_NODE_MAX (__NET_FLOW_HEADER_NODE_MAX - 1)
+
+enum {
+	NET_FLOW_HEADER_GRAPH_UNSPEC,
+	NET_FLOW_HEADER_GRAPH_NODE,
+	__NET_FLOW_HEADER_GRAPH_MAX,
+};
+#define NET_FLOW_HEADER_GRAPH_MAX (__NET_FLOW_HEADER_GRAPH_MAX - 1)
+
+#define NET_FLOW_TABLE_EGRESS_ROOT 1
+#define	NET_FLOW_TABLE_INGRESS_ROOT 2
+
+enum {
+	NET_FLOW_TABLE_GRAPH_NODE_UNSPEC,
+	NET_FLOW_TABLE_GRAPH_NODE_UID,
+	NET_FLOW_TABLE_GRAPH_NODE_FLAGS,
+	NET_FLOW_TABLE_GRAPH_NODE_JUMP,
+	__NET_FLOW_TABLE_GRAPH_NODE_MAX,
+};
+#define NET_FLOW_TABLE_GRAPH_NODE_MAX (__NET_FLOW_TABLE_GRAPH_NODE_MAX - 1)
+
+enum {
+	NET_FLOW_TABLE_GRAPH_UNSPEC,
+	NET_FLOW_TABLE_GRAPH_NODE,
+	__NET_FLOW_TABLE_GRAPH_MAX,
+};
+#define NET_FLOW_TABLE_GRAPH_MAX (__NET_FLOW_TABLE_GRAPH_MAX - 1)
+
+enum {
+	NET_FLOW_IDENTIFIER_IFINDEX, /* net_device ifindex */
+};
+
+enum {
+	NET_FLOW_UNSPEC,
+	NET_FLOW_IDENTIFIER_TYPE,
+	NET_FLOW_IDENTIFIER,
+
+	NET_FLOW_TABLES,
+	NET_FLOW_HEADERS,
+	NET_FLOW_ACTIONS,
+	NET_FLOW_HEADER_GRAPH,
+	NET_FLOW_TABLE_GRAPH,
+
+	__NET_FLOW_MAX,
+	NET_FLOW_MAX = (__NET_FLOW_MAX - 1),
+};
+
+enum {
+	NET_FLOW_TABLE_CMD_GET_TABLES,
+	NET_FLOW_TABLE_CMD_GET_HEADERS,
+	NET_FLOW_TABLE_CMD_GET_ACTIONS,
+	NET_FLOW_TABLE_CMD_GET_HDR_GRAPH,
+	NET_FLOW_TABLE_CMD_GET_TABLE_GRAPH,
+
+	__NET_FLOW_CMD_MAX,
+	NET_FLOW_CMD_MAX = (__NET_FLOW_CMD_MAX - 1),
+};
+
+#define NET_FLOW_GENL_NAME "net_flow_table"
+#define NET_FLOW_GENL_VERSION 0x1
+#endif /* _UAPI_LINUX_IF_FLOW */
diff --git a/net/Kconfig b/net/Kconfig
index ff9ffc1..8380bfe 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -293,6 +293,13 @@ config NET_FLOW_LIMIT
 	  with many clients some protection against DoS by a single (spoofed)
 	  flow that greatly exceeds average workload.
 
+config NET_FLOW_TABLES
+	boolean "Support network flow tables"
+	---help---
+	This feature provides an interface for device drivers to report
+	flow tables and supported matches and actions. If you do not
+	want to support hardware offloads for flow tables, say N here.
+
 menu "Network testing"
 
 config NET_PKTGEN
diff --git a/net/core/Makefile b/net/core/Makefile
index 235e6c5..1eea785 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -23,3 +23,4 @@ obj-$(CONFIG_NETWORK_PHY_TIMESTAMPING) += timestamping.o
 obj-$(CONFIG_NET_PTP_CLASSIFY) += ptp_classifier.o
 obj-$(CONFIG_CGROUP_NET_PRIO) += netprio_cgroup.o
 obj-$(CONFIG_CGROUP_NET_CLASSID) += netclassid_cgroup.o
+obj-$(CONFIG_NET_FLOW_TABLES) += flow_table.o
diff --git a/net/core/flow_table.c b/net/core/flow_table.c
new file mode 100644
index 0000000..ec3f06d
--- /dev/null
+++ b/net/core/flow_table.c
@@ -0,0 +1,837 @@
+/*
+ * include/uapi/linux/if_flow.h - Flow table interface for Switch devices
+ * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com>
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Author: John Fastabend <john.r.fastabend@intel.com>
+ */
+
+#include <uapi/linux/if_flow.h>
+#include <linux/if_flow.h>
+#include <linux/if_bridge.h>
+#include <linux/types.h>
+#include <net/netlink.h>
+#include <net/genetlink.h>
+#include <net/rtnetlink.h>
+#include <linux/module.h>
+
+static struct genl_family net_flow_nl_family = {
+	.id		= GENL_ID_GENERATE,
+	.name		= NET_FLOW_GENL_NAME,
+	.version	= NET_FLOW_GENL_VERSION,
+	.maxattr	= NET_FLOW_MAX,
+	.netnsok	= true,
+};
+
+static struct net_device *net_flow_get_dev(struct genl_info *info)
+{
+	struct net *net = genl_info_net(info);
+	int type, ifindex;
+
+	if (!info->attrs[NET_FLOW_IDENTIFIER_TYPE] ||
+	    !info->attrs[NET_FLOW_IDENTIFIER])
+		return NULL;
+
+	type = nla_get_u32(info->attrs[NET_FLOW_IDENTIFIER_TYPE]);
+	switch (type) {
+	case NET_FLOW_IDENTIFIER_IFINDEX:
+		ifindex = nla_get_u32(info->attrs[NET_FLOW_IDENTIFIER]);
+		break;
+	default:
+		return NULL;
+	}
+
+	return dev_get_by_index(net, ifindex);
+}
+
+static int net_flow_put_act_types(struct sk_buff *skb,
+				  struct net_flow_action_arg *args)
+{
+	int i, err;
+
+	for (i = 0; args[i].type; i++) {
+		err = nla_put(skb, NET_FLOW_ACTION_ARG,
+			      sizeof(struct net_flow_action_arg), &args[i]);
+		if (err)
+			return -EMSGSIZE;
+	}
+	return 0;
+}
+
+static const
+struct nla_policy net_flow_action_policy[NET_FLOW_ACTION_ATTR_MAX + 1] = {
+	[NET_FLOW_ACTION_ATTR_NAME]	 = {.type = NLA_STRING,
+					    .len = NET_FLOW_NAMSIZ-1 },
+	[NET_FLOW_ACTION_ATTR_UID]	 = {.type = NLA_U32 },
+	[NET_FLOW_ACTION_ATTR_SIGNATURE] = {.type = NLA_NESTED },
+};
+
+static int net_flow_put_action(struct sk_buff *skb, struct net_flow_action *a)
+{
+	struct net_flow_action_arg *this;
+	struct nlattr *nest;
+	int err, args = 0;
+
+	if (a->name && nla_put_string(skb, NET_FLOW_ACTION_ATTR_NAME, a->name))
+		return -EMSGSIZE;
+
+	if (nla_put_u32(skb, NET_FLOW_ACTION_ATTR_UID, a->uid))
+		return -EMSGSIZE;
+
+	if (!a->args)
+		return 0;
+
+	for (this = &a->args[0]; strlen(this->name) > 0; this++)
+		args++;
+
+	if (args) {
+		nest = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE);
+		if (!nest)
+			goto nest_put_failure;
+
+		err = net_flow_put_act_types(skb, a->args);
+		if (err) {
+			nla_nest_cancel(skb, nest);
+			return err;
+		}
+		nla_nest_end(skb, nest);
+	}
+
+	return 0;
+nest_put_failure:
+	return -EMSGSIZE;
+}
+
+static int net_flow_put_actions(struct sk_buff *skb,
+				struct net_flow_action **acts)
+{
+	struct nlattr *actions;
+	int err, i;
+
+	actions = nla_nest_start(skb, NET_FLOW_ACTIONS);
+	if (!actions)
+		return -EMSGSIZE;
+
+	for (i = 0; acts[i]->uid; i++) {
+		struct nlattr *action = nla_nest_start(skb, NET_FLOW_ACTION);
+
+		if (!action)
+			goto action_put_failure;
+
+		err = net_flow_put_action(skb, acts[i]);
+		if (err)
+			goto action_put_failure;
+		nla_nest_end(skb, action);
+	}
+	nla_nest_end(skb, actions);
+
+	return 0;
+action_put_failure:
+	nla_nest_cancel(skb, actions);
+	return -EMSGSIZE;
+}
+
+struct sk_buff *net_flow_build_actions_msg(struct net_flow_action **a,
+					   struct net_device *dev,
+					   u32 portid, int seq, u8 cmd)
+{
+	struct genlmsghdr *hdr;
+	struct sk_buff *skb;
+	int err = -ENOBUFS;
+
+	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!skb)
+		return ERR_PTR(-ENOBUFS);
+
+	hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd);
+	if (!hdr)
+		goto out;
+
+	if (nla_put_u32(skb,
+			NET_FLOW_IDENTIFIER_TYPE,
+			NET_FLOW_IDENTIFIER_IFINDEX) ||
+	    nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) {
+		err = -ENOBUFS;
+		goto out;
+	}
+
+	err = net_flow_put_actions(skb, a);
+	if (err < 0)
+		goto out;
+
+	err = genlmsg_end(skb, hdr);
+	if (err < 0)
+		goto out;
+
+	return skb;
+out:
+	nlmsg_free(skb);
+	return ERR_PTR(err);
+}
+
+static int net_flow_cmd_get_actions(struct sk_buff *skb,
+				    struct genl_info *info)
+{
+	struct net_flow_action **a;
+	struct net_device *dev;
+	struct sk_buff *msg;
+
+	dev = net_flow_get_dev(info);
+	if (!dev)
+		return -EINVAL;
+
+	if (!dev->netdev_ops->ndo_flow_get_actions) {
+		dev_put(dev);
+		return -EOPNOTSUPP;
+	}
+
+	a = dev->netdev_ops->ndo_flow_get_actions(dev);
+	if (!a)
+		return -EBUSY;
+
+	msg = net_flow_build_actions_msg(a, dev,
+					 info->snd_portid,
+					 info->snd_seq,
+					 NET_FLOW_TABLE_CMD_GET_ACTIONS);
+	dev_put(dev);
+
+	if (IS_ERR(msg))
+		return PTR_ERR(msg);
+
+	return genlmsg_reply(msg, info);
+}
+
+static int net_flow_put_table(struct net_device *dev,
+			      struct sk_buff *skb,
+			      struct net_flow_table *t)
+{
+	struct nlattr *matches, *actions;
+	int i;
+
+	if (nla_put_string(skb, NET_FLOW_TABLE_ATTR_NAME, t->name) ||
+	    nla_put_u32(skb, NET_FLOW_TABLE_ATTR_UID, t->uid) ||
+	    nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SOURCE, t->source) ||
+	    nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SIZE, t->size))
+		return -EMSGSIZE;
+
+	matches = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_MATCHES);
+	if (!matches)
+		return -EMSGSIZE;
+
+	for (i = 0; t->matches[i].instance; i++)
+		nla_put(skb, NET_FLOW_FIELD_REF,
+			sizeof(struct net_flow_field_ref),
+			&t->matches[i]);
+	nla_nest_end(skb, matches);
+
+	actions = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_ACTIONS);
+	if (!actions)
+		return -EMSGSIZE;
+
+	for (i = 0; t->actions[i]; i++) {
+		if (nla_put_u32(skb,
+				NET_FLOW_ACTION_ATTR_UID,
+				t->actions[i])) {
+			nla_nest_cancel(skb, actions);
+			return -EMSGSIZE;
+		}
+	}
+	nla_nest_end(skb, actions);
+
+	return 0;
+}
+
+static int net_flow_put_tables(struct net_device *dev,
+			       struct sk_buff *skb,
+			       struct net_flow_table **tables)
+{
+	struct nlattr *nest, *t;
+	int i, err = 0;
+
+	nest = nla_nest_start(skb, NET_FLOW_TABLES);
+	if (!nest)
+		return -EMSGSIZE;
+
+	for (i = 0; tables[i]->uid; i++) {
+		t = nla_nest_start(skb, NET_FLOW_TABLE);
+		if (!t) {
+			err = -EMSGSIZE;
+			goto errout;
+		}
+
+		err = net_flow_put_table(dev, skb, tables[i]);
+		if (err) {
+			nla_nest_cancel(skb, t);
+			goto errout;
+		}
+		nla_nest_end(skb, t);
+	}
+	nla_nest_end(skb, nest);
+	return 0;
+errout:
+	nla_nest_cancel(skb, nest);
+	return err;
+}
+
+static struct sk_buff *net_flow_build_tables_msg(struct net_flow_table **t,
+						 struct net_device *dev,
+						 u32 portid, int seq, u8 cmd)
+{
+	struct genlmsghdr *hdr;
+	struct sk_buff *skb;
+	int err = -ENOBUFS;
+
+	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!skb)
+		return ERR_PTR(-ENOBUFS);
+
+	hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd);
+	if (!hdr)
+		goto out;
+
+	if (nla_put_u32(skb,
+			NET_FLOW_IDENTIFIER_TYPE,
+			NET_FLOW_IDENTIFIER_IFINDEX) ||
+	    nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) {
+		err = -ENOBUFS;
+		goto out;
+	}
+
+	err = net_flow_put_tables(dev, skb, t);
+	if (err < 0)
+		goto out;
+
+	err = genlmsg_end(skb, hdr);
+	if (err < 0)
+		goto out;
+
+	return skb;
+out:
+	nlmsg_free(skb);
+	return ERR_PTR(err);
+}
+
+static int net_flow_cmd_get_tables(struct sk_buff *skb,
+				   struct genl_info *info)
+{
+	struct net_flow_table **tables;
+	struct net_device *dev;
+	struct sk_buff *msg;
+
+	dev = net_flow_get_dev(info);
+	if (!dev)
+		return -EINVAL;
+
+	if (!dev->netdev_ops->ndo_flow_get_tables) {
+		dev_put(dev);
+		return -EOPNOTSUPP;
+	}
+
+	tables = dev->netdev_ops->ndo_flow_get_tables(dev);
+	if (!tables) /* transient failure should always have some table */
+		return -EBUSY;
+
+	msg = net_flow_build_tables_msg(tables, dev,
+					info->snd_portid,
+					info->snd_seq,
+					NET_FLOW_TABLE_CMD_GET_TABLES);
+	dev_put(dev);
+
+	if (IS_ERR(msg))
+		return PTR_ERR(msg);
+
+	return genlmsg_reply(msg, info);
+}
+
+static
+int net_flow_put_fields(struct sk_buff *skb, const struct net_flow_header *h)
+{
+	struct net_flow_field *f;
+	int count = h->field_sz;
+	struct nlattr *field;
+
+	for (f = h->fields; count; count--, f++) {
+		field = nla_nest_start(skb, NET_FLOW_FIELD);
+		if (!field)
+			goto field_put_failure;
+
+		if (nla_put_string(skb, NET_FLOW_FIELD_ATTR_NAME, f->name) ||
+		    nla_put_u32(skb, NET_FLOW_FIELD_ATTR_UID, f->uid) ||
+		    nla_put_u32(skb, NET_FLOW_FIELD_ATTR_BITWIDTH, f->bitwidth))
+			goto out;
+
+		nla_nest_end(skb, field);
+	}
+
+	return 0;
+out:
+	nla_nest_cancel(skb, field);
+field_put_failure:
+	return -EMSGSIZE;
+}
+
+static int net_flow_put_headers(struct sk_buff *skb,
+				struct net_flow_header **headers)
+{
+	struct nlattr *nest, *hdr, *fields;
+	struct net_flow_header *h;
+	int i, err;
+
+	nest = nla_nest_start(skb, NET_FLOW_HEADERS);
+	if (!nest)
+		return -EMSGSIZE;
+
+	for (i = 0; headers[i]->uid; i++) {
+		err = -EMSGSIZE;
+		h = headers[i];
+
+		hdr = nla_nest_start(skb, NET_FLOW_HEADER);
+		if (!hdr)
+			goto hdr_put_failure;
+
+		if (nla_put_string(skb, NET_FLOW_HEADER_ATTR_NAME, h->name) ||
+		    nla_put_u32(skb, NET_FLOW_HEADER_ATTR_UID, h->uid))
+			goto attr_put_failure;
+
+		fields = nla_nest_start(skb, NET_FLOW_HEADER_ATTR_FIELDS);
+		if (!fields)
+			goto attr_put_failure;
+
+		err = net_flow_put_fields(skb, h);
+		if (err)
+			goto fields_put_failure;
+
+		nla_nest_end(skb, fields);
+
+		nla_nest_end(skb, hdr);
+	}
+	nla_nest_end(skb, nest);
+
+	return 0;
+fields_put_failure:
+	nla_nest_cancel(skb, fields);
+attr_put_failure:
+	nla_nest_cancel(skb, hdr);
+hdr_put_failure:
+	nla_nest_cancel(skb, nest);
+	return err;
+}
+
+static struct sk_buff *net_flow_build_headers_msg(struct net_flow_header **h,
+						  struct net_device *dev,
+						  u32 portid, int seq, u8 cmd)
+{
+	struct genlmsghdr *hdr;
+	struct sk_buff *skb;
+	int err = -ENOBUFS;
+
+	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!skb)
+		return ERR_PTR(-ENOBUFS);
+
+	hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd);
+	if (!hdr)
+		goto out;
+
+	if (nla_put_u32(skb,
+			NET_FLOW_IDENTIFIER_TYPE,
+			NET_FLOW_IDENTIFIER_IFINDEX) ||
+	    nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) {
+		err = -ENOBUFS;
+		goto out;
+	}
+
+	err = net_flow_put_headers(skb, h);
+	if (err < 0)
+		goto out;
+
+	err = genlmsg_end(skb, hdr);
+	if (err < 0)
+		goto out;
+
+	return skb;
+out:
+	nlmsg_free(skb);
+	return ERR_PTR(err);
+}
+
+static int net_flow_cmd_get_headers(struct sk_buff *skb,
+				    struct genl_info *info)
+{
+	struct net_flow_header **h;
+	struct net_device *dev;
+	struct sk_buff *msg;
+
+	dev = net_flow_get_dev(info);
+	if (!dev)
+		return -EINVAL;
+
+	if (!dev->netdev_ops->ndo_flow_get_headers) {
+		dev_put(dev);
+		return -EOPNOTSUPP;
+	}
+
+	h = dev->netdev_ops->ndo_flow_get_headers(dev);
+	if (!h)
+		return -EBUSY;
+
+	msg = net_flow_build_headers_msg(h, dev,
+					 info->snd_portid,
+					 info->snd_seq,
+					 NET_FLOW_TABLE_CMD_GET_HEADERS);
+	dev_put(dev);
+
+	if (IS_ERR(msg))
+		return PTR_ERR(msg);
+
+	return genlmsg_reply(msg, info);
+}
+
+static int net_flow_put_header_node(struct sk_buff *skb,
+				    struct net_flow_hdr_node *node)
+{
+	struct nlattr *hdrs, *jumps;
+	int i, err;
+
+	if (nla_put_string(skb, NET_FLOW_HEADER_NODE_NAME, node->name) ||
+	    nla_put_u32(skb, NET_FLOW_HEADER_NODE_UID, node->uid))
+		return -EMSGSIZE;
+
+	/* Insert the set of headers that get extracted at this node */
+	hdrs = nla_nest_start(skb, NET_FLOW_HEADER_NODE_HDRS);
+	if (!hdrs)
+		return -EMSGSIZE;
+	for (i = 0; node->hdrs[i]; i++) {
+		if (nla_put_u32(skb, NET_FLOW_HEADER_NODE_HDRS_VALUE,
+				node->hdrs[i])) {
+			nla_nest_cancel(skb, hdrs);
+			return -EMSGSIZE;
+		}
+	}
+	nla_nest_end(skb, hdrs);
+
+	/* Then give the jump table to find next header node in graph */
+	jumps = nla_nest_start(skb, NET_FLOW_HEADER_NODE_JUMP);
+	if (!jumps)
+		return -EMSGSIZE;
+
+	for (i = 0; node->jump[i].node; i++) {
+		err = nla_put(skb, NET_FLOW_JUMP_TABLE_ENTRY,
+			      sizeof(struct net_flow_jump_table),
+			      &node->jump[i]);
+		if (err) {
+			nla_nest_cancel(skb, jumps);
+			return -EMSGSIZE;
+		}
+	}
+	nla_nest_end(skb, jumps);
+
+	return 0;
+}
+
+static int net_flow_put_header_graph(struct sk_buff *skb,
+				     struct net_flow_hdr_node **g)
+{
+	struct nlattr *nodes, *node;
+	int err, i;
+
+	nodes = nla_nest_start(skb, NET_FLOW_HEADER_GRAPH);
+	if (!nodes)
+		return -EMSGSIZE;
+
+	for (i = 0; g[i]->uid; i++) {
+		node = nla_nest_start(skb, NET_FLOW_HEADER_GRAPH_NODE);
+		if (!node) {
+			err = -EMSGSIZE;
+			goto nodes_put_error;
+		}
+
+		err = net_flow_put_header_node(skb, g[i]);
+		if (err)
+			goto node_put_error;
+
+		nla_nest_end(skb, node);
+	}
+
+	nla_nest_end(skb, nodes);
+	return 0;
+node_put_error:
+	nla_nest_cancel(skb, node);
+nodes_put_error:
+	nla_nest_cancel(skb, nodes);
+	return err;
+}
+
+static
+struct sk_buff *net_flow_build_header_graph_msg(struct net_flow_hdr_node **g,
+						struct net_device *dev,
+						u32 portid, int seq, u8 cmd)
+{
+	struct genlmsghdr *hdr;
+	struct sk_buff *skb;
+	int err = -ENOBUFS;
+
+	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!skb)
+		return ERR_PTR(-ENOBUFS);
+
+	hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd);
+	if (!hdr)
+		goto out;
+
+	if (nla_put_u32(skb,
+			NET_FLOW_IDENTIFIER_TYPE,
+			NET_FLOW_IDENTIFIER_IFINDEX) ||
+	    nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) {
+		err = -ENOBUFS;
+		goto out;
+	}
+
+	err = net_flow_put_header_graph(skb, g);
+	if (err < 0)
+		goto out;
+
+	err = genlmsg_end(skb, hdr);
+	if (err < 0)
+		goto out;
+
+	return skb;
+out:
+	nlmsg_free(skb);
+	return ERR_PTR(err);
+}
+
+static int net_flow_cmd_get_header_graph(struct sk_buff *skb,
+					 struct genl_info *info)
+{
+	struct net_flow_hdr_node **h;
+	struct net_device *dev;
+	struct sk_buff *msg;
+
+	dev = net_flow_get_dev(info);
+	if (!dev)
+		return -EINVAL;
+
+	if (!dev->netdev_ops->ndo_flow_get_hdr_graph) {
+		dev_put(dev);
+		return -EOPNOTSUPP;
+	}
+
+	h = dev->netdev_ops->ndo_flow_get_hdr_graph(dev);
+	if (!h)
+		return -EBUSY;
+
+	msg = net_flow_build_header_graph_msg(h, dev,
+					      info->snd_portid,
+					      info->snd_seq,
+					      NET_FLOW_TABLE_CMD_GET_HDR_GRAPH);
+	dev_put(dev);
+
+	if (IS_ERR(msg))
+		return PTR_ERR(msg);
+
+	return genlmsg_reply(msg, info);
+}
+
+static int net_flow_put_table_node(struct sk_buff *skb,
+				   struct net_flow_tbl_node *node)
+{
+	struct nlattr *nest, *jump;
+	int i, err = -EMSGSIZE;
+
+	nest = nla_nest_start(skb, NET_FLOW_TABLE_GRAPH_NODE);
+	if (!nest)
+		return err;
+
+	if (nla_put_u32(skb, NET_FLOW_TABLE_GRAPH_NODE_UID, node->uid) ||
+	    nla_put_u32(skb, NET_FLOW_TABLE_GRAPH_NODE_FLAGS, node->flags))
+		goto node_put_failure;
+
+	jump = nla_nest_start(skb, NET_FLOW_TABLE_GRAPH_NODE_JUMP);
+	if (!jump)
+		goto node_put_failure;
+
+	for (i = 0; node->jump[i].node; i++) {
+		err = nla_put(skb, NET_FLOW_JUMP_TABLE_ENTRY,
+			      sizeof(struct net_flow_jump_table),
+			      &node->jump[i]);
+		if (err)
+			goto jump_put_failure;
+	}
+
+	nla_nest_end(skb, jump);
+	nla_nest_end(skb, nest);
+	return 0;
+jump_put_failure:
+	nla_nest_cancel(skb, jump);
+node_put_failure:
+	nla_nest_cancel(skb, nest);
+	return err;
+}
+
+static int net_flow_put_table_graph(struct sk_buff *skb,
+				    struct net_flow_tbl_node **nodes)
+{
+	struct nlattr *graph;
+	int err, i = 0;
+
+	graph = nla_nest_start(skb, NET_FLOW_TABLE_GRAPH);
+	if (!graph)
+		return -EMSGSIZE;
+
+	for (i = 0; nodes[i]->uid; i++) {
+		err = net_flow_put_table_node(skb, nodes[i]);
+		if (err) {
+			nla_nest_cancel(skb, graph);
+			return -EMSGSIZE;
+		}
+	}
+
+	nla_nest_end(skb, graph);
+	return 0;
+}
+
+static
+struct sk_buff *net_flow_build_graph_msg(struct net_flow_tbl_node **g,
+					 struct net_device *dev,
+					 u32 portid, int seq, u8 cmd)
+{
+	struct genlmsghdr *hdr;
+	struct sk_buff *skb;
+	int err = -ENOBUFS;
+
+	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!skb)
+		return ERR_PTR(-ENOBUFS);
+
+	hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd);
+	if (!hdr)
+		goto out;
+
+	if (nla_put_u32(skb,
+			NET_FLOW_IDENTIFIER_TYPE,
+			NET_FLOW_IDENTIFIER_IFINDEX) ||
+	    nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) {
+		err = -ENOBUFS;
+		goto out;
+	}
+
+	err = net_flow_put_table_graph(skb, g);
+	if (err < 0)
+		goto out;
+
+	err = genlmsg_end(skb, hdr);
+	if (err < 0)
+		goto out;
+
+	return skb;
+out:
+	nlmsg_free(skb);
+	return ERR_PTR(err);
+}
+
+static int net_flow_cmd_get_table_graph(struct sk_buff *skb,
+					struct genl_info *info)
+{
+	struct net_flow_tbl_node **g;
+	struct net_device *dev;
+	struct sk_buff *msg;
+
+	dev = net_flow_get_dev(info);
+	if (!dev)
+		return -EINVAL;
+
+	if (!dev->netdev_ops->ndo_flow_get_tbl_graph) {
+		dev_put(dev);
+		return -EOPNOTSUPP;
+	}
+
+	g = dev->netdev_ops->ndo_flow_get_tbl_graph(dev);
+	if (!g)
+		return -EBUSY;
+
+	msg = net_flow_build_graph_msg(g, dev,
+				       info->snd_portid,
+				       info->snd_seq,
+				       NET_FLOW_TABLE_CMD_GET_TABLE_GRAPH);
+	dev_put(dev);
+
+	if (IS_ERR(msg))
+		return PTR_ERR(msg);
+
+	return genlmsg_reply(msg, info);
+}
+
+static const struct nla_policy net_flow_cmd_policy[NET_FLOW_MAX + 1] = {
+	[NET_FLOW_IDENTIFIER_TYPE] = {.type = NLA_U32, },
+	[NET_FLOW_IDENTIFIER]	   = {.type = NLA_U32, },
+	[NET_FLOW_TABLES]	   = {.type = NLA_NESTED, },
+	[NET_FLOW_HEADERS]	   = {.type = NLA_NESTED, },
+	[NET_FLOW_ACTIONS]	   = {.type = NLA_NESTED, },
+	[NET_FLOW_HEADER_GRAPH]	   = {.type = NLA_NESTED, },
+	[NET_FLOW_TABLE_GRAPH]	   = {.type = NLA_NESTED, },
+};
+
+static const struct genl_ops net_flow_table_nl_ops[] = {
+	{
+		.cmd = NET_FLOW_TABLE_CMD_GET_TABLES,
+		.doit = net_flow_cmd_get_tables,
+		.policy = net_flow_cmd_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = NET_FLOW_TABLE_CMD_GET_HEADERS,
+		.doit = net_flow_cmd_get_headers,
+		.policy = net_flow_cmd_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = NET_FLOW_TABLE_CMD_GET_ACTIONS,
+		.doit = net_flow_cmd_get_actions,
+		.policy = net_flow_cmd_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = NET_FLOW_TABLE_CMD_GET_HDR_GRAPH,
+		.doit = net_flow_cmd_get_header_graph,
+		.policy = net_flow_cmd_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = NET_FLOW_TABLE_CMD_GET_TABLE_GRAPH,
+		.doit = net_flow_cmd_get_table_graph,
+		.policy = net_flow_cmd_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+};
+
+static int __init net_flow_nl_module_init(void)
+{
+	return genl_register_family_with_ops(&net_flow_nl_family,
+					     net_flow_table_nl_ops);
+}
+
+static void net_flow_nl_module_fini(void)
+{
+	genl_unregister_family(&net_flow_nl_family);
+}
+
+module_init(net_flow_nl_module_init);
+module_exit(net_flow_nl_module_fini);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("John Fastabend <john.r.fastabend@intel.com>");
+MODULE_DESCRIPTION("Netlink interface to Flow Tables");
+MODULE_ALIAS_GENL_FAMILY(NET_FLOW_GENL_NAME);

^ permalink raw reply related

* [net-next PATCH v1 02/11] net: flow_table: add flow, delete flow
From: John Fastabend @ 2014-12-31 19:46 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

Now that the device capabilities are exposed we can add support to
add and delete flows from the tables.

The two operations are

table_set_flows :

  The set flow operations is used to program a set of flows into a
  hardware device table. The message is consumed via netlink encoded
  message which is then decoded into a null terminated  array of
  flow entry structures. A flow entry structure is defined as

     struct net_flow_flow {
			  int table_id;
			  int uid;
			  int priority;
			  struct net_flow_field_ref *matches;
			  struct net_flow_action *actions;
     }

  The table id is the _uid_ returned from 'get_tables' operatoins.
  Matches is a set of match criteria for packets with a logical AND
  operation done on the set so packets match the entire criteria.
  Actions provide a set of actions to perform when the flow rule is
  hit. Both matches and actions are null terminated arrays.

  The flows are configured in hardware using an ndo op. We do not
  provide a commit operation at the moment and expect hardware
  commits the flows one at a time. Future work may require a commit
  operation to tell the hardware we are done loading flow rules. On
  some hardware this will help bulk updates.

  Its possible for hardware to return an error from a flow set
  operation. This can occur for many reasons both transient and
  resource constraints. We have different error handling strategies
  built in and listed here,

    *_ERROR_ABORT      abort on first error with errmsg

    *_ERROR_CONTINUE   continue programming flows no errmsg

    *_ERROR_ABORT_LOG  abort on first error and return flow that
 		       failed to user space in reply msg

    *_ERROR_CONT_LOG   continue programming flows and return a list
		       of flows that failed to user space in a reply
		       msg.

  notably missing is a rollback error strategy. I don't have a
  use for this in software yet but the strategy can be added with
  *_ERROR_ROLLBACK for example.

table_del_flows

  The delete flow operation uses the same structures and error
  handling strategies as the table_set_flows operations. Although on
  delete messges ommit the matches/actions arrays because they are
  not needed to lookup the flow.

Also thanks to Simon Horman for fixes and other help.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 include/linux/if_flow.h      |   21 ++
 include/linux/netdevice.h    |    8 +
 include/uapi/linux/if_flow.h |   49 ++++
 net/core/flow_table.c        |  501 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 579 insertions(+)

diff --git a/include/linux/if_flow.h b/include/linux/if_flow.h
index 1b6c1ea..20fa752 100644
--- a/include/linux/if_flow.h
+++ b/include/linux/if_flow.h
@@ -90,4 +90,25 @@ struct net_flow_tbl_node {
 	__u32 flags;
 	struct net_flow_jump_table *jump;
 };
+
+/**
+ * @struct net_flow_flow
+ * @brief describes the match/action entry
+ *
+ * @uid unique identifier for flow
+ * @priority priority to execute flow match/action in table
+ * @match null terminated set of match uids match criteria
+ * @actoin null terminated set of action uids to apply to match
+ *
+ * Flows must match all entries in match set.
+ */
+struct net_flow_flow {
+	int table_id;
+	int uid;
+	int priority;
+	struct net_flow_field_ref *matches;
+	struct net_flow_action *actions;
+};
+
+int net_flow_put_flow(struct sk_buff *skb, struct net_flow_flow *flow);
 #endif
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3c3c856..be8d4e4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1197,6 +1197,14 @@ struct net_device_ops {
 	struct net_flow_header	**(*ndo_flow_get_headers)(struct net_device *dev);
 	struct net_flow_hdr_node **(*ndo_flow_get_hdr_graph)(struct net_device *dev);
 	struct net_flow_tbl_node **(*ndo_flow_get_tbl_graph)(struct net_device *dev);
+	int		        (*ndo_flow_get_flows)(struct sk_buff *skb,
+						      struct net_device *dev,
+						      int table,
+						      int min, int max);
+	int		        (*ndo_flow_set_flows)(struct net_device *dev,
+						      struct net_flow_flow *f);
+	int		        (*ndo_flow_del_flows)(struct net_device *dev,
+						      struct net_flow_flow *f);
 #endif
 };
 
diff --git a/include/uapi/linux/if_flow.h b/include/uapi/linux/if_flow.h
index 2acdb38..125cdc6 100644
--- a/include/uapi/linux/if_flow.h
+++ b/include/uapi/linux/if_flow.h
@@ -329,6 +329,48 @@ enum {
 #define NET_FLOW_TABLE_GRAPH_MAX (__NET_FLOW_TABLE_GRAPH_MAX - 1)
 
 enum {
+	NET_FLOW_NET_FLOW_UNSPEC,
+	NET_FLOW_FLOW,
+	__NET_FLOW_NET_FLOW_MAX,
+};
+#define NET_FLOW_NET_FLOW_MAX (__NET_FLOW_NET_FLOW_MAX - 1)
+
+enum {
+	NET_FLOW_TABLE_FLOWS_UNSPEC,
+	NET_FLOW_TABLE_FLOWS_TABLE,
+	NET_FLOW_TABLE_FLOWS_MINPRIO,
+	NET_FLOW_TABLE_FLOWS_MAXPRIO,
+	NET_FLOW_TABLE_FLOWS_FLOWS,
+	__NET_FLOW_TABLE_FLOWS_MAX,
+};
+#define NET_FLOW_TABLE_FLOWS_MAX (__NET_FLOW_TABLE_FLOWS_MAX - 1)
+
+enum {
+	/* Abort with normal errmsg */
+	NET_FLOW_FLOWS_ERROR_ABORT,
+	/* Ignore errors and continue without logging */
+	NET_FLOW_FLOWS_ERROR_CONTINUE,
+	/* Abort and reply with invalid flow fields */
+	NET_FLOW_FLOWS_ERROR_ABORT_LOG,
+	/* Continue and reply with list of invalid flows */
+	NET_FLOW_FLOWS_ERROR_CONT_LOG,
+	__NET_FLOWS_FLOWS_ERROR_MAX,
+};
+#define NET_FLOWS_FLOWS_ERROR_MAX (__NET_FLOWS_FLOWS_ERROR_MAX - 1)
+
+enum {
+	NET_FLOW_ATTR_UNSPEC,
+	NET_FLOW_ATTR_ERROR,
+	NET_FLOW_ATTR_TABLE,
+	NET_FLOW_ATTR_UID,
+	NET_FLOW_ATTR_PRIORITY,
+	NET_FLOW_ATTR_MATCHES,
+	NET_FLOW_ATTR_ACTIONS,
+	__NET_FLOW_ATTR_MAX,
+};
+#define NET_FLOW_ATTR_MAX (__NET_FLOW_ATTR_MAX - 1)
+
+enum {
 	NET_FLOW_IDENTIFIER_IFINDEX, /* net_device ifindex */
 };
 
@@ -343,6 +385,9 @@ enum {
 	NET_FLOW_HEADER_GRAPH,
 	NET_FLOW_TABLE_GRAPH,
 
+	NET_FLOW_FLOWS,
+	NET_FLOW_FLOWS_ERROR,
+
 	__NET_FLOW_MAX,
 	NET_FLOW_MAX = (__NET_FLOW_MAX - 1),
 };
@@ -354,6 +399,10 @@ enum {
 	NET_FLOW_TABLE_CMD_GET_HDR_GRAPH,
 	NET_FLOW_TABLE_CMD_GET_TABLE_GRAPH,
 
+	NET_FLOW_TABLE_CMD_GET_FLOWS,
+	NET_FLOW_TABLE_CMD_SET_FLOWS,
+	NET_FLOW_TABLE_CMD_DEL_FLOWS,
+
 	__NET_FLOW_CMD_MAX,
 	NET_FLOW_CMD_MAX = (__NET_FLOW_CMD_MAX - 1),
 };
diff --git a/net/core/flow_table.c b/net/core/flow_table.c
index ec3f06d..f4cf293 100644
--- a/net/core/flow_table.c
+++ b/net/core/flow_table.c
@@ -774,6 +774,489 @@ static int net_flow_cmd_get_table_graph(struct sk_buff *skb,
 	return genlmsg_reply(msg, info);
 }
 
+static struct sk_buff *net_flow_build_flows_msg(struct net_device *dev,
+						u32 portid, int seq, u8 cmd,
+						int min, int max, int table)
+{
+	struct genlmsghdr *hdr;
+	struct nlattr *flows;
+	struct sk_buff *skb;
+	int err = -ENOBUFS;
+
+	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!skb)
+		return ERR_PTR(-ENOBUFS);
+
+	hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd);
+	if (!hdr)
+		goto out;
+
+	if (nla_put_u32(skb,
+			NET_FLOW_IDENTIFIER_TYPE,
+			NET_FLOW_IDENTIFIER_IFINDEX) ||
+	    nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) {
+		err = -ENOBUFS;
+		goto out;
+	}
+
+	flows = nla_nest_start(skb, NET_FLOW_FLOWS);
+	if (!flows) {
+		err = -EMSGSIZE;
+		goto out;
+	}
+
+	err = dev->netdev_ops->ndo_flow_get_flows(skb, dev, table, min, max);
+	if (err < 0)
+		goto out_cancel;
+
+	nla_nest_end(skb, flows);
+
+	err = genlmsg_end(skb, hdr);
+	if (err < 0)
+		goto out;
+
+	return skb;
+out_cancel:
+	nla_nest_cancel(skb, flows);
+out:
+	nlmsg_free(skb);
+	return ERR_PTR(err);
+}
+
+static const
+struct nla_policy net_flow_table_flows_policy[NET_FLOW_TABLE_FLOWS_MAX + 1] = {
+	[NET_FLOW_TABLE_FLOWS_TABLE]   = { .type = NLA_U32,},
+	[NET_FLOW_TABLE_FLOWS_MINPRIO] = { .type = NLA_U32,},
+	[NET_FLOW_TABLE_FLOWS_MAXPRIO] = { .type = NLA_U32,},
+	[NET_FLOW_TABLE_FLOWS_FLOWS]   = { .type = NLA_NESTED,},
+};
+
+static int net_flow_table_cmd_get_flows(struct sk_buff *skb,
+					struct genl_info *info)
+{
+	struct nlattr *tb[NET_FLOW_TABLE_FLOWS_MAX+1];
+	int table, min = -1, max = -1;
+	struct net_device *dev;
+	struct sk_buff *msg;
+	int err = -EINVAL;
+
+	dev = net_flow_get_dev(info);
+	if (!dev)
+		return -EINVAL;
+
+	if (!dev->netdev_ops->ndo_flow_get_flows) {
+		dev_put(dev);
+		return -EOPNOTSUPP;
+	}
+
+	if (!info->attrs[NET_FLOW_IDENTIFIER_TYPE] ||
+	    !info->attrs[NET_FLOW_IDENTIFIER] ||
+	    !info->attrs[NET_FLOW_FLOWS])
+		goto out;
+
+	err = nla_parse_nested(tb, NET_FLOW_TABLE_FLOWS_MAX,
+			       info->attrs[NET_FLOW_FLOWS],
+			       net_flow_table_flows_policy);
+	if (err)
+		goto out;
+
+	if (!tb[NET_FLOW_TABLE_FLOWS_TABLE])
+		goto out;
+
+	table = nla_get_u32(tb[NET_FLOW_TABLE_FLOWS_TABLE]);
+
+	if (tb[NET_FLOW_TABLE_FLOWS_MINPRIO])
+		min = nla_get_u32(tb[NET_FLOW_TABLE_FLOWS_MINPRIO]);
+	if (tb[NET_FLOW_TABLE_FLOWS_MAXPRIO])
+		max = nla_get_u32(tb[NET_FLOW_TABLE_FLOWS_MAXPRIO]);
+
+	msg = net_flow_build_flows_msg(dev,
+				       info->snd_portid,
+				       info->snd_seq,
+				       NET_FLOW_TABLE_CMD_GET_FLOWS,
+				       min, max, table);
+	dev_put(dev);
+
+	if (IS_ERR(msg))
+		return PTR_ERR(msg);
+
+	return genlmsg_reply(msg, info);
+out:
+	dev_put(dev);
+	return err;
+}
+
+static struct sk_buff *net_flow_start_errmsg(struct net_device *dev,
+					     struct genlmsghdr **hdr,
+					     u32 portid, int seq, u8 cmd)
+{
+	struct genlmsghdr *h;
+	struct sk_buff *skb;
+
+	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!skb)
+		return ERR_PTR(-EMSGSIZE);
+
+	h = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd);
+	if (!h)
+		return ERR_PTR(-EMSGSIZE);
+
+	if (nla_put_u32(skb,
+			NET_FLOW_IDENTIFIER_TYPE,
+			NET_FLOW_IDENTIFIER_IFINDEX) ||
+	    nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex))
+		return ERR_PTR(-EMSGSIZE);
+
+	*hdr = h;
+	return skb;
+}
+
+static struct sk_buff *net_flow_end_flow_errmsg(struct sk_buff *skb,
+						struct genlmsghdr *hdr)
+{
+	int err;
+
+	err = genlmsg_end(skb, hdr);
+	if (err < 0) {
+		nlmsg_free(skb);
+		return ERR_PTR(err);
+	}
+
+	return skb;
+}
+
+static int net_flow_put_flow_action(struct sk_buff *skb,
+				    struct net_flow_action *a)
+{
+	struct nlattr *action, *sigs;
+	int i, err = 0;
+
+	action = nla_nest_start(skb, NET_FLOW_ACTION);
+	if (!action)
+		return -EMSGSIZE;
+
+	if (nla_put_u32(skb, NET_FLOW_ACTION_ATTR_UID, a->uid))
+		return -EMSGSIZE;
+
+	if (!a->args)
+		goto done;
+
+	for (i = 0; a->args[i].type; i++) {
+		sigs = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE);
+		if (!sigs) {
+			nla_nest_cancel(skb, action);
+			return -EMSGSIZE;
+		}
+
+		err = net_flow_put_act_types(skb, a[i].args);
+		if (err) {
+			nla_nest_cancel(skb, action);
+			nla_nest_cancel(skb, sigs);
+			return err;
+		}
+		nla_nest_end(skb, sigs);
+	}
+
+done:
+	nla_nest_end(skb, action);
+	return 0;
+}
+
+int net_flow_put_flow(struct sk_buff *skb, struct net_flow_flow *flow)
+{
+	struct nlattr *flows, *matches;
+	struct nlattr *actions = NULL; /* must be null to unwind */
+	int err, j, i = 0;
+
+	flows = nla_nest_start(skb, NET_FLOW_FLOW);
+	if (!flows)
+		goto put_failure;
+
+	if (nla_put_u32(skb, NET_FLOW_ATTR_TABLE, flow->table_id) ||
+	    nla_put_u32(skb, NET_FLOW_ATTR_UID, flow->uid) ||
+	    nla_put_u32(skb, NET_FLOW_ATTR_PRIORITY, flow->priority))
+		goto flows_put_failure;
+
+	if (flow->matches) {
+		matches = nla_nest_start(skb, NET_FLOW_ATTR_MATCHES);
+		if (!matches)
+			goto flows_put_failure;
+
+		for (j = 0; flow->matches && flow->matches[j].header; j++) {
+			struct net_flow_field_ref *f = &flow->matches[j];
+
+			if (!f->header)
+				continue;
+
+			nla_put(skb, NET_FLOW_FIELD_REF, sizeof(*f), f);
+		}
+		nla_nest_end(skb, matches);
+	}
+
+	if (flow->actions) {
+		actions = nla_nest_start(skb, NET_FLOW_ATTR_ACTIONS);
+		if (!actions)
+			goto flows_put_failure;
+
+		for (i = 0; flow->actions && flow->actions[i].uid; i++) {
+			err = net_flow_put_flow_action(skb, &flow->actions[i]);
+			if (err) {
+				nla_nest_cancel(skb, actions);
+				goto flows_put_failure;
+			}
+		}
+		nla_nest_end(skb, actions);
+	}
+
+	nla_nest_end(skb, flows);
+	return 0;
+
+flows_put_failure:
+	nla_nest_cancel(skb, flows);
+put_failure:
+	return -EMSGSIZE;
+}
+EXPORT_SYMBOL(net_flow_put_flow);
+
+static int net_flow_get_field(struct net_flow_field_ref *field,
+			      struct nlattr *nla)
+{
+	if (nla_type(nla) != NET_FLOW_FIELD_REF)
+		return -EINVAL;
+
+	if (nla_len(nla) < sizeof(*field))
+		return -EINVAL;
+
+	*field = *(struct net_flow_field_ref *)nla_data(nla);
+	return 0;
+}
+
+static int net_flow_get_action(struct net_flow_action *a, struct nlattr *attr)
+{
+	struct nlattr *act[NET_FLOW_ACTION_ATTR_MAX+1];
+	struct nlattr *args;
+	int rem;
+	int err, count = 0;
+
+	if (nla_type(attr) != NET_FLOW_ACTION) {
+		pr_warn("%s: expected NET_FLOW_ACTION\n", __func__);
+		return 0;
+	}
+
+	err = nla_parse_nested(act, NET_FLOW_ACTION_ATTR_MAX,
+			       attr, net_flow_action_policy);
+	if (err < 0)
+		return err;
+
+	if (!act[NET_FLOW_ACTION_ATTR_UID] ||
+	    !act[NET_FLOW_ACTION_ATTR_SIGNATURE])
+		return -EINVAL;
+
+	a->uid = nla_get_u32(act[NET_FLOW_ACTION_ATTR_UID]);
+
+	nla_for_each_nested(args, act[NET_FLOW_ACTION_ATTR_SIGNATURE], rem)
+		count++; /* unoptimized max possible */
+
+	a->args = kcalloc(count + 1,
+			  sizeof(struct net_flow_action_arg),
+			  GFP_KERNEL);
+	count = 0;
+
+	nla_for_each_nested(args, act[NET_FLOW_ACTION_ATTR_SIGNATURE], rem) {
+		if (nla_type(args) != NET_FLOW_ACTION_ARG)
+			continue;
+
+		if (nla_len(args) < sizeof(struct net_flow_action_arg)) {
+			kfree(a->args);
+			return -EINVAL;
+		}
+
+		a->args[count] = *(struct net_flow_action_arg *)nla_data(args);
+	}
+	return 0;
+}
+
+static const
+struct nla_policy net_flow_flow_policy[NET_FLOW_ATTR_MAX + 1] = {
+	[NET_FLOW_ATTR_TABLE]		= { .type = NLA_U32 },
+	[NET_FLOW_ATTR_UID]		= { .type = NLA_U32 },
+	[NET_FLOW_ATTR_PRIORITY]	= { .type = NLA_U32 },
+	[NET_FLOW_ATTR_MATCHES]		= { .type = NLA_NESTED },
+	[NET_FLOW_ATTR_ACTIONS]		= { .type = NLA_NESTED },
+};
+
+static int net_flow_get_flow(struct net_flow_flow *flow, struct nlattr *attr)
+{
+	struct nlattr *f[NET_FLOW_ATTR_MAX+1];
+	struct nlattr *attr2;
+	int rem, err;
+	int count = 0;
+
+	err = nla_parse_nested(f, NET_FLOW_ATTR_MAX,
+			       attr, net_flow_flow_policy);
+	if (err < 0)
+		return -EINVAL;
+
+	if (!f[NET_FLOW_ATTR_TABLE] || !f[NET_FLOW_ATTR_UID] ||
+	    !f[NET_FLOW_ATTR_PRIORITY])
+		return -EINVAL;
+
+	flow->table_id = nla_get_u32(f[NET_FLOW_ATTR_TABLE]);
+	flow->uid = nla_get_u32(f[NET_FLOW_ATTR_UID]);
+	flow->priority = nla_get_u32(f[NET_FLOW_ATTR_PRIORITY]);
+
+	flow->matches = NULL;
+	flow->actions = NULL;
+
+	if (f[NET_FLOW_ATTR_MATCHES]) {
+		nla_for_each_nested(attr2, f[NET_FLOW_ATTR_MATCHES], rem)
+			count++;
+
+		/* Null terminated list of matches */
+		flow->matches = kcalloc(count + 1,
+					sizeof(struct net_flow_field_ref),
+					GFP_KERNEL);
+		if (!flow->matches)
+			return -ENOMEM;
+
+		count = 0;
+		nla_for_each_nested(attr2, f[NET_FLOW_ATTR_MATCHES], rem) {
+			err = net_flow_get_field(&flow->matches[count], attr2);
+			if (err) {
+				kfree(flow->matches);
+				return err;
+			}
+			count++;
+		}
+	}
+
+	if (f[NET_FLOW_ATTR_ACTIONS]) {
+		count = 0;
+		nla_for_each_nested(attr2, f[NET_FLOW_ATTR_ACTIONS], rem)
+			count++;
+
+		/* Null terminated list of actions */
+		flow->actions = kcalloc(count + 1,
+					sizeof(struct net_flow_action),
+					GFP_KERNEL);
+		if (!flow->actions) {
+			kfree(flow->matches);
+			return -ENOMEM;
+		}
+
+		count = 0;
+		nla_for_each_nested(attr2, f[NET_FLOW_ATTR_ACTIONS], rem) {
+			err = net_flow_get_action(&flow->actions[count], attr2);
+			if (err) {
+				kfree(flow->matches);
+				kfree(flow->actions);
+				return err;
+			}
+			count++;
+		}
+	}
+
+	return 0;
+}
+
+static int net_flow_table_cmd_flows(struct sk_buff *recv_skb,
+				    struct genl_info *info)
+{
+	int rem, err_handle = NET_FLOW_FLOWS_ERROR_ABORT;
+	struct sk_buff *skb = NULL;
+	struct net_flow_flow this;
+	struct genlmsghdr *hdr;
+	struct net_device *dev;
+	struct nlattr *flow, *flows;
+	int cmd = info->genlhdr->cmd;
+	int err = -EOPNOTSUPP;
+
+	dev = net_flow_get_dev(info);
+	if (!dev)
+		return -EINVAL;
+
+	if (!dev->netdev_ops->ndo_flow_set_flows ||
+	    !dev->netdev_ops->ndo_flow_del_flows)
+		goto out;
+
+	if (!info->attrs[NET_FLOW_IDENTIFIER_TYPE] ||
+	    !info->attrs[NET_FLOW_IDENTIFIER] ||
+	    !info->attrs[NET_FLOW_FLOWS]) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (info->attrs[NET_FLOW_FLOWS_ERROR])
+		err_handle = nla_get_u32(info->attrs[NET_FLOW_FLOWS_ERROR]);
+
+	nla_for_each_nested(flow, info->attrs[NET_FLOW_FLOWS], rem) {
+		if (nla_type(flow) != NET_FLOW_FLOW)
+			continue;
+
+		err = net_flow_get_flow(&this, flow);
+		if (err)
+			goto out;
+
+		switch (cmd) {
+		case NET_FLOW_TABLE_CMD_SET_FLOWS:
+			err = dev->netdev_ops->ndo_flow_set_flows(dev, &this);
+			break;
+		case NET_FLOW_TABLE_CMD_DEL_FLOWS:
+			err = dev->netdev_ops->ndo_flow_del_flows(dev, &this);
+			break;
+		default:
+			err = -EOPNOTSUPP;
+			break;
+		}
+
+		if (err && err_handle != NET_FLOW_FLOWS_ERROR_CONTINUE) {
+			if (!skb) {
+				skb = net_flow_start_errmsg(dev, &hdr,
+							    info->snd_portid,
+							    info->snd_seq,
+							    cmd);
+				if (IS_ERR(skb)) {
+					err = PTR_ERR(skb);
+					goto out_plus_free;
+				}
+
+				flows = nla_nest_start(skb, NET_FLOW_FLOWS);
+				if (!flows) {
+					err = -EMSGSIZE;
+					goto out_plus_free;
+				}
+			}
+
+			net_flow_put_flow(skb, &this);
+		}
+
+		/* Cleanup flow */
+		kfree(this.matches);
+		kfree(this.actions);
+
+		if (err && err_handle == NET_FLOW_FLOWS_ERROR_ABORT)
+			goto out;
+	}
+
+	dev_put(dev);
+
+	if (skb) {
+		nla_nest_end(skb, flows);
+		net_flow_end_flow_errmsg(skb, hdr);
+		return genlmsg_reply(skb, info);
+	}
+	return 0;
+
+out_plus_free:
+	kfree(this.matches);
+	kfree(this.actions);
+out:
+	if (skb)
+		nlmsg_free(skb);
+	dev_put(dev);
+	return -EINVAL;
+}
+
 static const struct nla_policy net_flow_cmd_policy[NET_FLOW_MAX + 1] = {
 	[NET_FLOW_IDENTIFIER_TYPE] = {.type = NLA_U32, },
 	[NET_FLOW_IDENTIFIER]	   = {.type = NLA_U32, },
@@ -815,6 +1298,24 @@ static const struct genl_ops net_flow_table_nl_ops[] = {
 		.policy = net_flow_cmd_policy,
 		.flags = GENL_ADMIN_PERM,
 	},
+	{
+		.cmd = NET_FLOW_TABLE_CMD_GET_FLOWS,
+		.doit = net_flow_table_cmd_get_flows,
+		.policy = net_flow_cmd_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = NET_FLOW_TABLE_CMD_SET_FLOWS,
+		.doit = net_flow_table_cmd_flows,
+		.policy = net_flow_cmd_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = NET_FLOW_TABLE_CMD_DEL_FLOWS,
+		.doit = net_flow_table_cmd_flows,
+		.policy = net_flow_cmd_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
 };
 
 static int __init net_flow_nl_module_init(void)

^ permalink raw reply related

* [net-next PATCH v1 03/11] net: flow_table: add apply action argument to tables
From: John Fastabend @ 2014-12-31 19:46 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

Actions may not always be applied after exiting a table. For example
some pipelines may accumulate actions and then apply them at the end
of a pipeline.

To model this we use a table type called APPLY. Tables who share an
apply identifier have their actions applied in one step.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 include/linux/if_flow.h      |    1 +
 include/uapi/linux/if_flow.h |    1 +
 net/core/flow_table.c        |    1 +
 3 files changed, 3 insertions(+)

diff --git a/include/linux/if_flow.h b/include/linux/if_flow.h
index 20fa752..a042a3d 100644
--- a/include/linux/if_flow.h
+++ b/include/linux/if_flow.h
@@ -67,6 +67,7 @@ struct net_flow_table {
 	char name[NET_FLOW_NAMSIZ];
 	int uid;
 	int source;
+	int apply_action;
 	int size;
 	struct net_flow_field_ref *matches;
 	int *actions;
diff --git a/include/uapi/linux/if_flow.h b/include/uapi/linux/if_flow.h
index 125cdc6..3c1a860 100644
--- a/include/uapi/linux/if_flow.h
+++ b/include/uapi/linux/if_flow.h
@@ -265,6 +265,7 @@ enum {
 	NET_FLOW_TABLE_ATTR_NAME,
 	NET_FLOW_TABLE_ATTR_UID,
 	NET_FLOW_TABLE_ATTR_SOURCE,
+	NET_FLOW_TABLE_ATTR_APPLY,
 	NET_FLOW_TABLE_ATTR_SIZE,
 	NET_FLOW_TABLE_ATTR_MATCHES,
 	NET_FLOW_TABLE_ATTR_ACTIONS,
diff --git a/net/core/flow_table.c b/net/core/flow_table.c
index f4cf293..97cdf92 100644
--- a/net/core/flow_table.c
+++ b/net/core/flow_table.c
@@ -223,6 +223,7 @@ static int net_flow_put_table(struct net_device *dev,
 	if (nla_put_string(skb, NET_FLOW_TABLE_ATTR_NAME, t->name) ||
 	    nla_put_u32(skb, NET_FLOW_TABLE_ATTR_UID, t->uid) ||
 	    nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SOURCE, t->source) ||
+	    nla_put_u32(skb, NET_FLOW_TABLE_ATTR_APPLY, t->apply_action) ||
 	    nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SIZE, t->size))
 		return -EMSGSIZE;
 

^ permalink raw reply related

* [net-next PATCH v1 04/11] rocker: add pipeline model for rocker switch
From: John Fastabend @ 2014-12-31 19:47 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

This adds rocker support for the net_flow_get_* operations. With this
we can interrogate rocker.

Here we see that for static configurations enabling the get operations
is simply a matter of defining a pipeline model and returning the
structures for the core infrastructure to encapsulate into netlink
messages.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/ethernet/rocker/rocker.c          |   35 +
 drivers/net/ethernet/rocker/rocker_pipeline.h |  673 +++++++++++++++++++++++++
 2 files changed, 708 insertions(+)
 create mode 100644 drivers/net/ethernet/rocker/rocker_pipeline.h

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index fded127..4c6787a 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -36,6 +36,7 @@
 #include <generated/utsrelease.h>
 
 #include "rocker.h"
+#include "rocker_pipeline.h"
 
 static const char rocker_driver_name[] = "rocker";
 
@@ -3780,6 +3781,33 @@ static int rocker_port_switch_port_stp_update(struct net_device *dev, u8 state)
 	return rocker_port_stp_update(rocker_port, state);
 }
 
+#ifdef CONFIG_NET_FLOW_TABLES
+static struct net_flow_table **rocker_get_tables(struct net_device *d)
+{
+	return rocker_table_list;
+}
+
+static struct net_flow_header **rocker_get_headers(struct net_device *d)
+{
+	return rocker_header_list;
+}
+
+static struct net_flow_action **rocker_get_actions(struct net_device *d)
+{
+	return rocker_action_list;
+}
+
+static struct net_flow_tbl_node **rocker_get_tgraph(struct net_device *d)
+{
+	return rocker_table_nodes;
+}
+
+static struct net_flow_hdr_node **rocker_get_hgraph(struct net_device *d)
+{
+	return rocker_header_nodes;
+}
+#endif
+
 static const struct net_device_ops rocker_port_netdev_ops = {
 	.ndo_open			= rocker_port_open,
 	.ndo_stop			= rocker_port_stop,
@@ -3794,6 +3822,13 @@ static const struct net_device_ops rocker_port_netdev_ops = {
 	.ndo_bridge_getlink		= rocker_port_bridge_getlink,
 	.ndo_switch_parent_id_get	= rocker_port_switch_parent_id_get,
 	.ndo_switch_port_stp_update	= rocker_port_switch_port_stp_update,
+#ifdef CONFIG_NET_FLOW_TABLES
+	.ndo_flow_get_tables		= rocker_get_tables,
+	.ndo_flow_get_headers		= rocker_get_headers,
+	.ndo_flow_get_actions		= rocker_get_actions,
+	.ndo_flow_get_tbl_graph		= rocker_get_tgraph,
+	.ndo_flow_get_hdr_graph		= rocker_get_hgraph,
+#endif
 };
 
 /********************
diff --git a/drivers/net/ethernet/rocker/rocker_pipeline.h b/drivers/net/ethernet/rocker/rocker_pipeline.h
new file mode 100644
index 0000000..9544339
--- /dev/null
+++ b/drivers/net/ethernet/rocker/rocker_pipeline.h
@@ -0,0 +1,673 @@
+#ifndef _MY_PIPELINE_H_
+#define _MY_PIPELINE_H_
+
+#include <linux/if_flow.h>
+
+/* header definition */
+#define HEADER_ETHERNET_SRC_MAC 1
+#define HEADER_ETHERNET_DST_MAC 2
+#define HEADER_ETHERNET_ETHERTYPE 3
+struct net_flow_field ethernet_fields[3] = {
+	{ .name = "src_mac", .uid = HEADER_ETHERNET_SRC_MAC, .bitwidth = 48},
+	{ .name = "dst_mac", .uid = HEADER_ETHERNET_DST_MAC, .bitwidth = 48},
+	{ .name = "ethertype",
+	  .uid = HEADER_ETHERNET_ETHERTYPE,
+	  .bitwidth = 16},
+};
+
+#define HEADER_ETHERNET 1
+struct net_flow_header ethernet = {
+	.name = "ethernet",
+	.uid = HEADER_ETHERNET,
+	.field_sz = 3,
+	.fields = ethernet_fields,
+};
+
+#define HEADER_VLAN_PCP 1
+#define HEADER_VLAN_CFI 2
+#define HEADER_VLAN_VID 3
+#define HEADER_VLAN_ETHERTYPE 4
+struct net_flow_field vlan_fields[4] = {
+	{ .name = "pcp", .uid = HEADER_VLAN_PCP, .bitwidth = 3,},
+	{ .name = "cfi", .uid = HEADER_VLAN_CFI, .bitwidth = 1,},
+	{ .name = "vid", .uid = HEADER_VLAN_VID, .bitwidth = 12,},
+	{ .name = "ethertype", .uid = HEADER_VLAN_ETHERTYPE, .bitwidth = 16,},
+};
+
+#define HEADER_VLAN 2
+struct net_flow_header vlan = {
+	.name = "vlan",
+	.uid = HEADER_VLAN,
+	.field_sz = 4,
+	.fields = vlan_fields,
+};
+
+#define HEADER_IPV4_VERSION 1
+#define HEADER_IPV4_IHL 2
+#define HEADER_IPV4_DSCP 3
+#define HEADER_IPV4_ECN 4
+#define HEADER_IPV4_LENGTH 5
+#define HEADER_IPV4_IDENTIFICATION 6
+#define HEADER_IPV4_FLAGS 7
+#define HEADER_IPV4_FRAGMENT_OFFSET 8
+#define HEADER_IPV4_TTL 9
+#define HEADER_IPV4_PROTOCOL 10
+#define HEADER_IPV4_CSUM 11
+#define HEADER_IPV4_SRC_IP 12
+#define HEADER_IPV4_DST_IP 13
+#define HEADER_IPV4_OPTIONS 14
+struct net_flow_field ipv4_fields[14] = {
+	{ .name = "version",
+	  .uid = HEADER_IPV4_VERSION,
+	  .bitwidth = 4,},
+	{ .name = "ihl",
+	  .uid = HEADER_IPV4_IHL,
+	  .bitwidth = 4,},
+	{ .name = "dscp",
+	  .uid = HEADER_IPV4_DSCP,
+	  .bitwidth = 6,},
+	{ .name = "ecn",
+	  .uid = HEADER_IPV4_ECN,
+	  .bitwidth = 2,},
+	{ .name = "length",
+	  .uid = HEADER_IPV4_LENGTH,
+	  .bitwidth = 8,},
+	{ .name = "identification",
+	  .uid = HEADER_IPV4_IDENTIFICATION,
+	  .bitwidth = 8,},
+	{ .name = "flags",
+	  .uid = HEADER_IPV4_FLAGS,
+	  .bitwidth = 3,},
+	{ .name = "fragment_offset",
+	  .uid = HEADER_IPV4_FRAGMENT_OFFSET,
+	  .bitwidth = 13,},
+	{ .name = "ttl",
+	  .uid = HEADER_IPV4_TTL,
+	  .bitwidth = 1,},
+	{ .name = "protocol",
+	  .uid = HEADER_IPV4_PROTOCOL,
+	  .bitwidth = 8,},
+	{ .name = "csum",
+	  .uid = HEADER_IPV4_CSUM,
+	  .bitwidth = 8,},
+	{ .name = "src_ip",
+	  .uid = HEADER_IPV4_SRC_IP,
+	  .bitwidth = 32,},
+	{ .name = "dst_ip",
+	  .uid = HEADER_IPV4_DST_IP,
+	  .bitwidth = 32,},
+	{ .name = "options",
+	  .uid = HEADER_IPV4_OPTIONS,
+	  .bitwidth = -1,},
+};
+
+#define HEADER_IPV4 3
+struct net_flow_header ipv4 = {
+	.name = "ipv4",
+	.uid = HEADER_IPV4,
+	.field_sz = 14,
+	.fields = ipv4_fields,
+};
+
+#define HEADER_METADATA_IN_LPORT 1
+#define HEADER_METADATA_GOTO_TBL 2
+#define HEADER_METADATA_GROUP_ID 3
+struct net_flow_field metadata_fields[3] = {
+	{ .name = "in_lport",
+	  .uid = HEADER_METADATA_IN_LPORT,
+	  .bitwidth = 32,},
+	{ .name = "goto_tbl",
+	  .uid = HEADER_METADATA_GOTO_TBL,
+	  .bitwidth = 16,},
+	{ .name = "group_id",
+	  .uid = HEADER_METADATA_GROUP_ID,
+	  .bitwidth = 32,},
+};
+
+#define HEADER_METADATA 4
+struct net_flow_header metadata_t = {
+	.name = "metadata_t",
+	.uid = HEADER_METADATA,
+	.field_sz = 3,
+	.fields = metadata_fields,
+};
+
+struct net_flow_header null_hdr = {.name = "",
+				   .uid = 0,
+				   .field_sz = 0,
+				   .fields = NULL};
+
+struct net_flow_header *rocker_header_list[8] = {
+	&ethernet,
+	&vlan,
+	&ipv4,
+	&metadata_t,
+	&null_hdr,
+};
+
+/* action definitions */
+struct net_flow_action_arg null_args[1] = {
+	{
+		.name = "",
+		.type = NET_FLOW_ACTION_ARG_TYPE_NULL,
+	},
+};
+
+struct net_flow_action null_action = {
+	.name = "", .uid = 0, .args = NULL,
+};
+
+struct net_flow_action_arg set_goto_table_args[2] = {
+	{
+		.name = "table",
+		.type = NET_FLOW_ACTION_ARG_TYPE_U16,
+		.value_u16 = 0,
+	},
+	{
+		.name = "",
+		.type = NET_FLOW_ACTION_ARG_TYPE_NULL,
+	},
+};
+
+#define ACTION_SET_GOTO_TABLE 1
+struct net_flow_action set_goto_table = {
+	.name = "set_goto_table",
+	.uid = ACTION_SET_GOTO_TABLE,
+	.args = set_goto_table_args,
+};
+
+struct net_flow_action_arg set_vlan_id_args[2] = {
+	{
+		.name = "vlan_id",
+		.type = NET_FLOW_ACTION_ARG_TYPE_U16,
+		.value_u16 = 0,
+	},
+	{
+		.name = "",
+		.type = NET_FLOW_ACTION_ARG_TYPE_NULL,
+	},
+};
+
+#define ACTION_SET_VLAN_ID 2
+struct net_flow_action set_vlan_id = {
+	.name = "set_vlan_id",
+	.uid = ACTION_SET_VLAN_ID,
+	.args = set_vlan_id_args,
+};
+
+/* TBD: what is the untagged bool about in vlan table */
+#define ACTION_COPY_TO_CPU 3
+struct net_flow_action copy_to_cpu = {
+	.name = "copy_to_cpu",
+	.uid = ACTION_COPY_TO_CPU,
+	.args = null_args,
+};
+
+struct net_flow_action_arg set_group_id_args[2] = {
+	{
+		.name = "group_id",
+		.type = NET_FLOW_ACTION_ARG_TYPE_U32,
+		.value_u32 = 0,
+	},
+	{
+		.name = "",
+		.type = NET_FLOW_ACTION_ARG_TYPE_NULL,
+	},
+};
+
+#define ACTION_SET_GROUP_ID 4
+struct net_flow_action set_group_id = {
+	.name = "set_group_id",
+	.uid = ACTION_SET_GROUP_ID,
+	.args = set_group_id_args,
+};
+
+#define ACTION_POP_VLAN 5
+struct net_flow_action pop_vlan = {
+	.name = "pop_vlan",
+	.uid = ACTION_POP_VLAN,
+	.args = null_args,
+};
+
+struct net_flow_action_arg set_eth_src_args[2] = {
+	{
+		.name = "eth_src",
+		.type = NET_FLOW_ACTION_ARG_TYPE_U64,
+		.value_u64 = 0,
+	},
+	{
+		.name = "",
+		.type = NET_FLOW_ACTION_ARG_TYPE_NULL,
+	},
+};
+
+#define ACTION_SET_ETH_SRC 6
+struct net_flow_action set_eth_src = {
+	.name = "set_eth_src",
+	.uid = ACTION_SET_ETH_SRC,
+	.args = set_eth_src_args,
+};
+
+struct net_flow_action_arg set_eth_dst_args[2] = {
+	{
+		.name = "eth_dst",
+		.type = NET_FLOW_ACTION_ARG_TYPE_U64,
+		.value_u64 = 0,
+	},
+	{
+		.name = "",
+		.type = NET_FLOW_ACTION_ARG_TYPE_NULL,
+	},
+};
+
+#define ACTION_SET_ETH_DST 7
+struct net_flow_action set_eth_dst = {
+	.name = "set_eth_dst",
+	.uid = ACTION_SET_ETH_DST,
+	.args = set_eth_dst_args,
+};
+
+struct net_flow_action_arg set_out_port_args[2] = {
+	{
+		.name = "set_out_port",
+		.type = NET_FLOW_ACTION_ARG_TYPE_U32,
+		.value_u32 = 0,
+	},
+	{
+		.name = "",
+		.type = NET_FLOW_ACTION_ARG_TYPE_NULL,
+	},
+};
+
+#define ACTION_SET_OUT_PORT 8
+struct net_flow_action set_out_port = {
+	.name = "set_out_port",
+	.uid = ACTION_SET_OUT_PORT,
+	.args = set_out_port_args,
+};
+
+struct net_flow_action *rocker_action_list[8] = {
+	&set_goto_table,
+	&set_vlan_id,
+	&copy_to_cpu,
+	&set_group_id,
+	&pop_vlan,
+	&set_eth_src,
+	&set_eth_dst,
+	&null_action,
+};
+
+/* headers graph */
+#define HEADER_INSTANCE_ETHERNET 1
+#define HEADER_INSTANCE_VLAN_OUTER 2
+#define HEADER_INSTANCE_IPV4 3
+#define HEADER_INSTANCE_IN_LPORT 4
+#define HEADER_INSTANCE_GOTO_TABLE 5
+#define HEADER_INSTANCE_GROUP_ID 6
+
+struct net_flow_jump_table parse_ethernet[3] = {
+	{
+		.field = {
+		   .header = HEADER_ETHERNET,
+		   .field = HEADER_ETHERNET_ETHERTYPE,
+		   .type = NET_FLOW_FIELD_REF_ATTR_TYPE_U16,
+		   .value_u16 = 0x0800,
+		},
+		.node = HEADER_INSTANCE_IPV4,
+	},
+	{
+		.field = {
+		   .header = HEADER_ETHERNET,
+		   .field = HEADER_ETHERNET_ETHERTYPE,
+		   .type = NET_FLOW_FIELD_REF_ATTR_TYPE_U16,
+		   .value_u16 = 0x8100,
+		},
+		.node = HEADER_INSTANCE_VLAN_OUTER,
+	},
+	{
+		.field = {0},
+		.node = 0,
+	},
+};
+
+int ethernet_headers[2] = {HEADER_ETHERNET, 0};
+
+struct net_flow_hdr_node ethernet_header_node = {
+	.name = "ethernet",
+	.uid = HEADER_INSTANCE_ETHERNET,
+	.hdrs = ethernet_headers,
+	.jump = parse_ethernet,
+};
+
+struct net_flow_jump_table parse_vlan[2] = {
+	{
+		.field = {
+		   .header = HEADER_VLAN,
+		   .field = HEADER_VLAN_ETHERTYPE,
+		   .type = NET_FLOW_FIELD_REF_ATTR_TYPE_U16,
+		   .value_u16 = 0x0800,
+		},
+		.node = HEADER_INSTANCE_IPV4,
+	},
+	{
+		.field = {0},
+		.node = 0,
+	},
+};
+
+int vlan_headers[2] = {HEADER_VLAN, 0};
+struct net_flow_hdr_node vlan_header_node = {
+	.name = "vlan",
+	.uid = HEADER_INSTANCE_VLAN_OUTER,
+	.hdrs = vlan_headers,
+	.jump = parse_vlan,
+};
+
+struct net_flow_jump_table terminal_headers[2] = {
+	{
+		.field = {0},
+		.node = NET_FLOW_JUMP_TABLE_DONE,
+	},
+	{
+		.field = {0},
+		.node = 0,
+	},
+};
+
+int ipv4_headers[2] = {HEADER_IPV4, 0};
+struct net_flow_hdr_node ipv4_header_node = {
+	.name = "ipv4",
+	.uid = HEADER_INSTANCE_IPV4,
+	.hdrs = ipv4_headers,
+	.jump = terminal_headers,
+};
+
+int metadata_headers[2] = {HEADER_METADATA, 0};
+struct net_flow_hdr_node in_lport_header_node = {
+	.name = "in_lport",
+	.uid = HEADER_INSTANCE_IN_LPORT,
+	.hdrs = metadata_headers,
+	.jump = terminal_headers,
+};
+
+struct net_flow_hdr_node goto_table_header_node = {
+	.name = "goto_table",
+	.uid = HEADER_INSTANCE_GOTO_TABLE,
+	.hdrs = metadata_headers,
+	.jump = terminal_headers,
+};
+
+struct net_flow_hdr_node group_id_header_node = {
+	.name = "group_id",
+	.uid = HEADER_INSTANCE_GROUP_ID,
+	.hdrs = metadata_headers,
+	.jump = terminal_headers,
+};
+
+struct net_flow_hdr_node null_header = {.name = "", .uid = 0,};
+
+struct net_flow_hdr_node *rocker_header_nodes[7] = {
+	&ethernet_header_node,
+	&vlan_header_node,
+	&ipv4_header_node,
+	&in_lport_header_node,
+	&goto_table_header_node,
+	&group_id_header_node,
+	&null_header,
+};
+
+/* table definition */
+struct net_flow_field_ref matches_ig_port[2] = {
+	{ .instance = HEADER_INSTANCE_IN_LPORT,
+	  .header = HEADER_METADATA,
+	  .field = HEADER_METADATA_IN_LPORT,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = 0, .field = 0},
+};
+
+struct net_flow_field_ref matches_vlan[3] = {
+	{ .instance = HEADER_INSTANCE_IN_LPORT,
+	  .header = HEADER_METADATA,
+	  .field = HEADER_METADATA_IN_LPORT,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = HEADER_INSTANCE_VLAN_OUTER,
+	  .header = HEADER_VLAN,
+	  .field = HEADER_VLAN_VID,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = 0, .field = 0},
+};
+
+struct net_flow_field_ref matches_term_mac[5] = {
+	{ .instance = HEADER_INSTANCE_IN_LPORT,
+	  .header = HEADER_METADATA,
+	  .field = HEADER_METADATA_IN_LPORT,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = HEADER_INSTANCE_ETHERNET,
+	  .header = HEADER_ETHERNET,
+	  .field = HEADER_ETHERNET_ETHERTYPE,
+	  .mask_type = NET_FLOW_MASK_TYPE_EXACT},
+	{ .instance = HEADER_INSTANCE_ETHERNET,
+	  .header = HEADER_ETHERNET,
+	  .field = HEADER_ETHERNET_DST_MAC,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = HEADER_INSTANCE_VLAN_OUTER,
+	  .header = HEADER_VLAN,
+	  .field = HEADER_VLAN_VID,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = 0, .field = 0},
+};
+
+struct net_flow_field_ref matches_ucast_routing[3] = {
+	{ .instance = HEADER_INSTANCE_ETHERNET,
+	  .header = HEADER_ETHERNET,
+	  .field = HEADER_ETHERNET_ETHERTYPE,
+	  .mask_type = NET_FLOW_MASK_TYPE_EXACT},
+	{ .instance = HEADER_INSTANCE_IPV4,
+	  .header = HEADER_IPV4,
+	  .field = HEADER_IPV4_DST_IP,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = 0, .field = 0},
+};
+
+struct net_flow_field_ref matches_bridge[3] = {
+	{ .instance = HEADER_INSTANCE_ETHERNET,
+	  .header = HEADER_ETHERNET,
+	  .field = HEADER_ETHERNET_DST_MAC,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = HEADER_INSTANCE_VLAN_OUTER,
+	  .header = HEADER_VLAN,
+	  .field = HEADER_VLAN_VID,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = 0, .field = 0},
+};
+
+struct net_flow_field_ref matches_acl[8] = {
+	{ .instance = HEADER_INSTANCE_IN_LPORT,
+	  .header = HEADER_METADATA,
+	  .field = HEADER_METADATA_IN_LPORT,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = HEADER_INSTANCE_ETHERNET,
+	  .header = HEADER_ETHERNET,
+	  .field = HEADER_ETHERNET_SRC_MAC,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = HEADER_INSTANCE_ETHERNET,
+	  .header = HEADER_ETHERNET,
+	  .field = HEADER_ETHERNET_DST_MAC,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = HEADER_INSTANCE_ETHERNET,
+	  .header = HEADER_ETHERNET,
+	  .field = HEADER_ETHERNET_ETHERTYPE,
+	  .mask_type = NET_FLOW_MASK_TYPE_EXACT},
+	{ .instance = HEADER_INSTANCE_VLAN_OUTER,
+	  .header = HEADER_VLAN,
+	  .field = HEADER_VLAN_VID,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = HEADER_INSTANCE_IPV4,
+	  .header = HEADER_IPV4,
+	  .field = HEADER_IPV4_PROTOCOL,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = HEADER_INSTANCE_IPV4,
+	  .header = HEADER_IPV4,
+	  .field = HEADER_IPV4_DSCP,
+	  .mask_type = NET_FLOW_MASK_TYPE_LPM},
+	{ .instance = 0, .field = 0},
+};
+
+int actions_ig_port[2] = {ACTION_SET_GOTO_TABLE, 0};
+int actions_vlan[3] = {ACTION_SET_GOTO_TABLE, ACTION_SET_VLAN_ID, 0};
+int actions_term_mac[3] = {ACTION_SET_GOTO_TABLE, ACTION_COPY_TO_CPU, 0};
+int actions_ucast_routing[3] = {ACTION_SET_GOTO_TABLE, ACTION_SET_GROUP_ID, 0};
+int actions_bridge[4] = {ACTION_SET_GOTO_TABLE,
+			 ACTION_SET_GROUP_ID,
+			 ACTION_COPY_TO_CPU, 0};
+int actions_acl[2] = {ACTION_SET_GROUP_ID, 0};
+
+enum rocker_flow_table_id_space {
+	ROCKER_FLOW_TABLE_ID_INGRESS_PORT = 1,
+	ROCKER_FLOW_TABLE_ID_VLAN,
+	ROCKER_FLOW_TABLE_ID_TERMINATION_MAC,
+	ROCKER_FLOW_TABLE_ID_UNICAST_ROUTING,
+	ROCKER_FLOW_TABLE_ID_BRIDGING,
+	ROCKER_FLOW_TABLE_ID_ACL_POLICY,
+	ROCKER_FLOW_TABLE_NULL = 0,
+};
+
+struct net_flow_table ingress_port_table = {
+	.name = "ingress_port",
+	.uid = ROCKER_FLOW_TABLE_ID_INGRESS_PORT,
+	.source = 1,
+	.size = -1,
+	.matches = matches_ig_port,
+	.actions = actions_ig_port,
+};
+
+struct net_flow_table vlan_table = {
+	.name = "vlan",
+	.uid = ROCKER_FLOW_TABLE_ID_VLAN,
+	.source = 1,
+	.size = -1,
+	.matches = matches_vlan,
+	.actions = actions_vlan,
+};
+
+struct net_flow_table term_mac_table = {
+	.name = "term_mac",
+	.uid = ROCKER_FLOW_TABLE_ID_TERMINATION_MAC,
+	.source = 1,
+	.size = -1,
+	.matches = matches_term_mac,
+	.actions = actions_term_mac,
+};
+
+struct net_flow_table ucast_routing_table = {
+	.name = "ucast_routing",
+	.uid = ROCKER_FLOW_TABLE_ID_UNICAST_ROUTING,
+	.source = 1,
+	.size = -1,
+	.matches = matches_ucast_routing,
+	.actions = actions_ucast_routing,
+};
+
+struct net_flow_table bridge_table = {
+	.name = "bridge",
+	.uid = ROCKER_FLOW_TABLE_ID_BRIDGING,
+	.source = 1,
+	.size = -1,
+	.matches = matches_bridge,
+	.actions = actions_bridge,
+};
+
+struct net_flow_table acl_table = {
+	.name = "acl",
+	.uid = ROCKER_FLOW_TABLE_ID_ACL_POLICY,
+	.source = 1,
+	.size = -1,
+	.matches = matches_acl,
+	.actions = actions_acl,
+};
+
+struct net_flow_table null_table = {
+	.name = "",
+	.uid = 0,
+	.source = 0,
+	.size = 0,
+	.matches = NULL,
+	.actions = NULL,
+};
+
+struct net_flow_table *rocker_table_list[7] = {
+	&ingress_port_table,
+	&vlan_table,
+	&term_mac_table,
+	&ucast_routing_table,
+	&bridge_table,
+	&acl_table,
+	&null_table,
+};
+
+/* Define the table graph layout */
+struct net_flow_jump_table table_node_ig_port_next[2] = {
+	{ .field = {0}, .node = ROCKER_FLOW_TABLE_ID_VLAN},
+	{ .field = {0}, .node = 0},
+};
+
+struct net_flow_tbl_node table_node_ingress_port = {
+	.uid = ROCKER_FLOW_TABLE_ID_INGRESS_PORT,
+	.jump = table_node_ig_port_next};
+
+struct net_flow_jump_table table_node_vlan_next[2] = {
+	{ .field = {0}, .node = ROCKER_FLOW_TABLE_ID_TERMINATION_MAC},
+	{ .field = {0}, .node = 0},
+};
+
+struct net_flow_tbl_node table_node_vlan = {
+	.uid = ROCKER_FLOW_TABLE_ID_VLAN,
+	.jump = table_node_vlan_next};
+
+struct net_flow_jump_table table_node_term_mac_next[2] = {
+	{ .field = {0}, .node = ROCKER_FLOW_TABLE_ID_UNICAST_ROUTING},
+	{ .field = {0}, .node = 0},
+};
+
+struct net_flow_tbl_node table_node_term_mac = {
+	.uid = ROCKER_FLOW_TABLE_ID_TERMINATION_MAC,
+	.jump = table_node_term_mac_next};
+
+struct net_flow_jump_table table_node_bridge_next[2] = {
+	{ .field = {0}, .node = ROCKER_FLOW_TABLE_ID_ACL_POLICY},
+	{ .field = {0}, .node = 0},
+};
+
+struct net_flow_tbl_node table_node_bridge = {
+	.uid = ROCKER_FLOW_TABLE_ID_BRIDGING,
+	.jump = table_node_bridge_next};
+
+struct net_flow_jump_table table_node_ucast_routing_next[2] = {
+	{ .field = {0}, .node = ROCKER_FLOW_TABLE_ID_ACL_POLICY},
+	{ .field = {0}, .node = 0},
+};
+
+struct net_flow_tbl_node table_node_ucast_routing = {
+	.uid = ROCKER_FLOW_TABLE_ID_UNICAST_ROUTING,
+	.jump = table_node_ucast_routing_next};
+
+struct net_flow_jump_table table_node_acl_next[1] = {
+	{ .field = {0}, .node = 0},
+};
+
+struct net_flow_tbl_node table_node_acl = {
+	.uid = ROCKER_FLOW_TABLE_ID_ACL_POLICY,
+	.jump = table_node_acl_next};
+
+struct net_flow_tbl_node table_node_nil = {.uid = 0, .jump = NULL};
+
+struct net_flow_tbl_node *rocker_table_nodes[7] = {
+	&table_node_ingress_port,
+	&table_node_vlan,
+	&table_node_term_mac,
+	&table_node_ucast_routing,
+	&table_node_bridge,
+	&table_node_acl,
+	&table_node_nil,
+};
+#endif /*_MY_PIPELINE_H*/

^ permalink raw reply related

* [net-next PATCH v1 05/11] net: rocker: add set flow rules
From: John Fastabend @ 2014-12-31 19:47 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

Implement set flow operations for existing rocker tables.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/ethernet/rocker/rocker.c          |  517 +++++++++++++++++++++++++
 drivers/net/ethernet/rocker/rocker_pipeline.h |    3 
 2 files changed, 519 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index 4c6787a..c40c58d 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -3806,6 +3806,520 @@ static struct net_flow_hdr_node **rocker_get_hgraph(struct net_device *d)
 {
 	return rocker_header_nodes;
 }
+
+static int is_valid_net_flow_action_arg(struct net_flow_action *a, int id)
+{
+	struct net_flow_action_arg *args = a->args;
+	int i;
+
+	for (i = 0; args[i].type != NET_FLOW_ACTION_ARG_TYPE_NULL; i++) {
+		if (a->args[i].type == NET_FLOW_ACTION_ARG_TYPE_NULL ||
+		    args[i].type != a->args[i].type)
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int is_valid_net_flow_action(struct net_flow_action *a, int *actions)
+{
+	int i;
+
+	for (i = 0; actions[i]; i++) {
+		if (actions[i] == a->uid)
+			return is_valid_net_flow_action_arg(a, a->uid);
+	}
+	return -EINVAL;
+}
+
+static int is_valid_net_flow_match(struct net_flow_field_ref *f,
+				   struct net_flow_field_ref *fields)
+{
+	int i;
+
+	for (i = 0; fields[i].header; i++) {
+		if (f->header == fields[i].header &&
+		    f->field == fields[i].field)
+			return 0;
+	}
+
+	return -EINVAL;
+}
+
+int is_valid_net_flow(struct net_flow_table *table, struct net_flow_flow *flow)
+{
+	struct net_flow_field_ref *fields = table->matches;
+	int *actions = table->actions;
+	int i, err;
+
+	for (i = 0; flow->actions[i].uid; i++) {
+		err = is_valid_net_flow_action(&flow->actions[i], actions);
+		if (err)
+			return -EINVAL;
+	}
+
+	for (i = 0; flow->matches[i].header; i++) {
+		err = is_valid_net_flow_match(&flow->matches[i], fields);
+		if (err)
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+static u32 rocker_goto_value(u32 id)
+{
+	switch (id) {
+	case ROCKER_FLOW_TABLE_ID_INGRESS_PORT:
+		return ROCKER_OF_DPA_TABLE_ID_INGRESS_PORT;
+	case ROCKER_FLOW_TABLE_ID_VLAN:
+		return ROCKER_OF_DPA_TABLE_ID_VLAN;
+	case ROCKER_FLOW_TABLE_ID_TERMINATION_MAC:
+		return ROCKER_OF_DPA_TABLE_ID_TERMINATION_MAC;
+	case ROCKER_FLOW_TABLE_ID_UNICAST_ROUTING:
+		return ROCKER_OF_DPA_TABLE_ID_UNICAST_ROUTING;
+	case ROCKER_FLOW_TABLE_ID_MULTICAST_ROUTING:
+		return ROCKER_OF_DPA_TABLE_ID_MULTICAST_ROUTING;
+	case ROCKER_FLOW_TABLE_ID_BRIDGING:
+		return ROCKER_OF_DPA_TABLE_ID_BRIDGING;
+	case ROCKER_FLOW_TABLE_ID_ACL_POLICY:
+		return ROCKER_OF_DPA_TABLE_ID_ACL_POLICY;
+	default:
+		return 0;
+	}
+}
+
+static int rocker_flow_set_ig_port(struct net_device *dev,
+				   struct net_flow_flow *flow)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	enum rocker_of_dpa_table_id goto_tbl;
+	u32 in_lport_mask = 0xffff0000;
+	u32 in_lport = 0;
+	int err, flags = 0;
+
+	err = is_valid_net_flow(&ingress_port_table, flow);
+	if (err)
+		return err;
+
+	/* ingress port table only supports one field/mask/action this
+	 * simplifies the key construction and we can assume the values
+	 * are the correct types/mask/action by valid check above. The
+	 * user could pass multiple match/actions in a message with the
+	 * same field multiple times currently the valid test does not
+	 * catch this and we just use the first specified.
+	 */
+	in_lport = flow->matches[0].value_u32;
+	in_lport_mask = flow->matches[0].mask_u32;
+	goto_tbl = rocker_goto_value(flow->actions[0].args[0].value_u16);
+
+	err = rocker_flow_tbl_ig_port(rocker_port, flags,
+				      in_lport, in_lport_mask,
+				      goto_tbl);
+	return err;
+}
+
+static int rocker_flow_set_vlan(struct net_device *dev,
+				struct net_flow_flow *flow)
+{
+	enum rocker_of_dpa_table_id goto_tbl;
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	int i, err = 0, flags = 0;
+	u32 in_lport;
+	__be16 vlan_id, vlan_id_mask, new_vlan_id;
+	bool untagged, have_in_lport = false;
+
+	err = is_valid_net_flow(&vlan_table, flow);
+	if (err)
+		return err;
+
+	goto_tbl = ROCKER_OF_DPA_TABLE_ID_TERMINATION_MAC;
+
+	/* If user does not specify vid match default to any */
+	vlan_id = 1;
+	vlan_id_mask = 0;
+
+	for (i = 0; flow->matches && flow->matches[i].instance; i++) {
+		switch (flow->matches[i].instance) {
+		case HEADER_INSTANCE_IN_LPORT:
+			in_lport = flow->matches[i].value_u32;
+			have_in_lport = true;
+			break;
+		case HEADER_INSTANCE_VLAN_OUTER:
+			if (flow->matches[i].field != HEADER_VLAN_VID)
+				break;
+
+			vlan_id = htons(flow->matches[i].value_u16);
+			vlan_id_mask = htons(flow->matches[i].mask_u16);
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	/* If user does not specify a new vlan id use default vlan id */
+	new_vlan_id = rocker_port_vid_to_vlan(rocker_port, vlan_id, &untagged);
+
+	for (i = 0; flow->actions && flow->actions[i].uid; i++) {
+		struct net_flow_action_arg *arg = &flow->actions[i].args[0];
+
+		switch (flow->actions[i].uid) {
+		case ACTION_SET_GOTO_TABLE:
+			goto_tbl = rocker_goto_value(arg->value_u16);
+			break;
+		case ACTION_SET_VLAN_ID:
+			new_vlan_id = htons(arg->value_u16);
+			if (new_vlan_id)
+				untagged = false;
+			break;
+		}
+	}
+
+	if (!have_in_lport)
+		return -EINVAL;
+
+	err = rocker_flow_tbl_vlan(rocker_port, flags, in_lport,
+				   vlan_id, vlan_id_mask, goto_tbl,
+				   untagged, new_vlan_id);
+	return err;
+}
+
+static int rocker_flow_set_term_mac(struct net_device *dev,
+				    struct net_flow_flow *flow)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	__be16 vlan_id, vlan_id_mask, ethtype = 0;
+	const u8 *eth_dst, *eth_dst_mask;
+	u32 in_lport, in_lport_mask;
+	int i, err = 0, flags = 0;
+	bool copy_to_cpu;
+
+	eth_dst = NULL;
+	eth_dst_mask = NULL;
+
+	err = is_valid_net_flow(&term_mac_table, flow);
+	if (err)
+		return err;
+
+	/* If user does not specify vid match default to any */
+	vlan_id = rocker_port->internal_vlan_id;
+	vlan_id_mask = 0;
+
+	/* If user does not specify in_lport match default to any */
+	in_lport = rocker_port->lport;
+	in_lport_mask = 0;
+
+	/* If user does not specify a mac address match any */
+	eth_dst = rocker_port->dev->dev_addr;
+	eth_dst_mask = zero_mac;
+
+	for (i = 0; flow->matches && flow->matches[i].instance; i++) {
+		switch (flow->matches[i].instance) {
+		case HEADER_INSTANCE_IN_LPORT:
+			in_lport = flow->matches[i].value_u32;
+			in_lport_mask = flow->matches[i].mask_u32;
+			break;
+		case HEADER_INSTANCE_VLAN_OUTER:
+			if (flow->matches[i].field != HEADER_VLAN_VID)
+				break;
+
+			vlan_id = htons(flow->matches[i].value_u16);
+			vlan_id_mask = htons(flow->matches[i].mask_u16);
+			break;
+		case HEADER_INSTANCE_ETHERNET:
+			switch (flow->matches[i].field) {
+			case HEADER_ETHERNET_DST_MAC:
+				eth_dst = (u8 *)&flow->matches[i].value_u64;
+				eth_dst_mask = (u8 *)&flow->matches[i].mask_u64;
+				break;
+			case HEADER_ETHERNET_ETHERTYPE:
+				ethtype = htons(flow->matches[i].value_u16);
+				break;
+			default:
+				return -EINVAL;
+			}
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	if (!ethtype)
+		return -EINVAL;
+
+	/* By default do not copy to cpu */
+	copy_to_cpu = false;
+
+	for (i = 0; flow->actions && flow->actions[i].uid; i++) {
+		switch (flow->actions[i].uid) {
+		case ACTION_COPY_TO_CPU:
+			copy_to_cpu = true;
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	err = rocker_flow_tbl_term_mac(rocker_port, in_lport, in_lport_mask,
+				       ethtype, eth_dst, eth_dst_mask,
+				       vlan_id, vlan_id_mask,
+				       copy_to_cpu, flags);
+	return err;
+}
+
+static int rocker_flow_set_ucast_routing(struct net_device *dev,
+					 struct net_flow_flow *flow)
+{
+	return -EOPNOTSUPP;
+}
+
+static int rocker_flow_set_mcast_routing(struct net_device *dev,
+					 struct net_flow_flow *flow)
+{
+	return -EOPNOTSUPP;
+}
+
+static int rocker_flow_set_bridge(struct net_device *dev,
+				  struct net_flow_flow *flow)
+{
+	enum rocker_of_dpa_table_id goto_tbl;
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	u32 in_lport, in_lport_mask, group_id, tunnel_id;
+	__be16 vlan_id, vlan_id_mask;
+	const u8 *eth_dst, *eth_dst_mask;
+	int i, err = 0, flags = 0;
+	bool copy_to_cpu;
+
+	err = is_valid_net_flow(&bridge_table, flow);
+	if (err)
+		return err;
+
+	goto_tbl = ROCKER_OF_DPA_TABLE_ID_ACL_POLICY;
+
+	/* If user does not specify vid match default to any */
+	vlan_id = rocker_port->internal_vlan_id;
+	vlan_id_mask = 0;
+
+	/* If user does not specify in_lport match default to any */
+	in_lport = rocker_port->lport;
+	in_lport_mask = 0;
+
+	/* If user does not specify a mac address match any */
+	eth_dst = rocker_port->dev->dev_addr;
+	eth_dst_mask = NULL;
+
+	/* Do not support for tunnel_id yet. */
+	tunnel_id = 0;
+
+	for (i = 0; flow->matches && flow->matches[i].instance; i++) {
+		switch (flow->matches[i].instance) {
+		case HEADER_INSTANCE_IN_LPORT:
+			in_lport = flow->matches[i].value_u32;
+			in_lport_mask = flow->matches[i].mask_u32;
+			break;
+		case HEADER_INSTANCE_VLAN_OUTER:
+			if (flow->matches[i].field != HEADER_VLAN_VID)
+				break;
+
+			vlan_id = htons(flow->matches[i].value_u16);
+			vlan_id_mask = htons(flow->matches[i].mask_u16);
+			break;
+		case HEADER_INSTANCE_ETHERNET:
+			switch (flow->matches[i].field) {
+			case HEADER_ETHERNET_DST_MAC:
+				eth_dst = (u8 *)&flow->matches[i].value_u64;
+				eth_dst_mask = (u8 *)&flow->matches[i].mask_u64;
+				break;
+			default:
+				return -EINVAL;
+			}
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	/* By default do not copy to cpu and skip group assignment */
+	copy_to_cpu = false;
+	group_id = ROCKER_GROUP_NONE;
+
+	for (i = 0; flow->actions && flow->actions[i].uid; i++) {
+		struct net_flow_action_arg *arg = &flow->actions[i].args[0];
+
+		switch (flow->actions[i].uid) {
+		case ACTION_SET_GOTO_TABLE:
+			goto_tbl = rocker_goto_value(arg->value_u16);
+			break;
+		case ACTION_COPY_TO_CPU:
+			copy_to_cpu = true;
+			break;
+		case ACTION_SET_GROUP_ID:
+			group_id = arg->value_u32;
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	/* Ignoring eth_dst_mask it seems to cause a EINVAL return code */
+	err = rocker_flow_tbl_bridge(rocker_port, flags,
+				     eth_dst, eth_dst_mask,
+				     vlan_id, tunnel_id,
+				     goto_tbl, group_id, copy_to_cpu);
+	return err;
+}
+
+static int rocker_flow_set_acl(struct net_device *dev,
+			       struct net_flow_flow *flow)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	u32 in_lport, in_lport_mask, group_id, tunnel_id;
+	__be16 vlan_id, vlan_id_mask, ethtype = 0;
+	const u8 *eth_dst, *eth_src, *eth_dst_mask, *eth_src_mask;
+	u8 protocol, protocol_mask, dscp, dscp_mask;
+	int i, err = 0, flags = 0;
+
+	err = is_valid_net_flow(&bridge_table, flow);
+	if (err)
+		return err;
+
+	/* If user does not specify vid match default to any */
+	vlan_id = rocker_port->internal_vlan_id;
+	vlan_id_mask = 0;
+
+	/* If user does not specify in_lport match default to any */
+	in_lport = rocker_port->lport;
+	in_lport_mask = 0;
+
+	/* If user does not specify a mac address match any */
+	eth_dst = rocker_port->dev->dev_addr;
+	eth_src = zero_mac;
+	eth_dst_mask = NULL;
+	eth_src_mask = NULL;
+
+	/* If user does not set protocol/dscp mask them out */
+	protocol = 0;
+	dscp = 0;
+	protocol_mask = 0;
+	dscp_mask = 0;
+
+	/* Do not support for tunnel_id yet. */
+	tunnel_id = 0;
+
+	for (i = 0; flow->matches && flow->matches[i].instance; i++) {
+		switch (flow->matches[i].instance) {
+		case HEADER_INSTANCE_IN_LPORT:
+			in_lport = flow->matches[i].value_u32;
+			in_lport_mask = flow->matches[i].mask_u32;
+			break;
+		case HEADER_INSTANCE_VLAN_OUTER:
+			if (flow->matches[i].field != HEADER_VLAN_VID)
+				break;
+
+			vlan_id = htons(flow->matches[i].value_u16);
+			vlan_id_mask = htons(flow->matches[i].mask_u16);
+			break;
+		case HEADER_INSTANCE_ETHERNET:
+			switch (flow->matches[i].field) {
+			case HEADER_ETHERNET_SRC_MAC:
+				eth_src = (u8 *)&flow->matches[i].value_u64;
+				eth_src_mask = (u8 *)&flow->matches[i].mask_u64;
+				break;
+			case HEADER_ETHERNET_DST_MAC:
+				eth_dst = (u8 *)&flow->matches[i].value_u64;
+				eth_dst_mask = (u8 *)&flow->matches[i].mask_u64;
+				break;
+			case HEADER_ETHERNET_ETHERTYPE:
+				ethtype = htons(flow->matches[i].value_u16);
+				break;
+			default:
+				return -EINVAL;
+			}
+			break;
+		case HEADER_INSTANCE_IPV4:
+			switch (flow->matches[i].field) {
+			case HEADER_IPV4_PROTOCOL:
+				protocol = flow->matches[i].value_u8;
+				protocol_mask = flow->matches[i].mask_u8;
+				break;
+			case HEADER_IPV4_DSCP:
+				dscp = flow->matches[i].value_u8;
+				dscp_mask = flow->matches[i].mask_u8;
+				break;
+			default:
+				return -EINVAL;
+			}
+		default:
+			return -EINVAL;
+		}
+	}
+
+	/* By default do not copy to cpu and skip group assignment */
+	group_id = ROCKER_GROUP_NONE;
+
+	for (i = 0; flow->actions && flow->actions[i].uid; i++) {
+		switch (flow->actions[i].uid) {
+		case ACTION_SET_GROUP_ID:
+			group_id = flow->actions[i].args[0].value_u32;
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	err = rocker_flow_tbl_acl(rocker_port, flags,
+				  in_lport, in_lport_mask,
+				  eth_src, eth_src_mask,
+				  eth_dst, eth_dst_mask, ethtype,
+				  vlan_id, vlan_id_mask,
+				  protocol, protocol_mask,
+				  dscp, dscp_mask,
+				  group_id);
+	return err;
+}
+
+static int rocker_set_flows(struct net_device *dev,
+			    struct net_flow_flow *flow)
+{
+	int err = -EINVAL;
+
+	if (!flow->matches || !flow->actions)
+		return -EINVAL;
+
+	switch (flow->table_id) {
+	case ROCKER_FLOW_TABLE_ID_INGRESS_PORT:
+		err = rocker_flow_set_ig_port(dev, flow);
+		break;
+	case ROCKER_FLOW_TABLE_ID_VLAN:
+		err = rocker_flow_set_vlan(dev, flow);
+		break;
+	case ROCKER_FLOW_TABLE_ID_TERMINATION_MAC:
+		err = rocker_flow_set_term_mac(dev, flow);
+		break;
+	case ROCKER_FLOW_TABLE_ID_UNICAST_ROUTING:
+		err = rocker_flow_set_ucast_routing(dev, flow);
+		break;
+	case ROCKER_FLOW_TABLE_ID_MULTICAST_ROUTING:
+		err = rocker_flow_set_mcast_routing(dev, flow);
+		break;
+	case ROCKER_FLOW_TABLE_ID_BRIDGING:
+		err = rocker_flow_set_bridge(dev, flow);
+		break;
+	case ROCKER_FLOW_TABLE_ID_ACL_POLICY:
+		err = rocker_flow_set_acl(dev, flow);
+		break;
+	default:
+		break;
+	}
+
+	return err;
+}
+
+static int rocker_del_flows(struct net_device *dev,
+			    struct net_flow_flow *flow)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 
 static const struct net_device_ops rocker_port_netdev_ops = {
@@ -3828,6 +4342,9 @@ static const struct net_device_ops rocker_port_netdev_ops = {
 	.ndo_flow_get_actions		= rocker_get_actions,
 	.ndo_flow_get_tbl_graph		= rocker_get_tgraph,
 	.ndo_flow_get_hdr_graph		= rocker_get_hgraph,
+
+	.ndo_flow_set_flows		= rocker_set_flows,
+	.ndo_flow_del_flows		= rocker_del_flows,
 #endif
 };
 
diff --git a/drivers/net/ethernet/rocker/rocker_pipeline.h b/drivers/net/ethernet/rocker/rocker_pipeline.h
index 9544339..701e139 100644
--- a/drivers/net/ethernet/rocker/rocker_pipeline.h
+++ b/drivers/net/ethernet/rocker/rocker_pipeline.h
@@ -527,6 +527,7 @@ enum rocker_flow_table_id_space {
 	ROCKER_FLOW_TABLE_ID_VLAN,
 	ROCKER_FLOW_TABLE_ID_TERMINATION_MAC,
 	ROCKER_FLOW_TABLE_ID_UNICAST_ROUTING,
+	ROCKER_FLOW_TABLE_ID_MULTICAST_ROUTING,
 	ROCKER_FLOW_TABLE_ID_BRIDGING,
 	ROCKER_FLOW_TABLE_ID_ACL_POLICY,
 	ROCKER_FLOW_TABLE_NULL = 0,
@@ -588,7 +589,7 @@ struct net_flow_table acl_table = {
 
 struct net_flow_table null_table = {
 	.name = "",
-	.uid = 0,
+	.uid = ROCKER_FLOW_TABLE_NULL,
 	.source = 0,
 	.size = 0,
 	.matches = NULL,

^ permalink raw reply related

* [net-next PATCH v1 06/11] net: rocker: add group_id slices and drop explicit goto
From: John Fastabend @ 2014-12-31 19:48 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

This adds the group tables for l3_unicast, l2_rewrite and l2. In
addition to adding the tables we extend the metadata fields to
support three different group id lookups. One for each table and
drop the more generic one previously being used.

Finally we can also drop the goto action as it is not used anymore.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/ethernet/rocker/rocker.c          |  192 +++++++++++++++++++-
 drivers/net/ethernet/rocker/rocker_pipeline.h |  235 ++++++++++++++++++-------
 2 files changed, 355 insertions(+), 72 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index c40c58d..8ce9933 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -3964,9 +3964,6 @@ static int rocker_flow_set_vlan(struct net_device *dev,
 		struct net_flow_action_arg *arg = &flow->actions[i].args[0];
 
 		switch (flow->actions[i].uid) {
-		case ACTION_SET_GOTO_TABLE:
-			goto_tbl = rocker_goto_value(arg->value_u16);
-			break;
 		case ACTION_SET_VLAN_ID:
 			new_vlan_id = htons(arg->value_u16);
 			if (new_vlan_id)
@@ -4147,14 +4144,11 @@ static int rocker_flow_set_bridge(struct net_device *dev,
 		struct net_flow_action_arg *arg = &flow->actions[i].args[0];
 
 		switch (flow->actions[i].uid) {
-		case ACTION_SET_GOTO_TABLE:
-			goto_tbl = rocker_goto_value(arg->value_u16);
-			break;
 		case ACTION_COPY_TO_CPU:
 			copy_to_cpu = true;
 			break;
-		case ACTION_SET_GROUP_ID:
-			group_id = arg->value_u32;
+		case ACTION_SET_L3_UNICAST_GROUP_ID:
+			group_id = ROCKER_GROUP_L3_UNICAST(arg->value_u32);
 			break;
 		default:
 			return -EINVAL;
@@ -4258,9 +4252,11 @@ static int rocker_flow_set_acl(struct net_device *dev,
 	group_id = ROCKER_GROUP_NONE;
 
 	for (i = 0; flow->actions && flow->actions[i].uid; i++) {
+		struct net_flow_action_arg *arg = &flow->actions[i].args[0];
+
 		switch (flow->actions[i].uid) {
-		case ACTION_SET_GROUP_ID:
-			group_id = flow->actions[i].args[0].value_u32;
+		case ACTION_SET_L3_UNICAST_GROUP_ID:
+			group_id = ROCKER_GROUP_L3_UNICAST(arg->value_u32);
 			break;
 		default:
 			return -EINVAL;
@@ -4278,6 +4274,173 @@ static int rocker_flow_set_acl(struct net_device *dev,
 	return err;
 }
 
+static int rocker_flow_set_group_slice_l3_unicast(struct net_device *dev,
+						  struct net_flow_flow *flow)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	struct rocker_group_tbl_entry *entry;
+	int i, flags = 0, err = 0;
+
+	err = is_valid_net_flow(&group_slice_l3_unicast_table, flow);
+	if (err)
+		return err;
+
+	entry = kzalloc(sizeof(*entry), rocker_op_flags_gfp(flags));
+	if (!entry)
+		return -ENOMEM;
+
+	for (i = 0; flow->matches && flow->matches[i].instance; i++) {
+		struct net_flow_field_ref *r = &flow->matches[i];
+
+		switch (r->instance) {
+		case HEADER_INSTANCE_L3_UNICAST_GROUP_ID:
+			entry->group_id = ROCKER_GROUP_L3_UNICAST(r->value_u32);
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	for (i = 0; flow->actions && flow->actions[i].uid; i++) {
+		struct net_flow_action_arg *arg = &flow->actions[i].args[0];
+
+		switch (flow->actions[i].uid) {
+		case ACTION_SET_ETH_SRC:
+			ether_addr_copy(entry->l3_unicast.eth_src,
+					(u8 *)&arg->value_u64);
+			break;
+		case ACTION_SET_ETH_DST:
+			ether_addr_copy(entry->l3_unicast.eth_dst,
+					(u8 *)&arg->value_u64);
+			break;
+		case ACTION_SET_VLAN_ID:
+			entry->l3_unicast.vlan_id = htons(arg->value_u16);
+			break;
+		case ACTION_CHECK_TTL_DROP:
+			entry->l3_unicast.ttl_check = true;
+			break;
+		case ACTION_SET_L2_REWRITE_GROUP_ID:
+			entry->l3_unicast.group_id =
+				ROCKER_GROUP_L2_REWRITE(arg->value_u32);
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	return rocker_group_tbl_do(rocker_port, flags, entry);
+}
+
+static int rocker_flow_set_group_slice_l2_rewrite(struct net_device *dev,
+						  struct net_flow_flow *flow)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	struct rocker_group_tbl_entry *entry;
+	int i, flags = 0, err = 0;
+
+	err = is_valid_net_flow(&group_slice_l2_rewrite_table, flow);
+	if (err)
+		return err;
+
+	entry = kzalloc(sizeof(*entry), rocker_op_flags_gfp(flags));
+	if (!entry)
+		return -ENOMEM;
+
+	for (i = 0; flow->matches && flow->matches[i].instance; i++) {
+		struct net_flow_field_ref *r = &flow->matches[i];
+
+		switch (r->instance) {
+		case HEADER_INSTANCE_L2_REWRITE_GROUP_ID:
+			entry->group_id = ROCKER_GROUP_L2_REWRITE(r->value_u32);
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	for (i = 0; flow->actions && flow->actions[i].uid; i++) {
+		struct net_flow_action_arg *arg = &flow->actions[i].args[0];
+
+		switch (flow->actions[i].uid) {
+		case ACTION_SET_ETH_SRC:
+			ether_addr_copy(entry->l2_rewrite.eth_src,
+					(u8 *)&arg->value_u64);
+			break;
+		case ACTION_SET_ETH_DST:
+			ether_addr_copy(entry->l2_rewrite.eth_dst,
+					(u8 *)&arg->value_u64);
+			break;
+		case ACTION_SET_VLAN_ID:
+			entry->l2_rewrite.vlan_id = htons(arg->value_u16);
+			break;
+		case ACTION_SET_L2_GROUP_ID:
+			entry->l2_rewrite.group_id =
+				ROCKER_GROUP_L2_INTERFACE(arg->value_u32,
+							  rocker_port->lport);
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	return rocker_group_tbl_do(rocker_port, flags, entry);
+}
+
+static int rocker_flow_set_group_slice_l2(struct net_device *dev,
+					  struct net_flow_flow *flow)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	struct rocker_group_tbl_entry *entry;
+	int i, flags = 0, err = 0;
+	u32 lport;
+
+	err = is_valid_net_flow(&group_slice_l2_table, flow);
+	if (err)
+		return err;
+
+	entry = kzalloc(sizeof(*entry), rocker_op_flags_gfp(flags));
+	if (!entry)
+		return -ENOMEM;
+
+	lport = rocker_port->lport;
+
+	/* Use the dev lport if we don't have a specified lport instance
+	 * from the user. We need to walk the list once before to extract
+	 * any lport attribute.
+	 */
+	for (i = 0; flow->matches && flow->matches[i].instance; i++) {
+		switch (flow->matches[i].instance) {
+		case HEADER_METADATA_IN_LPORT:
+			lport = flow->matches[i].value_u32;
+		}
+	}
+
+	for (i = 0; flow->matches && flow->matches[i].instance; i++) {
+		struct net_flow_field_ref *r = &flow->matches[i];
+
+		switch (r->instance) {
+		case HEADER_INSTANCE_L2_GROUP_ID:
+			entry->group_id =
+				ROCKER_GROUP_L2_INTERFACE(r->value_u32, lport);
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	for (i = 0; flow->actions && flow->actions[i].uid; i++) {
+		switch (flow->actions[i].uid) {
+		case ACTION_POP_VLAN:
+			entry->l2_interface.pop_vlan = true;
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	return rocker_group_tbl_do(rocker_port, flags, entry);
+}
+
 static int rocker_set_flows(struct net_device *dev,
 			    struct net_flow_flow *flow)
 {
@@ -4308,6 +4471,15 @@ static int rocker_set_flows(struct net_device *dev,
 	case ROCKER_FLOW_TABLE_ID_ACL_POLICY:
 		err = rocker_flow_set_acl(dev, flow);
 		break;
+	case ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L3_UNICAST:
+		err = rocker_flow_set_group_slice_l3_unicast(dev, flow);
+		break;
+	case ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2_REWRITE:
+		err = rocker_flow_set_group_slice_l2_rewrite(dev, flow);
+		break;
+	case ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2:
+		err = rocker_flow_set_group_slice_l2(dev, flow);
+		break;
 	default:
 		break;
 	}
diff --git a/drivers/net/ethernet/rocker/rocker_pipeline.h b/drivers/net/ethernet/rocker/rocker_pipeline.h
index 701e139..7e689c0 100644
--- a/drivers/net/ethernet/rocker/rocker_pipeline.h
+++ b/drivers/net/ethernet/rocker/rocker_pipeline.h
@@ -111,16 +111,21 @@ struct net_flow_header ipv4 = {
 
 #define HEADER_METADATA_IN_LPORT 1
 #define HEADER_METADATA_GOTO_TBL 2
-#define HEADER_METADATA_GROUP_ID 3
-struct net_flow_field metadata_fields[3] = {
+#define HEADER_METADATA_L3_UNICAST_GROUP_ID	3
+#define HEADER_METADATA_L2_REWRITE_GROUP_ID	4
+#define HEADER_METADATA_L2_GROUP_ID		5
+struct net_flow_field metadata_fields[5] = {
 	{ .name = "in_lport",
 	  .uid = HEADER_METADATA_IN_LPORT,
 	  .bitwidth = 32,},
-	{ .name = "goto_tbl",
-	  .uid = HEADER_METADATA_GOTO_TBL,
-	  .bitwidth = 16,},
-	{ .name = "group_id",
-	  .uid = HEADER_METADATA_GROUP_ID,
+	{ .name = "l3_unicast_group_id",
+	  .uid = HEADER_METADATA_L3_UNICAST_GROUP_ID,
+	  .bitwidth = 32,},
+	{ .name = "l2_rewrite_group_id",
+	  .uid = HEADER_METADATA_L2_REWRITE_GROUP_ID,
+	  .bitwidth = 32,},
+	{ .name = "l2_group_id",
+	  .uid = HEADER_METADATA_L2_GROUP_ID,
 	  .bitwidth = 32,},
 };
 
@@ -128,7 +133,7 @@ struct net_flow_field metadata_fields[3] = {
 struct net_flow_header metadata_t = {
 	.name = "metadata_t",
 	.uid = HEADER_METADATA,
-	.field_sz = 3,
+	.field_sz = 5,
 	.fields = metadata_fields,
 };
 
@@ -157,25 +162,6 @@ struct net_flow_action null_action = {
 	.name = "", .uid = 0, .args = NULL,
 };
 
-struct net_flow_action_arg set_goto_table_args[2] = {
-	{
-		.name = "table",
-		.type = NET_FLOW_ACTION_ARG_TYPE_U16,
-		.value_u16 = 0,
-	},
-	{
-		.name = "",
-		.type = NET_FLOW_ACTION_ARG_TYPE_NULL,
-	},
-};
-
-#define ACTION_SET_GOTO_TABLE 1
-struct net_flow_action set_goto_table = {
-	.name = "set_goto_table",
-	.uid = ACTION_SET_GOTO_TABLE,
-	.args = set_goto_table_args,
-};
-
 struct net_flow_action_arg set_vlan_id_args[2] = {
 	{
 		.name = "vlan_id",
@@ -188,7 +174,7 @@ struct net_flow_action_arg set_vlan_id_args[2] = {
 	},
 };
 
-#define ACTION_SET_VLAN_ID 2
+#define ACTION_SET_VLAN_ID 1
 struct net_flow_action set_vlan_id = {
 	.name = "set_vlan_id",
 	.uid = ACTION_SET_VLAN_ID,
@@ -196,7 +182,7 @@ struct net_flow_action set_vlan_id = {
 };
 
 /* TBD: what is the untagged bool about in vlan table */
-#define ACTION_COPY_TO_CPU 3
+#define ACTION_COPY_TO_CPU 2
 struct net_flow_action copy_to_cpu = {
 	.name = "copy_to_cpu",
 	.uid = ACTION_COPY_TO_CPU,
@@ -215,14 +201,28 @@ struct net_flow_action_arg set_group_id_args[2] = {
 	},
 };
 
-#define ACTION_SET_GROUP_ID 4
-struct net_flow_action set_group_id = {
-	.name = "set_group_id",
-	.uid = ACTION_SET_GROUP_ID,
+#define ACTION_SET_L3_UNICAST_GROUP_ID 3
+struct net_flow_action set_l3_unicast_group_id = {
+	.name = "set_l3_unicast_group_id",
+	.uid = ACTION_SET_L3_UNICAST_GROUP_ID,
 	.args = set_group_id_args,
 };
 
-#define ACTION_POP_VLAN 5
+#define ACTION_SET_L2_REWRITE_GROUP_ID 4
+struct net_flow_action set_l2_rewrite_group_id = {
+	.name = "set_l2_rewrite_group_id",
+	.uid = ACTION_SET_L2_REWRITE_GROUP_ID,
+	.args = set_group_id_args,
+};
+
+#define ACTION_SET_L2_GROUP_ID 5
+struct net_flow_action set_l2_group_id = {
+	.name = "set_l2_group_id",
+	.uid = ACTION_SET_L2_GROUP_ID,
+	.args = set_group_id_args,
+};
+
+#define ACTION_POP_VLAN 6
 struct net_flow_action pop_vlan = {
 	.name = "pop_vlan",
 	.uid = ACTION_POP_VLAN,
@@ -241,7 +241,7 @@ struct net_flow_action_arg set_eth_src_args[2] = {
 	},
 };
 
-#define ACTION_SET_ETH_SRC 6
+#define ACTION_SET_ETH_SRC 7
 struct net_flow_action set_eth_src = {
 	.name = "set_eth_src",
 	.uid = ACTION_SET_ETH_SRC,
@@ -260,7 +260,7 @@ struct net_flow_action_arg set_eth_dst_args[2] = {
 	},
 };
 
-#define ACTION_SET_ETH_DST 7
+#define ACTION_SET_ETH_DST 8
 struct net_flow_action set_eth_dst = {
 	.name = "set_eth_dst",
 	.uid = ACTION_SET_ETH_DST,
@@ -279,21 +279,30 @@ struct net_flow_action_arg set_out_port_args[2] = {
 	},
 };
 
-#define ACTION_SET_OUT_PORT 8
+#define ACTION_SET_OUT_PORT 9
 struct net_flow_action set_out_port = {
 	.name = "set_out_port",
 	.uid = ACTION_SET_OUT_PORT,
 	.args = set_out_port_args,
 };
 
-struct net_flow_action *rocker_action_list[8] = {
-	&set_goto_table,
+#define ACTION_CHECK_TTL_DROP 10
+struct net_flow_action check_ttl_drop = {
+	.name = "check_ttl_drop",
+	.uid = ACTION_CHECK_TTL_DROP,
+	.args = null_args,
+};
+
+struct net_flow_action *rocker_action_list[10] = {
 	&set_vlan_id,
 	&copy_to_cpu,
-	&set_group_id,
+	&set_l3_unicast_group_id,
+	&set_l2_rewrite_group_id,
+	&set_l2_group_id,
 	&pop_vlan,
 	&set_eth_src,
 	&set_eth_dst,
+	&check_ttl_drop,
 	&null_action,
 };
 
@@ -302,8 +311,9 @@ struct net_flow_action *rocker_action_list[8] = {
 #define HEADER_INSTANCE_VLAN_OUTER 2
 #define HEADER_INSTANCE_IPV4 3
 #define HEADER_INSTANCE_IN_LPORT 4
-#define HEADER_INSTANCE_GOTO_TABLE 5
-#define HEADER_INSTANCE_GROUP_ID 6
+#define HEADER_INSTANCE_L3_UNICAST_GROUP_ID 5
+#define HEADER_INSTANCE_L2_REWRITE_GROUP_ID 6
+#define HEADER_INSTANCE_L2_GROUP_ID 7
 
 struct net_flow_jump_table parse_ethernet[3] = {
 	{
@@ -390,29 +400,37 @@ struct net_flow_hdr_node in_lport_header_node = {
 	.jump = terminal_headers,
 };
 
-struct net_flow_hdr_node goto_table_header_node = {
-	.name = "goto_table",
-	.uid = HEADER_INSTANCE_GOTO_TABLE,
+struct net_flow_hdr_node l2_group_id_header_node = {
+	.name = "l2_group_id",
+	.uid = HEADER_INSTANCE_L2_GROUP_ID,
+	.hdrs = metadata_headers,
+	.jump = terminal_headers,
+};
+
+struct net_flow_hdr_node l2_rewrite_group_id_header_node = {
+	.name = "l2_rewrite_group_id",
+	.uid = HEADER_INSTANCE_L2_REWRITE_GROUP_ID,
 	.hdrs = metadata_headers,
 	.jump = terminal_headers,
 };
 
-struct net_flow_hdr_node group_id_header_node = {
-	.name = "group_id",
-	.uid = HEADER_INSTANCE_GROUP_ID,
+struct net_flow_hdr_node l3_unicast_group_id_header_node = {
+	.name = "l3_uniscast_group_id",
+	.uid = HEADER_INSTANCE_L3_UNICAST_GROUP_ID,
 	.hdrs = metadata_headers,
 	.jump = terminal_headers,
 };
 
 struct net_flow_hdr_node null_header = {.name = "", .uid = 0,};
 
-struct net_flow_hdr_node *rocker_header_nodes[7] = {
+struct net_flow_hdr_node *rocker_header_nodes[] = {
 	&ethernet_header_node,
 	&vlan_header_node,
 	&ipv4_header_node,
 	&in_lport_header_node,
-	&goto_table_header_node,
-	&group_id_header_node,
+	&l3_unicast_group_id_header_node,
+	&l2_rewrite_group_id_header_node,
+	&l2_group_id_header_node,
 	&null_header,
 };
 
@@ -513,14 +531,46 @@ struct net_flow_field_ref matches_acl[8] = {
 	{ .instance = 0, .field = 0},
 };
 
-int actions_ig_port[2] = {ACTION_SET_GOTO_TABLE, 0};
-int actions_vlan[3] = {ACTION_SET_GOTO_TABLE, ACTION_SET_VLAN_ID, 0};
-int actions_term_mac[3] = {ACTION_SET_GOTO_TABLE, ACTION_COPY_TO_CPU, 0};
-int actions_ucast_routing[3] = {ACTION_SET_GOTO_TABLE, ACTION_SET_GROUP_ID, 0};
-int actions_bridge[4] = {ACTION_SET_GOTO_TABLE,
-			 ACTION_SET_GROUP_ID,
-			 ACTION_COPY_TO_CPU, 0};
-int actions_acl[2] = {ACTION_SET_GROUP_ID, 0};
+struct net_flow_field_ref matches_l3_unicast_group_slice[2] = {
+	{ .instance = HEADER_INSTANCE_L3_UNICAST_GROUP_ID,
+	  .header = HEADER_METADATA,
+	  .field = HEADER_METADATA_L3_UNICAST_GROUP_ID,
+	  .mask_type = NET_FLOW_MASK_TYPE_EXACT},
+	{ .instance = 0, .field = 0},
+};
+
+struct net_flow_field_ref matches_l2_rewrite_group_slice[2] = {
+	{ .instance = HEADER_INSTANCE_L2_REWRITE_GROUP_ID,
+	  .header = HEADER_METADATA,
+	  .field = HEADER_METADATA_L2_REWRITE_GROUP_ID,
+	  .mask_type = NET_FLOW_MASK_TYPE_EXACT},
+	{ .instance = 0, .field = 0},
+};
+
+struct net_flow_field_ref matches_l2_group_slice[2] = {
+	{ .instance = HEADER_INSTANCE_L2_GROUP_ID,
+	  .header = HEADER_METADATA,
+	  .field = HEADER_METADATA_L2_GROUP_ID,
+	  .mask_type = NET_FLOW_MASK_TYPE_EXACT},
+	{ .instance = 0, .field = 0},
+};
+
+int actions_ig_port[] = {0};
+int actions_vlan[] = {ACTION_SET_VLAN_ID, 0};
+int actions_term_mac[] = {ACTION_COPY_TO_CPU, 0};
+int actions_ucast_routing[] = {ACTION_SET_L3_UNICAST_GROUP_ID, 0};
+int actions_bridge[] = {ACTION_SET_L2_GROUP_ID, ACTION_COPY_TO_CPU, 0};
+int actions_acl[] = {ACTION_SET_L3_UNICAST_GROUP_ID, 0};
+int actions_group_slice_l3_unicast[] = {ACTION_SET_ETH_SRC,
+					ACTION_SET_ETH_DST,
+					ACTION_SET_VLAN_ID,
+					ACTION_SET_L2_REWRITE_GROUP_ID,
+					ACTION_CHECK_TTL_DROP, 0};
+int actions_group_slice_l2_rewrite[] = {ACTION_SET_ETH_SRC,
+					ACTION_SET_ETH_DST,
+					ACTION_SET_VLAN_ID,
+					ACTION_SET_L2_GROUP_ID, 0};
+int actions_group_slice_l2[] = {ACTION_POP_VLAN, 0};
 
 enum rocker_flow_table_id_space {
 	ROCKER_FLOW_TABLE_ID_INGRESS_PORT = 1,
@@ -530,6 +580,9 @@ enum rocker_flow_table_id_space {
 	ROCKER_FLOW_TABLE_ID_MULTICAST_ROUTING,
 	ROCKER_FLOW_TABLE_ID_BRIDGING,
 	ROCKER_FLOW_TABLE_ID_ACL_POLICY,
+	ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L3_UNICAST,
+	ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2_REWRITE,
+	ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2,
 	ROCKER_FLOW_TABLE_NULL = 0,
 };
 
@@ -587,6 +640,33 @@ struct net_flow_table acl_table = {
 	.actions = actions_acl,
 };
 
+struct net_flow_table group_slice_l3_unicast_table = {
+	.name = "group_slice_l3_unicast",
+	.uid = ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L3_UNICAST,
+	.source = 1,
+	.size = -1,
+	.matches = matches_l3_unicast_group_slice,
+	.actions = actions_group_slice_l3_unicast,
+};
+
+struct net_flow_table group_slice_l2_rewrite_table = {
+	.name = "group_slice_l2_rewrite",
+	.uid = ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2_REWRITE,
+	.source = 1,
+	.size = -1,
+	.matches = matches_l2_rewrite_group_slice,
+	.actions = actions_group_slice_l2_rewrite,
+};
+
+struct net_flow_table group_slice_l2_table = {
+	.name = "group_slice_l2",
+	.uid = ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2,
+	.source = 1,
+	.size = -1,
+	.matches = matches_l2_group_slice,
+	.actions = actions_group_slice_l2,
+};
+
 struct net_flow_table null_table = {
 	.name = "",
 	.uid = ROCKER_FLOW_TABLE_NULL,
@@ -596,13 +676,16 @@ struct net_flow_table null_table = {
 	.actions = NULL,
 };
 
-struct net_flow_table *rocker_table_list[7] = {
+struct net_flow_table *rocker_table_list[10] = {
 	&ingress_port_table,
 	&vlan_table,
 	&term_mac_table,
 	&ucast_routing_table,
 	&bridge_table,
 	&acl_table,
+	&group_slice_l3_unicast_table,
+	&group_slice_l2_rewrite_table,
+	&group_slice_l2_table,
 	&null_table,
 };
 
@@ -652,7 +735,8 @@ struct net_flow_tbl_node table_node_ucast_routing = {
 	.uid = ROCKER_FLOW_TABLE_ID_UNICAST_ROUTING,
 	.jump = table_node_ucast_routing_next};
 
-struct net_flow_jump_table table_node_acl_next[1] = {
+struct net_flow_jump_table table_node_acl_next[2] = {
+	{ .field = {0}, .node = ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L3_UNICAST},
 	{ .field = {0}, .node = 0},
 };
 
@@ -660,15 +744,42 @@ struct net_flow_tbl_node table_node_acl = {
 	.uid = ROCKER_FLOW_TABLE_ID_ACL_POLICY,
 	.jump = table_node_acl_next};
 
+struct net_flow_jump_table table_node_group_l3_unicast_next[1] = {
+	{ .field = {0}, .node = ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2_REWRITE},
+};
+
+struct net_flow_tbl_node table_node_group_l3_unicast = {
+	.uid = ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L3_UNICAST,
+	.jump = table_node_group_l3_unicast_next};
+
+struct net_flow_jump_table table_node_group_l2_rewrite_next[1] = {
+	{ .field = {0}, .node = ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2},
+};
+
+struct net_flow_tbl_node table_node_group_l2_rewrite = {
+	.uid = ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2_REWRITE,
+	.jump = table_node_group_l2_rewrite_next};
+
+struct net_flow_jump_table table_node_group_l2_next[1] = {
+	{ .field = {0}, .node = 0},
+};
+
+struct net_flow_tbl_node table_node_group_l2 = {
+	.uid = ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2,
+	.jump = table_node_group_l2_next};
+
 struct net_flow_tbl_node table_node_nil = {.uid = 0, .jump = NULL};
 
-struct net_flow_tbl_node *rocker_table_nodes[7] = {
+struct net_flow_tbl_node *rocker_table_nodes[10] = {
 	&table_node_ingress_port,
 	&table_node_vlan,
 	&table_node_term_mac,
 	&table_node_ucast_routing,
 	&table_node_bridge,
 	&table_node_acl,
+	&table_node_group_l3_unicast,
+	&table_node_group_l2_rewrite,
+	&table_node_group_l2,
 	&table_node_nil,
 };
 #endif /*_MY_PIPELINE_H*/

^ permalink raw reply related

* [net-next PATCH v1 07/11] net: rocker: add multicast path to bridging
From: John Fastabend @ 2014-12-31 19:48 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

Add path in table graph to send packets to the bridge table.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/ethernet/rocker/rocker_pipeline.h |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/rocker/rocker_pipeline.h b/drivers/net/ethernet/rocker/rocker_pipeline.h
index 7e689c0..0835bcc 100644
--- a/drivers/net/ethernet/rocker/rocker_pipeline.h
+++ b/drivers/net/ethernet/rocker/rocker_pipeline.h
@@ -708,7 +708,15 @@ struct net_flow_tbl_node table_node_vlan = {
 	.uid = ROCKER_FLOW_TABLE_ID_VLAN,
 	.jump = table_node_vlan_next};
 
-struct net_flow_jump_table table_node_term_mac_next[2] = {
+struct net_flow_jump_table table_node_term_mac_next[3] = {
+	{ .field = {.instance = HEADER_INSTANCE_ETHERNET,
+		    .header = HEADER_ETHERNET,
+		    .field = HEADER_ETHERNET_DST_MAC,
+		    .mask_type = NET_FLOW_MASK_TYPE_LPM,
+		    .type = NET_FLOW_FIELD_REF_ATTR_TYPE_U64,
+		    .value_u64 = (__u64)0x1,
+		    .mask_u64 = (__u64)0x1,
+	}, .node = ROCKER_FLOW_TABLE_ID_BRIDGING},
 	{ .field = {0}, .node = ROCKER_FLOW_TABLE_ID_UNICAST_ROUTING},
 	{ .field = {0}, .node = 0},
 };

^ permalink raw reply related

* [net-next PATCH v1 08/11] net: rocker: add get flow API operation
From: John Fastabend @ 2014-12-31 19:48 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

Add operations to get flows. I wouldn't mind cleaning this code
up a bit but my first attempt to do this used macros which shortered
the code up but when I was done I decided it just made the code
unreadable and unmaintainable.

I might think about it a bit more but this implementation albeit
a bit long and repeatative is easier to understand IMO.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/ethernet/rocker/rocker.c |  819 ++++++++++++++++++++++++++++++++++
 1 file changed, 819 insertions(+)

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index 8ce9933..997beb9 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -3884,6 +3884,12 @@ static u32 rocker_goto_value(u32 id)
 		return ROCKER_OF_DPA_TABLE_ID_BRIDGING;
 	case ROCKER_FLOW_TABLE_ID_ACL_POLICY:
 		return ROCKER_OF_DPA_TABLE_ID_ACL_POLICY;
+	case ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L3_UNICAST:
+		return ROCKER_OF_DPA_GROUP_TYPE_L3_UCAST;
+	case ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2_REWRITE:
+		return ROCKER_OF_DPA_GROUP_TYPE_L2_REWRITE;
+	case ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2:
+		return ROCKER_OF_DPA_GROUP_TYPE_L2_INTERFACE;
 	default:
 		return 0;
 	}
@@ -4492,6 +4498,818 @@ static int rocker_del_flows(struct net_device *dev,
 {
 	return -EOPNOTSUPP;
 }
+
+static int rocker_ig_port_to_flow(struct rocker_flow_tbl_key *key,
+				  struct net_flow_flow *flow)
+{
+	flow->matches = kcalloc(2, sizeof(struct net_flow_field_ref),
+				GFP_KERNEL);
+	if (!flow->matches)
+		return -ENOMEM;
+
+	flow->matches[0].instance = HEADER_INSTANCE_IN_LPORT;
+	flow->matches[0].header = HEADER_METADATA;
+	flow->matches[0].field = HEADER_METADATA_IN_LPORT;
+	flow->matches[0].mask_type = NET_FLOW_MASK_TYPE_LPM;
+	flow->matches[0].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U32;
+	flow->matches[0].value_u32 = key->ig_port.in_lport;
+	flow->matches[0].mask_u32 = key->ig_port.in_lport_mask;
+	memset(&flow->matches[1], 0, sizeof(flow->matches[1]));
+	return 0;
+}
+
+static int rocker_vlan_to_flow(struct rocker_flow_tbl_key *key,
+			       struct net_flow_flow *flow)
+{
+	int cnt = 0;
+
+	if (key->vlan.in_lport)
+		cnt++;
+	if (key->vlan.vlan_id)
+		cnt++;
+
+	flow->matches = kcalloc((cnt + 1),
+				sizeof(struct net_flow_field_ref),
+				GFP_KERNEL);
+	if (!flow->matches)
+		return -ENOMEM;
+
+	cnt = 0;
+	if (key->vlan.in_lport) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_IN_LPORT;
+		flow->matches[cnt].header = HEADER_METADATA;
+		flow->matches[cnt].field = HEADER_METADATA_IN_LPORT;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_EXACT;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U32;
+		flow->matches[cnt].value_u32 = key->vlan.in_lport;
+		cnt++;
+	}
+
+	if (key->vlan.vlan_id) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_VLAN_OUTER;
+		flow->matches[cnt].header = HEADER_VLAN;
+		flow->matches[cnt].field = HEADER_VLAN_VID;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U16;
+		flow->matches[cnt].value_u16 = ntohs(key->vlan.vlan_id);
+		flow->matches[cnt].mask_u16 = ntohs(key->vlan.vlan_id_mask);
+		cnt++;
+	}
+	memset(&flow->matches[cnt], 0, sizeof(flow->matches[cnt]));
+
+	flow->actions = kcalloc(2,
+				sizeof(struct net_flow_action),
+				GFP_KERNEL);
+	if (!flow->actions) {
+		kfree(flow->matches);
+		return -ENOMEM;
+	}
+
+	flow->actions[0].args = kcalloc(2, sizeof(struct net_flow_action_arg),
+					GFP_KERNEL);
+	if (!flow->actions[0].args) {
+		kfree(flow->matches);
+		kfree(flow->actions);
+		return -ENOMEM;
+	}
+
+	flow->actions[0].uid = ACTION_SET_VLAN_ID;
+	flow->actions[0].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U16;
+	flow->actions[0].args[0].value_u16 = ntohs(key->vlan.new_vlan_id);
+
+	memset(&flow->actions[1], 0, sizeof(flow->actions[1]));
+	memset(&flow->actions[0].args[1], 0,
+	       sizeof(struct net_flow_action_arg));
+
+	return 0;
+}
+
+static int rocker_term_to_flow(struct rocker_flow_tbl_key *key,
+			       struct net_flow_flow *flow)
+{
+	int cnt = 0;
+
+	if (key->term_mac.in_lport)
+		cnt++;
+	if (key->term_mac.eth_type)
+		cnt++;
+	if (key->term_mac.eth_dst)
+		cnt++;
+	if (key->term_mac.vlan_id)
+		cnt++;
+
+	flow->matches = kcalloc((cnt + 1), sizeof(struct net_flow_field_ref),
+				GFP_KERNEL);
+	if (!flow->matches)
+		return -ENOMEM;
+
+	cnt = 0;
+	if (key->term_mac.in_lport) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_IN_LPORT;
+		flow->matches[cnt].header = HEADER_METADATA;
+		flow->matches[cnt].field = HEADER_METADATA_IN_LPORT;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U32;
+		flow->matches[cnt].value_u32 = key->term_mac.in_lport;
+		flow->matches[cnt].mask_u32 = key->term_mac.in_lport;
+		cnt++;
+	}
+
+	if (key->term_mac.eth_type) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_ETHERNET;
+		flow->matches[cnt].header = HEADER_ETHERNET;
+		flow->matches[cnt].field = HEADER_ETHERNET_ETHERTYPE;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_EXACT;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U16;
+		flow->matches[cnt].value_u16 = ntohs(key->term_mac.eth_type);
+		cnt++;
+	}
+
+	if (key->term_mac.eth_dst) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_ETHERNET;
+		flow->matches[cnt].header = HEADER_ETHERNET;
+		flow->matches[cnt].field = HEADER_ETHERNET_DST_MAC;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U64;
+		memcpy(&flow->matches[cnt].value_u64,
+		       key->term_mac.eth_dst, ETH_ALEN);
+		memcpy(&flow->matches[cnt].mask_u64,
+		       key->term_mac.eth_dst_mask, ETH_ALEN);
+		cnt++;
+	}
+
+	if (key->term_mac.vlan_id) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_VLAN_OUTER;
+		flow->matches[cnt].header = HEADER_VLAN;
+		flow->matches[cnt].field = HEADER_VLAN_VID;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U16;
+		flow->matches[cnt].value_u16 = ntohs(key->term_mac.vlan_id);
+		flow->matches[cnt].mask_u16 = ntohs(key->term_mac.vlan_id_mask);
+		cnt++;
+	}
+
+	memset(&flow->matches[cnt], 0, sizeof(flow->matches[cnt]));
+
+	flow->actions = kmalloc(2 * sizeof(struct net_flow_action), GFP_KERNEL);
+	if (!flow->actions) {
+		kfree(flow->matches);
+		return -ENOMEM;
+	}
+
+	flow->actions[0].args = NULL;
+	flow->actions[0].uid = ACTION_COPY_TO_CPU;
+	memset(&flow->actions[1], 0, sizeof(flow->actions[1]));
+
+	return 0;
+}
+
+static int rocker_ucast_to_flow(struct rocker_flow_tbl_key *key,
+				struct net_flow_flow *flow)
+{
+	int cnt = 0;
+
+	if (key->ucast_routing.eth_type)
+		cnt++;
+	if (key->ucast_routing.dst4)
+		cnt++;
+
+	flow->matches = kcalloc((cnt + 1), sizeof(struct net_flow_field_ref),
+				GFP_KERNEL);
+	if (!flow->matches)
+		return -ENOMEM;
+
+	cnt = 0;
+
+	if (key->ucast_routing.eth_type) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_ETHERNET;
+		flow->matches[cnt].header = HEADER_ETHERNET;
+		flow->matches[cnt].field = HEADER_ETHERNET_ETHERTYPE;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_EXACT;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U16;
+		flow->matches[cnt].value_u16 =
+				ntohs(key->ucast_routing.eth_type);
+		cnt++;
+	}
+
+	if (key->ucast_routing.dst4) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_IPV4;
+		flow->matches[cnt].header = HEADER_IPV4;
+		flow->matches[cnt].field = HEADER_IPV4_DST_IP;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U32;
+		flow->matches[cnt].value_u32 = key->ucast_routing.dst4;
+		flow->matches[cnt].mask_u32 = key->ucast_routing.dst4_mask;
+		cnt++;
+	}
+
+	memset(&flow->matches[cnt], 0, sizeof(flow->matches[cnt]));
+
+	flow->actions = kmalloc(2 * sizeof(struct net_flow_action), GFP_KERNEL);
+	if (!flow->actions) {
+		kfree(flow->matches);
+		return -ENOMEM;
+	}
+
+	flow->actions[0].args = kcalloc(2, sizeof(struct net_flow_action_arg),
+					GFP_KERNEL);
+	if (!flow->actions[0].args) {
+		kfree(flow->matches);
+		kfree(flow->actions);
+		return -ENOMEM;
+	}
+
+	flow->actions[0].uid = ACTION_SET_L3_UNICAST_GROUP_ID;
+	flow->actions[0].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U32;
+	flow->actions[0].args[0].value_u32 = key->ucast_routing.group_id;
+
+	memset(&flow->actions[1], 0, sizeof(flow->actions[1]));
+	memset(&flow->actions[0].args[1], 0,
+	       sizeof(struct net_flow_action_arg));
+
+	return 0;
+}
+
+static int rocker_bridge_to_flow(struct rocker_flow_tbl_key *key,
+				 struct net_flow_flow *flow)
+{
+	int cnt = 0;
+
+	if (key->bridge.eth_dst)
+		cnt++;
+	if (key->bridge.vlan_id)
+		cnt++;
+
+	flow->matches = kcalloc((cnt + 1), sizeof(struct net_flow_field_ref),
+				GFP_KERNEL);
+	if (!flow->matches)
+		return -ENOMEM;
+
+	cnt = 0;
+
+	if (key->bridge.eth_dst) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_ETHERNET;
+		flow->matches[cnt].header = HEADER_ETHERNET;
+		flow->matches[cnt].field = HEADER_ETHERNET_DST_MAC;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U64;
+		memcpy(&flow->matches[cnt].value_u64,
+		       key->bridge.eth_dst, ETH_ALEN);
+		memcpy(&flow->matches[cnt].mask_u64,
+		       key->bridge.eth_dst_mask, ETH_ALEN);
+		cnt++;
+	}
+
+	if (key->bridge.vlan_id) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_VLAN_OUTER;
+		flow->matches[cnt].header = HEADER_VLAN;
+		flow->matches[cnt].field = HEADER_VLAN_VID;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_EXACT;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U16;
+		flow->matches[cnt].value_u16 = ntohs(key->bridge.vlan_id);
+		cnt++;
+	}
+
+	memset(&flow->matches[cnt], 0, sizeof(flow->matches[cnt]));
+
+	cnt = 0;
+	if (key->bridge.group_id)
+		cnt++;
+	if (key->bridge.copy_to_cpu)
+		cnt++;
+
+	flow->actions = kcalloc((cnt + 1), sizeof(struct net_flow_action),
+				GFP_KERNEL);
+	if (!flow->actions) {
+		kfree(flow->matches);
+		return -ENOMEM;
+	}
+
+	cnt = 0;
+	if (key->bridge.group_id) {
+		flow->actions[cnt].args =
+				kcalloc(2,
+					sizeof(struct net_flow_action_arg),
+					GFP_KERNEL);
+		if (!flow->actions[cnt].args) {
+			kfree(flow->matches);
+			kfree(flow->actions);
+			return -ENOMEM;
+		}
+
+		flow->actions[cnt].uid = ACTION_SET_L3_UNICAST_GROUP_ID;
+		flow->actions[cnt].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U32;
+		flow->actions[cnt].args[0].value_u32 = key->bridge.group_id;
+		cnt++;
+	}
+
+	if (key->bridge.copy_to_cpu) {
+		flow->actions[cnt].uid = ACTION_COPY_TO_CPU;
+		flow->actions[cnt].args = NULL;
+		cnt++;
+	}
+
+	memset(&flow->actions[cnt], 0, sizeof(flow->actions[1]));
+	return 0;
+}
+
+static int rocker_acl_to_flow(struct rocker_flow_tbl_key *key,
+			      struct net_flow_flow *flow)
+{
+	int cnt = 0;
+
+	if (key->acl.in_lport)
+		cnt++;
+	if (key->acl.eth_src)
+		cnt++;
+	if (key->acl.eth_dst)
+		cnt++;
+	if (key->acl.eth_type)
+		cnt++;
+	if (key->acl.vlan_id)
+		cnt++;
+	if (key->acl.ip_proto)
+		cnt++;
+	if (key->acl.ip_tos)
+		cnt++;
+
+	flow->matches = kcalloc((cnt + 1), sizeof(struct net_flow_field_ref),
+				GFP_KERNEL);
+	if (!flow->matches)
+		return -ENOMEM;
+
+	cnt = 0;
+
+	if (key->acl.in_lport) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_IN_LPORT;
+		flow->matches[cnt].header = HEADER_METADATA;
+		flow->matches[cnt].field = HEADER_METADATA_IN_LPORT;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U32;
+		flow->matches[cnt].value_u32 = key->acl.in_lport;
+		flow->matches[cnt].mask_u32 = key->acl.in_lport_mask;
+		cnt++;
+	}
+
+	if (key->acl.eth_src) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_ETHERNET;
+		flow->matches[cnt].header = HEADER_ETHERNET;
+		flow->matches[cnt].field = HEADER_ETHERNET_SRC_MAC;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U64;
+		flow->matches[cnt].value_u64 = *key->acl.eth_src;
+		flow->matches[cnt].mask_u64 = *key->acl.eth_src_mask;
+		cnt++;
+	}
+
+	if (key->acl.eth_dst) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_ETHERNET;
+		flow->matches[cnt].header = HEADER_ETHERNET;
+		flow->matches[cnt].field = HEADER_ETHERNET_DST_MAC;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U64;
+		memcpy(&flow->matches[cnt].value_u64,
+		       key->acl.eth_dst, ETH_ALEN);
+		memcpy(&flow->matches[cnt].mask_u64,
+		       key->acl.eth_dst_mask, ETH_ALEN);
+		cnt++;
+	}
+
+	if (key->acl.eth_type) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_ETHERNET;
+		flow->matches[cnt].header = HEADER_ETHERNET;
+		flow->matches[cnt].field = HEADER_ETHERNET_ETHERTYPE;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_EXACT;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U16;
+		flow->matches[cnt].value_u16 = ntohs(key->acl.eth_type);
+		cnt++;
+	}
+
+	if (key->acl.vlan_id) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_VLAN_OUTER;
+		flow->matches[cnt].header = HEADER_VLAN;
+		flow->matches[cnt].field = HEADER_VLAN_VID;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_EXACT;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U16;
+		flow->matches[cnt].value_u16 = ntohs(key->acl.vlan_id);
+		cnt++;
+	}
+
+	if (key->acl.ip_proto) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_IPV4;
+		flow->matches[cnt].header = HEADER_IPV4;
+		flow->matches[cnt].field = HEADER_IPV4_PROTOCOL;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U8;
+		flow->matches[cnt].value_u8 = key->acl.ip_proto;
+		flow->matches[cnt].mask_u8 = key->acl.ip_proto_mask;
+		cnt++;
+	}
+
+	if (key->acl.ip_tos) {
+		flow->matches[cnt].instance = HEADER_INSTANCE_IPV4;
+		flow->matches[cnt].header = HEADER_IPV4;
+		flow->matches[cnt].field = HEADER_IPV4_DSCP;
+		flow->matches[cnt].mask_type = NET_FLOW_MASK_TYPE_LPM;
+		flow->matches[cnt].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U8;
+		flow->matches[cnt].value_u8 = key->acl.ip_tos;
+		flow->matches[cnt].mask_u8 = key->acl.ip_tos_mask;
+		cnt++;
+	}
+
+	memset(&flow->matches[cnt], 0, sizeof(flow->matches[cnt]));
+
+	flow->actions = kcalloc(2,
+				sizeof(struct net_flow_action),
+				GFP_KERNEL);
+	if (!flow->actions) {
+		kfree(flow->matches);
+		return -ENOMEM;
+	}
+
+	flow->actions[0].args = kcalloc(2,
+					sizeof(struct net_flow_action_arg),
+					GFP_KERNEL);
+	if (!flow->actions[0].args) {
+		kfree(flow->matches);
+		kfree(flow->actions);
+		return -ENOMEM;
+	}
+
+	flow->actions[0].uid = ACTION_SET_L3_UNICAST_GROUP_ID;
+	flow->actions[0].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U32;
+	flow->actions[0].args[0].value_u32 = key->acl.group_id;
+
+	memset(&flow->actions[0].args[1], 0,
+	       sizeof(struct net_flow_action_arg));
+	memset(&flow->actions[1], 0, sizeof(flow->actions[1]));
+	return 0;
+}
+
+static int rocker_l3_unicast_to_flow(struct rocker_group_tbl_entry *entry,
+				     struct net_flow_flow *flow)
+{
+	int cnt = 0;
+
+	flow->matches = kcalloc(2, sizeof(struct net_flow_field_ref),
+				GFP_KERNEL);
+	if (!flow->matches)
+		return -ENOMEM;
+
+	flow->matches[0].instance = HEADER_INSTANCE_L3_UNICAST_GROUP_ID;
+	flow->matches[0].header = HEADER_METADATA;
+	flow->matches[0].field = HEADER_METADATA_L3_UNICAST_GROUP_ID;
+	flow->matches[0].mask_type = NET_FLOW_MASK_TYPE_EXACT;
+	flow->matches[0].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U32;
+	flow->matches[0].value_u32 = ~ROCKER_GROUP_TYPE_MASK & entry->group_id;
+
+	memset(&flow->matches[1], 0, sizeof(flow->matches[cnt]));
+
+	if (entry->l3_unicast.eth_src)
+		cnt++;
+	if (entry->l3_unicast.eth_dst)
+		cnt++;
+	if (entry->l3_unicast.vlan_id)
+		cnt++;
+	if (entry->l3_unicast.ttl_check)
+		cnt++;
+	if (entry->l3_unicast.group_id)
+		cnt++;
+
+	flow->actions = kcalloc(cnt, sizeof(struct net_flow_action),
+				GFP_KERNEL);
+	if (!flow->actions) {
+		kfree(flow->matches);
+		return -ENOMEM;
+	}
+
+	cnt = 0;
+
+	if (entry->l3_unicast.eth_src) {
+		flow->actions[cnt].args =
+				kcalloc(2,
+					sizeof(struct net_flow_action_arg),
+					GFP_KERNEL);
+
+		if (!flow->actions[cnt].args)
+			goto unwind_args;
+
+		flow->actions[cnt].uid = ACTION_SET_ETH_SRC;
+		flow->actions[cnt].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U64;
+		ether_addr_copy(flow->actions[cnt].args[0].value_u64,
+				entry->l3_unicast.eth_src);
+		memset(&flow->actions[0].args[1], 0,
+		       sizeof(struct net_flow_action_arg));
+		cnt++;
+	}
+
+	if (entry->l3_unicast.eth_dst) {
+		flow->actions[cnt].args =
+			kcalloc(2,
+				sizeof(struct net_flow_action_arg),
+				GFP_KERNEL);
+
+		if (!flow->actions[cnt].args)
+			goto unwind_args;
+
+		flow->actions[cnt].uid = ACTION_SET_ETH_DST;
+		flow->actions[cnt].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U64;
+		ether_addr_copy(&flow->actions[cnt].args[0].value_u64,
+				entry->l3_unicast.eth_dst);
+		memset(&flow->actions[0].args[1], 0,
+		       sizeof(struct net_flow_action_arg));
+		cnt++;
+	}
+
+	if (entry->l3_unicast.vlan_id) {
+		flow->actions[cnt].args =
+				kcalloc(2,
+					sizeof(struct net_flow_action_arg),
+					GFP_KERNEL);
+
+		if (!flow->actions[cnt].args)
+			goto unwind_args;
+
+		flow->actions[cnt].uid = ACTION_SET_VLAN_ID;
+		flow->actions[cnt].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U16;
+		flow->actions[cnt].args[0].value_u16 =
+					ntohs(entry->l3_unicast.vlan_id);
+		memset(&flow->actions[0].args[1], 0,
+		       sizeof(struct net_flow_action_arg));
+		cnt++;
+	}
+
+	if (entry->l3_unicast.ttl_check) {
+		flow->actions[cnt].uid = ACTION_CHECK_TTL_DROP;
+		flow->actions[cnt].args = NULL;
+		cnt++;
+	}
+
+	if (entry->l3_unicast.group_id) {
+		flow->actions[cnt].args =
+				kcalloc(2,
+					sizeof(struct net_flow_action_arg),
+					GFP_KERNEL);
+
+		if (!flow->actions[cnt].args)
+			goto unwind_args;
+
+		flow->actions[cnt].uid = ACTION_SET_L2_GROUP_ID;
+		flow->actions[cnt].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U32;
+		flow->actions[cnt].args[0].value_u32 =
+						entry->l3_unicast.group_id;
+		memset(&flow->actions[0].args[1], 0,
+		       sizeof(struct net_flow_action_arg));
+		cnt++;
+	}
+
+	memset(&flow->actions[cnt], 0, sizeof(flow->actions[cnt]));
+	return 0;
+unwind_args:
+	kfree(flow->matches);
+	for (cnt--; cnt >= 0; cnt--)
+		kfree(flow->actions[cnt].args);
+	kfree(flow->actions);
+	return -ENOMEM;
+}
+
+static int rocker_l2_rewrite_to_flow(struct rocker_group_tbl_entry *entry,
+				     struct net_flow_flow *flow)
+{
+	int cnt = 0;
+
+	flow->matches = kcalloc(2, sizeof(struct net_flow_field_ref),
+				GFP_KERNEL);
+	if (!flow->matches)
+		return -ENOMEM;
+
+	flow->matches[0].instance = HEADER_INSTANCE_L2_REWRITE_GROUP_ID;
+	flow->matches[0].header = HEADER_METADATA;
+	flow->matches[0].field = HEADER_METADATA_L2_REWRITE_GROUP_ID;
+	flow->matches[0].mask_type = NET_FLOW_MASK_TYPE_EXACT;
+	flow->matches[0].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U32;
+	flow->matches[0].value_u32 = ~ROCKER_GROUP_TYPE_MASK & entry->group_id;
+
+	memset(&flow->matches[1], 0, sizeof(flow->matches[cnt]));
+
+	if (entry->l2_rewrite.eth_src)
+		cnt++;
+	if (entry->l2_rewrite.eth_dst)
+		cnt++;
+	if (entry->l2_rewrite.vlan_id)
+		cnt++;
+	if (entry->l2_rewrite.group_id)
+		cnt++;
+
+	flow->actions = kcalloc(cnt, sizeof(struct net_flow_action),
+				GFP_KERNEL);
+	if (!flow->actions) {
+		kfree(flow->matches);
+		return -ENOMEM;
+	}
+
+	cnt = 0;
+
+	if (entry->l2_rewrite.eth_src) {
+		flow->actions[cnt].args =
+			kmalloc(2 * sizeof(struct net_flow_action_arg),
+				GFP_KERNEL);
+
+		if (!flow->actions[cnt].args)
+			goto unwind_args;
+
+		flow->actions[cnt].uid = ACTION_SET_ETH_SRC;
+		flow->actions[cnt].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U64;
+		ether_addr_copy(flow->actions[cnt].args[0].value_u64,
+				entry->l2_rewrite.eth_src);
+		memset(&flow->actions[0].args[1], 0,
+		       sizeof(struct net_flow_action_arg));
+		cnt++;
+	}
+
+	if (entry->l2_rewrite.eth_dst) {
+		flow->actions[cnt].args =
+			kmalloc(2 * sizeof(struct net_flow_action_arg),
+				GFP_KERNEL);
+
+		if (!flow->actions[cnt].args)
+			goto unwind_args;
+
+		flow->actions[cnt].uid = ACTION_SET_ETH_DST;
+		flow->actions[cnt].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U64;
+		ether_addr_copy(&flow->actions[cnt].args[0].value_u64,
+				entry->l2_rewrite.eth_dst);
+		memset(&flow->actions[0].args[1], 0,
+		       sizeof(struct net_flow_action_arg));
+		cnt++;
+	}
+
+	if (entry->l2_rewrite.vlan_id) {
+		flow->actions[cnt].args =
+			kmalloc(2 * sizeof(struct net_flow_action_arg),
+				GFP_KERNEL);
+
+		if (!flow->actions[cnt].args)
+			goto unwind_args;
+
+		flow->actions[cnt].uid = ACTION_SET_VLAN_ID;
+		flow->actions[cnt].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U16;
+		flow->actions[cnt].args[0].value_u16 =
+					ntohs(entry->l2_rewrite.vlan_id);
+		memset(&flow->actions[0].args[1], 0,
+		       sizeof(struct net_flow_action_arg));
+		cnt++;
+	}
+
+	if (entry->l2_rewrite.group_id) {
+		flow->actions[cnt].args =
+			kmalloc(2 * sizeof(struct net_flow_action_arg),
+				GFP_KERNEL);
+
+		if (!flow->actions[cnt].args)
+			goto unwind_args;
+
+		flow->actions[cnt].uid = ACTION_SET_L2_GROUP_ID;
+		flow->actions[cnt].args[0].type = NET_FLOW_ACTION_ARG_TYPE_U32;
+		flow->actions[cnt].args[0].value_u32 =
+			entry->l2_rewrite.group_id;
+		memset(&flow->actions[0].args[1], 0,
+		       sizeof(struct net_flow_action_arg));
+		cnt++;
+	}
+
+	memset(&flow->actions[cnt], 0, sizeof(flow->actions[cnt]));
+	return 0;
+unwind_args:
+	kfree(flow->matches);
+	for (cnt--; cnt >= 0; cnt--)
+		kfree(flow->actions[cnt].args);
+	kfree(flow->actions);
+	return -ENOMEM;
+}
+
+static int rocker_l2_interface_to_flow(struct rocker_group_tbl_entry *entry,
+				       struct net_flow_flow *flow)
+{
+	flow->matches = kmalloc(2 * sizeof(struct net_flow_field_ref),
+				GFP_KERNEL);
+	if (!flow->matches)
+		return -ENOMEM;
+
+	flow->matches[0].instance = HEADER_INSTANCE_L2_GROUP_ID;
+	flow->matches[0].header = HEADER_METADATA;
+	flow->matches[0].field = HEADER_METADATA_L2_GROUP_ID;
+	flow->matches[0].mask_type = NET_FLOW_MASK_TYPE_EXACT;
+	flow->matches[0].type = NET_FLOW_FIELD_REF_ATTR_TYPE_U32;
+	flow->matches[0].value_u32 = ~ROCKER_GROUP_TYPE_MASK & entry->group_id;
+
+	memset(&flow->matches[1], 0, sizeof(flow->matches[1]));
+
+	if (!entry->l2_interface.pop_vlan) {
+		flow->actions = NULL;
+		return 0;
+	}
+
+	flow->actions = kmalloc(2 * sizeof(struct net_flow_action), GFP_KERNEL);
+	if (!flow->actions) {
+		kfree(flow->matches);
+		return -ENOMEM;
+	}
+
+	if (entry->l2_interface.pop_vlan) {
+		flow->actions[0].uid = ACTION_POP_VLAN;
+		flow->actions[0].args = NULL;
+	}
+
+	memset(&flow->actions[1], 0, sizeof(flow->actions[1]));
+	return 0;
+}
+
+static int rocker_get_flows(struct sk_buff *skb, struct net_device *dev,
+			    int table, int min, int max)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	struct net_flow_flow flow;
+	struct rocker_flow_tbl_entry *entry;
+	struct rocker_group_tbl_entry *group;
+	struct hlist_node *tmp;
+	unsigned long flags;
+	int bkt, err;
+
+	spin_lock_irqsave(&rocker_port->rocker->flow_tbl_lock, flags);
+	hash_for_each_safe(rocker_port->rocker->flow_tbl,
+			   bkt, tmp, entry, entry) {
+		struct rocker_flow_tbl_key *key = &entry->key;
+
+		if (rocker_goto_value(table) != key->tbl_id)
+			continue;
+
+		flow.table_id = table;
+		flow.uid = entry->cookie;
+		flow.priority = key->priority;
+
+		switch (table) {
+		case ROCKER_FLOW_TABLE_ID_INGRESS_PORT:
+			err = rocker_ig_port_to_flow(key, &flow);
+			if (err)
+				return err;
+			break;
+		case ROCKER_FLOW_TABLE_ID_VLAN:
+			err = rocker_vlan_to_flow(key, &flow);
+			if (err)
+				return err;
+			break;
+		case ROCKER_FLOW_TABLE_ID_TERMINATION_MAC:
+			err = rocker_term_to_flow(key, &flow);
+			break;
+		case ROCKER_FLOW_TABLE_ID_UNICAST_ROUTING:
+			err = rocker_ucast_to_flow(key, &flow);
+			break;
+		case ROCKER_FLOW_TABLE_ID_BRIDGING:
+			err = rocker_bridge_to_flow(key, &flow);
+			break;
+		case ROCKER_FLOW_TABLE_ID_ACL_POLICY:
+			err = rocker_acl_to_flow(key, &flow);
+			break;
+		default:
+			continue;
+		}
+
+		net_flow_put_flow(skb, &flow);
+	}
+	spin_unlock_irqrestore(&rocker_port->rocker->flow_tbl_lock, flags);
+
+	spin_lock_irqsave(&rocker_port->rocker->group_tbl_lock, flags);
+	hash_for_each_safe(rocker_port->rocker->group_tbl,
+			   bkt, tmp, group, entry) {
+		if (rocker_goto_value(table) !=
+			ROCKER_GROUP_TYPE_GET(group->group_id))
+			continue;
+
+		flow.table_id = table;
+		flow.uid = group->group_id;
+		flow.priority = 1;
+
+		switch (table) {
+		case ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L3_UNICAST:
+			err = rocker_l3_unicast_to_flow(group, &flow);
+			break;
+		case ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2_REWRITE:
+			err = rocker_l2_rewrite_to_flow(group, &flow);
+			break;
+		case ROCKER_FLOW_TABLE_ID_GROUP_SLICE_L2:
+			err = rocker_l2_interface_to_flow(group, &flow);
+			break;
+		default:
+			continue;
+		}
+
+		net_flow_put_flow(skb, &flow);
+	}
+	spin_unlock_irqrestore(&rocker_port->rocker->group_tbl_lock, flags);
+
+	return 0;
+}
 #endif
 
 static const struct net_device_ops rocker_port_netdev_ops = {
@@ -4517,6 +5335,7 @@ static const struct net_device_ops rocker_port_netdev_ops = {
 
 	.ndo_flow_set_flows		= rocker_set_flows,
 	.ndo_flow_del_flows		= rocker_del_flows,
+	.ndo_flow_get_flows		= rocker_get_flows,
 #endif
 };
 

^ permalink raw reply related

* [net-next PATCH v1 09/11] net: rocker: add cookie to group acls and use flow_id to set cookie
From: John Fastabend @ 2014-12-31 19:49 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

Rocker uses a cookie value to identify flows however the flow API
already has a unique id for each flow. To help the translation
add support to set the cookie value through the internal rocker
flow API and then use the unique id in the cases where it is
available.

This patch extends the internal code paths to support the new
cookie value.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/ethernet/rocker/rocker.c |   64 ++++++++++++++++++++++------------
 1 file changed, 42 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index 997beb9..4d2d292 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -120,6 +120,7 @@ struct rocker_flow_tbl_entry {
 
 struct rocker_group_tbl_entry {
 	struct hlist_node entry;
+	u64 cookie;
 	u32 cmd;
 	u32 group_id; /* key */
 	u16 group_count;
@@ -2216,7 +2217,8 @@ static int rocker_flow_tbl_add(struct rocker_port *rocker_port,
 		kfree(match);
 	} else {
 		found = match;
-		found->cookie = rocker->flow_tbl_next_cookie++;
+		if (!found->cookie)
+			found->cookie = rocker->flow_tbl_next_cookie++;
 		hash_add(rocker->flow_tbl, &found->entry, found->key_crc32);
 		add_to_hw = true;
 	}
@@ -2294,7 +2296,7 @@ static int rocker_flow_tbl_do(struct rocker_port *rocker_port,
 		return rocker_flow_tbl_add(rocker_port, entry, nowait);
 }
 
-static int rocker_flow_tbl_ig_port(struct rocker_port *rocker_port,
+static int rocker_flow_tbl_ig_port(struct rocker_port *rocker_port, u64 flow_id,
 				   int flags, u32 in_lport, u32 in_lport_mask,
 				   enum rocker_of_dpa_table_id goto_tbl)
 {
@@ -2310,11 +2312,14 @@ static int rocker_flow_tbl_ig_port(struct rocker_port *rocker_port,
 	entry->key.ig_port.in_lport_mask = in_lport_mask;
 	entry->key.ig_port.goto_tbl = goto_tbl;
 
+	if (flow_id)
+		entry->cookie = flow_id;
+
 	return rocker_flow_tbl_do(rocker_port, flags, entry);
 }
 
 static int rocker_flow_tbl_vlan(struct rocker_port *rocker_port,
-				int flags, u32 in_lport,
+				int flags, u64 flow_id, u32 in_lport,
 				__be16 vlan_id, __be16 vlan_id_mask,
 				enum rocker_of_dpa_table_id goto_tbl,
 				bool untagged, __be16 new_vlan_id)
@@ -2335,10 +2340,14 @@ static int rocker_flow_tbl_vlan(struct rocker_port *rocker_port,
 	entry->key.vlan.untagged = untagged;
 	entry->key.vlan.new_vlan_id = new_vlan_id;
 
+	if (flow_id)
+		entry->cookie = flow_id;
+
 	return rocker_flow_tbl_do(rocker_port, flags, entry);
 }
 
 static int rocker_flow_tbl_term_mac(struct rocker_port *rocker_port,
+				    u64 flow_id,
 				    u32 in_lport, u32 in_lport_mask,
 				    __be16 eth_type, const u8 *eth_dst,
 				    const u8 *eth_dst_mask, __be16 vlan_id,
@@ -2371,11 +2380,14 @@ static int rocker_flow_tbl_term_mac(struct rocker_port *rocker_port,
 	entry->key.term_mac.vlan_id_mask = vlan_id_mask;
 	entry->key.term_mac.copy_to_cpu = copy_to_cpu;
 
+	if (flow_id)
+		entry->cookie = flow_id;
+
 	return rocker_flow_tbl_do(rocker_port, flags, entry);
 }
 
 static int rocker_flow_tbl_bridge(struct rocker_port *rocker_port,
-				  int flags,
+				  int flags, u64 flow_id,
 				  const u8 *eth_dst, const u8 *eth_dst_mask,
 				  __be16 vlan_id, u32 tunnel_id,
 				  enum rocker_of_dpa_table_id goto_tbl,
@@ -2425,11 +2437,14 @@ static int rocker_flow_tbl_bridge(struct rocker_port *rocker_port,
 	entry->key.bridge.group_id = group_id;
 	entry->key.bridge.copy_to_cpu = copy_to_cpu;
 
+	if (flow_id)
+		entry->cookie = flow_id;
+
 	return rocker_flow_tbl_do(rocker_port, flags, entry);
 }
 
 static int rocker_flow_tbl_acl(struct rocker_port *rocker_port,
-			       int flags, u32 in_lport,
+			       int flags, u64 flow_id, u32 in_lport,
 			       u32 in_lport_mask,
 			       const u8 *eth_src, const u8 *eth_src_mask,
 			       const u8 *eth_dst, const u8 *eth_dst_mask,
@@ -2477,6 +2492,9 @@ static int rocker_flow_tbl_acl(struct rocker_port *rocker_port,
 	entry->key.acl.ip_tos_mask = ip_tos_mask;
 	entry->key.acl.group_id = group_id;
 
+	if (flow_id)
+		entry->cookie = flow_id;
+
 	return rocker_flow_tbl_do(rocker_port, flags, entry);
 }
 
@@ -2587,7 +2605,7 @@ static int rocker_group_tbl_do(struct rocker_port *rocker_port,
 }
 
 static int rocker_group_l2_interface(struct rocker_port *rocker_port,
-				     int flags, __be16 vlan_id,
+				     int flags, int flow_id, __be16 vlan_id,
 				     u32 out_lport, int pop_vlan)
 {
 	struct rocker_group_tbl_entry *entry;
@@ -2598,6 +2616,7 @@ static int rocker_group_l2_interface(struct rocker_port *rocker_port,
 
 	entry->group_id = ROCKER_GROUP_L2_INTERFACE(vlan_id, out_lport);
 	entry->l2_interface.pop_vlan = pop_vlan;
+	entry->cookie = flow_id;
 
 	return rocker_group_tbl_do(rocker_port, flags, entry);
 }
@@ -2696,7 +2715,7 @@ static int rocker_port_vlan_l2_groups(struct rocker_port *rocker_port,
 	if (rocker_port->stp_state == BR_STATE_LEARNING ||
 	    rocker_port->stp_state == BR_STATE_FORWARDING) {
 		out_lport = rocker_port->lport;
-		err = rocker_group_l2_interface(rocker_port, flags,
+		err = rocker_group_l2_interface(rocker_port, flags, 0,
 						vlan_id, out_lport,
 						pop_vlan);
 		if (err) {
@@ -2722,7 +2741,7 @@ static int rocker_port_vlan_l2_groups(struct rocker_port *rocker_port,
 		return 0;
 
 	out_lport = 0;
-	err = rocker_group_l2_interface(rocker_port, flags,
+	err = rocker_group_l2_interface(rocker_port, flags, 0,
 					vlan_id, out_lport,
 					pop_vlan);
 	if (err) {
@@ -2796,7 +2815,7 @@ static int rocker_port_ctrl_vlan_acl(struct rocker_port *rocker_port,
 	u32 group_id = ROCKER_GROUP_L2_INTERFACE(vlan_id, out_lport);
 	int err;
 
-	err = rocker_flow_tbl_acl(rocker_port, flags,
+	err = rocker_flow_tbl_acl(rocker_port, flags, 0,
 				  in_lport, in_lport_mask,
 				  eth_src, eth_src_mask,
 				  ctrl->eth_dst, ctrl->eth_dst_mask,
@@ -2825,7 +2844,7 @@ static int rocker_port_ctrl_vlan_bridge(struct rocker_port *rocker_port,
 	if (!rocker_port_is_bridged(rocker_port))
 		return 0;
 
-	err = rocker_flow_tbl_bridge(rocker_port, flags,
+	err = rocker_flow_tbl_bridge(rocker_port, flags, 0,
 				     ctrl->eth_dst, ctrl->eth_dst_mask,
 				     vlan_id, tunnel_id,
 				     goto_tbl, group_id, ctrl->copy_to_cpu);
@@ -2847,7 +2866,7 @@ static int rocker_port_ctrl_vlan_term(struct rocker_port *rocker_port,
 	if (ntohs(vlan_id) == 0)
 		vlan_id = rocker_port->internal_vlan_id;
 
-	err = rocker_flow_tbl_term_mac(rocker_port,
+	err = rocker_flow_tbl_term_mac(rocker_port, 0,
 				       rocker_port->lport, in_lport_mask,
 				       ctrl->eth_type, ctrl->eth_dst,
 				       ctrl->eth_dst_mask, vlan_id,
@@ -2961,7 +2980,7 @@ static int rocker_port_vlan(struct rocker_port *rocker_port, int flags,
 		return err;
 	}
 
-	err = rocker_flow_tbl_vlan(rocker_port, flags,
+	err = rocker_flow_tbl_vlan(rocker_port, flags, 0,
 				   in_lport, vlan_id, vlan_id_mask,
 				   goto_tbl, untagged, internal_vlan_id);
 	if (err)
@@ -2986,7 +3005,7 @@ static int rocker_port_ig_tbl(struct rocker_port *rocker_port, int flags)
 	in_lport_mask = 0xffff0000;
 	goto_tbl = ROCKER_OF_DPA_TABLE_ID_VLAN;
 
-	err = rocker_flow_tbl_ig_port(rocker_port, flags,
+	err = rocker_flow_tbl_ig_port(rocker_port, flags, 0,
 				      in_lport, in_lport_mask,
 				      goto_tbl);
 	if (err)
@@ -3036,7 +3055,7 @@ static int rocker_port_fdb_learn(struct rocker_port *rocker_port,
 		group_id = ROCKER_GROUP_L2_INTERFACE(vlan_id, out_lport);
 
 	if (!(flags & ROCKER_OP_FLAG_REFRESH)) {
-		err = rocker_flow_tbl_bridge(rocker_port, flags, addr, NULL,
+		err = rocker_flow_tbl_bridge(rocker_port, flags, 0, addr, NULL,
 					     vlan_id, tunnel_id, goto_tbl,
 					     group_id, copy_to_cpu);
 		if (err)
@@ -3171,7 +3190,7 @@ static int rocker_port_router_mac(struct rocker_port *rocker_port,
 		vlan_id = rocker_port->internal_vlan_id;
 
 	eth_type = htons(ETH_P_IP);
-	err = rocker_flow_tbl_term_mac(rocker_port,
+	err = rocker_flow_tbl_term_mac(rocker_port, 0,
 				       rocker_port->lport, in_lport_mask,
 				       eth_type, rocker_port->dev->dev_addr,
 				       dst_mac_mask, vlan_id, vlan_id_mask,
@@ -3180,7 +3199,7 @@ static int rocker_port_router_mac(struct rocker_port *rocker_port,
 		return err;
 
 	eth_type = htons(ETH_P_IPV6);
-	err = rocker_flow_tbl_term_mac(rocker_port,
+	err = rocker_flow_tbl_term_mac(rocker_port, 0,
 				       rocker_port->lport, in_lport_mask,
 				       eth_type, rocker_port->dev->dev_addr,
 				       dst_mac_mask, vlan_id, vlan_id_mask,
@@ -3215,7 +3234,7 @@ static int rocker_port_fwding(struct rocker_port *rocker_port)
 			continue;
 		vlan_id = htons(vid);
 		pop_vlan = rocker_vlan_id_is_internal(vlan_id);
-		err = rocker_group_l2_interface(rocker_port, flags,
+		err = rocker_group_l2_interface(rocker_port, flags, 0,
 						vlan_id, out_lport,
 						pop_vlan);
 		if (err) {
@@ -3919,7 +3938,7 @@ static int rocker_flow_set_ig_port(struct net_device *dev,
 	in_lport_mask = flow->matches[0].mask_u32;
 	goto_tbl = rocker_goto_value(flow->actions[0].args[0].value_u16);
 
-	err = rocker_flow_tbl_ig_port(rocker_port, flags,
+	err = rocker_flow_tbl_ig_port(rocker_port, flags, 0,
 				      in_lport, in_lport_mask,
 				      goto_tbl);
 	return err;
@@ -3981,7 +4000,7 @@ static int rocker_flow_set_vlan(struct net_device *dev,
 	if (!have_in_lport)
 		return -EINVAL;
 
-	err = rocker_flow_tbl_vlan(rocker_port, flags, in_lport,
+	err = rocker_flow_tbl_vlan(rocker_port, flags, 0, in_lport,
 				   vlan_id, vlan_id_mask, goto_tbl,
 				   untagged, new_vlan_id);
 	return err;
@@ -4063,7 +4082,8 @@ static int rocker_flow_set_term_mac(struct net_device *dev,
 		}
 	}
 
-	err = rocker_flow_tbl_term_mac(rocker_port, in_lport, in_lport_mask,
+	err = rocker_flow_tbl_term_mac(rocker_port, 0,
+				       in_lport, in_lport_mask,
 				       ethtype, eth_dst, eth_dst_mask,
 				       vlan_id, vlan_id_mask,
 				       copy_to_cpu, flags);
@@ -4162,7 +4182,7 @@ static int rocker_flow_set_bridge(struct net_device *dev,
 	}
 
 	/* Ignoring eth_dst_mask it seems to cause a EINVAL return code */
-	err = rocker_flow_tbl_bridge(rocker_port, flags,
+	err = rocker_flow_tbl_bridge(rocker_port, flags, 0,
 				     eth_dst, eth_dst_mask,
 				     vlan_id, tunnel_id,
 				     goto_tbl, group_id, copy_to_cpu);
@@ -4269,7 +4289,7 @@ static int rocker_flow_set_acl(struct net_device *dev,
 		}
 	}
 
-	err = rocker_flow_tbl_acl(rocker_port, flags,
+	err = rocker_flow_tbl_acl(rocker_port, flags, 0,
 				  in_lport, in_lport_mask,
 				  eth_src, eth_src_mask,
 				  eth_dst, eth_dst_mask, ethtype,

^ permalink raw reply related

* [net-next PATCH v1 10/11] net: rocker: have flow api calls set cookie value
From: John Fastabend @ 2014-12-31 19:50 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/ethernet/rocker/rocker.c |   19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index 4d2d292..4ca95da 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -3938,7 +3938,8 @@ static int rocker_flow_set_ig_port(struct net_device *dev,
 	in_lport_mask = flow->matches[0].mask_u32;
 	goto_tbl = rocker_goto_value(flow->actions[0].args[0].value_u16);
 
-	err = rocker_flow_tbl_ig_port(rocker_port, flags, 0,
+	err = rocker_flow_tbl_ig_port(rocker_port, flags,
+				      flow->uid, 
 				      in_lport, in_lport_mask,
 				      goto_tbl);
 	return err;
@@ -4000,7 +4001,7 @@ static int rocker_flow_set_vlan(struct net_device *dev,
 	if (!have_in_lport)
 		return -EINVAL;
 
-	err = rocker_flow_tbl_vlan(rocker_port, flags, 0, in_lport,
+	err = rocker_flow_tbl_vlan(rocker_port, flags, flow->uid, in_lport,
 				   vlan_id, vlan_id_mask, goto_tbl,
 				   untagged, new_vlan_id);
 	return err;
@@ -4082,7 +4083,7 @@ static int rocker_flow_set_term_mac(struct net_device *dev,
 		}
 	}
 
-	err = rocker_flow_tbl_term_mac(rocker_port, 0,
+	err = rocker_flow_tbl_term_mac(rocker_port, flow->uid,
 				       in_lport, in_lport_mask,
 				       ethtype, eth_dst, eth_dst_mask,
 				       vlan_id, vlan_id_mask,
@@ -4182,7 +4183,7 @@ static int rocker_flow_set_bridge(struct net_device *dev,
 	}
 
 	/* Ignoring eth_dst_mask it seems to cause a EINVAL return code */
-	err = rocker_flow_tbl_bridge(rocker_port, flags, 0,
+	err = rocker_flow_tbl_bridge(rocker_port, flags, flow->uid,
 				     eth_dst, eth_dst_mask,
 				     vlan_id, tunnel_id,
 				     goto_tbl, group_id, copy_to_cpu);
@@ -4289,7 +4290,7 @@ static int rocker_flow_set_acl(struct net_device *dev,
 		}
 	}
 
-	err = rocker_flow_tbl_acl(rocker_port, flags, 0,
+	err = rocker_flow_tbl_acl(rocker_port, flags, flow->uid,
 				  in_lport, in_lport_mask,
 				  eth_src, eth_src_mask,
 				  eth_dst, eth_dst_mask, ethtype,
@@ -4354,6 +4355,8 @@ static int rocker_flow_set_group_slice_l3_unicast(struct net_device *dev,
 		}
 	}
 
+	entry->cookie = flow->uid;
+
 	return rocker_group_tbl_do(rocker_port, flags, entry);
 }
 
@@ -4409,6 +4412,8 @@ static int rocker_flow_set_group_slice_l2_rewrite(struct net_device *dev,
 		}
 	}
 
+	entry->cookie = flow->uid;
+
 	return rocker_group_tbl_do(rocker_port, flags, entry);
 }
 
@@ -4464,6 +4469,8 @@ static int rocker_flow_set_group_slice_l2(struct net_device *dev,
 		}
 	}
 
+	entry->cookie = flow->uid;
+
 	return rocker_group_tbl_do(rocker_port, flags, entry);
 }
 
@@ -5307,7 +5314,7 @@ static int rocker_get_flows(struct sk_buff *skb, struct net_device *dev,
 			continue;
 
 		flow.table_id = table;
-		flow.uid = group->group_id;
+		flow.uid = group->cookie;
 		flow.priority = 1;
 
 		switch (table) {

^ permalink raw reply related

* [net-next PATCH v1 11/11] net: rocker: implement delete flow routine
From: John Fastabend @ 2014-12-31 19:50 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/ethernet/rocker/rocker.c |   39 +++++++++++++++++++++++++++++++++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index 4ca95da..fb1e3eb 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -4523,7 +4523,44 @@ static int rocker_set_flows(struct net_device *dev,
 static int rocker_del_flows(struct net_device *dev,
 			    struct net_flow_flow *flow)
 {
-	return -EOPNOTSUPP;
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	struct rocker_flow_tbl_entry *entry;
+	struct rocker_group_tbl_entry *group;
+	struct hlist_node *tmp;
+	int bkt, err = -EEXIST;
+	unsigned long flags;
+
+	spin_lock_irqsave(&rocker_port->rocker->flow_tbl_lock, flags);
+	hash_for_each_safe(rocker_port->rocker->flow_tbl,
+			   bkt, tmp, entry, entry) {
+		if (rocker_goto_value(flow->table_id) != entry->key.tbl_id ||
+		    flow->uid != entry->cookie)
+			continue;
+
+		hash_del(&entry->entry);
+		err = 0;
+		break;
+	}
+	spin_unlock_irqrestore(&rocker_port->rocker->flow_tbl_lock, flags);
+
+	if (!err)
+		return err;
+
+	spin_lock_irqsave(&rocker_port->rocker->group_tbl_lock, flags);
+	hash_for_each_safe(rocker_port->rocker->group_tbl,
+			   bkt, tmp, group, entry) {
+		if (rocker_goto_value(flow->table_id) !=
+			ROCKER_GROUP_TYPE_GET(group->group_id) ||
+		    flow->uid != group->cookie)
+			continue;
+
+		hash_del(&group->entry);
+		err = 0;
+		break;
+	}
+	spin_unlock_irqrestore(&rocker_port->rocker->group_tbl_lock, flags);
+
+	return err;
 }
 
 static int rocker_ig_port_to_flow(struct rocker_flow_tbl_key *key,

^ permalink raw reply related

* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables
From: John Fastabend @ 2014-12-31 20:10 UTC (permalink / raw)
  To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy
In-Reply-To: <20141231194544.31070.30335.stgit@nitbit.x32>

On 12/31/2014 11:45 AM, John Fastabend wrote:
> Currently, we do not have an interface to query hardware and learn
> the capabilities of the device. This makes it very difficult to use
> hardware flow tables.
>

oops missed a few dev_put calls so at least need a new rev
for this. I'll wait a few days for feedback though.

[...]

> +
> +static int net_flow_cmd_get_actions(struct sk_buff *skb,
> +				    struct genl_info *info)
> +{
> +	struct net_flow_action **a;
> +	struct net_device *dev;
> +	struct sk_buff *msg;
> +
> +	dev = net_flow_get_dev(info);
> +	if (!dev)
> +		return -EINVAL;
> +
> +	if (!dev->netdev_ops->ndo_flow_get_actions) {
> +		dev_put(dev);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	a = dev->netdev_ops->ndo_flow_get_actions(dev);
> +	if (!a)

missing dev_put(dev) here.

> +		return -EBUSY;
> +
> +	msg = net_flow_build_actions_msg(a, dev,
> +					 info->snd_portid,
> +					 info->snd_seq,
> +					 NET_FLOW_TABLE_CMD_GET_ACTIONS);
> +	dev_put(dev);
> +
> +	if (IS_ERR(msg))
> +		return PTR_ERR(msg);
> +
> +	return genlmsg_reply(msg, info);
> +}
> +
> +static int net_flow_put_table(struct net_device *dev,
> +			      struct sk_buff *skb,
> +			      struct net_flow_table *t)
> +{
> +	struct nlattr *matches, *actions;
> +	int i;
> +
> +	if (nla_put_string(skb, NET_FLOW_TABLE_ATTR_NAME, t->name) ||
> +	    nla_put_u32(skb, NET_FLOW_TABLE_ATTR_UID, t->uid) ||
> +	    nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SOURCE, t->source) ||
> +	    nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SIZE, t->size))
> +		return -EMSGSIZE;
> +
> +	matches = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_MATCHES);
> +	if (!matches)
> +		return -EMSGSIZE;
> +
> +	for (i = 0; t->matches[i].instance; i++)
> +		nla_put(skb, NET_FLOW_FIELD_REF,
> +			sizeof(struct net_flow_field_ref),
> +			&t->matches[i]);

need to check the return codes here.

> +	nla_nest_end(skb, matches);
> +
> +	actions = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_ACTIONS);
> +	if (!actions)
> +		return -EMSGSIZE;
> +
> +	for (i = 0; t->actions[i]; i++) {
> +		if (nla_put_u32(skb,
> +				NET_FLOW_ACTION_ATTR_UID,
> +				t->actions[i])) {
> +			nla_nest_cancel(skb, actions);
> +			return -EMSGSIZE;
> +		}

remembered to do the check here though ;)

> +	}
> +	nla_nest_end(skb, actions);
> +
> +	return 0;
> +}
> +

[...]

> +
> +static struct sk_buff *net_flow_build_tables_msg(struct net_flow_table **t,
> +						 struct net_device *dev,
> +						 u32 portid, int seq, u8 cmd)
> +{
> +	struct genlmsghdr *hdr;
> +	struct sk_buff *skb;
> +	int err = -ENOBUFS;
> +
> +	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
> +	if (!skb)
> +		return ERR_PTR(-ENOBUFS);
> +
> +	hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd);
> +	if (!hdr)
> +		goto out;
> +
> +	if (nla_put_u32(skb,
> +			NET_FLOW_IDENTIFIER_TYPE,
> +			NET_FLOW_IDENTIFIER_IFINDEX) ||
> +	    nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) {
> +		err = -ENOBUFS;
> +		goto out;
> +	}
> +
> +	err = net_flow_put_tables(dev, skb, t);
> +	if (err < 0)
> +		goto out;
> +
> +	err = genlmsg_end(skb, hdr);
> +	if (err < 0)
> +		goto out;
> +
> +	return skb;
> +out:
> +	nlmsg_free(skb);
> +	return ERR_PTR(err);
> +}
> +
> +static int net_flow_cmd_get_tables(struct sk_buff *skb,
> +				   struct genl_info *info)
> +{
> +	struct net_flow_table **tables;
> +	struct net_device *dev;
> +	struct sk_buff *msg;
> +
> +	dev = net_flow_get_dev(info);
> +	if (!dev)
> +		return -EINVAL;
> +
> +	if (!dev->netdev_ops->ndo_flow_get_tables) {
> +		dev_put(dev);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	tables = dev->netdev_ops->ndo_flow_get_tables(dev);
> +	if (!tables) /* transient failure should always have some table */

need dev_put()

> +		return -EBUSY;
> +
> +	msg = net_flow_build_tables_msg(tables, dev,
> +					info->snd_portid,
> +					info->snd_seq,
> +					NET_FLOW_TABLE_CMD_GET_TABLES);
> +	dev_put(dev);
> +
> +	if (IS_ERR(msg))
> +		return PTR_ERR(msg);
> +
> +	return genlmsg_reply(msg, info);
> +}
> +

[...]

> +
> +static int net_flow_put_headers(struct sk_buff *skb,
> +				struct net_flow_header **headers)
> +{
> +	struct nlattr *nest, *hdr, *fields;
> +	struct net_flow_header *h;
> +	int i, err;
> +
> +	nest = nla_nest_start(skb, NET_FLOW_HEADERS);
> +	if (!nest)
> +		return -EMSGSIZE;
> +
> +	for (i = 0; headers[i]->uid; i++) {
> +		err = -EMSGSIZE;
> +		h = headers[i];
> +
> +		hdr = nla_nest_start(skb, NET_FLOW_HEADER);
> +		if (!hdr)
> +			goto hdr_put_failure;
> +
> +		if (nla_put_string(skb, NET_FLOW_HEADER_ATTR_NAME, h->name) ||
> +		    nla_put_u32(skb, NET_FLOW_HEADER_ATTR_UID, h->uid))
> +			goto attr_put_failure;
> +
> +		fields = nla_nest_start(skb, NET_FLOW_HEADER_ATTR_FIELDS);
> +		if (!fields)
> +			goto attr_put_failure;
> +
> +		err = net_flow_put_fields(skb, h);
> +		if (err)
> +			goto fields_put_failure;
> +
> +		nla_nest_end(skb, fields);
> +

can remove this new line I think it doesn't add much.

> +		nla_nest_end(skb, hdr);
> +	}
> +	nla_nest_end(skb, nest);
> +
> +	return 0;
> +fields_put_failure:
> +	nla_nest_cancel(skb, fields);
> +attr_put_failure:
> +	nla_nest_cancel(skb, hdr);
> +hdr_put_failure:
> +	nla_nest_cancel(skb, nest);
> +	return err;
> +}
> +

[...]

> +
> +static int net_flow_cmd_get_headers(struct sk_buff *skb,
> +				    struct genl_info *info)
> +{
> +	struct net_flow_header **h;
> +	struct net_device *dev;
> +	struct sk_buff *msg;
> +
> +	dev = net_flow_get_dev(info);
> +	if (!dev)
> +		return -EINVAL;
> +
> +	if (!dev->netdev_ops->ndo_flow_get_headers) {
> +		dev_put(dev);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	h = dev->netdev_ops->ndo_flow_get_headers(dev);
> +	if (!h)

dev_put again

> +		return -EBUSY;
> +
> +	msg = net_flow_build_headers_msg(h, dev,
> +					 info->snd_portid,
> +					 info->snd_seq,
> +					 NET_FLOW_TABLE_CMD_GET_HEADERS);
> +	dev_put(dev);
> +
> +	if (IS_ERR(msg))
> +		return PTR_ERR(msg);
> +
> +	return genlmsg_reply(msg, info);
> +}
> +

[...]

> +
> +static int net_flow_cmd_get_header_graph(struct sk_buff *skb,
> +					 struct genl_info *info)
> +{
> +	struct net_flow_hdr_node **h;
> +	struct net_device *dev;
> +	struct sk_buff *msg;
> +
> +	dev = net_flow_get_dev(info);
> +	if (!dev)
> +		return -EINVAL;
> +
> +	if (!dev->netdev_ops->ndo_flow_get_hdr_graph) {
> +		dev_put(dev);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	h = dev->netdev_ops->ndo_flow_get_hdr_graph(dev);
> +	if (!h)

dev_put() seems I copy/pasted the same template for each cmd.

> +		return -EBUSY;
> +
> +	msg = net_flow_build_header_graph_msg(h, dev,
> +					      info->snd_portid,
> +					      info->snd_seq,
> +					      NET_FLOW_TABLE_CMD_GET_HDR_GRAPH);
> +	dev_put(dev);
> +
> +	if (IS_ERR(msg))
> +		return PTR_ERR(msg);
> +
> +	return genlmsg_reply(msg, info);
> +}
> +

[...]

> +
> +static int net_flow_cmd_get_table_graph(struct sk_buff *skb,
> +					struct genl_info *info)
> +{
> +	struct net_flow_tbl_node **g;
> +	struct net_device *dev;
> +	struct sk_buff *msg;
> +
> +	dev = net_flow_get_dev(info);
> +	if (!dev)
> +		return -EINVAL;
> +
> +	if (!dev->netdev_ops->ndo_flow_get_tbl_graph) {
> +		dev_put(dev);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	g = dev->netdev_ops->ndo_flow_get_tbl_graph(dev);
> +	if (!g)

dev_put

> +		return -EBUSY;
> +

[...]


-- 
John Fastabend         Intel Corporation

^ permalink raw reply

* But do not tell anyone
From: George @ 2015-01-01  5:57 UTC (permalink / raw)
  To: netdev

Everything could become real....


http://binaryperform.go2cloud.org/aff_c?offer_id=36&aff_id=1476

















No more such Info? Simply answer <NO>

^ permalink raw reply

* Re: [PATCH V3 for 3.19]  rtlwifi: Fix error when accessing unmapped memory in skb
From: Larry Finger @ 2014-12-31 21:10 UTC (permalink / raw)
  To: Eric Biggers; +Cc: kvalo, linux-wireless, netdev, Stable
In-Reply-To: <20141231050735.GA20639@zzz>

On 12/30/2014 11:07 PM, Eric Biggers wrote:
> On Tue, Dec 30, 2014 at 09:33:07PM -0600, Larry Finger wrote:
>> v3 - Unmap skb before trying to allocate a new one so as to not leak mapping.
>
> Looks good to me, although I'm not sure about the handling of DMA mapping errors
> (perhaps that's something that drivers typically don't even try to handle?).
> Anyway, the skb allocation issue appears to be resolved now.  I am running your
> patch with an extra hack to inject some occasional skb allocation failures, and
> I haven't noticed any problems except dropped packets.

The last time I saw any DMA mapping errors were for some early BCM43xx cards 
that only had 20 bits of DMA addressing space. These Realtek devices have a full 
32 bits of addressing, thus any physical address in the first 4GB of RAM will be 
OK. I suppose that it might be possible to get a physical address outside this 
range for machines with a lot of RAM, but they are unlikely to have wifi interfaces.

Thanks for the testing. The Realtek engineer told me that they are looking at 
this section, and may do a rewrite. I'm waiting to see what happens there before 
considering alternatives. If the number of packets dropped due to skb allocation 
failures is small, then the current code is likely OK.

Larry

^ permalink raw reply

* Re: [PATCH net] openvswitch: Consistently include VLAN header in flow and port stats.
From: Pravin Shelar @ 2014-12-31 21:12 UTC (permalink / raw)
  To: Ben Pfaff; +Cc: netdev, dev@openvswitch.org, Motonori Shindo
In-Reply-To: <1420044346-27957-1-git-send-email-blp@nicira.com>

On Wed, Dec 31, 2014 at 8:45 AM, Ben Pfaff <blp@nicira.com> wrote:
> Until now, when VLAN acceleration was in use, the bytes of the VLAN header
> were not included in port or flow byte counters.  They were however
> included when VLAN acceleration was not used.  This commit corrects the
> inconsistency, by always including the VLAN header in byte counters.
>
> Previous discussion at
> http://openvswitch.org/pipermail/dev/2014-December/049521.html
>
> Reported-by: Motonori Shindo <mshindo@vmware.com>
> Signed-off-by: Ben Pfaff <blp@nicira.com>

Looks good.

Acked-by: Pravin B Shelar <pshelar@nicira.com>

^ permalink raw reply

* Re: [PATCH net-next v2 1/2] bridge: new attribute and flags to represent vlan info lists and ranges
From: roopa @ 2014-12-31 21:17 UTC (permalink / raw)
  To: Jeremiah Mahler, netdev, shemminger, vyasevic, sfeldma, wkok
In-Reply-To: <20141231184855.GB2658@hudson.localdomain>

On 12/31/14, 10:48 AM, Jeremiah Mahler wrote:
> Roopa,
>
> On Wed, Dec 31, 2014 at 10:15:53AM -0800, roopa wrote:
>> On 12/31/14, 9:45 AM, Jeremiah Mahler wrote:
>>> Roopa,
>>>
>>> On Wed, Dec 31, 2014 at 08:48:52AM -0800, roopa@cumulusnetworks.com wrote:
>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>
>>>> This patch adds (as suggested by scott feldman),
>>>>          - new netlink attribute IFLA_BRIDGE_VLAN_INFO_LIST to represent
>>>>            vlan list
>>>>          - And bridge_vlan_info flags BRIDGE_VLAN_INFO_RANGE_START and
>>>>            BRIDGE_VLAN_INFO_RANGE_END to indicate start and end of vlan range
>>>>
>>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>> ---
>>>>   include/uapi/linux/if_bridge.h |    4 ++++
>>>>   net/bridge/br_netlink.c        |    1 +
>>>>   2 files changed, 5 insertions(+)
>>>>
>>>> diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
>>>> index b03ee8f..fa468aa 100644
>>>> --- a/include/uapi/linux/if_bridge.h
>>>> +++ b/include/uapi/linux/if_bridge.h
>>>> @@ -112,12 +112,14 @@ struct __fdb_entry {
>>>>    *     [IFLA_BRIDGE_FLAGS]
>>>>    *     [IFLA_BRIDGE_MODE]
>>>>    *     [IFLA_BRIDGE_VLAN_INFO]
>>>> + *     [IFLA_BRIDGE_VLAN_INFO_LIST]
>>>>    * }
>>>>    */
>>>>   enum {
>>>>   	IFLA_BRIDGE_FLAGS,
>>>>   	IFLA_BRIDGE_MODE,
>>>>   	IFLA_BRIDGE_VLAN_INFO,
>>>> +	IFLA_BRIDGE_VLAN_INFO_LIST,
>>>>   	__IFLA_BRIDGE_MAX,
>>>>   };
>>>>   #define IFLA_BRIDGE_MAX (__IFLA_BRIDGE_MAX - 1)
>>>> @@ -125,6 +127,8 @@ enum {
>>>>   #define BRIDGE_VLAN_INFO_MASTER	(1<<0)	/* Operate on Bridge device as well */
>>>>   #define BRIDGE_VLAN_INFO_PVID	(1<<1)	/* VLAN is PVID, ingress untagged */
>>>>   #define BRIDGE_VLAN_INFO_UNTAGGED	(1<<2)	/* VLAN egresses untagged */
>>>> +#define BRIDGE_VLAN_INFO_RANGE_START	(1<<3) /* VLAN is start of vlan range */
>>>> +#define BRIDGE_VLAN_INFO_RANGE_END	(1<<4) /* VLAN is end of vlan range */
>>> You add these here but you don't use them until the next patch.
>>> If they were wrong a bisect would point to the next patch.
>>>
>>> I would add them in the next patch where you start to use them.
>> I thought it was ok to declare it first and use them in the next patch. Only
>> the other way around would be bad.
>>   I have submitted in a similar way before. If needed i will resubmit.
>>
>>
> Hmm.  I cannot see how the other way would be bad but maybe I am missing
> something.
sorry, i did not mean what you were saying would be bad. I was just 
trying to say that, use first and declare later would be bad (ie if my 
patches 1 and 2 were swapped). Otherwise i don't see a problem.

I know that you are saying i should combine the patches 1 and 2 into a 
single patch. That is not a problem. If i need to respin again due to 
other reasons i will consider merging them as well if that is a concern.

thanks.

>   Hopefully someone else has some insight.
>
>>
>>>>   struct bridge_vlan_info {
>>>>   	__u16 flags;
>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>>> index 9f5eb55..492ef6a 100644
>>>> --- a/net/bridge/br_netlink.c
>>>> +++ b/net/bridge/br_netlink.c
>>>> @@ -223,6 +223,7 @@ static const struct nla_policy ifla_br_policy[IFLA_MAX+1] = {
>>>>   	[IFLA_BRIDGE_MODE]	= { .type = NLA_U16 },
>>>>   	[IFLA_BRIDGE_VLAN_INFO]	= { .type = NLA_BINARY,
>>>>   				    .len = sizeof(struct bridge_vlan_info), },
>>>> +	[IFLA_BRIDGE_VLAN_INFO_LIST] = { .type = NLA_NESTED, },
>>>>   };
>>>>   static int br_afspec(struct net_bridge *br,
>>>> -- 
>>>> 1.7.10.4
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] TCP: Add support for TCP Stealth
From: Julian Kirsch @ 2014-12-31 21:54 UTC (permalink / raw)
  To: netdev; +Cc: Christian Grothoff, Jacob Appelbaum

[-- Attachment #1: Type: text/plain, Size: 2090 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

one year ago [0] we tried to convince you to add support for a new
socket option to the linux kernel. Equipped with an improved version of
our patch we're back to accomplish this task today. :-)

TCP Stealth is a modern variant of port knocking which borrows
techniques from network steganography to enable clients to authenticate
themselves towards a server on TCP level. You can find technical details
in an rfc draft we wrote earlier this year [1] and in my master's thesis
[2]. In summary, TCP Stealth derives authentication information from a
pre-shared secret and embeds it into the ISN sent along with the first
SYN from the client.

Our motivation is simple: During this year we gained hard evidence on
secret services actively port scanning the internets followed by
exploitation of your services using 0-day exploits [3, 4]. We don't want
our machines to be turned into relays from where they continue to
cascade their attacks. TCP Stealth makes port scanning more expensive by
a factor of 2^31 (on average).

A copy of this patch as well as patches for several user space
applications can be found on the project's home page [5].

All the best for the upcoming year,
Julian & Christian



[0] https://lkml.org/lkml/2013/12/10/1155
[1] https://datatracker.ietf.org/doc/draft-kirsch-ietf-tcp-stealth/
[2] https://gnunet.org/kirsch2014knock
[3]
http://www.heise.de/ct/artikel/NSA-GCHQ-The-HACIENDA-Program-for-Internet-Colonization-2292681.html
[4]
https://firstlook.org/theintercept/2014/12/13/belgacom-hack-gchq-inside-story/
[5] https://gnunet.org/knock
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJUpHCvAAoJENwkOWttRRA4g10IALbJZU9/5Gp8tVdpXqbkOIMp
Kz+yOMyYULqYeM8yguSBZjZLbaz/VAS7SNpQxKGU+W0aAXa22FsSfVoUU7wqp3NT
3EGRuPkMaJkQ66IP8MtX+6/hSeWSh78tEaIFWVjyutihPyQGz0LefFc66gm54X4T
s8IYW7jKFhNmmROu9CXLTxq4B5t2v+Evv/qWqotZqR1t3IbIUmZAiKrlkMRd7dtM
SaS5JwFeiObxn+0M/7javQCAhfgPXYEOU0QKAGY55MXcPAner/5PuExIZdOJ41R3
XD9tgoLGhHEiQkxj0/bP2cs3Cl5xfJl9t2iecVfTIR7PytaTJ/kFuE4gNgWEcTA=
=T6/C
-----END PGP SIGNATURE-----

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: tcp_stealth_3.18.diff --]
[-- Type: text/x-patch; name="tcp_stealth_3.18.diff", Size: 19140 bytes --]

Signed-off-by: Julian Kirsch <kirschju@sec.in.tum.de>
diff -Nurp linux-3.18-rc3/include/linux/tcp.h linux-3.18-rc3-knock/include/linux/tcp.h
--- linux-3.18-rc3/include/linux/tcp.h	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/include/linux/tcp.h	2014-11-06 21:26:34.976017001 +0100
@@ -19,6 +19,7 @@
 
 
 #include <linux/skbuff.h>
+#include <linux/cryptohash.h>
 #include <net/sock.h>
 #include <net/inet_connection_sock.h>
 #include <net/inet_timewait_sock.h>
@@ -309,6 +310,21 @@ struct tcp_sock {
 	struct tcp_md5sig_info	__rcu *md5sig_info;
 #endif
 
+#ifdef CONFIG_TCP_STEALTH
+/* Stealth TCP socket configuration */
+	struct {
+		#define TCP_STEALTH_MODE_AUTH		BIT(0)
+		#define TCP_STEALTH_MODE_INTEGRITY	BIT(1)
+		#define TCP_STEALTH_MODE_INTEGRITY_LEN	BIT(2)
+		int mode;
+		u8 secret[MD5_MESSAGE_BYTES];
+		int integrity_len;
+		u16 integrity_hash;
+		struct skb_mstamp mstamp;
+		bool saw_tsval;
+	} stealth;
+#endif
+
 /* TCP fastopen related information */
 	struct tcp_fastopen_request *fastopen_req;
 	/* fastopen_rsk points to request_sock that resulted in this big
diff -Nurp linux-3.18-rc3/include/net/secure_seq.h linux-3.18-rc3-knock/include/net/secure_seq.h
--- linux-3.18-rc3/include/net/secure_seq.h	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/include/net/secure_seq.h	2014-11-06 21:26:34.976017001 +0100
@@ -14,5 +14,10 @@ u64 secure_dccp_sequence_number(__be32 s
 				__be16 sport, __be16 dport);
 u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr,
 				  __be16 sport, __be16 dport);
+#ifdef CONFIG_TCP_STEALTH
+u32 tcp_stealth_do_auth(struct sock *sk, struct sk_buff *skb);
+u32 tcp_stealth_sequence_number(struct sock *sk, __be32 *daddr,
+				u32 daddr_size, __be16 dport);
+#endif
 
 #endif /* _NET_SECURE_SEQ */
diff -Nurp linux-3.18-rc3/include/net/tcp.h linux-3.18-rc3-knock/include/net/tcp.h
--- linux-3.18-rc3/include/net/tcp.h	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/include/net/tcp.h	2014-11-06 21:26:34.976017001 +0100
@@ -439,6 +439,12 @@ void tcp_parse_options(const struct sk_b
 		       struct tcp_options_received *opt_rx,
 		       int estab, struct tcp_fastopen_cookie *foc);
 const u8 *tcp_parse_md5sig_option(const struct tcphdr *th);
+#ifdef CONFIG_TCP_STEALTH
+const bool tcp_parse_tsval_option(u32 *tsval, const struct tcphdr *th);
+int tcp_stealth_integrity(u16 *hash, u8 *secret, u8 *payload, int len);
+#define be32_isn_to_be16_av(x)	(((__be16 *)&x)[0])
+#define be32_isn_to_be16_ih(x)	(((__be16 *)&x)[1])
+#endif
 
 /*
  *	TCP v4 functions exported for the inet6 API
diff -Nurp linux-3.18-rc3/include/uapi/linux/tcp.h linux-3.18-rc3-knock/include/uapi/linux/tcp.h
--- linux-3.18-rc3/include/uapi/linux/tcp.h	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/include/uapi/linux/tcp.h	2014-11-06 21:26:34.976017001 +0100
@@ -112,6 +112,9 @@ enum {
 #define TCP_FASTOPEN		23	/* Enable FastOpen on listeners */
 #define TCP_TIMESTAMP		24
 #define TCP_NOTSENT_LOWAT	25	/* limit number of unsent bytes in write queue */
+#define TCP_STEALTH		26
+#define TCP_STEALTH_INTEGRITY	27
+#define TCP_STEALTH_INTEGRITY_LEN	28
 
 struct tcp_repair_opt {
 	__u32	opt_code;
diff -Nurp linux-3.18-rc3/net/core/secure_seq.c linux-3.18-rc3-knock/net/core/secure_seq.c
--- linux-3.18-rc3/net/core/secure_seq.c	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/net/core/secure_seq.c	2014-11-24 14:31:20.227872751 +0100
@@ -8,7 +8,11 @@
 #include <linux/ktime.h>
 #include <linux/string.h>
 #include <linux/net.h>
+#include <linux/socket.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
 
+#include <net/tcp.h>
 #include <net/secure_seq.h>
 
 #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET)
@@ -39,6 +43,103 @@ static u32 seq_scale(u32 seq)
 }
 #endif
 
+#ifdef CONFIG_TCP_STEALTH
+u32 tcp_stealth_sequence_number(struct sock *sk, __be32 *daddr,
+				u32 daddr_size, __be16 dport)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct tcp_md5sig_key *md5;
+
+	__u32 sec[MD5_MESSAGE_BYTES / sizeof(__u32)];
+	__u32 i;
+	__u32 tsval = 0;
+
+	__be32 iv[MD5_DIGEST_WORDS] = { 0 };
+	__be32 isn;
+
+	memcpy(iv, (const __u8 *)daddr,
+	       (daddr_size > sizeof(iv)) ? sizeof(iv) : daddr_size);
+
+#ifdef CONFIG_TCP_MD5SIG
+	md5 = tp->af_specific->md5_lookup(sk, sk);
+#else
+	md5 = NULL;
+#endif
+	if (likely(sysctl_tcp_timestamps && !md5) || tp->stealth.saw_tsval)
+		tsval = tp->stealth.mstamp.stamp_jiffies;
+
+	((__be16 *)iv)[2] ^= cpu_to_be16(tp->stealth.integrity_hash);
+	iv[2] ^= cpu_to_be32(tsval);
+	((__be16 *)iv)[6] ^= dport;
+
+	for (i = 0; i < MD5_DIGEST_WORDS; i++)
+		iv[i] = le32_to_cpu(iv[i]);
+	for (i = 0; i < MD5_MESSAGE_BYTES / sizeof(__le32); i++)
+		sec[i] = le32_to_cpu(((__le32 *)tp->stealth.secret)[i]);
+
+	md5_transform(iv, sec);
+
+	isn = cpu_to_be32(iv[0]) ^ cpu_to_be32(iv[1]) ^
+	      cpu_to_be32(iv[2]) ^ cpu_to_be32(iv[3]);
+
+	if (tp->stealth.mode & TCP_STEALTH_MODE_INTEGRITY)
+		be32_isn_to_be16_ih(isn) =
+			cpu_to_be16(tp->stealth.integrity_hash);
+
+	return be32_to_cpu(isn);
+}
+EXPORT_SYMBOL(tcp_stealth_sequence_number);
+
+u32 tcp_stealth_do_auth(struct sock *sk, struct sk_buff *skb)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct tcphdr *th = tcp_hdr(skb);
+	__be32 isn = th->seq;
+	__be32 hash;
+	__be32 *daddr;
+	u32 daddr_size;
+
+	tp->stealth.saw_tsval =
+		tcp_parse_tsval_option(&tp->stealth.mstamp.stamp_jiffies, th);
+
+	if (tp->stealth.mode & TCP_STEALTH_MODE_INTEGRITY_LEN)
+		tp->stealth.integrity_hash =
+			be16_to_cpu(be32_isn_to_be16_ih(isn));
+
+	switch (tp->inet_conn.icsk_inet.sk.sk_family) {
+#if IS_ENABLED(CONFIG_IPV6)
+	case PF_INET6:
+		daddr_size = sizeof(ipv6_hdr(skb)->daddr.s6_addr32);
+		daddr = ipv6_hdr(skb)->daddr.s6_addr32;
+	break;
+#endif
+	case PF_INET:
+		daddr_size = sizeof(ip_hdr(skb)->daddr);
+		daddr = &ip_hdr(skb)->daddr;
+	break;
+	default:
+		pr_err("TCP Stealth: Unknown network layer protocol, stop!\n");
+		return 1;
+	}
+
+	hash = tcp_stealth_sequence_number(sk, daddr, daddr_size, th->dest);
+	cpu_to_be32s(&hash);
+
+	if (tp->stealth.mode & TCP_STEALTH_MODE_AUTH &&
+	    tp->stealth.mode & TCP_STEALTH_MODE_INTEGRITY_LEN &&
+	    be32_isn_to_be16_av(isn) == be32_isn_to_be16_av(hash))
+		return 0;
+
+	if (tp->stealth.mode & TCP_STEALTH_MODE_AUTH &&
+	    !(tp->stealth.mode & TCP_STEALTH_MODE_INTEGRITY_LEN) &&
+	    isn == hash)
+		return 0;
+
+	return 1;
+}
+EXPORT_SYMBOL(tcp_stealth_do_auth);
+#endif
+
 #if IS_ENABLED(CONFIG_IPV6)
 __u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 				   __be16 sport, __be16 dport)
diff -Nurp linux-3.18-rc3/net/ipv4/Kconfig linux-3.18-rc3-knock/net/ipv4/Kconfig
--- linux-3.18-rc3/net/ipv4/Kconfig	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/net/ipv4/Kconfig	2014-11-06 21:26:34.976017001 +0100
@@ -671,3 +671,13 @@ config TCP_MD5SIG
 	  on the Internet.
 
 	  If unsure, say N.
+
+config TCP_STEALTH
+	bool "TCP: Stealth TCP socket support"
+	default n
+	---help---
+	  This option enables support for stealth TCP sockets. If you do not
+	  know what this means, you do not need it.
+
+	  If unsure, say N.
+
diff -Nurp linux-3.18-rc3/net/ipv4/tcp.c linux-3.18-rc3-knock/net/ipv4/tcp.c
--- linux-3.18-rc3/net/ipv4/tcp.c	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/net/ipv4/tcp.c	2014-11-24 11:44:39.700059516 +0100
@@ -2329,6 +2329,43 @@ static int tcp_repair_options_est(struct
 	return 0;
 }
 
+#ifdef CONFIG_TCP_STEALTH
+int tcp_stealth_integrity(__be16 *hash, u8 *secret, u8 *payload, int len)
+{
+	struct scatterlist sg[2];
+	struct crypto_hash *tfm;
+	struct hash_desc desc;
+	__be16 h[MD5_DIGEST_WORDS * 2];
+	int i;
+	int err = 0;
+
+	tfm = crypto_alloc_hash("md5", 0, CRYPTO_ALG_ASYNC);
+	if (IS_ERR(tfm)) {
+		err = -PTR_ERR(tfm);
+		goto out;
+	}
+	desc.tfm = tfm;
+	desc.flags = 0;
+
+	sg_init_table(sg, 2);
+	sg_set_buf(&sg[0], secret, MD5_MESSAGE_BYTES);
+	sg_set_buf(&sg[1], payload, len);
+
+	if (crypto_hash_digest(&desc, sg, MD5_MESSAGE_BYTES + len, (u8 *)h)) {
+		err = -EFAULT;
+		goto out;
+	}
+
+	*hash = be16_to_cpu(h[0]);
+	for (i = 1; i < MD5_DIGEST_WORDS * 2; i++)
+		*hash ^= be16_to_cpu(h[i]);
+
+out:
+	crypto_free_hash(tfm);
+	return err;
+}
+#endif
+
 /*
  *	Socket option code for TCP.
  */
@@ -2359,6 +2396,67 @@ static int do_tcp_setsockopt(struct sock
 		release_sock(sk);
 		return err;
 	}
+#ifdef CONFIG_TCP_STEALTH
+	case TCP_STEALTH: {
+		u8 secret[MD5_MESSAGE_BYTES] = { 0 };
+
+		val = copy_from_user(secret, optval,
+				     min_t(unsigned int, optlen,
+					   MD5_MESSAGE_BYTES));
+
+		if (val != 0)
+			return -EFAULT;
+
+		lock_sock(sk);
+		memcpy(tp->stealth.secret, secret, MD5_MESSAGE_BYTES);
+		tp->stealth.mode = TCP_STEALTH_MODE_AUTH;
+		tp->stealth.mstamp.v64 = 0;
+		tp->stealth.saw_tsval = false;
+		release_sock(sk);
+		return err;
+	}
+	case TCP_STEALTH_INTEGRITY: {
+		u8 *payload;
+
+		lock_sock(sk);
+
+		if (!(tp->stealth.mode & TCP_STEALTH_MODE_AUTH)) {
+			err = -EOPNOTSUPP;
+			goto stealth_integrity_out_1;
+		}
+
+		if (optlen < 1 || optlen > USHRT_MAX) {
+			err = -EINVAL;
+			goto stealth_integrity_out_1;
+		}
+
+		payload = vmalloc(optlen);
+		if (!payload) {
+			err = -ENOMEM;
+			goto stealth_integrity_out_1;
+		}
+
+		val = copy_from_user(payload, optval, optlen);
+		if (val != 0) {
+			err = -EFAULT;
+			goto stealth_integrity_out_2;
+		}
+
+		err = tcp_stealth_integrity(&tp->stealth.integrity_hash,
+					    tp->stealth.secret, payload,
+					    optlen);
+		if (err)
+			goto stealth_integrity_out_2;
+
+		tp->stealth.mode |= TCP_STEALTH_MODE_INTEGRITY;
+
+stealth_integrity_out_2:
+		vfree(payload);
+stealth_integrity_out_1:
+		release_sock(sk);
+		return err;
+	}
+#endif
 	default:
 		/* fallthru */
 		break;
@@ -2600,6 +2698,18 @@ static int do_tcp_setsockopt(struct sock
 		tp->notsent_lowat = val;
 		sk->sk_write_space(sk);
 		break;
+#ifdef CONFIG_TCP_STEALTH
+	case TCP_STEALTH_INTEGRITY_LEN:
+		if (!(tp->stealth.mode & TCP_STEALTH_MODE_AUTH)) {
+			err = -EOPNOTSUPP;
+		} else if (val < 1 || val > USHRT_MAX) {
+			err = -EINVAL;
+		} else {
+			tp->stealth.integrity_len = val;
+			tp->stealth.mode |= TCP_STEALTH_MODE_INTEGRITY_LEN;
+		}
+		break;
+#endif
 	default:
 		err = -ENOPROTOOPT;
 		break;
diff -Nurp linux-3.18-rc3/net/ipv4/tcp_input.c linux-3.18-rc3-knock/net/ipv4/tcp_input.c
--- linux-3.18-rc3/net/ipv4/tcp_input.c	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/net/ipv4/tcp_input.c	2014-11-06 21:26:34.976017001 +0100
@@ -77,6 +77,9 @@
 #include <linux/errqueue.h>
 
 int sysctl_tcp_timestamps __read_mostly = 1;
+#ifdef CONFIG_TCP_STEALTH
+EXPORT_SYMBOL(sysctl_tcp_timestamps);
+#endif
 int sysctl_tcp_window_scaling __read_mostly = 1;
 int sysctl_tcp_sack __read_mostly = 1;
 int sysctl_tcp_fack __read_mostly = 1;
@@ -3715,6 +3718,47 @@ static bool tcp_fast_parse_options(const
 	return true;
 }
 
+#ifdef CONFIG_TCP_STEALTH
+/* Parse only the TSVal field of the TCP Timestamp option header.
+ */
+const bool tcp_parse_tsval_option(u32 *tsval, const struct tcphdr *th)
+{
+	int length = (th->doff << 2) - sizeof(*th);
+	const u8 *ptr = (const u8 *)(th + 1);
+
+	/* If the TCP option is too short, we can short cut */
+	if (length < TCPOLEN_TIMESTAMP)
+		return false;
+
+	while (length > 0) {
+		int opcode = *ptr++;
+		int opsize;
+
+		switch (opcode) {
+		case TCPOPT_EOL:
+			return false;
+		case TCPOPT_NOP:
+			length--;
+			continue;
+		case TCPOPT_TIMESTAMP:
+			opsize = *ptr++;
+			if (opsize != TCPOLEN_TIMESTAMP || opsize > length)
+				return false;
+			*tsval = get_unaligned_be32(ptr);
+			return true;
+		default:
+			opsize = *ptr++;
+			if (opsize < 2 || opsize > length)
+				return false;
+		}
+		ptr += opsize - 2;
+		length -= opsize;
+	}
+	return false;
+}
+EXPORT_SYMBOL(tcp_parse_tsval_option);
+#endif
+
 #ifdef CONFIG_TCP_MD5SIG
 /*
  * Parse MD5 Signature option
@@ -4384,6 +4428,31 @@ err:
 	return -ENOMEM;
 }
 
+#ifdef CONFIG_TCP_STEALTH
+static int __tcp_stealth_integrity_check(struct sock *sk, struct sk_buff *skb)
+{
+	struct tcphdr *th = tcp_hdr(skb);
+	struct tcp_sock *tp = tcp_sk(sk);
+	u16 hash;
+	__be32 seq = cpu_to_be32(TCP_SKB_CB(skb)->seq - 1);
+	char *data = skb->data + th->doff * 4;
+	int len = skb->len - th->doff * 4;
+
+	if (len < tp->stealth.integrity_len)
+		return 1;
+
+	if (tcp_stealth_integrity(&hash, tp->stealth.secret, data,
+				  tp->stealth.integrity_len))
+		return 1;
+
+	if (be32_isn_to_be16_ih(seq) != cpu_to_be16(hash))
+		return 1;
+
+	tp->stealth.mode &= ~TCP_STEALTH_MODE_INTEGRITY_LEN;
+	return 0;
+}
+#endif
+
 static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
@@ -4393,6 +4462,14 @@ static void tcp_data_queue(struct sock *
 	if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq)
 		goto drop;
 
+#ifdef CONFIG_TCP_STEALTH
+	if (unlikely(tp->stealth.mode & TCP_STEALTH_MODE_INTEGRITY_LEN) &&
+	    __tcp_stealth_integrity_check(sk, skb)) {
+		tcp_reset(sk);
+		goto drop;
+	}
+#endif
+
 	skb_dst_drop(skb);
 	__skb_pull(skb, tcp_hdr(skb)->doff * 4);
 
@@ -5156,6 +5233,15 @@ void tcp_rcv_established(struct sock *sk
 			int eaten = 0;
 			bool fragstolen = false;
 
+#ifdef CONFIG_TCP_STEALTH
+			if (unlikely(tp->stealth.mode &
+				     TCP_STEALTH_MODE_INTEGRITY_LEN) &&
+			    __tcp_stealth_integrity_check(sk, skb)) {
+				tcp_reset(sk);
+				goto discard;
+			}
+#endif
+
 			if (tp->ucopy.task == current &&
 			    tp->copied_seq == tp->rcv_nxt &&
 			    len - tcp_header_len <= tp->ucopy.len &&
diff -Nurp linux-3.18-rc3/net/ipv4/tcp_ipv4.c linux-3.18-rc3-knock/net/ipv4/tcp_ipv4.c
--- linux-3.18-rc3/net/ipv4/tcp_ipv4.c	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/net/ipv4/tcp_ipv4.c	2014-11-06 21:26:34.976017001 +0100
@@ -75,6 +75,7 @@
 #include <net/secure_seq.h>
 #include <net/tcp_memcontrol.h>
 #include <net/busy_poll.h>
+#include <net/secure_seq.h>
 
 #include <linux/inet.h>
 #include <linux/ipv6.h>
@@ -235,6 +236,21 @@ int tcp_v4_connect(struct sock *sk, stru
 	sk->sk_gso_type = SKB_GSO_TCPV4;
 	sk_setup_caps(sk, &rt->dst);
 
+#ifdef CONFIG_TCP_STEALTH
+	/* If CONFIG_TCP_STEALTH is defined, we need to know the timestamp as
+	 * early as possible and thus move taking the snapshot of tcp_time_stamp
+	 * here.
+	 */
+	skb_mstamp_get(&tp->stealth.mstamp);
+
+	if (!tp->write_seq && likely(!tp->repair) &&
+	    unlikely(tp->stealth.mode & TCP_STEALTH_MODE_AUTH))
+		tp->write_seq = tcp_stealth_sequence_number(sk,
+					&inet->inet_daddr,
+					sizeof(inet->inet_daddr),
+					usin->sin_port);
+#endif
+
 	if (!tp->write_seq && likely(!tp->repair))
 		tp->write_seq = secure_tcp_sequence_number(inet->inet_saddr,
 							   inet->inet_daddr,
@@ -1423,6 +1439,8 @@ static struct sock *tcp_v4_hnd_req(struc
  */
 int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 {
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct tcphdr *th = tcp_hdr(skb);
 	struct sock *rsk;
 
 	if (sk->sk_state == TCP_ESTABLISHED) { /* Fast path */
@@ -1443,6 +1461,15 @@ int tcp_v4_do_rcv(struct sock *sk, struc
 	if (skb->len < tcp_hdrlen(skb) || tcp_checksum_complete(skb))
 		goto csum_err;
 
+#ifdef CONFIG_TCP_STEALTH
+	if (sk->sk_state == TCP_LISTEN && th->syn && !th->fin &&
+	    unlikely(tp->stealth.mode & TCP_STEALTH_MODE_AUTH) &&
+	    tcp_stealth_do_auth(sk, skb)) {
+		rsk = sk;
+		goto reset;
+	}
+#endif
+
 	if (sk->sk_state == TCP_LISTEN) {
 		struct sock *nsk = tcp_v4_hnd_req(sk, skb);
 		if (!nsk)
diff -Nurp linux-3.18-rc3/net/ipv4/tcp_output.c linux-3.18-rc3-knock/net/ipv4/tcp_output.c
--- linux-3.18-rc3/net/ipv4/tcp_output.c	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/net/ipv4/tcp_output.c	2014-11-24 14:29:25.380852760 +0100
@@ -915,6 +915,13 @@ static int tcp_transmit_skb(struct sock
 	tcb = TCP_SKB_CB(skb);
 	memset(&opts, 0, sizeof(opts));
 
+#ifdef TCP_STEALTH
+	if (unlikely(tcb->tcp_flags & TCPHDR_SYN &&
+		     tp->stealth.mode & TCP_STEALTH_MODE_AUTH)) {
+		skb->skb_mstamp = tp->stealth.mstamp;
+	}
+#endif
+
 	if (unlikely(tcb->tcp_flags & TCPHDR_SYN))
 		tcp_options_size = tcp_syn_options(sk, skb, &opts, &md5);
 	else
@@ -3109,7 +3116,15 @@ int tcp_connect(struct sock *sk)
 	skb_reserve(buff, MAX_TCP_HEADER);
 
 	tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN);
+#ifdef CONFIG_TCP_STEALTH
+	/* The timetamp was already made at the time the ISN was generated
+	 * as we need to know its value in the stealth_tcp_sequence_number()
+	 * function.
+	 */
+	tp->retrans_stamp = tp->stealth.mstamp.stamp_jiffies;
+#else
 	tp->retrans_stamp = tcp_time_stamp;
+#endif
 	tcp_connect_queue_skb(sk, buff);
 	tcp_ecn_send_syn(sk, buff);
 
diff -Nurp linux-3.18-rc3/net/ipv6/tcp_ipv6.c linux-3.18-rc3-knock/net/ipv6/tcp_ipv6.c
--- linux-3.18-rc3/net/ipv6/tcp_ipv6.c	2014-11-03 00:01:51.000000000 +0100
+++ linux-3.18-rc3-knock/net/ipv6/tcp_ipv6.c	2014-11-06 21:26:34.976017001 +0100
@@ -63,6 +63,7 @@
 #include <net/secure_seq.h>
 #include <net/tcp_memcontrol.h>
 #include <net/busy_poll.h>
+#include <net/secure_seq.h>
 
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
@@ -297,6 +298,21 @@ static int tcp_v6_connect(struct sock *s
 
 	ip6_set_txhash(sk);
 
+#ifdef CONFIG_TCP_STEALTH
+	/* If CONFIG_TCP_STEALTH is defined, we need to know the timestamp as
+	 * early as possible and thus move taking the snapshot of tcp_time_stamp
+	 * here.
+	 */
+	skb_mstamp_get(&tp->stealth.mstamp);
+
+	if (!tp->write_seq && likely(!tp->repair) &&
+	    unlikely(tp->stealth.mode & TCP_STEALTH_MODE_AUTH))
+		tp->write_seq = tcp_stealth_sequence_number(sk,
+					sk->sk_v6_daddr.s6_addr32,
+					sizeof(sk->sk_v6_daddr),
+					inet->inet_dport);
+#endif
+
 	if (!tp->write_seq && likely(!tp->repair))
 		tp->write_seq = secure_tcpv6_sequence_number(np->saddr.s6_addr32,
 							     sk->sk_v6_daddr.s6_addr32,
@@ -1251,7 +1267,8 @@ out:
 static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 {
 	struct ipv6_pinfo *np = inet6_sk(sk);
-	struct tcp_sock *tp;
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct tcphdr *th = tcp_hdr(skb);
 	struct sk_buff *opt_skb = NULL;
 
 	/* Imagine: socket is IPv6. IPv4 packet arrives,
@@ -1310,6 +1327,13 @@ static int tcp_v6_do_rcv(struct sock *sk
 	if (skb->len < tcp_hdrlen(skb) || tcp_checksum_complete(skb))
 		goto csum_err;
 
+#ifdef CONFIG_TCP_STEALTH
+	if (sk->sk_state == TCP_LISTEN && th->syn && !th->fin &&
+	    tp->stealth.mode & TCP_STEALTH_MODE_AUTH &&
+	    tcp_stealth_do_auth(sk, skb))
+		goto reset;
+#endif
+
 	if (sk->sk_state == TCP_LISTEN) {
 		struct sock *nsk = tcp_v6_hnd_req(sk, skb);
 		if (!nsk)

[-- Attachment #3: tcp_stealth_3.18.diff.sig --]
[-- Type: application/pgp-signature, Size: 287 bytes --]

^ permalink raw reply

* Re: [PATCH] net: ethernet: intel: i40e: i40e_fcoe.c:  Remove unused function
From: Jeff Kirsher @ 2014-12-31 23:15 UTC (permalink / raw)
  To: Rickard Strandqvist
  Cc: Jesse Brandeburg, Bruce Allan, Carolyn Wyborny, Don Skidmore,
	Greg Rose, Matthew Vick, John Ronciak, Mitch Williams, Linux NICS,
	e1000-devel, netdev, linux-kernel
In-Reply-To: <1420044537-21077-1-git-send-email-rickard_strandqvist@spectrumdigital.se>

[-- Attachment #1: Type: text/plain, Size: 603 bytes --]

On Wed, 2014-12-31 at 17:48 +0100, Rickard Strandqvist wrote:
> Remove the function i40e_rx_is_fip() that is not used anywhere.
> 
> This was partially found by using a static code analysis program
> called cppcheck.
> 
> Signed-off-by: Rickard Strandqvist
> <rickard_strandqvist@spectrumdigital.se>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_fcoe.c |    9 ---------
>  1 file changed, 9 deletions(-)

Thanks Rickard!  I thought I had some patches in my queue that started
to make use of that function, but come to find out, I don't... :-)

I will add your patch to my queue, thanks!

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* [PATCH 1/2] e1000e: Include clocksource.h to get CLOCKSOURCE_MASK.
From: David Miller @ 2014-12-31 23:33 UTC (permalink / raw)
  To: richardcochran; +Cc: netdev


Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index e14fd85..2537d36a 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -43,6 +43,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/aer.h>
 #include <linux/prefetch.h>
+#include <linux/clocksource.h>
 
 #include "e1000.h"
 
-- 
2.1.0

^ permalink raw reply related

* [PATCH 2/2] igb_ptp: Include clocksource.h to get CLOCKSOURCE_MASK.
From: David Miller @ 2014-12-31 23:33 UTC (permalink / raw)
  To: richardcochran; +Cc: netdev


Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/intel/igb/igb_ptp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
index 1d27f2d..8baf3fd 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -19,6 +19,7 @@
 #include <linux/device.h>
 #include <linux/pci.h>
 #include <linux/ptp_classify.h>
+#include <linux/clocksource.h>
 
 #include "igb.h"
 
-- 
2.1.0

^ permalink raw reply related

* Re: [PATCH net] netlink: call cond_resched after broadcasting updates
From: David Miller @ 2014-12-31 23:38 UTC (permalink / raw)
  To: stephen; +Cc: netdev
In-Reply-To: <20141227095433.00333deb@urahara>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Sat, 27 Dec 2014 09:54:33 -0800

> When a netlink event is posted to a socket, the receiving process maybe
> waiting to wakeup. Reduce the latency by calling cond_resched() in this
> loop. This reduces the problems with missed events during a netlink
> storm such as when a routing daemon does mass update in response to
> a link transition.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

As mentioned by others, this is potentially invoked from software
interrupts generating netlink events (one example is ipv6) so we
can't try to conditionally sleep here.

^ permalink raw reply

* Re: [PATCH 1/2] e1000e: Include clocksource.h to get CLOCKSOURCE_MASK.
From: Jeff Kirsher @ 2014-12-31 23:42 UTC (permalink / raw)
  To: David Miller; +Cc: Richard Cochran, netdev
In-Reply-To: <20141231.183347.862533634176009078.davem@davemloft.net>

On Wed, Dec 31, 2014 at 3:33 PM, David Miller <davem@davemloft.net> wrote:
>
> Signed-off-by: David S. Miller <davem@davemloft.net>

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c | 1 +
>  1 file changed, 1 insertion(+)

^ permalink raw reply

* Re: [PATCH 2/2] igb_ptp: Include clocksource.h to get CLOCKSOURCE_MASK.
From: Jeff Kirsher @ 2014-12-31 23:43 UTC (permalink / raw)
  To: David Miller; +Cc: Richard Cochran, netdev
In-Reply-To: <20141231.183359.681102444156146233.davem@davemloft.net>

On Wed, Dec 31, 2014 at 3:33 PM, David Miller <davem@davemloft.net> wrote:
>
> Signed-off-by: David S. Miller <davem@davemloft.net>

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

> ---
>  drivers/net/ethernet/intel/igb/igb_ptp.c | 1 +
>  1 file changed, 1 insertion(+)
>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox