netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Graf <tgraf@suug.ch>
To: Jiri Pirko <jiri@resnulli.us>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	John Fastabend <john.fastabend@gmail.com>,
	simon.horman@netronome.com, sfeldma@gmail.com,
	netdev@vger.kernel.org, davem@davemloft.net,
	gerlitz.or@gmail.com, andy@greyhouse.net, ast@plumgrid.com
Subject: Re: [net-next PATCH v3 00/12] Flow API
Date: Fri, 23 Jan 2015 12:28:38 +0000	[thread overview]
Message-ID: <20150123122838.GI25797@casper.infradead.org> (raw)
In-Reply-To: <20150123113934.GD2065@nanopsycho.orion>

On 01/23/15 at 12:39pm, Jiri Pirko wrote:
> Maybe I did not express myself correctly. I do not care if this is
> exposed by rtnl or a separate genetlink. The issue still stands. And the
> issue is that the user have to use "the way A" to setup sw datapath and
> "the way B" to setup hw datapath. The preferable would be to have
> "the way X" which can be used to setup both sw and hw.
> 
> And I believe that could be achieved. Consider something like this:
> 
> - have cls_xflows tc classifier and act_xflows tc action as a wrapper
>   (or api) for John's work. With possibility for multiple backends. The
>   backend iface would looke very similar to what John has now.
> - other tc clses and acts will implement xflows backend
> - openvswitch datapath will implement xflows backend
> - rocker switch will implement xflows backend
> - other drivers will implement xflows backend
> 
> Now if user wants to manipulate with any flow setting, he can just use
> cls_xflows and act_xflows to to that.
> 
> This is very rough, but I just wanted to draw the picture. This would
> provide single entry to flow world manipulation in kernel, no matter if
> sw or hw.

If I understand this correctly then you propose to do the decision on
whether to implement a flow in software or offload it to hardware in the
xflows classifier and action. I had exactly the same architecture in mind
initially when I first approached this and wanted to offload OVS
datapath flows transparently to hardware.

If you look at this from the existing tc world then that makes a lot
of sense, in particular if you only support a single flat table with
wildcard flows and no priorities.

If you want to support priorities it already gets complicated. If flow
A, B, C are offloaded to hardware and the user then inserts a new flow
D with higher priority that can't be offloaded you need to figure out
whether you have to remove any of A, B, C from the hardware tables again
on the basis whether D overlaps with A, B, or C. If you have to remove
any of them you then have to verify whether that removal needs to
remove other already offloaded flows as well. It's certainly doable but
already adds considerable complexity to the kernel.

If you want to support multiple tables it gets even more complicated
because a flow in table 2 which can be offloaded might depend on a
flow in table 1 which can't be offloaded. You somehow need to track
that dependency and ensure that table 1 sends that flow to the CPU so
that the flow in table 2 sees it. The answer to this might be to maybe
only support  offload to a single table but that decreases the value
of the offload dramatically because the capabilities of each table are
very different.

If you bring the full programmability of OVS into the picture you might
have a pipeline consisting of multiple tables like this:

 +-------+   +------+   +-----+   +-------+
 | Decap |-->| L2   |-->| L3  |-->| Encap |
 +-------+   +------+   +-----+   +-------+

Each table contains flows and metadata registers plus header matches
are used to talk among the tables. The pipeline builds a chain of
actions which may be executed at any point in the pipeline or at the
end. If you want to map such a software pipeline to a set of hardware
tables you need to have full visbility into this table structure at
the point where you make the offload decision. This means that all of
this complexity would have to move into xflows.

Another aspect is that you might want to split a flow X into a hardware
and software part, e.g. consider the following flow:

in_port=vxlan0,vni=10,ip_dst=10.1.1.1,actions=decap(),nfqueue(10),output(tap0)

The hardware might be capable of matching on the VXLAN VNI, IP dst and
it might also capable of deencap. It obviously doesn't know about
netfilter queues. Ideally what you want is to split this into the
following flows:

Hardware table (offloaded):
in_port=vxlan0,vni=10,ip_dst=10.1.1.1,actions=decap(),metadata=1

Software table:
metadata=1,actions=nfqueue(10),output(tap0)

If the hardware capabilities are not exported to OVS then xflows would
need to encode such logic and xflows would need to be made aware of the
full software pipeline with all tables as you need to see all flows in
order to decide what to offload where.

I would love to see a tc interface to John's flow API and I see
tremendous value but I don't think it's appropriate to offload OVS.

  reply	other threads:[~2015-01-23 12:28 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-20 20:26 [net-next PATCH v3 00/12] Flow API John Fastabend
2015-01-20 20:26 ` [net-next PATCH v3 01/12] net: flow_table: create interface for hw match/action tables John Fastabend
2015-01-22  4:37   ` Simon Horman
2015-01-20 20:27 ` [net-next PATCH v3 02/12] net: flow_table: add rule, delete rule John Fastabend
2015-01-20 20:27 ` [net-next PATCH v3 03/12] net: flow: implement flow cache for get routines John Fastabend
2015-01-20 20:27 ` [net-next PATCH v3 04/12] net: flow_table: create a set of common headers and actions John Fastabend
2015-01-20 20:59   ` John W. Linville
2015-01-20 22:10     ` John Fastabend
2015-01-20 20:28 ` [net-next PATCH v3 05/12] net: flow_table: add validation functions for rules John Fastabend
2015-01-20 20:28 ` [net-next PATCH v3 06/12] net: rocker: add pipeline model for rocker switch John Fastabend
2015-01-20 20:29 ` [net-next PATCH v3 07/12] net: rocker: add set rule ops John Fastabend
2015-01-20 20:29 ` [net-next PATCH v3 08/12] net: rocker: add group_id slices and drop explicit goto John Fastabend
2015-01-20 20:30 ` [net-next PATCH v3 09/12] net: rocker: add multicast path to bridging John Fastabend
2015-01-20 20:30 ` [net-next PATCH v3 10/12] net: rocker: add cookie to group acls and use flow_id to set cookie John Fastabend
2015-01-20 20:31 ` [net-next PATCH v3 11/12] net: rocker: have flow api calls set cookie value John Fastabend
2015-01-20 20:31 ` [net-next PATCH v3 12/12] net: rocker: implement delete flow routine John Fastabend
2015-01-22 12:52 ` [net-next PATCH v3 00/12] Flow API Pablo Neira Ayuso
2015-01-22 13:37   ` Thomas Graf
2015-01-22 14:00     ` Pablo Neira Ayuso
2015-01-22 15:00       ` Jamal Hadi Salim
2015-01-22 15:13         ` Thomas Graf
2015-01-22 15:28           ` Jamal Hadi Salim
2015-01-22 15:37             ` Thomas Graf
2015-01-22 15:44               ` Jamal Hadi Salim
2015-01-23 10:10                 ` Thomas Graf
2015-01-23 10:24                   ` Jiri Pirko
2015-01-23 11:08                     ` Thomas Graf
2015-01-23 11:39                       ` Jiri Pirko
2015-01-23 12:28                         ` Thomas Graf [this message]
2015-01-23 13:43                           ` Jiri Pirko
2015-01-23 14:07                             ` Thomas Graf
2015-01-23 15:25                               ` Jiri Pirko
2015-01-23 15:43                                 ` John Fastabend
2015-01-23 15:56                                   ` Jiri Pirko
2015-01-23 15:49                                 ` Thomas Graf
2015-01-23 16:00                                   ` Jiri Pirko
2015-01-23 15:34                               ` John Fastabend
2015-01-23 15:53                                 ` Jiri Pirko
2015-01-23 16:00                                   ` Thomas Graf
2015-01-23 16:08                                     ` John Fastabend
2015-01-23 16:16                                     ` Jiri Pirko
2015-01-24 13:04                                       ` Jamal Hadi Salim
2015-01-23 17:46                                 ` Thomas Graf
2015-01-23 19:59                                   ` John Fastabend
2015-01-23 23:16                                     ` Thomas Graf
2015-01-24 13:22                                   ` Jamal Hadi Salim
2015-01-24 13:34                                     ` Thomas Graf
2015-01-24 13:01                                 ` Jamal Hadi Salim
2015-01-26  8:26                                   ` Simon Horman
2015-01-26 12:26                                     ` Jamal Hadi Salim
2015-01-27  4:28                                       ` David Ahern
2015-01-27  4:58                                         ` Andy Gospodarek
2015-01-27 15:54                                           ` Jamal Hadi Salim
2015-01-24 12:36                         ` Jamal Hadi Salim
2015-01-22 15:48               ` Jiri Pirko
2015-01-22 17:58                 ` Thomas Graf
2015-01-22 16:49               ` Pablo Neira Ayuso
2015-01-22 17:10                 ` John Fastabend
2015-01-22 17:44                 ` Thomas Graf
2015-01-24 12:34                   ` Jamal Hadi Salim
2015-01-24 13:48                     ` Thomas Graf
2015-01-23  9:00                 ` David Miller
2015-01-22 16:58           ` John Fastabend
2015-01-23 10:49             ` Thomas Graf
2015-01-23 16:42               ` John Fastabend
2015-01-24 12:29             ` Jamal Hadi Salim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150123122838.GI25797@casper.infradead.org \
    --to=tgraf@suug.ch \
    --cc=andy@greyhouse.net \
    --cc=ast@plumgrid.com \
    --cc=davem@davemloft.net \
    --cc=gerlitz.or@gmail.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=sfeldma@gmail.com \
    --cc=simon.horman@netronome.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).