From: Thomas Graf <tgraf@suug.ch>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Scott Feldman <sfeldma@cumulusnetworks.com>,
John Fastabend <john.fastabend@gmail.com>,
Jiri Pirko <jiri@resnulli.us>,
netdev@vger.kernel.org, davem@davemloft.net,
nhorman@tuxdriver.com, andy@greyhouse.net, dborkman@redhat.com,
ogerlitz@mellanox.com, jesse@nicira.com, pshelar@nicira.com,
azhou@nicira.com, ben@decadent.org.uk,
stephen@networkplumber.org, jeffrey.t.kirsher@intel.com,
vyasevic@redhat.com, xiyou.wangcong@gmail.com,
john.r.fastabend@intel.com, edumazet@google.com,
f.fainelli@gmail.com, roopa@cumulusnetworks.com,
linville@tuxdriver.com, dev@openvswitch.org, jasowang@redhat.com,
ebiederm@xmission.com, nicolas.dichtel@6wind.com,
ryazanov.s.a@gmail.com, buytenh@wantstofly.org,
aviadr@mellanox.com, nbd@openwrt.org,
alexei.starovoitov@gmail.com, Neil.Jerram@metaswitch.com,
ronye@mellanox.com
Subject: Re: [patch net-next RFC 10/12] openvswitch: add support for datapath hardware offload
Date: Sun, 24 Aug 2014 12:12:18 +0100 [thread overview]
Message-ID: <20140824111218.GA32741@casper.infradead.org> (raw)
In-Reply-To: <53F9459B.2070801@mojatatu.com>
On 08/23/14 at 09:53pm, Jamal Hadi Salim wrote:
> On 08/22/14 18:53, Scott Feldman wrote:
>
> Ok, Scott - now i have looked at the patches on the plane and i am
> still not convinced ;->
>
> >The intent is to use openvswitch.ko’s struct sw_flow to program hardware via the
> >ndo_swdev_flow_* ops, but otherwise be independent of OVS. So the upper layer of
> >the driver is struct sw_flow and any module above the driver can construct a struct
> >sw_flow and push it down via ndo_swdev_flow_*. So your non-OVS use-case should be
> >handled. OVS is another use-case. struct sw_flow should not be OVS-aware, but
> >rather a generic flow match/action sufficient to offload the data plane to HW.
>
>
> There is a legitimate case to be made for offloading OVS but *not*
> a basis for making it the offload interface.
> My suggestion is to make all OVS stuff a separate patchset.
> This thing needs to stand alone without OVS and we dont need
> to confuse the two.
I get what you are saying but I don't see that to be the case here. I
don't see how this series proposes the OVS case as *the* interface.
It proposes *a* interface which in this case is flow based with mask
support to accomodate the typical ntuple filter API in HW. OVS happens
to be one of the easiest to use examples as a consumer because it
already provides a flat flow representation.
That said, I already mentioned that I see a lot of value in having a
non OVS API example ASAP and I will be glad to help out John to achieve
that.
> Having said that:
> I believe in starting simple - by solving the basic functions of
> L2/3 offload first because those are well understood and fundamental.
> There is the simplicity of those network functions and then
> need to deal with tons of quarks that surround them....
> I think getting that right will help in understanding the issues and
> make this interface better. This is where i am going to focus my effort.
I thought this is exactly what is happening here. The flow key/mask
based API as proposed focuses on basic forwarding for L2-L4.
> Here's my view on flows in the patchset:
> What we need is ability to specify different types of classifiers.
> But leave L2 and 3 out of that - that should be part of the basic
> feature set.
>
> Your 15-tuple classifier should be one of those classifiers.
> This is because you *cannot possibly* have a universal classifier.
> The tc classifier/action API has got this part right. There is
> no ONE flow classifier but rather it has flexibility to add as many
> as you want.
Exactly and I never saw Jiri claim that swdev_flow_insert() would be
the only offload capability exposed by the API. I see no reason why
it could not also provide swdev_offset_match_insert() or
swdev_ebpf_insert() for the 2*next generation HW. I don't think it
makes sense to focus entirely on finding a single common denominator
and channel everything through a single function to represent all the
different generic and less generic offload capabilities. I believe
that doing so will raise the minimal HW requirements barrier HW too
much. I think we should start somewhere, learn and evolve.
> IOW:
> I should be able to specify a classifier that matches the
> definition of the openflow thing you are using. But then i should also
> be able to create one based on 32 bit value/masks, one that classifies
> strings, one that classifies metadata, my own pigeon observer
> classifier etc. And be able to attach them in combinations
> to select different things within the packet and act differently.
So essentially what you are saying is that the tc interface
(in particular cls and act) could be used as an API to achieve offloads.
Yes! I thought this was very clear and a given. I don't think that it
makes sense to force every offload API consumer through the tc interface
though. This comes back to my statements in a previous email. I don't
think we should require that all the offload decision complexity *has*
to live in the kernel. Quagga, nft, or OVS should be given an API to
influence this more directly (with the hardware complexity properly
abstracted). In-kernel users such as bridge, l3 (especially rules),
and tc itself could be handled through a cls/act derived API internally.
> Lets pick an example of the u32 classifier (or i could pick nftables).
> Using your scheme i have to incur penalties to translating u32 to your
> classifier and only achieve basic functionality; and now in addition
> i cant do 90% of my u32 features. And u32 is very implementable
> in hardware.
I don't fully understand the last claim. Given the specific ntuple
capabilities of a lot of hardware out there (let's assume a typical
5-tuple capability with N capacity for exact matches and M capacity for
wildcard matches) supporting a generic u32 offset-len-mask is not exactly
trivial at all and I don't see how you can get around converting the
generic offset into a ntuple filter *at some point* to verify if the HW
can fullfil the generic offset match request or not. Could you share
what kind of HW you regard as a minimal requirement to base the offload
API on? Personally I'm highly interested in the existing limited tuple
filters and flow directors of NICs already available and their next
successors. I think that the code that Jiri proposes and what John is
planning to do makes a lot of sense in that context.
next prev parent reply other threads:[~2014-08-24 11:12 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-21 16:18 [patch net-next RFC 00/12] introduce rocker switch driver with openvswitch hardware accelerated datapath Jiri Pirko
2014-08-21 16:18 ` [patch net-next RFC 02/12] net: rename netdev_phys_port_id to more generic name Jiri Pirko
[not found] ` <1408637945-10390-3-git-send-email-jiri-rHqAuBHg3fBzbRFIqnYvSA@public.gmane.org>
2014-08-26 12:23 ` Or Gerlitz
[not found] ` <53FC7C3C.3090901-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-08-26 14:10 ` Jiri Pirko
2014-08-26 17:14 ` Stephen Hemminger
[not found] ` <1408637945-10390-1-git-send-email-jiri-rHqAuBHg3fBzbRFIqnYvSA@public.gmane.org>
2014-08-21 16:18 ` [patch net-next RFC 01/12] openvswitch: split flow structures into ovs specific and generic ones Jiri Pirko
2014-08-21 16:18 ` [patch net-next RFC 03/12] net: introduce generic switch devices support Jiri Pirko
2014-08-21 16:41 ` Ben Hutchings
2014-08-21 17:03 ` Jiri Pirko
[not found] ` <1408639283.13073.3.camel-nDn/Rdv9kqW9Jme8/bJn5UCKIB8iOfG2tUK59QYPAWc@public.gmane.org>
2014-08-27 2:45 ` Tom Herbert
[not found] ` <1408637945-10390-4-git-send-email-jiri-rHqAuBHg3fBzbRFIqnYvSA@public.gmane.org>
2014-08-21 17:05 ` Florian Fainelli
[not found] ` <CAGVrzcYtnpcP4pfCJ0GSya01LTk0WwbSV1f+voF2K=S5CR3Arg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-22 12:42 ` Jamal Hadi Salim
2014-08-22 12:56 ` Jiri Pirko
2014-08-22 19:14 ` John Fastabend
[not found] ` <53F7969C.1060509-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-08-22 23:12 ` Scott Feldman
[not found] ` <20140822125655.GB1916-6KJVSR23iU488b5SBfVpbw@public.gmane.org>
2014-08-23 1:02 ` Florian Fainelli
[not found] ` <CAGVrzcZS=Y2stxSNMfVjWTpPT8GoDOpOD9tExnDnoF0jj_owoQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-23 9:17 ` Jiri Pirko
2014-08-24 11:46 ` Thomas Graf
[not found] ` <20140824114605.GC32741-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-08-26 8:34 ` Jiri Pirko
2014-08-27 22:19 ` Cong Wang
2014-08-21 16:18 ` [patch net-next RFC 06/12] net: introduce dummy switch Jiri Pirko
[not found] ` <1408637945-10390-7-git-send-email-jiri-rHqAuBHg3fBzbRFIqnYvSA@public.gmane.org>
2014-08-26 19:14 ` Andy Gospodarek
[not found] ` <20140826191420.GC5275-Me9pkO/C/lgvPfuUPAiksl6hYfS7NtTn@public.gmane.org>
2014-08-29 7:00 ` Jiri Pirko
2014-08-21 16:18 ` [patch net-next RFC 04/12] rtnl: expose physical switch id for particular device Jiri Pirko
[not found] ` <1408637945-10390-5-git-send-email-jiri-rHqAuBHg3fBzbRFIqnYvSA@public.gmane.org>
2014-08-22 19:08 ` John Fastabend
[not found] ` <53F79537.20207-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-08-26 8:32 ` Jiri Pirko
2014-08-21 16:18 ` [patch net-next RFC 05/12] net-sysfs: " Jiri Pirko
2014-08-21 16:19 ` [patch net-next RFC 07/12] dsa: implement ndo_swdev_get_id Jiri Pirko
2014-08-21 16:38 ` Ben Hutchings
2014-08-21 16:56 ` Florian Fainelli
[not found] ` <CAGVrzcbs1yGb5RW++XZ=2PFsqUjZGVGfWx5=QQYcEX6x4WOq9Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-21 17:06 ` Jiri Pirko
[not found] ` <20140821170645.GB10633-6KJVSR23iU5sFDB2n11ItA@public.gmane.org>
2014-08-21 17:12 ` Florian Fainelli
[not found] ` <CAGVrzcb=vkqPw2LUc4YO4Bs-eady2=1uN-jkG=kW2RnGx=24PQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-22 9:05 ` David Laight
2014-08-23 11:33 ` Eric W. Biederman
2014-08-21 16:19 ` [patch net-next RFC 08/12] net: introduce netdev_phys_item_ids_match helper Jiri Pirko
2014-08-21 16:19 ` [patch net-next RFC 09/12] openvswitch: introduce vport_op get_netdev Jiri Pirko
2014-08-21 16:19 ` [patch net-next RFC 10/12] openvswitch: add support for datapath hardware offload Jiri Pirko
[not found] ` <1408637945-10390-11-git-send-email-jiri-rHqAuBHg3fBzbRFIqnYvSA@public.gmane.org>
2014-08-22 19:39 ` John Fastabend
[not found] ` <53F79C54.5050701-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-08-22 22:53 ` Scott Feldman
[not found] ` <464DB0A8-0073-4CE0-9483-0F36B73A53A1-qUQiAmfTcIp+XZJcv9eMoEEOCMrvLtNR@public.gmane.org>
2014-08-23 9:24 ` Jiri Pirko
2014-08-23 14:51 ` Thomas Graf
[not found] ` <20140823145126.GB24116-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-08-23 17:09 ` John Fastabend
[not found] ` <53F8CAB9.8080407-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-08-24 11:32 ` Thomas Graf
2014-08-24 1:53 ` Jamal Hadi Salim
2014-08-24 11:12 ` Thomas Graf [this message]
[not found] ` <20140824111218.GA32741-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-08-24 15:15 ` Jamal Hadi Salim
[not found] ` <53FA01AC.10507-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
2014-08-25 2:24 ` Scott Feldman
[not found] ` <A67C7591-19BF-4431-9119-F61361F5E618-qUQiAmfTcIp+XZJcv9eMoEEOCMrvLtNR@public.gmane.org>
2014-08-25 2:42 ` John Fastabend
[not found] ` <53FAA2A2.7070801-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-08-25 13:53 ` Jamal Hadi Salim
2014-08-25 14:17 ` Thomas Graf
[not found] ` <20140825141754.GA30140-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-08-25 16:15 ` Jamal Hadi Salim
[not found] ` <53FB6122.2040901-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
2014-08-25 22:50 ` Thomas Graf
[not found] ` <20140825225057.GD30140-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-08-26 13:50 ` Roopa Prabhu
2014-08-26 14:06 ` Jiri Pirko
2014-08-26 14:58 ` Jamal Hadi Salim
2014-08-26 15:22 ` Jiri Pirko
[not found] ` <20140826152217.GA1843-6KJVSR23iU5sFDB2n11ItA@public.gmane.org>
2014-08-26 15:29 ` Jamal Hadi Salim
2014-08-26 15:44 ` Jiri Pirko
[not found] ` <20140826154459.GB1843-6KJVSR23iU5sFDB2n11ItA@public.gmane.org>
2014-08-26 15:54 ` Andy Gospodarek
[not found] ` <20140826155426.GA5275-Me9pkO/C/lgvPfuUPAiksl6hYfS7NtTn@public.gmane.org>
2014-08-26 16:19 ` Thomas Graf
[not found] ` <20140826161956.GA15316-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-08-26 18:41 ` Andy Gospodarek
2014-08-26 20:13 ` Alexei Starovoitov
2014-08-26 20:54 ` Thomas Graf
2014-08-29 14:20 ` Jamal Hadi Salim
[not found] ` <54008C47.5040503-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
2014-09-01 8:13 ` Simon Horman
[not found] ` <20140901081343.GC12731-IxS8c3vjKQDk1uMJSBkQmQ@public.gmane.org>
2014-09-01 16:37 ` Jamal Hadi Salim
2014-09-01 20:28 ` Jiri Pirko
2014-09-02 1:08 ` Jamal Hadi Salim
2014-08-26 15:01 ` Scott Feldman
[not found] ` <D891A8EC-548C-453E-AC70-8431DAC4B8C4-qUQiAmfTcIp+XZJcv9eMoEEOCMrvLtNR@public.gmane.org>
2014-08-26 15:12 ` Jamal Hadi Salim
2014-08-26 14:26 ` Jamal Hadi Salim
2014-08-25 13:42 ` Jamal Hadi Salim
2014-08-25 14:54 ` Thomas Graf
[not found] ` <20140825145449.GB30140-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-08-25 16:48 ` Jamal Hadi Salim
2014-08-25 22:11 ` Thomas Graf
2014-08-26 14:00 ` Jamal Hadi Salim
2014-08-26 14:20 ` Thomas Graf
[not found] ` <20140904090447.GB3176@vergenet.net>
[not found] ` <20140904090447.GB3176-IxS8c3vjKQDk1uMJSBkQmQ@public.gmane.org>
2014-09-04 16:30 ` Scott Feldman
[not found] ` <F4498A89-C1D6-4C5A-A6F0-942015D36B77-qUQiAmfTcIp+XZJcv9eMoEEOCMrvLtNR@public.gmane.org>
2014-09-05 4:08 ` Simon Horman
[not found] ` <20140905040810.GB32481-IxS8c3vjKQDk1uMJSBkQmQ@public.gmane.org>
2014-09-05 7:02 ` Scott Feldman
[not found] ` <E3C7797F-081E-484F-918E-937C705B43D6-qUQiAmfTcIp+XZJcv9eMoEEOCMrvLtNR@public.gmane.org>
2014-09-05 10:46 ` Jamal Hadi Salim
2014-09-08 0:02 ` Simon Horman
2014-08-21 16:19 ` [patch net-next RFC 11/12] sw_flow: add misc section to key with in_port_ifindex field Jiri Pirko
2014-08-21 16:19 ` [patch net-next RFC 12/12] rocker: introduce rocker switch driver Jiri Pirko
2014-08-21 17:19 ` Florian Fainelli
2014-08-23 14:04 ` Thomas Graf
2014-08-29 7:06 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140824111218.GA32741@casper.infradead.org \
--to=tgraf@suug.ch \
--cc=Neil.Jerram@metaswitch.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andy@greyhouse.net \
--cc=aviadr@mellanox.com \
--cc=azhou@nicira.com \
--cc=ben@decadent.org.uk \
--cc=buytenh@wantstofly.org \
--cc=davem@davemloft.net \
--cc=dborkman@redhat.com \
--cc=dev@openvswitch.org \
--cc=ebiederm@xmission.com \
--cc=edumazet@google.com \
--cc=f.fainelli@gmail.com \
--cc=jasowang@redhat.com \
--cc=jeffrey.t.kirsher@intel.com \
--cc=jesse@nicira.com \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=john.fastabend@gmail.com \
--cc=john.r.fastabend@intel.com \
--cc=linville@tuxdriver.com \
--cc=nbd@openwrt.org \
--cc=netdev@vger.kernel.org \
--cc=nhorman@tuxdriver.com \
--cc=nicolas.dichtel@6wind.com \
--cc=ogerlitz@mellanox.com \
--cc=pshelar@nicira.com \
--cc=ronye@mellanox.com \
--cc=roopa@cumulusnetworks.com \
--cc=ryazanov.s.a@gmail.com \
--cc=sfeldma@cumulusnetworks.com \
--cc=stephen@networkplumber.org \
--cc=vyasevic@redhat.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).