netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.r.fastabend@intel.com>
To: Shrijeet Mukherjee <shrijeet@gmail.com>, Thomas Graf <tgraf@suug.ch>
Cc: Jiri Pirko <jiri@resnulli.us>,
	Simon Horman <simon.horman@netronome.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"nhorman@tuxdriver.com" <nhorman@tuxdriver.com>,
	"andy@greyhouse.net" <andy@greyhouse.net>,
	"dborkman@redhat.com" <dborkman@redhat.com>,
	"ogerlitz@mellanox.com" <ogerlitz@mellanox.com>,
	"jesse@nicira.com" <jesse@nicira.com>,
	"jpettit@nicira.com" <jpettit@nicira.com>,
	"joestringer@nicira.com" <joestringer@nicira.com>,
	"jhs@mojatatu.com" <jhs@mojatatu.com>,
	"sfeldma@gmail.com" <sfeldma@gmail.com>,
	"f.fainelli@gmail.com" <f.fainelli@gmail.com>,
	"roopa@cumulusnetworks.com" <roopa@cumulusnetworks.com>,
	"linville@tuxdriver.com" <linville@tuxdriver.com>,
	"gospo@cumulusnetworks.com" <gospo@cumulusnetworks.com>,
	"bcrl@kvack.org" <bcrl@kvack.org>
Subject: Re: Flows! Offload them.
Date: Thu, 26 Feb 2015 07:39:49 -0800	[thread overview]
Message-ID: <54EF3E45.3070103@intel.com> (raw)
In-Reply-To: <CAGpadYGrjfkZqe0k7D05+cy3pY=1hXZtQqtV0J-8ogU80K7BUQ@mail.gmail.com>

On 02/26/2015 07:25 AM, Shrijeet Mukherjee wrote:
>     However, for certain datacenter server use cases we actually have the
>     full user intent in user space as we configure all of the kernel
>     subsystems from a single central management agent running locally
>     on the server (OpenStack, Kubernetes, Mesos, ...), i.e. we do know
>     exactly what the user wants on the system as a whole. This intent is
>     then split into small configuration pieces to configure iptables, tc,
>     routes on multiple net namespaces (for example to implement VRF).
> 
>     E.g. A VRF in software would make use of net namespaces which holds
>     tenant specific ACLs, routes and QoS settings. A separate action
>     would fwd packets to the namespace. Easy and straight forward in
>     software. OTOH, the hardware, capable of implementing the ACLs,
>     would also need to know about the tc action which selected the
>     namespace when attempting to offload the ACL as it would otherwise
>     ACLs to wrong packets.
> 
> 
> This is a new angle that I believe we have talked around in the context of user space policy, but not really considered.
> 
> So the issue is what if you have a classifier and forward action which points to a device which the element doing the classification does not have access to right ?
> 
> This problem obliquely showed up in the context of route table entries not in the "external" table but present in the software tables as well.
> 
> Maybe the scheme requires an implicit "send to software" device which then diverts traffic to the right place ? Would creating an implicit, un-offload device address these concerns ?

So I think there is a relatively simple solution for this. Assuming
I read the description correctly namely packet ingress' nic/switch
and you want it to land in a namespace.

Today we support offloaded macvlan's and SR-IOV. What I would expect
is user creates a set of macvlan's that are "offloaded" this just means
they are bound to a set of hardware queues and do not go through the
normal receive path. Then assigning these to a namespace is the same
as any other netdev.

Hardware has an action to forward to "VSI" (virtual station interface)
which matches on a packet and forwards it to either a VF or set of 
queues bound to a macvlan. Or you can do the forwarding using standards
based protocols such as EVB (edge virtual bridging).

So its a simple set of steps with the flow api,

	1. create macvlan with dfwd_offload set
	2. push netdev into namespace
	3. add flow rule to match traffic and send to VSI
		./flow -i ethx set_rule match xyz action fwd_vsi 3

The VSI# is reported by ip link today its a bit clumsy so that interface
could be cleaned up.

Here is a case where trying to map this onto a 'tc' action in software
is a bit awkward and you convoluted what is really a simple operation.
Anyways this is not really an "offload" in the sense that your taking
something that used to run in software and moving it 1:1 into hardware.
Adding SR-IOV/VMDQ support requries new constructs. By the way if you
don't like my "flow" tool and you want to move it onto "tc" that could
be done as well but the steps are the same.

.John

  parent reply	other threads:[~2015-02-26 15:40 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-26  7:42 Flows! Offload them Jiri Pirko
2015-02-26  8:38 ` Simon Horman
2015-02-26  9:16   ` Jiri Pirko
2015-02-26 13:33     ` Thomas Graf
2015-02-26 15:23       ` John Fastabend
2015-02-26 20:16         ` Neil Horman
2015-02-26 21:11           ` John Fastabend
2015-02-27  1:17             ` Neil Horman
2015-02-27  8:53             ` Jiri Pirko
2015-02-27 16:00               ` John Fastabend
2015-02-26 21:52           ` Simon Horman
2015-02-27  1:22             ` Neil Horman
2015-02-27  1:52               ` Tom Herbert
2015-03-02 13:49                 ` Andy Gospodarek
2015-03-02 16:54                   ` Scott Feldman
2015-03-02 18:06                     ` Andy Gospodarek
     [not found]                     ` <CAGpadYEC3-5AdkOG66q0vX+HM0c6EU-C0ZT=sKGe7rZRHsYYKg@mail.gmail.com>
2015-03-02 22:13                       ` Scott Feldman
2015-03-02 22:43                         ` Andy Gospodarek
2015-03-02 22:49                           ` Florian Fainelli
2015-02-27  8:41               ` Thomas Graf
2015-02-27 12:59                 ` Neil Horman
2015-03-01  9:36                 ` Arad, Ronen
2015-03-01 14:05                   ` Neil Horman
2015-03-02 14:16                     ` Jamal Hadi Salim
2015-03-01  9:47                 ` Arad, Ronen
2015-03-01 17:20                   ` Neil Horman
     [not found]       ` <CAGpadYGrjfkZqe0k7D05+cy3pY=1hXZtQqtV0J-8ogU80K7BUQ@mail.gmail.com>
2015-02-26 15:39         ` John Fastabend [this message]
     [not found]           ` <CAGpadYHfNcDR2ojubkCJ8-nJTQkdLkPsAwJu0wOKU82bLDzhww@mail.gmail.com>
2015-02-26 16:33             ` Thomas Graf
2015-02-26 16:53             ` John Fastabend
2015-02-27 13:33           ` Jamal Hadi Salim
2015-02-27 15:23             ` John Fastabend
2015-03-02 13:45               ` Jamal Hadi Salim
2015-02-26 17:38       ` David Ahern
2015-02-26 16:04     ` Tom Herbert
2015-02-26 16:17       ` Jiri Pirko
2015-02-26 18:15         ` Tom Herbert
2015-02-26 19:05           ` Thomas Graf
2015-02-27  9:00           ` Jiri Pirko
2015-02-28 20:02           ` David Miller
2015-02-28 21:31             ` Jiri Pirko
2015-02-26 18:16       ` Scott Feldman
2015-02-26 11:22 ` Sowmini Varadhan
2015-02-26 11:39   ` Jiri Pirko
2015-02-26 15:42     ` Sowmini Varadhan
2015-02-27 13:15     ` Named sockets WAS(Re: " Jamal Hadi Salim
2015-02-26 12:51 ` Thomas Graf
2015-02-26 13:17   ` Jiri Pirko
2015-02-26 19:32 ` Florian Fainelli
2015-02-26 20:58   ` John Fastabend
2015-02-26 21:45     ` Florian Fainelli
2015-02-26 23:06       ` John Fastabend
2015-02-27 18:37       ` Neil Horman
2015-02-27 14:01     ` Driver level interface WAS(Re: " Jamal Hadi Salim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54EF3E45.3070103@intel.com \
    --to=john.r.fastabend@intel.com \
    --cc=andy@greyhouse.net \
    --cc=bcrl@kvack.org \
    --cc=davem@davemloft.net \
    --cc=dborkman@redhat.com \
    --cc=f.fainelli@gmail.com \
    --cc=gospo@cumulusnetworks.com \
    --cc=jesse@nicira.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=joestringer@nicira.com \
    --cc=jpettit@nicira.com \
    --cc=linville@tuxdriver.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=ogerlitz@mellanox.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=sfeldma@gmail.com \
    --cc=shrijeet@gmail.com \
    --cc=simon.horman@netronome.com \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).