From: John Fastabend <john.fastabend@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: therbert@google.com, davidch@broadcom.com,
simon.horman@netronome.com, dev@openvswitch.org,
netdev@vger.kernel.org, pablo@netfilter.org
Subject: Re: [ovs-dev] OVS Offload Decision Proposal
Date: Wed, 04 Mar 2015 23:39:01 -0800 [thread overview]
Message-ID: <54F80815.5030208@gmail.com> (raw)
In-Reply-To: <20150305.014257.974664546228241067.davem@davemloft.net>
On 03/04/2015 10:42 PM, David Miller wrote:
> From: Tom Herbert <therbert@google.com>
> Date: Wed, 4 Mar 2015 21:20:41 -0800
>
>> On Wed, Mar 4, 2015 at 9:00 PM, David Miller <davem@davemloft.net> wrote:
>>> From: John Fastabend <john.fastabend@gmail.com>
>>> Date: Wed, 04 Mar 2015 17:54:54 -0800
>>>
>>>> I think a set operation _is_ necessary for OVS and other
>>>> applications that run in user space.
>>>
>>> It's necessary for the kernel to internally manage the chip
>>> flow resources.
>>>
>>> Full stop.
>>>
>>> It's not being exported to userspace. That is exactly the kind
>>> of open ended, outside the model, crap we're trying to avoid
>>> by putting everything into the kernel where we have consistent
>>> mechanisms, well understood behaviors, and rules.
>>
>> David,
>>
>> Just to make sure everyone is on the same page... this discussion has
>> been about where the policy of offload is implemented, not just who is
>> actually sending config bits to the device. The question is who gets
>> to decide how to best divvy up the finite resources of the device and
>> network amongst various requestors. Is this what you're referring to?
>
> I'm talking about only the kernel being able to make ->set() calls
> through the flow manager API to the device.
>
> Resource control is the kernel's job.
>
> You cannot delegate this crap between ipv4 routing in the kernel,
> L2 bridging in the kernel, and some user space crap. It's simply
> not going to happen.
The intent was to reserve space in the tables for l2, l3, user space,
and whatever else is needed. This reservation needs to come from the
administrator because even the kernel doesn't know how much of my
table space I want to reserve for l2 vs l3 vs tc vs ... The sizing
of each of these tables will depend on the use case. If I'm provisioning
L3 networks I may want to create a large l3 table and no 'tc' table.
If I'm building a firewall box I might want a small l3 table and a
large 'tc' table. Also depending on how wide I want my matches in the
'tc' case I may consume more or less resources in the hardware.
Once the reservation of resources occurs we wouldn't let user space
arbitrarily write to any table but only tables that have been
explicitly reserved for user space to write to.
Even without the user space piece we need this reservation when
the table space for l2, l3, etc are shared. Otherwise driver writers
end up doing a best guess for you or end up delivering driver flavours
based on firmware and you can hope the driver writer guessed something
that is close to your network.
>
> All of the delegation of the hardware resource must occur in the
> kernel. Because only the kernel has a full view of all of the
> resources and how each and every subsystem needs to use it.
>
So I'm going to ask... even if we restrict the set() using the above
scheme to only work on pre-defined tables you see an issue with it?
I might be missing the point but I could similarly drive the set()
calls through 'tc' via a new filter call it xflow.
.John
--
John Fastabend Intel Corporation
next prev parent reply other threads:[~2015-03-05 7:39 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-04 1:18 OVS Offload Decision Proposal Simon Horman
2015-03-04 16:45 ` Tom Herbert
2015-03-04 19:07 ` John Fastabend
2015-03-04 21:36 ` Tom Herbert
2015-03-05 1:58 ` John Fastabend
2015-03-06 0:44 ` Neil Horman
[not found] ` <20150306004425.GB6785-0o1r3XBGOEbbgkc5XkKeNuvMHUBZFtU3YPYVAmT7z5s@public.gmane.org>
2015-06-17 14:44 ` Neelakantam Gaddam
2015-03-05 0:04 ` [ovs-dev] " David Christensen
[not found] ` <3A5015FE9E557D448AF7238AF0ACE20A2D8AE08A-Wwdb2uEOBX+nNEFK5l6JbL1+IgudQmzARxWJa1zDYLQ@public.gmane.org>
2015-03-05 1:54 ` John Fastabend
2015-03-05 5:00 ` [ovs-dev] " David Miller
2015-03-05 5:20 ` Tom Herbert
2015-03-05 6:42 ` David Miller
2015-03-05 7:39 ` John Fastabend [this message]
2015-03-05 12:37 ` Jamal Hadi Salim
2015-03-05 13:16 ` Jamal Hadi Salim
2015-03-05 14:52 ` John Fastabend
2015-03-05 16:33 ` B Viswanath
2015-03-05 17:45 ` B Viswanath
[not found] ` <CAN+pFw+LDAiebOzFF+DD81vJp7y0OfVg=5BE0m47B2ZUp6zpeQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-03-05 19:21 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54F80815.5030208@gmail.com \
--to=john.fastabend@gmail.com \
--cc=davem@davemloft.net \
--cc=davidch@broadcom.com \
--cc=dev@openvswitch.org \
--cc=netdev@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=simon.horman@netronome.com \
--cc=therbert@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).