From: John Fastabend <john.r.fastabend@intel.com>
To: jhs@mojatatu.com
Cc: jamal <hadi@cyberus.ca>,
Stephen Hemminger <shemminger@vyatta.com>,
bhutchings@solarflare.com, roprabhu@cisco.com,
netdev@vger.kernel.org, mst@redhat.com, chrisw@redhat.com,
davem@davemloft.net, gregory.v.rose@intel.com,
kvm@vger.kernel.org, sri@us.ibm.com
Subject: Re: [RFC PATCH v0 1/2] net: bridge: propagate FDB table into hardware
Date: Tue, 14 Feb 2012 10:57:04 -0800 [thread overview]
Message-ID: <4F3AAE80.4040609@intel.com> (raw)
In-Reply-To: <1329225526.2806.34.camel@mojatatu>
On 2/14/2012 5:18 AM, jamal wrote:
> On Mon, 2012-02-13 at 07:13 -0800, John Fastabend wrote:
>
>> The use case here is multiple VFs but the same solution should work with
>> multiple PFs as well. FDB controls should be independent of how the ports
>> are exposed VFs, PFs, VMDQ/queue pairs, macvlan, etc.
>
> Makes sense.
>
>> With events and ADD/DEL/GET FDB controls we can solve both cases. This also
>> solves Roopa's case with macvlan where she wants to add additional addresses
>> to macvlan ports.
>
> Not familiar with that issue - I'll prowl the list.
Roopa was likely on the right track here,
http://patchwork.ozlabs.org/patch/123064/
But I think the proper syntax is to use the existing PF_BRIDGE:RTM_XXX
netlink messages. And if possible drive this without extending ndo_ops.
An ideal user space interaction IMHO would look like,
[root@jf-dev1-dcblab iproute2]# ./br/br fdb add 52:e5:62:7b:57:88 dev veth10
[root@jf-dev1-dcblab iproute2]# ./br/br fdb
port mac addr flags
veth2 36:a6:35:9b:96:c4 local
veth4 aa:54:b0:7b:42:ef local
veth0 2a:e8:5c:95:6c:1b local
veth6 6e:26:d5:43:a3:36 local
veth0 f2:c1:39:76:6a:fb
veth8 4e:35:16:af:87:13 local
veth10 52:e5:62:7b:57:88 static
veth10 aa:a9:35:21:15:c4 local
[root@jf-dev1-dcblab iproute2]# ./br/br fdb add dev eth3 to 52:e5:62:7b:57:88
RTNETLINK answers: Invalid argument
Using Stephen's br tool. First command adds FDB entry to SW bridge and
if the same tool could be used to add entries to embedded bridge I think
that would be the best case. So no RTNETLINK error on the second cmd. Then
embedded FDB entries could be dumped this way also so I get a complete view
of my FDB setup across multiple sw bridges and embedded bridges.
I don't think br is part of iproute2 yet I just pulled it out of some RFC
but it works reasonably well and is intuitive enough.
>
>> Yes it should flood here, unless its acting as a 802.1Qbg VEB or VEPA.
>
> Ok. So there is a toggle somewhere which controls how flooding should
> happen.
>
Yes. The hardware has a bit to support this which is currently not exposed
to user space. That's a case where we have 'yet another knob' that needs
a clean solution. This causes real bugs today when users try to use the
macvlan devices in VEPA mode on top of SR-IOV. By the way these modes are
all part of the 802.1Qbg spec which people actually want to use with Linux
so a good clean solution is probably needed.
>>
>> Maybe not. But the kernel already has the needed signals with one extra
>> hook we can save running a daemon in user space. Maybe that's not a great
>> argument to add kernel code though.
>
> You make a reasonable arguement to have it in the kernel but i think we
> win more if we separate the control. So while i empathize, I am hoping
> that youd go with the path that is hard to travel ;->
>
>> The PF_BRIDGE:RTM_GETNEIGH,RTM_NEWNEIGH,RTM_DELNEIGH are registered in the
>> br_netlink_init() path.
>
> Hrm - hadnt paid attention to that before. Nasty.
> The bridge seems to be hard-coding policy on station movement, no?
> This is a good example of the qualms i have on adding things to the
> kernel;->
> I may not want to auto update a MAC address moving ports as part of
> some policy i have. I can go and add YAK (Yet Another Knob) - but where
> is the line drawn?
>
I have no problem with drawing the line here and trying to implement something
over PF_BRIDGE:RTM_xxx nlmsgs. I'll work with Roopa and see if we can come
up with something in the next couple days.
w.r.t. VEPA/VEB and flooding behavior we could probably have a bit to indicate
if the port is a flooding port or not. Then users could build any sort of forwarding
table they wanted OR we could just drive it through a notifier (ndo_ops?) in the
macvlan path which does VEPA today.
OK I'll try to write some actual code now that can be critiqued.
> cheers,
> jamal
>
>
next prev parent reply other threads:[~2012-02-14 18:56 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-09 3:22 [RFC PATCH v0 1/2] net: bridge: propagate FDB table into hardware John Fastabend
2012-02-09 3:22 ` [RFC PATCH v0 2/2] ixgbe: add NETIF_F_HW_FDB to supported flags John Fastabend
2012-02-09 4:36 ` [RFC PATCH v0 1/2] net: bridge: propagate FDB table into hardware Stephen Hemminger
2012-02-09 17:36 ` John Fastabend
2012-02-09 17:40 ` Stephen Hemminger
2012-02-09 17:52 ` John Fastabend
2012-02-09 21:11 ` jamal
2012-02-10 2:14 ` John Fastabend
2012-02-10 4:14 ` John Fastabend
2012-02-10 15:18 ` jamal
2012-02-10 16:39 ` Stephen Hemminger
2012-02-13 13:54 ` jamal
2012-02-13 15:13 ` John Fastabend
2012-02-14 13:18 ` jamal
2012-02-14 18:57 ` John Fastabend [this message]
2012-02-14 19:05 ` Stephen Hemminger
2012-02-14 19:08 ` John Fastabend
2012-02-15 14:10 ` Jamal Hadi Salim
2012-02-16 1:26 ` John Fastabend
2012-02-17 14:28 ` jamal
2012-02-17 17:10 ` John Fastabend
2012-02-18 12:41 ` jamal
2012-02-29 4:40 ` John Fastabend
2012-02-29 5:14 ` John Fastabend
2012-02-29 13:57 ` Jamal Hadi Salim
2012-02-29 13:56 ` Jamal Hadi Salim
2012-02-29 17:25 ` John Fastabend
2012-02-29 17:52 ` Stephen Hemminger
2012-02-29 18:19 ` John Fastabend
2012-03-01 13:36 ` Jamal Hadi Salim
2012-03-01 22:17 ` John Fastabend
2012-03-02 13:20 ` jamal
2012-03-05 17:00 ` Lennert Buytenhek
2012-03-01 13:24 ` Jamal Hadi Salim
2012-03-01 14:14 ` Michael S. Tsirkin
2012-03-01 22:10 ` John Fastabend
2012-03-05 16:53 ` Lennert Buytenhek
2012-03-06 3:45 ` John Fastabend
2012-03-06 14:15 ` Lennert Buytenhek
2012-03-06 13:42 ` jamal
2012-03-06 14:09 ` Lennert Buytenhek
2012-03-07 14:11 ` Jamal Hadi Salim
2012-03-12 8:48 ` Lennert Buytenhek
2012-03-13 13:52 ` Jamal Hadi Salim
2012-02-16 3:58 ` Ben Hutchings
2012-02-16 19:18 ` Shradha Shah
2012-02-17 14:37 ` jamal
2012-02-10 13:45 ` Roopa Prabhu
2012-02-09 18:14 ` Sridhar Samudrala
2012-02-09 20:30 ` John Fastabend
2012-02-10 0:39 ` Sridhar Samudrala
2012-02-10 0:51 ` John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F3AAE80.4040609@intel.com \
--to=john.r.fastabend@intel.com \
--cc=bhutchings@solarflare.com \
--cc=chrisw@redhat.com \
--cc=davem@davemloft.net \
--cc=gregory.v.rose@intel.com \
--cc=hadi@cyberus.ca \
--cc=jhs@mojatatu.com \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=roprabhu@cisco.com \
--cc=shemminger@vyatta.com \
--cc=sri@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).