From: John Fastabend <john.r.fastabend@intel.com>
To: jhs@mojatatu.com
Cc: jamal <hadi@cyberus.ca>,
Stephen Hemminger <shemminger@vyatta.com>,
bhutchings@solarflare.com, roprabhu@cisco.com,
netdev@vger.kernel.org, mst@redhat.com, chrisw@redhat.com,
davem@davemloft.net, gregory.v.rose@intel.com,
kvm@vger.kernel.org, sri@us.ibm.com
Subject: Re: [RFC PATCH v0 1/2] net: bridge: propagate FDB table into hardware
Date: Thu, 09 Feb 2012 20:14:52 -0800 [thread overview]
Message-ID: <4F3499BC.8020609@intel.com> (raw)
In-Reply-To: <4F347D96.2020806@intel.com>
On 2/9/2012 6:14 PM, John Fastabend wrote:
> On 2/9/2012 1:11 PM, jamal wrote:
>> On Thu, 2012-02-09 at 09:52 -0800, John Fastabend wrote:
>>
>>>>> By netlink_notifier do you mean adding a notifier_block and using atomic_notifier_call_chain()
>>>>> probably in rtnl_notify()? Then drivers could register with the notifier chain with
>>>>> atomic_notifier_chain_register() and receive the events correctly. Or did I miss
>>>>> some notifier chain that already exists?
>>>>
>>>> Yes. that is what I mean. The callbacks you need may or may not already be present.
>>
>> I'll go one step further.
>> This stuff shouldnt be in the kernel at all.
>> The disadvantage is you need a user space app to update the hardware.
>> i.e, the same mechanism should be usable for either a switch embedded
>> in a NIC or a standalone hardware switch (with/out the s/ware bridge
>> presence)
>>
>> cheers,
>> jamal
>>
>
> Hi Jamal,
>
> The user space app in this case would listen for FDB updates to the SW
> bridge and then mirror them at the embedded NIC. In this case it seems
> easier to just add a notifier chain and let the kernel keep these in
> sync. Otherwise we need a daemon in user space to replicate these.
>
> On the other hand if you could make the same RTM_NEWNEIGH, RTM_DELNEIGH,
> and RTM_GETNEIGH work for the bridge, embedded bridge, and macvlan you
> would have one common interface to drive these. But the bridge already
> has this protocol/msgtype so that would require either some demux or
> new protocol/msgtype pairs to be created.
>
> Let me think on it. I'm tempted by the simplicity of adding notifier
> hooks though.
>
> .John
>
Actually because the bridge is adding/removing fdb entries dynamically
maybe its best this gets done in kernel. Here's the example case,
---------- ---------
| ethx.y | <---- E | veth0 | <--- A
---------- ---------
| |
| |
| |
| --------------
| | SW Bridge | <--- B
| --------------
| |
| |
| ---------
| | eth0 | <--- C
| ---------
| |
-----------------------------------
| embedded switch | <--- D
-----------------------------------
|
|
G
With the flow by letters above hope this is not too difficult to follow.
(A) veth0 a virtual device transmits packet destined for ethx.y
(B) SW bridge receives frames and updates FDB flooding to C
(C) eth0 the PF in this case sends the frame to the HW backed by the
embedded bridge
(D) The HW embedded switch has a static entry for ethx.y and forwards
the frame to the VF or if its a broadcast frame also floods it to
the wire and ethx.y
(E) ethx.y receives the frame and generates a response to the dest mac of
veth0
Now here is the potential issue,
(G) The frame transmitted from ethx.y with the destination address of
veth0 but the embedded switch is not a learning switch. If the FDB
update is done in user space its possible (likely?) that the FDB
entry for veth0 has not been added to the embedded switch yet. Now
we either have to flood the frame which is not horrible but not
ideal or worse if the embedded switch does not support flooding send
it to the wire and veth0 never receives it. If the SW bridge pushes
the FDB update down into the embedded switch the address is for
sure in the embedded switches forwarding tables and the switching
works as expected.
So to handle this case correctly its probably best IMHO to use a notifier
hook. Having a RTM_GETNEIGH for the embedded switch implemented though
would be nice for dumping the FDB of the embedded switch and SET/DEL
could be used to configure the FDB when its not being driven by the SW
switch. Of course we should try to be minimalists here.
.John
next prev parent reply other threads:[~2012-02-10 4:14 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-09 3:22 [RFC PATCH v0 1/2] net: bridge: propagate FDB table into hardware John Fastabend
2012-02-09 3:22 ` [RFC PATCH v0 2/2] ixgbe: add NETIF_F_HW_FDB to supported flags John Fastabend
2012-02-09 4:36 ` [RFC PATCH v0 1/2] net: bridge: propagate FDB table into hardware Stephen Hemminger
2012-02-09 17:36 ` John Fastabend
2012-02-09 17:40 ` Stephen Hemminger
2012-02-09 17:52 ` John Fastabend
2012-02-09 21:11 ` jamal
2012-02-10 2:14 ` John Fastabend
2012-02-10 4:14 ` John Fastabend [this message]
2012-02-10 15:18 ` jamal
2012-02-10 16:39 ` Stephen Hemminger
2012-02-13 13:54 ` jamal
2012-02-13 15:13 ` John Fastabend
2012-02-14 13:18 ` jamal
2012-02-14 18:57 ` John Fastabend
2012-02-14 19:05 ` Stephen Hemminger
2012-02-14 19:08 ` John Fastabend
2012-02-15 14:10 ` Jamal Hadi Salim
2012-02-16 1:26 ` John Fastabend
2012-02-17 14:28 ` jamal
2012-02-17 17:10 ` John Fastabend
2012-02-18 12:41 ` jamal
2012-02-29 4:40 ` John Fastabend
2012-02-29 5:14 ` John Fastabend
2012-02-29 13:57 ` Jamal Hadi Salim
2012-02-29 13:56 ` Jamal Hadi Salim
2012-02-29 17:25 ` John Fastabend
2012-02-29 17:52 ` Stephen Hemminger
2012-02-29 18:19 ` John Fastabend
2012-03-01 13:36 ` Jamal Hadi Salim
2012-03-01 22:17 ` John Fastabend
2012-03-02 13:20 ` jamal
2012-03-05 17:00 ` Lennert Buytenhek
2012-03-01 13:24 ` Jamal Hadi Salim
2012-03-01 14:14 ` Michael S. Tsirkin
2012-03-01 22:10 ` John Fastabend
2012-03-05 16:53 ` Lennert Buytenhek
2012-03-06 3:45 ` John Fastabend
2012-03-06 14:15 ` Lennert Buytenhek
2012-03-06 13:42 ` jamal
2012-03-06 14:09 ` Lennert Buytenhek
2012-03-07 14:11 ` Jamal Hadi Salim
2012-03-12 8:48 ` Lennert Buytenhek
2012-03-13 13:52 ` Jamal Hadi Salim
2012-02-16 3:58 ` Ben Hutchings
2012-02-16 19:18 ` Shradha Shah
2012-02-17 14:37 ` jamal
2012-02-10 13:45 ` Roopa Prabhu
2012-02-09 18:14 ` Sridhar Samudrala
2012-02-09 20:30 ` John Fastabend
2012-02-10 0:39 ` Sridhar Samudrala
2012-02-10 0:51 ` John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F3499BC.8020609@intel.com \
--to=john.r.fastabend@intel.com \
--cc=bhutchings@solarflare.com \
--cc=chrisw@redhat.com \
--cc=davem@davemloft.net \
--cc=gregory.v.rose@intel.com \
--cc=hadi@cyberus.ca \
--cc=jhs@mojatatu.com \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=roprabhu@cisco.com \
--cc=shemminger@vyatta.com \
--cc=sri@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).