* HW bridging support using notifiers? @ 2014-10-03 1:48 Florian Fainelli 2014-10-03 5:13 ` Scott Feldman 2014-10-03 14:22 ` Benjamin LaHaise 0 siblings, 2 replies; 6+ messages in thread From: Florian Fainelli @ 2014-10-03 1:48 UTC (permalink / raw) To: netdev Cc: davem, jiri, stephen, andy, tgraf, nbd, john.r.fastabend, edumazet, vyasevic, buytenh, sfeldma Hi all, I am taking a look at adding HW bridging support to DSA, in way that's usable outside of DSA. Lennert's approach in 2008 [1] looks conceptually good to me,as he noted, it uses a bunch of new ndo's which is not only limiting to one ndo implementer per struct net_device, but also is mostly consuming the information from the bridge layer, while the ndo is an action So here's what I am up to now: - use the NETDEV_JOIN notifier to discover when a bridge port is added - use the NETDEV_LEAVE notifier, still need to verify this does not break netconsole as indicated in net/bridge/br_if.c - use the NETDEV_CHANGEINFODATA notifier to notify about STP state changes Now, this raises a bunch of questions: - we would need a getter to return the stp state of a given network device when called with NETDEV_CHANGEINFODATA, is that acceptable? This would be the first function exported by the bridge layer to expose internal data NB: this also raises the question of the race condition and locking within br_set_stp_state() and when the network devices notifier callback runs - or do we need a new network device notifier accepting an opaque pointer which could provide us with the data we what, something like this: call_netdevices_notifier_data(NETDEV_CHANGEINFODATA, dev, info), where info would be a structure/union telling what's this data about Let me know what you think, thanks! [1]: http://patchwork.ozlabs.org/patch/16578/ -- Florian ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: HW bridging support using notifiers? 2014-10-03 1:48 HW bridging support using notifiers? Florian Fainelli @ 2014-10-03 5:13 ` Scott Feldman 2014-10-03 7:53 ` Jiri Pirko 2014-10-03 14:22 ` Benjamin LaHaise 1 sibling, 1 reply; 6+ messages in thread From: Scott Feldman @ 2014-10-03 5:13 UTC (permalink / raw) To: Florian Fainelli Cc: netdev, davem, jiri, stephen, andy, tgraf, nbd, john.r.fastabend, edumazet, vyasevic, buytenh On Oct 2, 2014, at 6:48 PM, Florian Fainelli <f.fainelli@gmail.com> wrote: > Hi all, > > I am taking a look at adding HW bridging support to DSA, in way that's > usable outside of DSA. > > Lennert's approach in 2008 [1] looks conceptually good to me,as he > noted, it uses a bunch of new ndo's which is not only limiting to one > ndo implementer per struct net_device, but also is mostly consuming the > information from the bridge layer, while the ndo is an action > > So here's what I am up to now: > > - use the NETDEV_JOIN notifier to discover when a bridge port is added > - use the NETDEV_LEAVE notifier, still need to verify this does not > break netconsole as indicated in net/bridge/br_if.c I can’t find a NETDEV_LEAVE...is this new? For this, rocker is using netdev_notifier and watching for NETDEV_CHANGEUPPER. On CHANGEUPPER, use netdev_master_upper_dev_get(dev) to get master. If master and master rtnl_link_ops->kind is “bridge”, then dev port is in bridge; otherwise port is not in bridge. Of course, master could be “openvswitch”, if the driver swings both ways. Would this approach work for DSA? > - use the NETDEV_CHANGEINFODATA notifier to notify about STP state changes > > Now, this raises a bunch of questions: > > - we would need a getter to return the stp state of a given network > device when called with NETDEV_CHANGEINFODATA, is that acceptable? This > would be the first function exported by the bridge layer to expose > internal data > > NB: this also raises the question of the race condition and locking > within br_set_stp_state() and when the network devices notifier callback > runs > > - or do we need a new network device notifier accepting an opaque > pointer which could provide us with the data we what, something like > this: call_netdevices_notifier_data(NETDEV_CHANGEINFODATA, dev, info), > where info would be a structure/union telling what's this data about > We need STP state change notification for rocker also, so we can install/remove STP/ARP filters on port state change. Netdev_notifier would work. I was also thinking about using Jiri’s ndo_swdev_flow_install/remove to install/remove the STP filters on the port, rather than using netdev_notifier. In other words, does the HW bridge driver need to know the STP state or can it be dumb (stateless) and told when to accept STP BPDUs, or not, using swdev_flow construct: dst_mac 01:80:c2:00:00:00 lasp 0x4242 in_port <port ifindex> actions output <br ifindex> and later when reaching LEARNING state: dst_mac ff:ff:ff:ff:ff:ff eth_type ARP in_port <port ifindex> actions output <br ifindex> and finally when reaching FORWARDING state, the learned/static bridge fdbs: dst_mac <neigh_mac> in_port <br ifindex> actions output <port ifindex> So driver doesn’t really know what STP is; it’s just installs/removes port filter when told to, using the common ndo_swdev_flow API. The smarts stay in the bridge driver. > Let me know what you think, thanks! > > [1]: http://patchwork.ozlabs.org/patch/16578/ > -- > Florian -scott ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: HW bridging support using notifiers? 2014-10-03 5:13 ` Scott Feldman @ 2014-10-03 7:53 ` Jiri Pirko 0 siblings, 0 replies; 6+ messages in thread From: Jiri Pirko @ 2014-10-03 7:53 UTC (permalink / raw) To: Scott Feldman Cc: Florian Fainelli, netdev, davem, stephen, andy, tgraf, nbd, john.r.fastabend, edumazet, vyasevic, buytenh Fri, Oct 03, 2014 at 07:13:35AM CEST, sfeldma@cumulusnetworks.com wrote: > >On Oct 2, 2014, at 6:48 PM, Florian Fainelli <f.fainelli@gmail.com> wrote: > >> Hi all, >> >> I am taking a look at adding HW bridging support to DSA, in way that's >> usable outside of DSA. >> >> Lennert's approach in 2008 [1] looks conceptually good to me,as he >> noted, it uses a bunch of new ndo's which is not only limiting to one >> ndo implementer per struct net_device, but also is mostly consuming the >> information from the bridge layer, while the ndo is an action >> >> So here's what I am up to now: >> >> - use the NETDEV_JOIN notifier to discover when a bridge port is added >> - use the NETDEV_LEAVE notifier, still need to verify this does not >> break netconsole as indicated in net/bridge/br_if.c > >I can’t find a NETDEV_LEAVE...is this new? > >For this, rocker is using netdev_notifier and watching for NETDEV_CHANGEUPPER. On CHANGEUPPER, use netdev_master_upper_dev_get(dev) to get master. If master and master rtnl_link_ops->kind is “bridge”, then dev port is in bridge; otherwise port is not in bridge. Of course, master could be “openvswitch”, if the driver swings both ways. I agree that NETDEV_CHANGEUPPER should be used for this. > >Would this approach work for DSA? > >> - use the NETDEV_CHANGEINFODATA notifier to notify about STP state changes >> >> Now, this raises a bunch of questions: >> >> - we would need a getter to return the stp state of a given network >> device when called with NETDEV_CHANGEINFODATA, is that acceptable? This >> would be the first function exported by the bridge layer to expose >> internal data >> >> NB: this also raises the question of the race condition and locking >> within br_set_stp_state() and when the network devices notifier callback >> runs >> >> - or do we need a new network device notifier accepting an opaque >> pointer which could provide us with the data we what, something like >> this: call_netdevices_notifier_data(NETDEV_CHANGEINFODATA, dev, info), >> where info would be a structure/union telling what's this data about >> > >We need STP state change notification for rocker also, so we can install/remove STP/ARP filters on port state change. Netdev_notifier would work. I was also thinking about using Jiri’s ndo_swdev_flow_install/remove to install/remove the STP filters on the port, rather than using netdev_notifier. In other words, does the HW bridge driver need to know the STP state or can it be dumb (stateless) and told when to accept STP BPDUs, or not, using swdev_flow construct: > > dst_mac 01:80:c2:00:00:00 lasp 0x4242 in_port <port ifindex> actions output <br ifindex> > >and later when reaching LEARNING state: > > dst_mac ff:ff:ff:ff:ff:ff eth_type ARP in_port <port ifindex> actions output <br ifindex> > >and finally when reaching FORWARDING state, the learned/static bridge fdbs: > > dst_mac <neigh_mac> in_port <br ifindex> actions output <port ifindex> > >So driver doesn’t really know what STP is; it’s just installs/removes port filter when told to, using the common ndo_swdev_flow API. The smarts stay in the bridge driver. > > >> Let me know what you think, thanks! >> >> [1]: http://patchwork.ozlabs.org/patch/16578/ >> -- >> Florian > > >-scott > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: HW bridging support using notifiers? 2014-10-03 1:48 HW bridging support using notifiers? Florian Fainelli 2014-10-03 5:13 ` Scott Feldman @ 2014-10-03 14:22 ` Benjamin LaHaise 2014-10-03 19:06 ` Florian Fainelli 1 sibling, 1 reply; 6+ messages in thread From: Benjamin LaHaise @ 2014-10-03 14:22 UTC (permalink / raw) To: Florian Fainelli Cc: netdev, davem, jiri, stephen, andy, tgraf, nbd, john.r.fastabend, edumazet, vyasevic, buytenh, sfeldma, Jamal Hadi Salim Hi Florian et al, On Thu, Oct 02, 2014 at 06:48:57PM -0700, Florian Fainelli wrote: > Hi all, > > I am taking a look at adding HW bridging support to DSA, in way that's > usable outside of DSA. I've been working on support for the RTL8366S switch, and our work is directly overlapping here. I actually have something that is working at configuring port and tag based vlans on the RTL8366S. I'll try to clean up the code to post something for discussion over the next couple of days. > Lennert's approach in 2008 [1] looks conceptually good to me,as he > noted, it uses a bunch of new ndo's which is not only limiting to one > ndo implementer per struct net_device, but also is mostly consuming the > information from the bridge layer, while the ndo is an action I think having ndo implementer methods for hardware switch offloads makes more sense. Such a scheme is needed in order to implement the stacking of devices that is required in order to transparently handle configuration of vlans on switch ports where the 8021q device has to pass on the vlan tag to the switch device. The ndo methods do perform an action of causing the switch to be configured to match the bridge config. Additionally, they can be used to veto changes that cannot be offloaded to hardware -- this (configurable) behaviour is desired by some users of these APIs who wish to be made aware when a particuarly configuration is not supported by the underlying hardware. > So here's what I am up to now: > > - use the NETDEV_JOIN notifier to discover when a bridge port is added > - use the NETDEV_LEAVE notifier, still need to verify this does not > break netconsole as indicated in net/bridge/br_if.c > - use the NETDEV_CHANGEINFODATA notifier to notify about STP state changes To me, notifiers are the wrong model for join and leave. Implementing stacking on top of notifiers is somewhat more complicated. Here are the ndo methods I've implemented so far which are sufficient for basic config of the RTL8366S. They're fairly similar to those in [1]. + int (*ndo_join_bridge)(struct net_bridge *bridge, + struct net_device *dev, + int *switch_nr, + int *switch_port_nr, + int vlan); + int (*ndo_leave_bridge)(struct net_bridge *bridge, + struct net_device *dev, + int switch_nr, + int switch_port_nr, + int vlan); + int (*ndo_flood_xmit)(struct switch_info *dev, + struct sk_buff *skb, + u64 port_mask); There are a couple of important points here. In the case of joining and leaving a bridge, the bridge needs to be provided with information it can use to identify switch ports. This is needed in order to offload the flooding of packets to multiple ports, as otherwise the Linux bridge code doesn't have any way to figure out which packets can be merged into a single transmit via the ndo_flood_xmit() method. > Now, this raises a bunch of questions: > > - we would need a getter to return the stp state of a given network > device when called with NETDEV_CHANGEINFODATA, is that acceptable? This > would be the first function exported by the bridge layer to expose > internal data I have yet to dig into STP, so I'll refrain from commenting on these parts for now. > NB: this also raises the question of the race condition and locking > within br_set_stp_state() and when the network devices notifier callback > runs U > - or do we need a new network device notifier accepting an opaque > pointer which could provide us with the data we what, something like > this: call_netdevices_notifier_data(NETDEV_CHANGEINFODATA, dev, info), > where info would be a structure/union telling what's this data about > > Let me know what you think, thanks! > > [1]: http://patchwork.ozlabs.org/patch/16578/ Thanks for the pointer to this. Cheers! -ben > -- > Florian > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- "Thought is the essence of where you are now." ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: HW bridging support using notifiers? 2014-10-03 14:22 ` Benjamin LaHaise @ 2014-10-03 19:06 ` Florian Fainelli 2014-10-03 19:42 ` Benjamin LaHaise 0 siblings, 1 reply; 6+ messages in thread From: Florian Fainelli @ 2014-10-03 19:06 UTC (permalink / raw) To: Benjamin LaHaise Cc: netdev, davem, jiri, stephen, andy, tgraf, nbd, john.r.fastabend, edumazet, vyasevic, buytenh, sfeldma, Jamal Hadi Salim Hi Benjamin, On 10/03/2014 07:22 AM, Benjamin LaHaise wrote: > Hi Florian et al, > > On Thu, Oct 02, 2014 at 06:48:57PM -0700, Florian Fainelli wrote: >> Hi all, >> >> I am taking a look at adding HW bridging support to DSA, in way that's >> usable outside of DSA. > > I've been working on support for the RTL8366S switch, and our work is > directly overlapping here. I actually have something that is working at > configuring port and tag based vlans on the RTL8366S. I'll try to clean > up the code to post something for discussion over the next couple of days. Cool, please do! > >> Lennert's approach in 2008 [1] looks conceptually good to me,as he >> noted, it uses a bunch of new ndo's which is not only limiting to one >> ndo implementer per struct net_device, but also is mostly consuming the >> information from the bridge layer, while the ndo is an action > > I think having ndo implementer methods for hardware switch offloads makes > more sense. Such a scheme is needed in order to implement the stacking of > devices that is required in order to transparently handle configuration of > vlans on switch ports where the 8021q device has to pass on the vlan tag > to the switch device. The ndo methods do perform an action of causing the > switch to be configured to match the bridge config. Additionally, they > can be used to veto changes that cannot be offloaded to hardware -- this > (configurable) behaviour is desired by some users of these APIs who wish > to be made aware when a particuarly configuration is not supported by the > underlying hardware. Humm, that's a fair point, so not only would we want new NDOs, but we'd also need to specify the return values (invalid, no space etc...). As far as bridging alone is concerned (not including VLANs for now), I don't think there are restrictions in terms of what the hardware can do, since we mostly tell it to "group" N-ports together. For VLANs, there should be a way for the switch driver to tell whether that's supported or not. > >> So here's what I am up to now: >> >> - use the NETDEV_JOIN notifier to discover when a bridge port is added >> - use the NETDEV_LEAVE notifier, still need to verify this does not >> break netconsole as indicated in net/bridge/br_if.c >> - use the NETDEV_CHANGEINFODATA notifier to notify about STP state changes > > To me, notifiers are the wrong model for join and leave. Implementing > stacking on top of notifiers is somewhat more complicated. Here are the > ndo methods I've implemented so far which are sufficient for basic config > of the RTL8366S. They're fairly similar to those in [1]. > > + int (*ndo_join_bridge)(struct net_bridge *bridge, > + struct net_device *dev, > + int *switch_nr, > + int *switch_port_nr, > + int vlan); > + int (*ndo_leave_bridge)(struct net_bridge *bridge, > + struct net_device *dev, > + int switch_nr, > + int switch_port_nr, > + int vlan); > + int (*ndo_flood_xmit)(struct switch_info *dev, > + struct sk_buff *skb, > + u64 port_mask); I don't think the switch_port_nr belongs here, this is something that should be resolved within the implementer of these ndo's, whether that is DSA, or Jiri's switchdev, since the net_device argument should be linked to both the switch port number, and the switch number. > > There are a couple of important points here. In the case of joining and > leaving a bridge, the bridge needs to be provided with information it can > use to identify switch ports. This is needed in order to offload the > flooding of packets to multiple ports, as otherwise the Linux bridge code > doesn't have any way to figure out which packets can be merged into a > single transmit via the ndo_flood_xmit() method. I am not exactly sure yet how ndo_flood_xmit() fits in the picture here, but it might be optional based on how the switch has been configured I presume? > >> Now, this raises a bunch of questions: >> >> - we would need a getter to return the stp state of a given network >> device when called with NETDEV_CHANGEINFODATA, is that acceptable? This >> would be the first function exported by the bridge layer to expose >> internal data > > I have yet to dig into STP, so I'll refrain from commenting on these parts > for now. The idea is to use the Linux STP result and apply the results to the bridge port/switch ports members directly, since switches (at least those from Marvell and Broadcom) have that semantic built into their hardware logic. > >> NB: this also raises the question of the race condition and locking >> within br_set_stp_state() and when the network devices notifier callback >> runs > U >> - or do we need a new network device notifier accepting an opaque >> pointer which could provide us with the data we what, something like >> this: call_netdevices_notifier_data(NETDEV_CHANGEINFODATA, dev, info), >> where info would be a structure/union telling what's this data about >> >> Let me know what you think, thanks! >> >> [1]: http://patchwork.ozlabs.org/patch/16578/ > > Thanks for the pointer to this. Cheers! > > -ben > >> -- >> Florian >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: HW bridging support using notifiers? 2014-10-03 19:06 ` Florian Fainelli @ 2014-10-03 19:42 ` Benjamin LaHaise 0 siblings, 0 replies; 6+ messages in thread From: Benjamin LaHaise @ 2014-10-03 19:42 UTC (permalink / raw) To: Florian Fainelli Cc: netdev, davem, jiri, stephen, andy, tgraf, nbd, john.r.fastabend, edumazet, vyasevic, buytenh, sfeldma, Jamal Hadi Salim On Fri, Oct 03, 2014 at 12:06:44PM -0700, Florian Fainelli wrote: > Hi Benjamin, > > On 10/03/2014 07:22 AM, Benjamin LaHaise wrote: > > Hi Florian et al, > > > > On Thu, Oct 02, 2014 at 06:48:57PM -0700, Florian Fainelli wrote: > >> Hi all, > >> > >> I am taking a look at adding HW bridging support to DSA, in way that's > >> usable outside of DSA. > > > > I've been working on support for the RTL8366S switch, and our work is > > directly overlapping here. I actually have something that is working at > > configuring port and tag based vlans on the RTL8366S. I'll try to clean > > up the code to post something for discussion over the next couple of days. > > Cool, please do! > > > > >> Lennert's approach in 2008 [1] looks conceptually good to me,as he > >> noted, it uses a bunch of new ndo's which is not only limiting to one > >> ndo implementer per struct net_device, but also is mostly consuming the > >> information from the bridge layer, while the ndo is an action > > > > I think having ndo implementer methods for hardware switch offloads makes > > more sense. Such a scheme is needed in order to implement the stacking of > > devices that is required in order to transparently handle configuration of > > vlans on switch ports where the 8021q device has to pass on the vlan tag > > to the switch device. The ndo methods do perform an action of causing the > > switch to be configured to match the bridge config. Additionally, they > > can be used to veto changes that cannot be offloaded to hardware -- this > > (configurable) behaviour is desired by some users of these APIs who wish > > to be made aware when a particuarly configuration is not supported by the > > underlying hardware. > > Humm, that's a fair point, so not only would we want new NDOs, but we'd > also need to specify the return values (invalid, no space etc...). > > As far as bridging alone is concerned (not including VLANs for now), I > don't think there are restrictions in terms of what the hardware can do, > since we mostly tell it to "group" N-ports together. > > For VLANs, there should be a way for the switch driver to tell whether > that's supported or not. What the hardware can support varies widely. For example, the RTL8366S happens to support a total of 8 FDBs in hardware, which, given how the Linux bridge works, implies a total of at most 8 VLANs. However, it can use more VLANs if they share overlapping FDBs (which Linux doesn't support). There are also features like VLAN remapping, q-in-q support... We're going to have to do a fair amount of work to learn about all these quirks of hardware features that need to be identified and reported. > > > >> So here's what I am up to now: > >> > >> - use the NETDEV_JOIN notifier to discover when a bridge port is added > >> - use the NETDEV_LEAVE notifier, still need to verify this does not > >> break netconsole as indicated in net/bridge/br_if.c > >> - use the NETDEV_CHANGEINFODATA notifier to notify about STP state changes > > > > To me, notifiers are the wrong model for join and leave. Implementing > > stacking on top of notifiers is somewhat more complicated. Here are the > > ndo methods I've implemented so far which are sufficient for basic config > > of the RTL8366S. They're fairly similar to those in [1]. > > > > + int (*ndo_join_bridge)(struct net_bridge *bridge, > > + struct net_device *dev, > > + int *switch_nr, > > + int *switch_port_nr, > > + int vlan); > > + int (*ndo_leave_bridge)(struct net_bridge *bridge, > > + struct net_device *dev, > > + int switch_nr, > > + int switch_port_nr, > > + int vlan); > > + int (*ndo_flood_xmit)(struct switch_info *dev, > > + struct sk_buff *skb, > > + u64 port_mask); > > I don't think the switch_port_nr belongs here, this is something that > should be resolved within the implementer of these ndo's, whether that > is DSA, or Jiri's switchdev, since the net_device argument should be > linked to both the switch port number, and the switch number. The switch_port_nr is absolutely required for flood offloading. (more below) > > > > There are a couple of important points here. In the case of joining and > > leaving a bridge, the bridge needs to be provided with information it can > > use to identify switch ports. This is needed in order to offload the > > flooding of packets to multiple ports, as otherwise the Linux bridge code > > doesn't have any way to figure out which packets can be merged into a > > single transmit via the ndo_flood_xmit() method. > > I am not exactly sure yet how ndo_flood_xmit() fits in the picture here, > but it might be optional based on how the switch has been configured I > presume? ndo_flood_xmit() is a method that sends a single packet to a bitmask of the ports attached to the switch. This is quite useful for saving bandwidth on the CPU port of a switch when sending out broadcast packets, and, more importantly, multicast packets. The bits in that bitmask correspond to the switch_port_nr reported ny ndo_join_bridge(), and I modified the Linux bridge code to group ports attached to the same switch together and use the switch_nr to identify that ports are on the same switch and collapse flooding to multiple ports into a single call of ndo_flood_xmit(). The RTL8366S has support for this feature (that's why I implemented it), and I'm pretty sure other switches do as well -- at the very least I know one of the Marvell switches I was exposed to in the past that had this capability, but I don't recall the precise details of the interface since I wasn't directly involved in the coding for that driver. I'm sure there are other hardware features we'll have to come up with a model for. Cheers, -ben -- "Thought is the essence of where you are now." ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-10-03 19:42 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-10-03 1:48 HW bridging support using notifiers? Florian Fainelli 2014-10-03 5:13 ` Scott Feldman 2014-10-03 7:53 ` Jiri Pirko 2014-10-03 14:22 ` Benjamin LaHaise 2014-10-03 19:06 ` Florian Fainelli 2014-10-03 19:42 ` Benjamin LaHaise
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).