* [RFC PATCH 0/4] switch device: offload policy attributes
@ 2014-11-21 22:49 roopa
2014-11-24 10:18 ` Scott Feldman
0 siblings, 1 reply; 9+ messages in thread
From: roopa @ 2014-11-21 22:49 UTC (permalink / raw)
To: jiri, sfeldma, jhs, bcrl, tgraf, john.fastabend, stephen,
linville, nhorman, nicolas.dichtel, vyasevic, f.fainelli, buytenh,
aviadr
Cc: netdev, davem, shrijeet, gospo, Roopa Prabhu
From: Roopa Prabhu <roopa@cumulusnetworks.com>
This series aims at introducing new policy attibutes/flags to enable
selective offloading of kernel network objects.
This is in the context of supporting switch devices in the linux kernel.
Assumption:
- All kernel network objects (routes, neighs, bridges, bonds, vxlans)
can be offloaded (This is true today with a few exceptions maybe)
policy points:
- By default all objects exist in software (kernel)
- Per object flag to add/del/show in kernel, hardware or both
- System global option to turn on/off offloads for all network objects.
This is for systems who want to turn offloading on for all network objects
by default. us :). Apps dont need to know about offloading in this
model. (TBD)
Patches are based on jiri/sfeldma's rocker work.
Apologize for the incomplete and untested code. This is a sample patch
to get some initial feedback.
Roopa Prabhu (4):
rtnetlink: new flag NLM_F_HW_OFFLOAD to indicate kernel object
offload to hardware
netdev: new feature flag NETIF_F_HW_OFFLOAD to indicate netdev object
offload to hardware
swdevice: new generic op to set bridge port attr
bridge: make hw offload conditional on bridge and bridge port offload
flags
include/linux/netdev_features.h | 1 +
include/net/switchdev.h | 8 ++++++-
include/uapi/linux/netlink.h | 2 ++
net/bridge/br_netlink.c | 50 +++++++++++++++++++++++++++++++--------
net/bridge/br_private.h | 2 ++
net/bridge/br_stp.c | 9 ++++---
net/bridge/br_stp_if.c | 8 +++++--
net/core/rtnetlink.c | 7 ++++++
net/switchdev/switchdev.c | 17 +++++++++++++
9 files changed, 88 insertions(+), 16 deletions(-)
--
1.7.10.4
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [RFC PATCH 0/4] switch device: offload policy attributes 2014-11-21 22:49 [RFC PATCH 0/4] switch device: offload policy attributes roopa @ 2014-11-24 10:18 ` Scott Feldman 2014-11-24 13:24 ` Jamal Hadi Salim 2014-11-24 14:55 ` Roopa Prabhu 0 siblings, 2 replies; 9+ messages in thread From: Scott Feldman @ 2014-11-24 10:18 UTC (permalink / raw) To: Roopa Prabhu Cc: Jiří Pírko, Jamal Hadi Salim, Benjamin LaHaise, Thomas Graf, john.fastabend, stephen, John Linville, nhorman, Nicolas Dichtel, vyasevic, Florian Fainelli, buytenh, Aviad Raveh, Netdev, David S. Miller, shrijeet, Andy Gospodarek Hi Roopa, I have a patch pending against Jiri's v2 that's uses existing ndo_bridge_setlink/getlink to push policy settings down to port driver for controlling HW offload. I had to make a few tweaks, but for the most part setlink/getlink already has the master/self semantics so users can set policy flags on bridge's SW version of the port (master) or on the offloaded version of the port (self). I added the new hwmode option "swdev" to the existing "vepa"|"veb" choices. When you specify hwmode, SELF is set and the port driver's setlink get's called. Did you look at setlink/getlink? It looks like the kernel and iproute2 where going down this route of using setlink/getlink for SELF policy, so I'm wondering if we need more? On FDB entries, using master/self semantics that exist, it's clear which are owned by offloaded device and which are owned by bridge. The one missing annotation was a flag indicating FDB entry in bridge was synced from device. And a policy flag to turn on/off syncing from the device. The policy flag is just another IFLA_BRPORT flags passed with setlink/getlink. The setlink/getlink patch will go out in v3 once I finish testing it and push it to Jiri. Hopefully tomorrow. -scott On Fri, Nov 21, 2014 at 12:49 PM, <roopa@cumulusnetworks.com> wrote: > From: Roopa Prabhu <roopa@cumulusnetworks.com> > > > This series aims at introducing new policy attibutes/flags to enable > selective offloading of kernel network objects. > This is in the context of supporting switch devices in the linux kernel. > > Assumption: > - All kernel network objects (routes, neighs, bridges, bonds, vxlans) > can be offloaded (This is true today with a few exceptions maybe) > > policy points: > - By default all objects exist in software (kernel) > - Per object flag to add/del/show in kernel, hardware or both > - System global option to turn on/off offloads for all network objects. > This is for systems who want to turn offloading on for all network objects > by default. us :). Apps dont need to know about offloading in this > model. (TBD) > > Patches are based on jiri/sfeldma's rocker work. > > Apologize for the incomplete and untested code. This is a sample patch > to get some initial feedback. > > Roopa Prabhu (4): > rtnetlink: new flag NLM_F_HW_OFFLOAD to indicate kernel object > offload to hardware > netdev: new feature flag NETIF_F_HW_OFFLOAD to indicate netdev object > offload to hardware > swdevice: new generic op to set bridge port attr > bridge: make hw offload conditional on bridge and bridge port offload > flags > > include/linux/netdev_features.h | 1 + > include/net/switchdev.h | 8 ++++++- > include/uapi/linux/netlink.h | 2 ++ > net/bridge/br_netlink.c | 50 +++++++++++++++++++++++++++++++-------- > net/bridge/br_private.h | 2 ++ > net/bridge/br_stp.c | 9 ++++--- > net/bridge/br_stp_if.c | 8 +++++-- > net/core/rtnetlink.c | 7 ++++++ > net/switchdev/switchdev.c | 17 +++++++++++++ > 9 files changed, 88 insertions(+), 16 deletions(-) > > -- > 1.7.10.4 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/4] switch device: offload policy attributes 2014-11-24 10:18 ` Scott Feldman @ 2014-11-24 13:24 ` Jamal Hadi Salim 2014-11-24 14:55 ` Roopa Prabhu 1 sibling, 0 replies; 9+ messages in thread From: Jamal Hadi Salim @ 2014-11-24 13:24 UTC (permalink / raw) To: Scott Feldman, Roopa Prabhu Cc: Jiří Pírko, Benjamin LaHaise, Thomas Graf, john.fastabend, stephen, John Linville, nhorman, Nicolas Dichtel, vyasevic, Florian Fainelli, buytenh, Aviad Raveh, Netdev, David S. Miller, shrijeet, Andy Gospodarek One of the challenges of master/self in the bridge is it may have morphed a little bit from its original goal given that unicast addresses on the device are now considered part of that equation. Vlad? Perhaps it is reasonable to consider new flags. cheers, jamal On 11/24/14 05:18, Scott Feldman wrote: > Hi Roopa, > > I have a patch pending against Jiri's v2 that's uses existing > ndo_bridge_setlink/getlink to push policy settings down to port driver > for controlling HW offload. I had to make a few tweaks, but for the > most part setlink/getlink already has the master/self semantics so > users can set policy flags on bridge's SW version of the port (master) > or on the offloaded version of the port (self). I added the new > hwmode option "swdev" to the existing "vepa"|"veb" choices. When you > specify hwmode, SELF is set and the port driver's setlink get's > called. Did you look at setlink/getlink? It looks like the kernel > and iproute2 where going down this route of using setlink/getlink for > SELF policy, so I'm wondering if we need more? > > On FDB entries, using master/self semantics that exist, it's clear > which are owned by offloaded device and which are owned by bridge. > The one missing annotation was a flag indicating FDB entry in bridge > was synced from device. And a policy flag to turn on/off syncing from > the device. The policy flag is just another IFLA_BRPORT flags passed > with setlink/getlink. > > The setlink/getlink patch will go out in v3 once I finish testing it > and push it to Jiri. Hopefully tomorrow. > > -scott > > On Fri, Nov 21, 2014 at 12:49 PM, <roopa@cumulusnetworks.com> wrote: >> From: Roopa Prabhu <roopa@cumulusnetworks.com> >> >> >> This series aims at introducing new policy attibutes/flags to enable >> selective offloading of kernel network objects. >> This is in the context of supporting switch devices in the linux kernel. >> >> Assumption: >> - All kernel network objects (routes, neighs, bridges, bonds, vxlans) >> can be offloaded (This is true today with a few exceptions maybe) >> >> policy points: >> - By default all objects exist in software (kernel) >> - Per object flag to add/del/show in kernel, hardware or both >> - System global option to turn on/off offloads for all network objects. >> This is for systems who want to turn offloading on for all network objects >> by default. us :). Apps dont need to know about offloading in this >> model. (TBD) >> >> Patches are based on jiri/sfeldma's rocker work. >> >> Apologize for the incomplete and untested code. This is a sample patch >> to get some initial feedback. >> >> Roopa Prabhu (4): >> rtnetlink: new flag NLM_F_HW_OFFLOAD to indicate kernel object >> offload to hardware >> netdev: new feature flag NETIF_F_HW_OFFLOAD to indicate netdev object >> offload to hardware >> swdevice: new generic op to set bridge port attr >> bridge: make hw offload conditional on bridge and bridge port offload >> flags >> >> include/linux/netdev_features.h | 1 + >> include/net/switchdev.h | 8 ++++++- >> include/uapi/linux/netlink.h | 2 ++ >> net/bridge/br_netlink.c | 50 +++++++++++++++++++++++++++++++-------- >> net/bridge/br_private.h | 2 ++ >> net/bridge/br_stp.c | 9 ++++--- >> net/bridge/br_stp_if.c | 8 +++++-- >> net/core/rtnetlink.c | 7 ++++++ >> net/switchdev/switchdev.c | 17 +++++++++++++ >> 9 files changed, 88 insertions(+), 16 deletions(-) >> >> -- >> 1.7.10.4 >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/4] switch device: offload policy attributes 2014-11-24 10:18 ` Scott Feldman 2014-11-24 13:24 ` Jamal Hadi Salim @ 2014-11-24 14:55 ` Roopa Prabhu 2014-11-24 20:48 ` Scott Feldman 1 sibling, 1 reply; 9+ messages in thread From: Roopa Prabhu @ 2014-11-24 14:55 UTC (permalink / raw) To: Scott Feldman Cc: Jiří Pírko, Jamal Hadi Salim, Benjamin LaHaise, Thomas Graf, john.fastabend, stephen, John Linville, nhorman, Nicolas Dichtel, vyasevic, Florian Fainelli, buytenh, Aviad Raveh, Netdev, David S. Miller, shrijeet, Andy Gospodarek On 11/24/14, 2:18 AM, Scott Feldman wrote: > Hi Roopa, > > I have a patch pending against Jiri's v2 that's uses existing > ndo_bridge_setlink/getlink to push policy settings down to port driver > for controlling HW offload. I had to make a few tweaks, but for the > most part setlink/getlink already has the master/self semantics so > users can set policy flags on bridge's SW version of the port (master) > or on the offloaded version of the port (self). > I added the new > hwmode option "swdev" to the existing "vepa"|"veb" choices. When you > specify hwmode, SELF is set and the port driver's setlink get's > called. Did you look at setlink/getlink? It looks like the kernel > and iproute2 where going down this route of using setlink/getlink for > SELF policy, so I'm wondering if we need more? If i understand you correctly, this will mean the port driver implements the setlink/getlink with the bridge port flags and attributes. And, it also means the port driver will parse the netlink msg instead of the bridge driver parsing all attributes and then calling offloads. But, in cases where we want bridge port flags and attributes set both in hw and sw, the user (iproute2) will have to set both MASTER and SELF flags, and the netlink parsing will now happen in two places, the bridge driver and the bridge port driver ? For bridge stp state updates, the bridge driver will call the ports->setlink ndo op ? (We should probably rename the ndo_bridge_setlink to ndo_setlink) > > On FDB entries, using master/self semantics that exist, it's clear > which are owned by offloaded device and which are owned by bridge. > The one missing annotation was a flag indicating FDB entry in bridge > was synced from device. And a policy flag to turn on/off syncing from > the device. The policy flag is just another IFLA_BRPORT flags passed > with setlink/getlink. > > The setlink/getlink patch will go out in v3 once I finish testing it > and push it to Jiri. Hopefully tomorrow. In my patches, I used newlink..., but in most cases all attributes set via newlink can be used with setlink and hence getlink. So, i think we are close here. But, Oh wait, i am talking about rtnl_link_ops ->newlink/changelink/getlink/dellink. I did not realize there was a parallel ndo op to this. But thats probably because, rtnl_link_ops can be used only for logical devices. But, i see bridge driver implementing both the rtnl_link_ops changelink and ndo op setlink. hmm..seems like a lot of duplication. Will look closely some more. Coming back to my series, i was trying to get a common set of flags for all netdevs (bridge, bond, vxlans so far), and hence the common flag in 'struct ifiinfomsg'. But, I am all for using existing infrastructure if it fits well. For the fdb offloads, the NTF_SELF and NTF_MASTER is in 'struct ndmsg->ndm_flags', which is also what i was proposing. So, ack there. For the bridge netdev, using the flag in IFLA_BRIDGE_FLAGS, is also ok. I will look at your patches when they are out. Thanks, Roopa On Fri, Nov 21, 2014 at 12:49 PM, <roopa@cumulusnetworks.com> wrote: >> From: Roopa Prabhu <roopa@cumulusnetworks.com> >> >> >> This series aims at introducing new policy attibutes/flags to enable >> selective offloading of kernel network objects. >> This is in the context of supporting switch devices in the linux kernel. >> >> Assumption: >> - All kernel network objects (routes, neighs, bridges, bonds, vxlans) >> can be offloaded (This is true today with a few exceptions maybe) >> >> policy points: >> - By default all objects exist in software (kernel) >> - Per object flag to add/del/show in kernel, hardware or both >> - System global option to turn on/off offloads for all network objects. >> This is for systems who want to turn offloading on for all network objects >> by default. us :). Apps dont need to know about offloading in this >> model. (TBD) >> >> Patches are based on jiri/sfeldma's rocker work. >> >> Apologize for the incomplete and untested code. This is a sample patch >> to get some initial feedback. >> >> Roopa Prabhu (4): >> rtnetlink: new flag NLM_F_HW_OFFLOAD to indicate kernel object >> offload to hardware >> netdev: new feature flag NETIF_F_HW_OFFLOAD to indicate netdev object >> offload to hardware >> swdevice: new generic op to set bridge port attr >> bridge: make hw offload conditional on bridge and bridge port offload >> flags >> >> include/linux/netdev_features.h | 1 + >> include/net/switchdev.h | 8 ++++++- >> include/uapi/linux/netlink.h | 2 ++ >> net/bridge/br_netlink.c | 50 +++++++++++++++++++++++++++++++-------- >> net/bridge/br_private.h | 2 ++ >> net/bridge/br_stp.c | 9 ++++--- >> net/bridge/br_stp_if.c | 8 +++++-- >> net/core/rtnetlink.c | 7 ++++++ >> net/switchdev/switchdev.c | 17 +++++++++++++ >> 9 files changed, 88 insertions(+), 16 deletions(-) >> >> -- >> 1.7.10.4 >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/4] switch device: offload policy attributes 2014-11-24 14:55 ` Roopa Prabhu @ 2014-11-24 20:48 ` Scott Feldman 2014-11-24 23:00 ` Roopa Prabhu 2014-11-25 3:10 ` Ronen Arad 0 siblings, 2 replies; 9+ messages in thread From: Scott Feldman @ 2014-11-24 20:48 UTC (permalink / raw) To: Roopa Prabhu Cc: Jiří Pírko, Jamal Hadi Salim, Benjamin LaHaise, Thomas Graf, john.fastabend, stephen, John Linville, nhorman, Nicolas Dichtel, vyasevic, Florian Fainelli, buytenh, Aviad Raveh, Netdev, David S. Miller, shrijeet, Andy Gospodarek On Mon, Nov 24, 2014 at 4:55 AM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote: > On 11/24/14, 2:18 AM, Scott Feldman wrote: >> >> Hi Roopa, >> >> I have a patch pending against Jiri's v2 that's uses existing >> ndo_bridge_setlink/getlink to push policy settings down to port driver >> for controlling HW offload. I had to make a few tweaks, but for the >> most part setlink/getlink already has the master/self semantics so >> users can set policy flags on bridge's SW version of the port (master) >> or on the offloaded version of the port (self). >> I added the new >> hwmode option "swdev" to the existing "vepa"|"veb" choices. When you >> specify hwmode, SELF is set and the port driver's setlink get's >> called. Did you look at setlink/getlink? It looks like the kernel >> and iproute2 where going down this route of using setlink/getlink for >> SELF policy, so I'm wondering if we need more? > > If i understand you correctly, this will mean the port driver implements the > setlink/getlink with the bridge port flags and attributes. And, it also > means > the port driver will parse the netlink msg instead of the bridge driver > parsing all attributes and then calling offloads. Yes, exactly, the port driver parses/fills bridge_setlink/getlink netlink msg to set/get port settings. It's basically a passthru for rtnl_bridge_setlink/getlink handling RTM_GETLNK/RTM_SETLINK on PF_BRIDGE. > But, in cases where we want bridge port flags and attributes set both in hw > and sw, > the user (iproute2) will have to set both MASTER and SELF flags, and the > netlink parsing will now happen in two places, the bridge driver and the > bridge port > driver ? Yes, that's correct. That seems to be the intent of the current design based on the existing code. I had to do minor changes to get brport flags passed back up in getlink, but for the most part the kernel code and iproute2 code where already setup to support this dual master/self model. As an example, let's consider the flags IFLA_BRPORT_LEARNING. User can set learning on/off on bridge's port (master) and can independently set learning on/off on port driver (self) in HW. So for typical swdev setups, the port driver would default to learning ON and the bridge would default to learning ON. The user would probably want to turn learning OFF on the bridge, but all combinations of ON/OFF are available. A better example is IFLA_BRPORT_FLOOD. If we had a single flag that applied to both master and self, we'd run into trouble in the flooding ON case. We don't want both the HW and bridge driver flooding as this would result in duplicate egress pkts. If we turn flooding OFF, then neither HW or SW flood and our bridge is broken. So we want SW bridge (master) flooding OFF and port driver (self) have flooding ON, for the optimized HW-offload case. > For bridge stp state updates, the bridge driver will call the ports->setlink > ndo op ? No, we added a new ndo op to communicate STP state changes to port driver. STP state transitions don't really fit into the setlink/getlink model, although you reminded me there was a request for a policy flag to turn STP pushes down to port driver ON/OFF. > (We should probably rename the ndo_bridge_setlink to ndo_setlink) The name seems correct as they're specific to RTM_GETLNK/SETLNK for PF_BRIDGE. >> >> On FDB entries, using master/self semantics that exist, it's clear >> which are owned by offloaded device and which are owned by bridge. >> The one missing annotation was a flag indicating FDB entry in bridge >> was synced from device. And a policy flag to turn on/off syncing from >> the device. The policy flag is just another IFLA_BRPORT flags passed >> with setlink/getlink. >> >> The setlink/getlink patch will go out in v3 once I finish testing it >> and push it to Jiri. Hopefully tomorrow. > > > In my patches, I used newlink..., but in most cases all attributes set via > newlink can be > used with setlink and hence getlink. So, i think we are close here. > > But, Oh wait, i am talking about rtnl_link_ops > ->newlink/changelink/getlink/dellink. I did not > realize there was a parallel ndo op to this. But thats probably because, > rtnl_link_ops can be > used only for logical devices. But, i see bridge driver implementing both > the > rtnl_link_ops changelink and ndo op setlink. hmm..seems like a lot of > duplication. > Will look closely some more. > > > Coming back to my series, i was trying to get a common set of flags for all > netdevs > (bridge, bond, vxlans so far), and hence the common flag in 'struct > ifiinfomsg'. > > But, I am all for using existing infrastructure if it fits well. Jiri updated his net-next-rocker tree with my latest but hasn't pushed out v3 yet. You can see for bridging at least we now have the policy flag support for master/self without much heavy lifting in kernel or iproute2 code. It seems like the right move at this time, with low chance of regressions or breaking backward compat. > For the fdb offloads, the NTF_SELF and NTF_MASTER is in 'struct > ndmsg->ndm_flags', which is also > what i was proposing. So, ack there. Ya, for FDB, the master/self model works slick and support is already there so we should use it. > For the bridge netdev, using the flag in IFLA_BRIDGE_FLAGS, is also ok. Agreed. > I will look at your patches when they are out. v3 is imminent. You can pull net-next-rocker now to look it over. -scott ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/4] switch device: offload policy attributes 2014-11-24 20:48 ` Scott Feldman @ 2014-11-24 23:00 ` Roopa Prabhu 2014-11-25 6:39 ` John Fastabend 2014-11-25 3:10 ` Ronen Arad 1 sibling, 1 reply; 9+ messages in thread From: Roopa Prabhu @ 2014-11-24 23:00 UTC (permalink / raw) To: Scott Feldman Cc: Jiří Pírko, Jamal Hadi Salim, Benjamin LaHaise, Thomas Graf, john.fastabend, stephen, John Linville, nhorman, Nicolas Dichtel, vyasevic, Florian Fainelli, buytenh, Aviad Raveh, Netdev, David S. Miller, Shrijeet Mukherjee, Andy Gospodarek On 11/24/14, 12:48 PM, Scott Feldman wrote: > On Mon, Nov 24, 2014 at 4:55 AM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote: >> On 11/24/14, 2:18 AM, Scott Feldman wrote: >>> Hi Roopa, >>> >>> I have a patch pending against Jiri's v2 that's uses existing >>> ndo_bridge_setlink/getlink to push policy settings down to port driver >>> for controlling HW offload. I had to make a few tweaks, but for the >>> most part setlink/getlink already has the master/self semantics so >>> users can set policy flags on bridge's SW version of the port (master) >>> or on the offloaded version of the port (self). >>> I added the new >>> hwmode option "swdev" to the existing "vepa"|"veb" choices. When you >>> specify hwmode, SELF is set and the port driver's setlink get's >>> called. Did you look at setlink/getlink? It looks like the kernel >>> and iproute2 where going down this route of using setlink/getlink for >>> SELF policy, so I'm wondering if we need more? >> If i understand you correctly, this will mean the port driver implements the >> setlink/getlink with the bridge port flags and attributes. And, it also >> means >> the port driver will parse the netlink msg instead of the bridge driver >> parsing all attributes and then calling offloads. > Yes, exactly, the port driver parses/fills bridge_setlink/getlink > netlink msg to set/get port settings. It's basically a passthru for > rtnl_bridge_setlink/getlink handling RTM_GETLNK/RTM_SETLINK on > PF_BRIDGE. This will duplicate msg parsing code and validation code in the kernel bridge driver and in all the port drivers. If this is the path we plan to go down on,...it will also mean we will do the same thing for bonds, vxlans and maybe others in the future. In the mode where kernel and hw state need to remain in sync, I am also concerned about how we handle failure cases. I think we will have cases where the kernel will set the attribute and hw will fail. In which case can we rollback ? >> But, in cases where we want bridge port flags and attributes set both in hw >> and sw, >> the user (iproute2) will have to set both MASTER and SELF flags, and the >> netlink parsing will now happen in two places, the bridge driver and the >> bridge port >> driver ? > Yes, that's correct. That seems to be the intent of the current > design based on the existing code. I had to do minor changes to get > brport flags passed back up in getlink, but for the most part the > kernel code and iproute2 code where already setup to support this dual > master/self model. As an example, let's consider the flags > IFLA_BRPORT_LEARNING. User can set learning on/off on bridge's port > (master) and can independently set learning on/off on port driver > (self) in HW. So for typical swdev setups, the port driver would > default to learning ON and the bridge would default to learning ON. > The user would probably want to turn learning OFF on the bridge, but > all combinations of ON/OFF are available. A better example is > IFLA_BRPORT_FLOOD. If we had a single flag that applied to both > master and self, we'd run into trouble in the flooding ON case. We > don't want both the HW and bridge driver flooding as this would result > in duplicate egress pkts. If we turn flooding OFF, then neither HW or > SW flood and our bridge is broken. So we want SW bridge (master) > flooding OFF and port driver (self) have flooding ON, for the > optimized HW-offload case. > >> For bridge stp state updates, the bridge driver will call the ports->setlink >> ndo op ? > No, we added a new ndo op to communicate STP state changes to port > driver. STP state transitions don't really fit into the > setlink/getlink model, although you reminded me there was a request > for a policy flag to turn STP pushes down to port driver ON/OFF. > >> (We should probably rename the ndo_bridge_setlink to ndo_setlink) > The name seems correct as they're specific to RTM_GETLNK/SETLNK for PF_BRIDGE. But, SETLINK/GETLINK Is general across all link objects. It can be bonds and vxlan interfaces too. > >>> On FDB entries, using master/self semantics that exist, it's clear >>> which are owned by offloaded device and which are owned by bridge. >>> The one missing annotation was a flag indicating FDB entry in bridge >>> was synced from device. And a policy flag to turn on/off syncing from >>> the device. The policy flag is just another IFLA_BRPORT flags passed >>> with setlink/getlink. >>> >>> The setlink/getlink patch will go out in v3 once I finish testing it >>> and push it to Jiri. Hopefully tomorrow. >> >> In my patches, I used newlink..., but in most cases all attributes set via >> newlink can be >> used with setlink and hence getlink. So, i think we are close here. >> >> But, Oh wait, i am talking about rtnl_link_ops >> ->newlink/changelink/getlink/dellink. I did not >> realize there was a parallel ndo op to this. But thats probably because, >> rtnl_link_ops can be >> used only for logical devices. But, i see bridge driver implementing both >> the >> rtnl_link_ops changelink and ndo op setlink. hmm..seems like a lot of >> duplication. >> Will look closely some more. >> >> >> Coming back to my series, i was trying to get a common set of flags for all >> netdevs >> (bridge, bond, vxlans so far), and hence the common flag in 'struct >> ifiinfomsg'. >> >> But, I am all for using existing infrastructure if it fits well. > Jiri updated his net-next-rocker tree with my latest but hasn't pushed > out v3 yet. You can see for bridging at least we now have the policy > flag support for master/self without much heavy lifting in kernel or > iproute2 code. It seems like the right move at this time, with low > chance of regressions or breaking backward compat. > >> For the fdb offloads, the NTF_SELF and NTF_MASTER is in 'struct >> ndmsg->ndm_flags', which is also >> what i was proposing. So, ack there. > Ya, for FDB, the master/self model works slick and support is already > there so we should use it. > >> For the bridge netdev, using the flag in IFLA_BRIDGE_FLAGS, is also ok. > Agreed. > >> I will look at your patches when they are out. > v3 is imminent. You can pull net-next-rocker now to look it over. > > ok, thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/4] switch device: offload policy attributes 2014-11-24 23:00 ` Roopa Prabhu @ 2014-11-25 6:39 ` John Fastabend 0 siblings, 0 replies; 9+ messages in thread From: John Fastabend @ 2014-11-25 6:39 UTC (permalink / raw) To: Roopa Prabhu Cc: Scott Feldman, Jiří Pírko, Jamal Hadi Salim, Benjamin LaHaise, Thomas Graf, stephen, John Linville, nhorman, Nicolas Dichtel, vyasevic, Florian Fainelli, buytenh, Aviad Raveh, Netdev, David S. Miller, Shrijeet Mukherjee, Andy Gospodarek On 11/24/2014 03:00 PM, Roopa Prabhu wrote: > On 11/24/14, 12:48 PM, Scott Feldman wrote: >> On Mon, Nov 24, 2014 at 4:55 AM, Roopa Prabhu >> <roopa@cumulusnetworks.com> wrote: >>> On 11/24/14, 2:18 AM, Scott Feldman wrote: >>>> Hi Roopa, >>>> >>>> I have a patch pending against Jiri's v2 that's uses existing >>>> ndo_bridge_setlink/getlink to push policy settings down to port driver >>>> for controlling HW offload. I had to make a few tweaks, but for the >>>> most part setlink/getlink already has the master/self semantics so >>>> users can set policy flags on bridge's SW version of the port (master) >>>> or on the offloaded version of the port (self). >>>> I added the new >>>> hwmode option "swdev" to the existing "vepa"|"veb" choices. When you >>>> specify hwmode, SELF is set and the port driver's setlink get's >>>> called. Did you look at setlink/getlink? It looks like the kernel >>>> and iproute2 where going down this route of using setlink/getlink for >>>> SELF policy, so I'm wondering if we need more? >>> If i understand you correctly, this will mean the port driver >>> implements the >>> setlink/getlink with the bridge port flags and attributes. And, it also >>> means >>> the port driver will parse the netlink msg instead of the bridge driver >>> parsing all attributes and then calling offloads. >> Yes, exactly, the port driver parses/fills bridge_setlink/getlink >> netlink msg to set/get port settings. It's basically a passthru for >> rtnl_bridge_setlink/getlink handling RTM_GETLNK/RTM_SETLINK on >> PF_BRIDGE. > This will duplicate msg parsing code and validation code in the kernel > bridge driver and in all the port drivers. > If this is the path we plan to go down on,...it will also mean we will > do the same thing > for bonds, vxlans and maybe others in the future. > Where it makes sense we can do the msg parsing in a common location once right? The fdb hooks and setlink/getlink handlers should probably be cleaned up for this as well. Or we can build a set of utility functions _dflt_ handlers that drivers can call to do the parsing. At least these are not exposed to the user API so we can iterate over this as needed and as more attributes are needed. > In the mode where kernel and hw state need to remain in sync, > I am also concerned about how we handle failure cases. > > I think we will have cases where the kernel will set the attribute and > hw will fail. In which case can we rollback ? In the fdb cases we set a flag to indicate how far we got in the command and then punt it back to user space where it can be rolled back. We don't have any sort of reserve and commit or rollback on error behaviour. It seems like a "nice" feature but perhaps the first iteration can just return how far it got and let user space handle the rollback. At least this is how I planned to handle flow tables and flow updates as Simon pointed out. [...] -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/4] switch device: offload policy attributes 2014-11-24 20:48 ` Scott Feldman 2014-11-24 23:00 ` Roopa Prabhu @ 2014-11-25 3:10 ` Ronen Arad 2014-11-25 6:43 ` John Fastabend 1 sibling, 1 reply; 9+ messages in thread From: Ronen Arad @ 2014-11-25 3:10 UTC (permalink / raw) To: netdev Scott Feldman <sfeldma <at> gmail.com> writes: > > On Mon, Nov 24, 2014 at 4:55 AM, Roopa Prabhu <roopa <at> cumulusnetworks.com> wrote: > > On 11/24/14, 2:18 AM, Scott Feldman wrote: > >> > >> Hi Roopa, > >> > >> I have a patch pending against Jiri's v2 that's uses existing > >> ndo_bridge_setlink/getlink to push policy settings down to port driver > >> for controlling HW offload. I had to make a few tweaks, but for the > >> most part setlink/getlink already has the master/self semantics so > >> users can set policy flags on bridge's SW version of the port (master) > >> or on the offloaded version of the port (self). > >> I added the new > >> hwmode option "swdev" to the existing "vepa"|"veb" choices. When you > >> specify hwmode, SELF is set and the port driver's setlink get's > >> called. Did you look at setlink/getlink? It looks like the kernel > >> and iproute2 where going down this route of using setlink/getlink for > >> SELF policy, so I'm wondering if we need more? > > > > If i understand you correctly, this will mean the port driver implements the > > setlink/getlink with the bridge port flags and attributes. And, it also > > means > > the port driver will parse the netlink msg instead of the bridge driver > > parsing all attributes and then calling offloads. > > Yes, exactly, the port driver parses/fills bridge_setlink/getlink > netlink msg to set/get port settings. It's basically a passthru for > rtnl_bridge_setlink/getlink handling RTM_GETLNK/RTM_SETLINK on > PF_BRIDGE. > > > But, in cases where we want bridge port flags and attributes set both in hw > > and sw, > > the user (iproute2) will have to set both MASTER and SELF flags, and the > > netlink parsing will now happen in two places, the bridge driver and the > > bridge port > > driver ? > How does VLAN filtering is expected to work with an HW-offloaded bridge? When MASTER flag is set by iproute2 in the netlink message (or no flag is set), br_setlink - the ndo_bridge_setlink function of the port's master bridge is called. It takes care of setting the VLAN policy in the net_port_vlans of the net_bridge_port. During this sequence the ndo_vlan_rx_add_vid function of the port dev is called. However, it only gets the vid. It does not have access to the flags (PVID, UNTAGGED) and those are not set in the net_port_vlans struct at that time. Giving the port driver access to the VLAN policy requires both MASTER and SLAVE flags to be set in the IFLA_BRIDGE_FLAGS attribute. Is the end user is expected to set both flags in the "bridge vlan add" command (Failing to do that will prevent the bridge and HW from being in sync)? Even with both MASTER and SLAVE flags being set in a single netlink request, the HW and bridge could get out of sync if adding the VLAN policy to the port driver would fail after it was added to the bridge netdev. Is there any mechanism for keeping both software and HW in-sync? -Ronen ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/4] switch device: offload policy attributes 2014-11-25 3:10 ` Ronen Arad @ 2014-11-25 6:43 ` John Fastabend 0 siblings, 0 replies; 9+ messages in thread From: John Fastabend @ 2014-11-25 6:43 UTC (permalink / raw) To: Ronen Arad; +Cc: netdev On 11/24/2014 07:10 PM, Ronen Arad wrote: > Scott Feldman <sfeldma <at> gmail.com> writes: > >> >> On Mon, Nov 24, 2014 at 4:55 AM, Roopa Prabhu <roopa <at> > cumulusnetworks.com> wrote: >>> On 11/24/14, 2:18 AM, Scott Feldman wrote: >>>> >>>> Hi Roopa, >>>> >>>> I have a patch pending against Jiri's v2 that's uses existing >>>> ndo_bridge_setlink/getlink to push policy settings down to port driver >>>> for controlling HW offload. I had to make a few tweaks, but for the >>>> most part setlink/getlink already has the master/self semantics so >>>> users can set policy flags on bridge's SW version of the port (master) >>>> or on the offloaded version of the port (self). >>>> I added the new >>>> hwmode option "swdev" to the existing "vepa"|"veb" choices. When you >>>> specify hwmode, SELF is set and the port driver's setlink get's >>>> called. Did you look at setlink/getlink? It looks like the kernel >>>> and iproute2 where going down this route of using setlink/getlink for >>>> SELF policy, so I'm wondering if we need more? >>> >>> If i understand you correctly, this will mean the port driver implements > the >>> setlink/getlink with the bridge port flags and attributes. And, it also >>> means >>> the port driver will parse the netlink msg instead of the bridge driver >>> parsing all attributes and then calling offloads. >> >> Yes, exactly, the port driver parses/fills bridge_setlink/getlink >> netlink msg to set/get port settings. It's basically a passthru for >> rtnl_bridge_setlink/getlink handling RTM_GETLNK/RTM_SETLINK on >> PF_BRIDGE. >> >>> But, in cases where we want bridge port flags and attributes set both in > hw >>> and sw, >>> the user (iproute2) will have to set both MASTER and SELF flags, and the >>> netlink parsing will now happen in two places, the bridge driver and the >>> bridge port >>> driver ? >> > > How does VLAN filtering is expected to work with an HW-offloaded bridge? > > When MASTER flag is set by iproute2 in the netlink message (or no flag is > set), br_setlink - the ndo_bridge_setlink function of the port's master > bridge is called. It takes care of setting the VLAN policy in the > net_port_vlans of the net_bridge_port. During this sequence the > ndo_vlan_rx_add_vid function of the port dev is called. However, it only gets > the vid. It does not have access to the flags (PVID, UNTAGGED) and those are > not set in the net_port_vlans struct at that time. > > Giving the port driver access to the VLAN policy requires both MASTER and > SLAVE flags to be set in the IFLA_BRIDGE_FLAGS attribute. > Is the end user is expected to set both flags in the "bridge vlan add" > command (Failing to do that will prevent the bridge and HW from being in > sync)? > Correct. My expectation is if the user wants the software and hardware tables to be in sync they should set both SELF and MASTER. If the user sets only one of these then it indicates the user _wants_ to have hw/sw out of sync. > Even with both MASTER and SLAVE flags being set in a single netlink request, > the HW and bridge could get out of sync if adding the VLAN policy to the port > driver would fail after it was added to the bridge netdev. > Is there any mechanism for keeping both software and HW in-sync? > Yep, my proposal to Roopa would be to abort the transaction and send back an indication of where the failure occurred. This seems to me to be the easiest option. We could implement a rollback mechanism but I think that can be done in a second series if its needed. > -Ronen > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-11-25 6:44 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-11-21 22:49 [RFC PATCH 0/4] switch device: offload policy attributes roopa 2014-11-24 10:18 ` Scott Feldman 2014-11-24 13:24 ` Jamal Hadi Salim 2014-11-24 14:55 ` Roopa Prabhu 2014-11-24 20:48 ` Scott Feldman 2014-11-24 23:00 ` Roopa Prabhu 2014-11-25 6:39 ` John Fastabend 2014-11-25 3:10 ` Ronen Arad 2014-11-25 6:43 ` John Fastabend
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).