* bonding directly changes underlying device address @ 2014-05-13 11:06 Or Gerlitz 2014-05-13 11:55 ` Jiri Pirko 0 siblings, 1 reply; 9+ messages in thread From: Or Gerlitz @ 2014-05-13 11:06 UTC (permalink / raw) To: Jay Vosburgh, Veaceslav Falico; +Cc: Eyal Perry, netdev, Noa Osherovich Hi Jay, Veaceslav I see now that alb_set_slave_mac_addr directly changes the underlying device mac address without calling dev_set_mac_address when running in TLB mode. Is that on purpose? if yes, can you explain why? I suspect this can lead to funny (or actually sad) bugs in networking drivers, can we avoid that? Or. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bonding directly changes underlying device address 2014-05-13 11:06 bonding directly changes underlying device address Or Gerlitz @ 2014-05-13 11:55 ` Jiri Pirko 2014-05-13 16:30 ` Jay Vosburgh 0 siblings, 1 reply; 9+ messages in thread From: Jiri Pirko @ 2014-05-13 11:55 UTC (permalink / raw) To: Or Gerlitz Cc: Jay Vosburgh, Veaceslav Falico, Eyal Perry, netdev, Noa Osherovich Tue, May 13, 2014 at 01:06:56PM CEST, ogerlitz@mellanox.com wrote: >Hi Jay, Veaceslav > >I see now that alb_set_slave_mac_addr directly changes the >underlying device mac address without calling dev_set_mac_address >when running in TLB mode. Is that on purpose? if yes, can you explain >why? I believe that is a bug. dev_addr change should be always done using dev_set_mac_address. > >I suspect this can lead to funny (or actually sad) bugs in networking >drivers, can we avoid that? > >Or. >-- >To unsubscribe from this list: send the line "unsubscribe netdev" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bonding directly changes underlying device address 2014-05-13 11:55 ` Jiri Pirko @ 2014-05-13 16:30 ` Jay Vosburgh 2014-05-13 17:38 ` Or Gerlitz 0 siblings, 1 reply; 9+ messages in thread From: Jay Vosburgh @ 2014-05-13 16:30 UTC (permalink / raw) To: Jiri Pirko Cc: Or Gerlitz, Veaceslav Falico, Eyal Perry, netdev, Noa Osherovich Jiri Pirko <jiri@resnulli.us> wrote: >Tue, May 13, 2014 at 01:06:56PM CEST, ogerlitz@mellanox.com wrote: >>Hi Jay, Veaceslav >> >>I see now that alb_set_slave_mac_addr directly changes the >>underlying device mac address without calling dev_set_mac_address >>when running in TLB mode. Is that on purpose? if yes, can you explain >>why? > >I believe that is a bug. dev_addr change should be always done using >dev_set_mac_address. It may or may not be a bug today, but it's done this way on purpose to work around the limitations of the system when the tlb mode was implemented. The tlb mode wants to have each slave send with source MAC set to a particular MAC assigned to the slave (which may or may not be the slave's actual MAC address). It does not need for slaves other than the active slave to actually receive any packets (tlb only has load balancing for TX, RX all goes to one slave), so programming the MAC into the device was left out on purpose. The MAC change was done this way because, when the code was originally written, many (perhaps most) devices / drivers could not change their MAC address while open; it was common practice to program the MAC in the device open function, and nowhere else. This method permits the tlb mode to do the MAC juggling on the TX side without requiring the slave device be able to change its MAC while open (as the alb mode requires). >>I suspect this can lead to funny (or actually sad) bugs in networking >>drivers, can we avoid that? Changing this would also remove support for the tlb mode from any device that cannot change their MAC while open. I don't know how many devices that is (it's probably a small number nowadays), but if it's not zero then an assessment on losing that support has to be made. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bonding directly changes underlying device address 2014-05-13 16:30 ` Jay Vosburgh @ 2014-05-13 17:38 ` Or Gerlitz 2014-05-14 8:01 ` Veaceslav Falico 0 siblings, 1 reply; 9+ messages in thread From: Or Gerlitz @ 2014-05-13 17:38 UTC (permalink / raw) To: Jay Vosburgh Cc: Jiri Pirko, Or Gerlitz, Veaceslav Falico, Eyal Perry, netdev, Noa Osherovich On Tue, May 13, 2014 at 7:30 PM, Jay Vosburgh <jay.vosburgh@canonical.com> wrote: >>Tue, May 13, 2014 at 01:06:56PM CEST, ogerlitz@mellanox.com wrote: [...] >>>I suspect this can lead to funny (or actually sad) bugs in networking >>>drivers, can we avoid that? > Changing this would also remove support for the tlb mode from > any device that cannot change their MAC while open. I don't know how > many devices that is (it's probably a small number nowadays), but if > it's not zero then an assessment on losing that support has to be made. Focusing on the last part of your reply, are you OK with a patch that branches on whether that device supports ndo_set_mac_address and if this is the case we will call dev_set_mac_address, else do the current practice of setting dev_addr directly? Or. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bonding directly changes underlying device address 2014-05-13 17:38 ` Or Gerlitz @ 2014-05-14 8:01 ` Veaceslav Falico 2014-05-14 8:08 ` Or Gerlitz 2014-05-15 1:38 ` Jay Vosburgh 0 siblings, 2 replies; 9+ messages in thread From: Veaceslav Falico @ 2014-05-14 8:01 UTC (permalink / raw) To: Or Gerlitz Cc: Jay Vosburgh, Jiri Pirko, Or Gerlitz, Eyal Perry, netdev, Noa Osherovich On Tue, May 13, 2014 at 08:38:01PM +0300, Or Gerlitz wrote: >On Tue, May 13, 2014 at 7:30 PM, Jay Vosburgh ><jay.vosburgh@canonical.com> wrote: >>>Tue, May 13, 2014 at 01:06:56PM CEST, ogerlitz@mellanox.com wrote: > >[...] > >>>>I suspect this can lead to funny (or actually sad) bugs in networking >>>>drivers, can we avoid that? > >> Changing this would also remove support for the tlb mode from >> any device that cannot change their MAC while open. I don't know how >> many devices that is (it's probably a small number nowadays), but if >> it's not zero then an assessment on losing that support has to be made. > >Focusing on the last part of your reply, are you OK with a patch that >branches on whether that device supports ndo_set_mac_address and if >this is the case we will call dev_set_mac_address, else do the current >practice of setting dev_addr directly? I'd actually just drop support for non-ndo_set_mac_address NICs, as it'll unify the RLB and TLB logic, and anyways dev_set_mac_address() is used in SIOCSIFHWADDR and rtnl ops, and thus, if the NIC doesn't support it, it can't change its mac address at all, and using it in *LB modes makes little sense. Other way of doing this would be to just move the dev_addr to the slave struct, instead of using the net_device's dev_addr, cause in tlb we don't usually need to set the NIC mac address (except when there's mac filtering, I guess), but only need to set the packet's source mac. This way we'll omit quite costy mac address setting (as, on some NICs, it resets the whole NIC and takes seconds), still maintain compatibility with older NICs and won't mess with NICs ->dev_addr. Anyway, I'll let Jay decide. > >Or. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bonding directly changes underlying device address 2014-05-14 8:01 ` Veaceslav Falico @ 2014-05-14 8:08 ` Or Gerlitz 2014-05-15 1:38 ` Jay Vosburgh 1 sibling, 0 replies; 9+ messages in thread From: Or Gerlitz @ 2014-05-14 8:08 UTC (permalink / raw) To: Veaceslav Falico, Jay Vosburgh Cc: Or Gerlitz, Jiri Pirko, Eyal Perry, netdev, Noa Osherovich On 14/05/2014 11:01, Veaceslav Falico wrote: > > I'd actually just drop support for non-ndo_set_mac_address NICs, as it'll > unify the RLB and TLB logic, and anyways dev_set_mac_address() is used in > SIOCSIFHWADDR and rtnl ops, and thus, if the NIC doesn't support it, it > can't change its mac address at all, and using it in *LB modes makes > little sense. Avoid touching the slave device address directly and acting through dev_set_mac_address() sounds good to me. Still, FWIW && just to make sure we see the whole picture, we noted that the practiceof manually touching the slave device address is done in another TLB code locationbesides alb_set_slave_mac_addr(), namely in alb_set_mac_address(). > > Other way of doing this would be to just move the dev_addr to the slave > struct, instead of using the net_device's dev_addr, cause in tlb we don't > usually need to set the NIC mac address (except when there's mac > filtering, > I guess), but only need to set the packet's source mac. This way we'll > omit > quite costy mac address setting (as, on some NICs, it resets the whole > NIC > and takes seconds), still maintain compatibility with older NICs and > won't > mess with NICs ->dev_addr. Interesting, so if Jay is OK with this design, any chance one of you can come up with the proper patch to make that happen? Or. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bonding directly changes underlying device address 2014-05-14 8:01 ` Veaceslav Falico 2014-05-14 8:08 ` Or Gerlitz @ 2014-05-15 1:38 ` Jay Vosburgh 2014-05-15 4:55 ` Veaceslav Falico 1 sibling, 1 reply; 9+ messages in thread From: Jay Vosburgh @ 2014-05-15 1:38 UTC (permalink / raw) To: Veaceslav Falico Cc: Or Gerlitz, Jiri Pirko, Or Gerlitz, Eyal Perry, netdev, Noa Osherovich Veaceslav Falico <vfalico@gmail.com> wrote: >On Tue, May 13, 2014 at 08:38:01PM +0300, Or Gerlitz wrote: >>On Tue, May 13, 2014 at 7:30 PM, Jay Vosburgh >><jay.vosburgh@canonical.com> wrote: >>>>Tue, May 13, 2014 at 01:06:56PM CEST, ogerlitz@mellanox.com wrote: >> >>[...] >> >>>>>I suspect this can lead to funny (or actually sad) bugs in networking >>>>>drivers, can we avoid that? >> >>> Changing this would also remove support for the tlb mode from >>> any device that cannot change their MAC while open. I don't know how >>> many devices that is (it's probably a small number nowadays), but if >>> it's not zero then an assessment on losing that support has to be made. >> >>Focusing on the last part of your reply, are you OK with a patch that >>branches on whether that device supports ndo_set_mac_address and if >>this is the case we will call dev_set_mac_address, else do the current >>practice of setting dev_addr directly? > >I'd actually just drop support for non-ndo_set_mac_address NICs, as it'll >unify the RLB and TLB logic, and anyways dev_set_mac_address() is used in >SIOCSIFHWADDR and rtnl ops, and thus, if the NIC doesn't support it, it >can't change its mac address at all, and using it in *LB modes makes little >sense. It's not that the device drivers lack ndo_set_mac_address, it's that the MAC can only be changed while the device is down. The hardware is programmed with the MAC only when the device transitions from down to up. I looked, and a large number (~140) of device drivers use eth_mac_addr as the ndo_set_mac_address function and do not set IFF_LIVE_ADDR_CHANGE in priv_flags. All eth_mac_addr does is copy the MAC address into dev->dev_addr after checking that the device isn't up. The actual programming of the hardware happens in the device's ndo_open function. These are the devices that the odd balance-tlb dev_addr MAC address handling is meant for. Now, just skimming the list of those drivers, it looks like the vast majority are old 10/100 only chipsets. Probably a good chunk of them cannot actually be used today due to a lack of hardware. A few are funky things that nobody is likely to run bonding on (ibmveth, for example). A few, though, are gigabit chipsets and appear to have relatively recent work being done on them. I suppose it's possible that there are users on 10/100 chipsets as well. >Other way of doing this would be to just move the dev_addr to the slave >struct, instead of using the net_device's dev_addr, cause in tlb we don't >usually need to set the NIC mac address (except when there's mac filtering, >I guess), but only need to set the packet's source mac. This way we'll omit >quite costy mac address setting (as, on some NICs, it resets the whole NIC >and takes seconds), still maintain compatibility with older NICs and won't >mess with NICs ->dev_addr. By this you mean to, basically, stash the MAC in the struct slave somewhere instead of in dev->dev_addr, and then in bond_tlb_xmit change the Ethernet destination? Without having actually tried it, I can't think of a reason why this wouldn't work. That's really all that's happening now, except the stash is dev_addr, and the Ethernet destination is set for us by dev_hard_header. So, in principle, I don't see any problems with this. >Anyway, I'll let Jay decide. Well, my general thought here is that, yes, this is icky, and it's even been behind a bug not too long ago, when some device went down then up and programmed the dev_addr into its hardware and ended up with a duplicate MAC address on the network. So, I'm not automatically against changing this to make it "better." On the other hand, I do not feel that I have a good grasp as to how many users are out there that rely on this property of balance-tlb (that the device need be able to not change its MAC while open). Therefore, I'm not in favor of a change that would break that support. So, I'd say "No" to conversion to dev_set_mac_address() and "show me the code" to the "stash it in the slave" plan. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bonding directly changes underlying device address 2014-05-15 1:38 ` Jay Vosburgh @ 2014-05-15 4:55 ` Veaceslav Falico 2014-05-15 12:37 ` Or Gerlitz 0 siblings, 1 reply; 9+ messages in thread From: Veaceslav Falico @ 2014-05-15 4:55 UTC (permalink / raw) To: Jay Vosburgh Cc: Or Gerlitz, Jiri Pirko, Or Gerlitz, Eyal Perry, netdev, Noa Osherovich On Wed, May 14, 2014 at 06:38:45PM -0700, Jay Vosburgh wrote: ...snip... > So, I'd say "No" to conversion to dev_set_mac_address() and >"show me the code" to the "stash it in the slave" plan. Ok, got it, I will try to move it to struct slave. Thank you! > > -J > >--- > -Jay Vosburgh, jay.vosburgh@canonical.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bonding directly changes underlying device address 2014-05-15 4:55 ` Veaceslav Falico @ 2014-05-15 12:37 ` Or Gerlitz 0 siblings, 0 replies; 9+ messages in thread From: Or Gerlitz @ 2014-05-15 12:37 UTC (permalink / raw) To: Veaceslav Falico Cc: Jay Vosburgh, Jiri Pirko, Or Gerlitz, Eyal Perry, netdev, Noa Osherovich On Thu, May 15, 2014 at 7:55 AM, Veaceslav Falico <vfalico@gmail.com> wrote: > On Wed, May 14, 2014 at 06:38:45PM -0700, Jay Vosburgh wrote: > ...snip... > >> So, I'd say "No" to conversion to dev_set_mac_address() and >> "show me the code" to the "stash it in the slave" plan. > > > Ok, got it, I will try to move it to struct slave. Good, and following that we'll see if/what remains on our mlx4_en/any driver plate to fix under the TLB use case (hopefully none..) ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-05-15 12:37 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-13 11:06 bonding directly changes underlying device address Or Gerlitz 2014-05-13 11:55 ` Jiri Pirko 2014-05-13 16:30 ` Jay Vosburgh 2014-05-13 17:38 ` Or Gerlitz 2014-05-14 8:01 ` Veaceslav Falico 2014-05-14 8:08 ` Or Gerlitz 2014-05-15 1:38 ` Jay Vosburgh 2014-05-15 4:55 ` Veaceslav Falico 2014-05-15 12:37 ` Or Gerlitz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).