From: Jay Vosburgh <jay.vosburgh@canonical.com>
To: Veaceslav Falico <vfalico@gmail.com>
Cc: Or Gerlitz <or.gerlitz@gmail.com>, Jiri Pirko <jiri@resnulli.us>,
Or Gerlitz <ogerlitz@mellanox.com>,
Eyal Perry <eyalpe@mellanox.com>, netdev <netdev@vger.kernel.org>,
Noa Osherovich <noaos@mellanox.com>
Subject: Re: bonding directly changes underlying device address
Date: Wed, 14 May 2014 18:38:45 -0700 [thread overview]
Message-ID: <5046.1400117925@localhost.localdomain> (raw)
In-Reply-To: <20140514080114.GA30960@mikrodark.usersys.redhat.com>
Veaceslav Falico <vfalico@gmail.com> wrote:
>On Tue, May 13, 2014 at 08:38:01PM +0300, Or Gerlitz wrote:
>>On Tue, May 13, 2014 at 7:30 PM, Jay Vosburgh
>><jay.vosburgh@canonical.com> wrote:
>>>>Tue, May 13, 2014 at 01:06:56PM CEST, ogerlitz@mellanox.com wrote:
>>
>>[...]
>>
>>>>>I suspect this can lead to funny (or actually sad) bugs in networking
>>>>>drivers, can we avoid that?
>>
>>> Changing this would also remove support for the tlb mode from
>>> any device that cannot change their MAC while open. I don't know how
>>> many devices that is (it's probably a small number nowadays), but if
>>> it's not zero then an assessment on losing that support has to be made.
>>
>>Focusing on the last part of your reply, are you OK with a patch that
>>branches on whether that device supports ndo_set_mac_address and if
>>this is the case we will call dev_set_mac_address, else do the current
>>practice of setting dev_addr directly?
>
>I'd actually just drop support for non-ndo_set_mac_address NICs, as it'll
>unify the RLB and TLB logic, and anyways dev_set_mac_address() is used in
>SIOCSIFHWADDR and rtnl ops, and thus, if the NIC doesn't support it, it
>can't change its mac address at all, and using it in *LB modes makes little
>sense.
It's not that the device drivers lack ndo_set_mac_address, it's
that the MAC can only be changed while the device is down. The hardware
is programmed with the MAC only when the device transitions from down to
up.
I looked, and a large number (~140) of device drivers use
eth_mac_addr as the ndo_set_mac_address function and do not set
IFF_LIVE_ADDR_CHANGE in priv_flags. All eth_mac_addr does is copy the
MAC address into dev->dev_addr after checking that the device isn't up.
The actual programming of the hardware happens in the device's ndo_open
function. These are the devices that the odd balance-tlb dev_addr MAC
address handling is meant for.
Now, just skimming the list of those drivers, it looks like the
vast majority are old 10/100 only chipsets. Probably a good chunk of
them cannot actually be used today due to a lack of hardware. A few are
funky things that nobody is likely to run bonding on (ibmveth, for
example). A few, though, are gigabit chipsets and appear to have
relatively recent work being done on them. I suppose it's possible that
there are users on 10/100 chipsets as well.
>Other way of doing this would be to just move the dev_addr to the slave
>struct, instead of using the net_device's dev_addr, cause in tlb we don't
>usually need to set the NIC mac address (except when there's mac filtering,
>I guess), but only need to set the packet's source mac. This way we'll omit
>quite costy mac address setting (as, on some NICs, it resets the whole NIC
>and takes seconds), still maintain compatibility with older NICs and won't
>mess with NICs ->dev_addr.
By this you mean to, basically, stash the MAC in the struct
slave somewhere instead of in dev->dev_addr, and then in bond_tlb_xmit
change the Ethernet destination? Without having actually tried it, I
can't think of a reason why this wouldn't work. That's really all
that's happening now, except the stash is dev_addr, and the Ethernet
destination is set for us by dev_hard_header. So, in principle, I don't
see any problems with this.
>Anyway, I'll let Jay decide.
Well, my general thought here is that, yes, this is icky, and
it's even been behind a bug not too long ago, when some device went down
then up and programmed the dev_addr into its hardware and ended up with
a duplicate MAC address on the network. So, I'm not automatically
against changing this to make it "better."
On the other hand, I do not feel that I have a good grasp as to
how many users are out there that rely on this property of balance-tlb
(that the device need be able to not change its MAC while open).
Therefore, I'm not in favor of a change that would break that support.
So, I'd say "No" to conversion to dev_set_mac_address() and
"show me the code" to the "stash it in the slave" plan.
-J
---
-Jay Vosburgh, jay.vosburgh@canonical.com
next prev parent reply other threads:[~2014-05-15 1:38 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-13 11:06 bonding directly changes underlying device address Or Gerlitz
2014-05-13 11:55 ` Jiri Pirko
2014-05-13 16:30 ` Jay Vosburgh
2014-05-13 17:38 ` Or Gerlitz
2014-05-14 8:01 ` Veaceslav Falico
2014-05-14 8:08 ` Or Gerlitz
2014-05-15 1:38 ` Jay Vosburgh [this message]
2014-05-15 4:55 ` Veaceslav Falico
2014-05-15 12:37 ` Or Gerlitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5046.1400117925@localhost.localdomain \
--to=jay.vosburgh@canonical.com \
--cc=eyalpe@mellanox.com \
--cc=jiri@resnulli.us \
--cc=netdev@vger.kernel.org \
--cc=noaos@mellanox.com \
--cc=ogerlitz@mellanox.com \
--cc=or.gerlitz@gmail.com \
--cc=vfalico@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.