From: Jay Vosburgh <jay.vosburgh@canonical.com>
To: Veaceslav Falico <vfalico@gmail.com>
Cc: Or Gerlitz <or.gerlitz@gmail.com>, Jiri Pirko <jiri@resnulli.us>,
Or Gerlitz <ogerlitz@mellanox.com>,
Eyal Perry <eyalpe@mellanox.com>, netdev <netdev@vger.kernel.org>,
Noa Osherovich <noaos@mellanox.com>
Subject: Re: bonding directly changes underlying device address
Date: Wed, 14 May 2014 18:38:45 -0700 [thread overview]
Message-ID: <5046.1400117925@localhost.localdomain> (raw)
In-Reply-To: <20140514080114.GA30960@mikrodark.usersys.redhat.com>
Veaceslav Falico <vfalico@gmail.com> wrote:
>On Tue, May 13, 2014 at 08:38:01PM +0300, Or Gerlitz wrote:
>>On Tue, May 13, 2014 at 7:30 PM, Jay Vosburgh
>><jay.vosburgh@canonical.com> wrote:
>>>>Tue, May 13, 2014 at 01:06:56PM CEST, ogerlitz@mellanox.com wrote:
>>
>>[...]
>>
>>>>>I suspect this can lead to funny (or actually sad) bugs in networking
>>>>>drivers, can we avoid that?
>>
>>> Changing this would also remove support for the tlb mode from
>>> any device that cannot change their MAC while open. I don't know how
>>> many devices that is (it's probably a small number nowadays), but if
>>> it's not zero then an assessment on losing that support has to be made.
>>
>>Focusing on the last part of your reply, are you OK with a patch that
>>branches on whether that device supports ndo_set_mac_address and if
>>this is the case we will call dev_set_mac_address, else do the current
>>practice of setting dev_addr directly?
>
>I'd actually just drop support for non-ndo_set_mac_address NICs, as it'll
>unify the RLB and TLB logic, and anyways dev_set_mac_address() is used in
>SIOCSIFHWADDR and rtnl ops, and thus, if the NIC doesn't support it, it
>can't change its mac address at all, and using it in *LB modes makes little
>sense.
It's not that the device drivers lack ndo_set_mac_address, it's
that the MAC can only be changed while the device is down. The hardware
is programmed with the MAC only when the device transitions from down to
up.
I looked, and a large number (~140) of device drivers use
eth_mac_addr as the ndo_set_mac_address function and do not set
IFF_LIVE_ADDR_CHANGE in priv_flags. All eth_mac_addr does is copy the
MAC address into dev->dev_addr after checking that the device isn't up.
The actual programming of the hardware happens in the device's ndo_open
function. These are the devices that the odd balance-tlb dev_addr MAC
address handling is meant for.
Now, just skimming the list of those drivers, it looks like the
vast majority are old 10/100 only chipsets. Probably a good chunk of
them cannot actually be used today due to a lack of hardware. A few are
funky things that nobody is likely to run bonding on (ibmveth, for
example). A few, though, are gigabit chipsets and appear to have
relatively recent work being done on them. I suppose it's possible that
there are users on 10/100 chipsets as well.
>Other way of doing this would be to just move the dev_addr to the slave
>struct, instead of using the net_device's dev_addr, cause in tlb we don't
>usually need to set the NIC mac address (except when there's mac filtering,
>I guess), but only need to set the packet's source mac. This way we'll omit
>quite costy mac address setting (as, on some NICs, it resets the whole NIC
>and takes seconds), still maintain compatibility with older NICs and won't
>mess with NICs ->dev_addr.
By this you mean to, basically, stash the MAC in the struct
slave somewhere instead of in dev->dev_addr, and then in bond_tlb_xmit
change the Ethernet destination? Without having actually tried it, I
can't think of a reason why this wouldn't work. That's really all
that's happening now, except the stash is dev_addr, and the Ethernet
destination is set for us by dev_hard_header. So, in principle, I don't
see any problems with this.
>Anyway, I'll let Jay decide.
Well, my general thought here is that, yes, this is icky, and
it's even been behind a bug not too long ago, when some device went down
then up and programmed the dev_addr into its hardware and ended up with
a duplicate MAC address on the network. So, I'm not automatically
against changing this to make it "better."
On the other hand, I do not feel that I have a good grasp as to
how many users are out there that rely on this property of balance-tlb
(that the device need be able to not change its MAC while open).
Therefore, I'm not in favor of a change that would break that support.
So, I'd say "No" to conversion to dev_set_mac_address() and
"show me the code" to the "stash it in the slave" plan.
-J
---
-Jay Vosburgh, jay.vosburgh@canonical.com
next prev parent reply other threads:[~2014-05-15 1:38 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-13 11:06 bonding directly changes underlying device address Or Gerlitz
2014-05-13 11:55 ` Jiri Pirko
2014-05-13 16:30 ` Jay Vosburgh
2014-05-13 17:38 ` Or Gerlitz
2014-05-14 8:01 ` Veaceslav Falico
2014-05-14 8:08 ` Or Gerlitz
2014-05-15 1:38 ` Jay Vosburgh [this message]
2014-05-15 4:55 ` Veaceslav Falico
2014-05-15 12:37 ` Or Gerlitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5046.1400117925@localhost.localdomain \
--to=jay.vosburgh@canonical.com \
--cc=eyalpe@mellanox.com \
--cc=jiri@resnulli.us \
--cc=netdev@vger.kernel.org \
--cc=noaos@mellanox.com \
--cc=ogerlitz@mellanox.com \
--cc=or.gerlitz@gmail.com \
--cc=vfalico@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).