From: Hangbin Liu <liuhangbin@gmail.com>
To: Jay Vosburgh <jv@jvosburgh.net>
Cc: netdev@vger.kernel.org
Subject: Re: [Question]: should we consider arp missed max during bond_ab_arp_probe()?
Date: Fri, 8 Nov 2024 02:51:52 +0000 [thread overview]
Message-ID: <Zy18yA6kNmlCl6eQ@fedora> (raw)
In-Reply-To: <316685.1731029549@famine>
On Thu, Nov 07, 2024 at 05:32:29PM -0800, Jay Vosburgh wrote:
> Hangbin Liu <liuhangbin@gmail.com> wrote:
>
> >Hi Jay,
> >
> >Our QE reported that, when there is no active slave during
> >bond_ab_arp_probe(), the slaves send the arp probe message one by one. This
> >will flap the switch's mac table quickly, sometimes even make the switch stop
> >learning mac address. So should we consider the arp missed max during
> >bond_ab_arp_probe()? i.e. each slave has more chances to send probe messages
> >before switch to another slave. What do you think?
>
> Well, "quickly" here depends entirely on what the value of
> arp_interval is. It's been quite a while since I looked into the
> details of this particular behavior, but at the time I didn't see the
> switches I had issue flap warnings. If memory serves, I usually tested
> with arp_interval in the realm of 100ms, with anywhere from 2 to 6
> interfaces in the bond.
>
> What settings are you using for the bond, and what model of
> switch exhibits the behavior you describe?
In our network, we have a cisco 9364 switch. Which will disable mac learning
for 120 seconds if 6 MAC moves in 30 seconds[1] by default.
>
> That said, the intent of the current implementation is to cycle
> through the interfaces in the bond relatively quickly when no interfaces
> are up, under the theory that such behavior finds an available interface
> in the minimum time.
>
> I'm not necessarily opposed to having each probe "step," so to
> speak, perform multiple ARP probe checks. However, I wonder if this is
> a complicated workaround for not wanting to change a configuration
> setting on a switch, and it would only make things better by chance
> (i.e., that the probes just happen to now take long enough to not run
> afoul of the switch's time limit for some flap parameter).
For Cisco Nexus 9300-X switches, the `mac-move policy` is supported since
Cisco NX-OS Release 10.3(1)F, which is released August 19, 2022.
So there do have an option to disable/modify the mac policy. But switches
can't update to this version will be affected, unless the user change the
arp_interval to an large number.
As there is an workaround (either change the switch configure or
arp_interval), I don't have a strong intend to change the bonding behavior.
I will do it or ignore it based on your decision.
[1] https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/104x/config-guides/cisco-nexus-9000-series-nx-os-system-management-configuration-guide-release-104x/m-configuring-mac-move.html
Thanks
Hangbin
prev parent reply other threads:[~2024-11-08 2:51 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-06 7:39 [Question]: should we consider arp missed max during bond_ab_arp_probe()? Hangbin Liu
2024-11-06 8:34 ` Jiri Pirko
2024-11-06 9:25 ` Hangbin Liu
2024-11-06 14:40 ` Jiri Pirko
2024-11-07 1:21 ` Hangbin Liu
2024-11-07 1:21 ` Jakub Kicinski
2024-11-08 1:32 ` Jay Vosburgh
2024-11-08 2:51 ` Hangbin Liu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zy18yA6kNmlCl6eQ@fedora \
--to=liuhangbin@gmail.com \
--cc=jv@jvosburgh.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).