* [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad
@ 2023-04-28 7:36 Hangbin Liu
2023-04-28 16:06 ` Jay Vosburgh
0 siblings, 1 reply; 11+ messages in thread
From: Hangbin Liu @ 2023-04-28 7:36 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: netdev
Hi Jay,
A user reported a bonding issue that if we put an active-back bond on top of a
802.3ad bond interface. When the 802.3ad bond's speed/duplex changed
dynamically. The upper bonding interface's speed/duplex can't be changed at
the same time.
This seems not easy to fix since we update the speed/duplex only
when there is a failover(except 802.3ad mode) or slave netdev change.
But the lower bonding interface doesn't trigger netdev change when the speed
changed as ethtool get bonding speed via bond_ethtool_get_link_ksettings(),
which not affect bonding interface itself.
Here is a reproducer:
```
#!/bin/bash
s_ns="s"
c_ns="c"
ip netns del ${c_ns} &> /dev/null
ip netns del ${s_ns} &> /dev/null
sleep 1
ip netns add ${c_ns}
ip netns add ${s_ns}
ip -n ${c_ns} link add bond0 type bond mode 802.3ad miimon 100
ip -n ${s_ns} link add bond0 type bond mode 802.3ad miimon 100
ip -n ${s_ns} link add bond1 type bond mode active-backup miimon 100
for i in $(seq 0 2); do
ip -n ${c_ns} link add eth${i} type veth peer name eth${i} netns ${s_ns}
[ $i -eq 2 ] && break
ip -n ${c_ns} link set eth${i} master bond0
ip -n ${s_ns} link set eth${i} master bond0
done
ip -n ${c_ns} link set eth2 up
ip -n ${c_ns} link set bond0 up
ip -n ${s_ns} link set bond0 master bond1
ip -n ${s_ns} link set bond1 up
sleep 5
ip netns exec ${s_ns} ethtool bond0 | grep Speed
ip netns exec ${s_ns} ethtool bond1 | grep Speed
```
When run the reproducer directly, you will see:
# ./bond_topo_lacp.sh
Speed: 20000Mb/s
Speed: 10000Mb/s
So do you have any thoughts about how to fix it?
Thanks
Hangbin
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad 2023-04-28 7:36 [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad Hangbin Liu @ 2023-04-28 16:06 ` Jay Vosburgh 2023-05-08 9:26 ` Hangbin Liu 0 siblings, 1 reply; 11+ messages in thread From: Jay Vosburgh @ 2023-04-28 16:06 UTC (permalink / raw) To: Hangbin Liu; +Cc: netdev Hangbin Liu <liuhangbin@gmail.com> wrote: >A user reported a bonding issue that if we put an active-back bond on top of a >802.3ad bond interface. When the 802.3ad bond's speed/duplex changed >dynamically. The upper bonding interface's speed/duplex can't be changed at >the same time. > >This seems not easy to fix since we update the speed/duplex only >when there is a failover(except 802.3ad mode) or slave netdev change. >But the lower bonding interface doesn't trigger netdev change when the speed >changed as ethtool get bonding speed via bond_ethtool_get_link_ksettings(), >which not affect bonding interface itself. Well, this gets back into the intermittent discussion on whether or not being able to nest bonds is useful or not, and thus whether it should be allowed or not. It's at best a niche use case (I don't recall the example configurations ever being anything other than 802.3ad under active-backup), and was broken for a number of years without much uproar. In this particular case, nesting two LACP (802.3ad) bonds inside an active-backup bond provides no functional benefit as far as I'm aware (maybe gratuitous ARP?), as 802.3ad mode will correctly handle switching between multiple aggregators. The "ad_select" option provides a few choices on the criteria for choosing the active aggregator. Is there a reason the user in your case doesn't use 802.3ad mode directly? >Here is a reproducer: > >``` >#!/bin/bash >s_ns="s" >c_ns="c" > >ip netns del ${c_ns} &> /dev/null >ip netns del ${s_ns} &> /dev/null >sleep 1 >ip netns add ${c_ns} >ip netns add ${s_ns} > >ip -n ${c_ns} link add bond0 type bond mode 802.3ad miimon 100 >ip -n ${s_ns} link add bond0 type bond mode 802.3ad miimon 100 >ip -n ${s_ns} link add bond1 type bond mode active-backup miimon 100 > >for i in $(seq 0 2); do > ip -n ${c_ns} link add eth${i} type veth peer name eth${i} netns ${s_ns} > [ $i -eq 2 ] && break > ip -n ${c_ns} link set eth${i} master bond0 > ip -n ${s_ns} link set eth${i} master bond0 >done > >ip -n ${c_ns} link set eth2 up >ip -n ${c_ns} link set bond0 up > >ip -n ${s_ns} link set bond0 master bond1 >ip -n ${s_ns} link set bond1 up > >sleep 5 > >ip netns exec ${s_ns} ethtool bond0 | grep Speed >ip netns exec ${s_ns} ethtool bond1 | grep Speed >``` > >When run the reproducer directly, you will see: ># ./bond_topo_lacp.sh > Speed: 20000Mb/s > Speed: 10000Mb/s > >So do you have any thoughts about how to fix it? Maybe it's time to disable nesting of bonds, update the documentation to note that it's disabled and that 802.3ad mode is smart enough to do multiple aggregators, and then see if anyone has some other use case and complains. In the past, I've been against doing this, but only because it might break existing configurations. If nested configurations are going to misbehave and require complicated shenanigans to fix, then perhaps it's time to push users into a configuration that works without the nesting. The only thing I can think of that active-backup over 802.3ad gets is the gratuitous ARP / NS on failover. If that's the key feature for nesting, then I'd rather add the grat ARP to 802.3ad aggregator selection and disable nesting. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad 2023-04-28 16:06 ` Jay Vosburgh @ 2023-05-08 9:26 ` Hangbin Liu 2023-05-08 18:32 ` Jay Vosburgh 0 siblings, 1 reply; 11+ messages in thread From: Hangbin Liu @ 2023-05-08 9:26 UTC (permalink / raw) To: Jay Vosburgh; +Cc: netdev On Fri, Apr 28, 2023 at 09:06:40AM -0700, Jay Vosburgh wrote: > Hangbin Liu <liuhangbin@gmail.com> wrote: > > >A user reported a bonding issue that if we put an active-back bond on top of a > >802.3ad bond interface. When the 802.3ad bond's speed/duplex changed > >dynamically. The upper bonding interface's speed/duplex can't be changed at > >the same time. > > > >This seems not easy to fix since we update the speed/duplex only > >when there is a failover(except 802.3ad mode) or slave netdev change. > >But the lower bonding interface doesn't trigger netdev change when the speed > >changed as ethtool get bonding speed via bond_ethtool_get_link_ksettings(), > >which not affect bonding interface itself. > > Well, this gets back into the intermittent discussion on whether > or not being able to nest bonds is useful or not, and thus whether it > should be allowed or not. It's at best a niche use case (I don't recall > the example configurations ever being anything other than 802.3ad under > active-backup), and was broken for a number of years without much > uproar. > > In this particular case, nesting two LACP (802.3ad) bonds inside > an active-backup bond provides no functional benefit as far as I'm aware > (maybe gratuitous ARP?), as 802.3ad mode will correctly handle switching > between multiple aggregators. The "ad_select" option provides a few > choices on the criteria for choosing the active aggregator. > > Is there a reason the user in your case doesn't use 802.3ad mode > directly? Hi Jay, I just back from holiday and re-read you reply. The user doesn't add 2 LACP bonds inside an active-backup bond. He add 1 LACP bond and 1 normal NIC in to an active-backup bond. This seems reasonable. e.g. The LACP bond in a switch and the normal NIC in another switch. What do you think? Thanks Hangbin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad 2023-05-08 9:26 ` Hangbin Liu @ 2023-05-08 18:32 ` Jay Vosburgh 2023-05-09 3:16 ` Hangbin Liu 2023-05-10 7:50 ` Hangbin Liu 0 siblings, 2 replies; 11+ messages in thread From: Jay Vosburgh @ 2023-05-08 18:32 UTC (permalink / raw) To: Hangbin Liu; +Cc: netdev Hangbin Liu <liuhangbin@gmail.com> wrote: >On Fri, Apr 28, 2023 at 09:06:40AM -0700, Jay Vosburgh wrote: >> Hangbin Liu <liuhangbin@gmail.com> wrote: >> >> >A user reported a bonding issue that if we put an active-back bond on top of a >> >802.3ad bond interface. When the 802.3ad bond's speed/duplex changed >> >dynamically. The upper bonding interface's speed/duplex can't be changed at >> >the same time. >> > >> >This seems not easy to fix since we update the speed/duplex only >> >when there is a failover(except 802.3ad mode) or slave netdev change. >> >But the lower bonding interface doesn't trigger netdev change when the speed >> >changed as ethtool get bonding speed via bond_ethtool_get_link_ksettings(), >> >which not affect bonding interface itself. >> >> Well, this gets back into the intermittent discussion on whether >> or not being able to nest bonds is useful or not, and thus whether it >> should be allowed or not. It's at best a niche use case (I don't recall >> the example configurations ever being anything other than 802.3ad under >> active-backup), and was broken for a number of years without much >> uproar. >> >> In this particular case, nesting two LACP (802.3ad) bonds inside >> an active-backup bond provides no functional benefit as far as I'm aware >> (maybe gratuitous ARP?), as 802.3ad mode will correctly handle switching >> between multiple aggregators. The "ad_select" option provides a few >> choices on the criteria for choosing the active aggregator. >> >> Is there a reason the user in your case doesn't use 802.3ad mode >> directly? > >Hi Jay, > >I just back from holiday and re-read you reply. The user doesn't add 2 LACP >bonds inside an active-backup bond. He add 1 LACP bond and 1 normal NIC in to >an active-backup bond. This seems reasonable. e.g. The LACP bond in a switch >and the normal NIC in another switch. > >What do you think? That case should work fine without the active-backup. LACP has a concept of an "individual" port, which (in this context) would be the "normal NIC," presuming that that means its link peer isn't running LACP. If all of the ports (N that are LACP to a single switch, plus 1 that's the non-LACP "normal NIC") were attached to a single bond, it would create one aggregator with the LACP enabled ports, and then a separate aggregator for the indvidual port that's not. The aggregator selection logic prefers the LACP enabled aggregator over the individual port aggregator. The precise criteria is in the commentary within ad_agg_selection_test(). -J --- -Jay Vosburgh, jay.vosburgh@canonical.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad 2023-05-08 18:32 ` Jay Vosburgh @ 2023-05-09 3:16 ` Hangbin Liu 2023-05-10 7:50 ` Hangbin Liu 1 sibling, 0 replies; 11+ messages in thread From: Hangbin Liu @ 2023-05-09 3:16 UTC (permalink / raw) To: Jay Vosburgh; +Cc: netdev On Mon, May 08, 2023 at 11:32:16AM -0700, Jay Vosburgh wrote: > >Hi Jay, > > > >I just back from holiday and re-read you reply. The user doesn't add 2 LACP > >bonds inside an active-backup bond. He add 1 LACP bond and 1 normal NIC in to > >an active-backup bond. This seems reasonable. e.g. The LACP bond in a switch > >and the normal NIC in another switch. > > > >What do you think? > > That case should work fine without the active-backup. LACP has > a concept of an "individual" port, which (in this context) would be the > "normal NIC," presuming that that means its link peer isn't running > LACP. > > If all of the ports (N that are LACP to a single switch, plus 1 > that's the non-LACP "normal NIC") were attached to a single bond, it > would create one aggregator with the LACP enabled ports, and then a > separate aggregator for the indvidual port that's not. The aggregator > selection logic prefers the LACP enabled aggregator over the individual > port aggregator. The precise criteria is in the commentary within > ad_agg_selection_test(). > Thanks for your explanation. I didn't know this before. Now I have learned. Regards Hangbin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad 2023-05-08 18:32 ` Jay Vosburgh 2023-05-09 3:16 ` Hangbin Liu @ 2023-05-10 7:50 ` Hangbin Liu 2023-05-10 16:57 ` Andrew J. Schorr 1 sibling, 1 reply; 11+ messages in thread From: Hangbin Liu @ 2023-05-10 7:50 UTC (permalink / raw) To: Jay Vosburgh; +Cc: netdev, Andrew Schorr On Mon, May 08, 2023 at 11:32:16AM -0700, Jay Vosburgh wrote: > >Hi Jay, > > > >I just back from holiday and re-read you reply. The user doesn't add 2 LACP > >bonds inside an active-backup bond. He add 1 LACP bond and 1 normal NIC in to > >an active-backup bond. This seems reasonable. e.g. The LACP bond in a switch > >and the normal NIC in another switch. > > > >What do you think? > > That case should work fine without the active-backup. LACP has > a concept of an "individual" port, which (in this context) would be the > "normal NIC," presuming that that means its link peer isn't running > LACP. > > If all of the ports (N that are LACP to a single switch, plus 1 > that's the non-LACP "normal NIC") were attached to a single bond, it > would create one aggregator with the LACP enabled ports, and then a > separate aggregator for the indvidual port that's not. The aggregator > selection logic prefers the LACP enabled aggregator over the individual > port aggregator. The precise criteria is in the commentary within > ad_agg_selection_test(). > cc Andrew, He add active-backup bond over LACP bond because he want to use arp_ip_target to ensure that the target network is reachable... Hangbin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad 2023-05-10 7:50 ` Hangbin Liu @ 2023-05-10 16:57 ` Andrew J. Schorr 2023-05-10 17:14 ` Andrew J. Schorr 0 siblings, 1 reply; 11+ messages in thread From: Andrew J. Schorr @ 2023-05-10 16:57 UTC (permalink / raw) To: Hangbin Liu; +Cc: Jay Vosburgh, netdev Hi Hangbin & Jay, On Wed, May 10, 2023 at 03:50:34PM +0800, Hangbin Liu wrote: > On Mon, May 08, 2023 at 11:32:16AM -0700, Jay Vosburgh wrote: > > That case should work fine without the active-backup. LACP has > > a concept of an "individual" port, which (in this context) would be the > > "normal NIC," presuming that that means its link peer isn't running > > LACP. > > > > If all of the ports (N that are LACP to a single switch, plus 1 > > that's the non-LACP "normal NIC") were attached to a single bond, it > > would create one aggregator with the LACP enabled ports, and then a > > separate aggregator for the indvidual port that's not. The aggregator > > selection logic prefers the LACP enabled aggregator over the individual > > port aggregator. The precise criteria is in the commentary within > > ad_agg_selection_test(). > > > > cc Andrew, He add active-backup bond over LACP bond because he want to > use arp_ip_target to ensure that the target network is reachable... That's correct. I prefer the ARP monitoring to ensure that the needed connectivity is actually there instead of relying on MII monitoring. I also confess that I was unaware of the possibility of using an individual port inside an 802.3ad bond without having to stick that individual port into a port-channel group with LACP enabled. I want to avoid enabling LACP on that link because I'd like to be able to PXE boot over it, not to mention the switch configuration hassle. Is that individual port configuration without LACP detected automatically by the kernel, or do I need to configure something to do that? I see the logic in drivers/net/bonding/bond_3ad.c to set is_individual, but it appears to depend on whether duplex is enabled. At that point, I got lost, since I see duplex mentioned only in ad_user_port_key, and that seems to be a property of the bond master, not the slaves. Is there any documentation of how this configuration works? But in any case, I still prefer active-backup on top of 802.3ad so that I can have the ARP monitoring. If it's too much trouble to get the top-level bond to report duplex/speed correctly when the underlying bond speed changes, then I think it would be an improvement to set duplex/speed to N/A (or -1) for a bond of bonds configuration instead of potentially having incorrect information. I imagine such a fix might be much easier than updating dynamically when the lower-level 802.3ad bond changes speed. Best regards, Andy ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad 2023-05-10 16:57 ` Andrew J. Schorr @ 2023-05-10 17:14 ` Andrew J. Schorr 2023-05-12 1:38 ` Jay Vosburgh 0 siblings, 1 reply; 11+ messages in thread From: Andrew J. Schorr @ 2023-05-10 17:14 UTC (permalink / raw) To: Hangbin Liu; +Cc: Jay Vosburgh, netdev Sorry -- resending from a different email address to fix a problem with gmail rejecting it. On Wed, May 10, 2023 at 12:57:38PM -0400, Andrew J. Schorr wrote: > Hi Hangbin & Jay, > > On Wed, May 10, 2023 at 03:50:34PM +0800, Hangbin Liu wrote: > > On Mon, May 08, 2023 at 11:32:16AM -0700, Jay Vosburgh wrote: > > > That case should work fine without the active-backup. LACP has > > > a concept of an "individual" port, which (in this context) would be the > > > "normal NIC," presuming that that means its link peer isn't running > > > LACP. > > > > > > If all of the ports (N that are LACP to a single switch, plus 1 > > > that's the non-LACP "normal NIC") were attached to a single bond, it > > > would create one aggregator with the LACP enabled ports, and then a > > > separate aggregator for the indvidual port that's not. The aggregator > > > selection logic prefers the LACP enabled aggregator over the individual > > > port aggregator. The precise criteria is in the commentary within > > > ad_agg_selection_test(). > > > > > > > cc Andrew, He add active-backup bond over LACP bond because he want to > > use arp_ip_target to ensure that the target network is reachable... > > That's correct. I prefer the ARP monitoring to ensure that the needed > connectivity is actually there instead of relying on MII monitoring. > > I also confess that I was unaware of the possibility of using an individual > port inside an 802.3ad bond without having to stick that individual port into a > port-channel group with LACP enabled. I want to avoid enabling LACP on that > link because I'd like to be able to PXE boot over it, not to mention the switch > configuration hassle. Is that individual port configuration without LACP > detected automatically by the kernel, or do I need to configure something to do > that? I see the logic in drivers/net/bonding/bond_3ad.c to set is_individual, > but it appears to depend on whether duplex is enabled. At that point, I got > lost, since I see duplex mentioned only in ad_user_port_key, and that seems to > be a property of the bond master, not the slaves. Is there any documentation of > how this configuration works? > > But in any case, I still prefer active-backup on top of 802.3ad so that I can > have the ARP monitoring. > > If it's too much trouble to get the top-level bond to report duplex/speed > correctly when the underlying bond speed changes, then I think it would > be an improvement to set duplex/speed to N/A (or -1) for a bond of > bonds configuration instead of potentially having incorrect information. > I imagine such a fix might be much easier than updating dynamically > when the lower-level 802.3ad bond changes speed. > > Best regards, > Andy ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad 2023-05-10 17:14 ` Andrew J. Schorr @ 2023-05-12 1:38 ` Jay Vosburgh 2023-05-12 14:44 ` Andrew J. Schorr 0 siblings, 1 reply; 11+ messages in thread From: Jay Vosburgh @ 2023-05-12 1:38 UTC (permalink / raw) To: Andrew J. Schorr; +Cc: Hangbin Liu, netdev Andrew J. Schorr <aschorr@telemetry-investments.com> wrote: >Sorry -- resending from a different email address to fix a problem >with gmail rejecting it. > >On Wed, May 10, 2023 at 12:57:38PM -0400, Andrew J. Schorr wrote: >> Hi Hangbin & Jay, >> >> On Wed, May 10, 2023 at 03:50:34PM +0800, Hangbin Liu wrote: >> > On Mon, May 08, 2023 at 11:32:16AM -0700, Jay Vosburgh wrote: >> > > That case should work fine without the active-backup. LACP has >> > > a concept of an "individual" port, which (in this context) would be the >> > > "normal NIC," presuming that that means its link peer isn't running >> > > LACP. >> > > >> > > If all of the ports (N that are LACP to a single switch, plus 1 >> > > that's the non-LACP "normal NIC") were attached to a single bond, it >> > > would create one aggregator with the LACP enabled ports, and then a >> > > separate aggregator for the indvidual port that's not. The aggregator >> > > selection logic prefers the LACP enabled aggregator over the individual >> > > port aggregator. The precise criteria is in the commentary within >> > > ad_agg_selection_test(). >> > > >> > >> > cc Andrew, He add active-backup bond over LACP bond because he want to >> > use arp_ip_target to ensure that the target network is reachable... >> >> That's correct. I prefer the ARP monitoring to ensure that the needed >> connectivity is actually there instead of relying on MII monitoring. >> >> I also confess that I was unaware of the possibility of using an individual >> port inside an 802.3ad bond without having to stick that individual port into a >> port-channel group with LACP enabled. I want to avoid enabling LACP on that >> link because I'd like to be able to PXE boot over it, not to mention the switch >> configuration hassle. Is that individual port configuration without LACP >> detected automatically by the kernel, or do I need to configure something to do >> that? I see the logic in drivers/net/bonding/bond_3ad.c to set is_individual, >> but it appears to depend on whether duplex is enabled. At that point, I got >> lost, since I see duplex mentioned only in ad_user_port_key, and that seems to >> be a property of the bond master, not the slaves. Is there any documentation of >> how this configuration works? The individual port behavior is part of the LACP standard (IEEE 802.1AX, recent editions call this "Solitary"), and is done automatically by the kernel. One of the reasons for it is to permit exactly the situation you mention: to enable PXE or "fallback" communication to work even if LACP negotiation fails or is not configured or implemented at one end. This is called out explicitly in 802.1AX, 6.1.1.j. The duplex test is only part of the "individual" logic; it comes up because LACP negotiation requires the peers to be point-to-point links, i.e., full duplex (IEEE 802.1AX-2014, 6.4.8). That's the norm for most everything now, but historically a port in half duplex could be on a multiple access topology, e.g., 802.3 CSMA/CD 10BASE2 on a coax cable, which is incompatible with LACP aggregation. This situation doesn't come up a lot these days. The important part of the "individual" logic is whether or not the port successfully completes LACP negotiation with a link partner. If not, the port is an individual port, which acts essentially like an aggregator with just one port in it. This is separate from "is_individual" in the bonding code, and happens in ad_port_selection_logic(), after the comment "check if current aggregator suits us". "is_individual" is one element of this test, the remaining tests compare the various keys and whether the partner MAC address has been populated. As far as documentation goes, the bonding docs[0] describe some of the parameters, but doesn't describe the specifics of bonding's ability to manage multiple aggregators; I should write that up, since this comes up periodically. The IEEE standard (to which the bonding implementation conforms) describes how the whole system works, but doesn't really have a simple overview. [0] https://www.kernel.org/doc/Documentation/networking/bonding.rst >> But in any case, I still prefer active-backup on top of 802.3ad so that I can >> have the ARP monitoring. >> >> If it's too much trouble to get the top-level bond to report duplex/speed >> correctly when the underlying bond speed changes, then I think it would >> be an improvement to set duplex/speed to N/A (or -1) for a bond of >> bonds configuration instead of potentially having incorrect information. >> I imagine such a fix might be much easier than updating dynamically >> when the lower-level 802.3ad bond changes speed. I'll have to give this some thought. The best long term solution would be to decouple the link monitoring stuff from the mode, and thus allow ARP and MII in a wider variety of modes. I've prototyped that out in the past, along with changing the MII monitor to respond to carrier state changes in real time instead of polling, and it's fairly complicated. In any event, this does sound like a valid use case for nesting the bonds, so simply disabling that facility seems to be off the table. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad 2023-05-12 1:38 ` Jay Vosburgh @ 2023-05-12 14:44 ` Andrew J. Schorr 2023-05-16 15:11 ` Andrew J. Schorr 0 siblings, 1 reply; 11+ messages in thread From: Andrew J. Schorr @ 2023-05-12 14:44 UTC (permalink / raw) To: Jay Vosburgh; +Cc: Hangbin Liu, netdev Hi Jay, On Thu, May 11, 2023 at 06:38:48PM -0700, Jay Vosburgh wrote: > The individual port behavior is part of the LACP standard (IEEE > 802.1AX, recent editions call this "Solitary"), and is done > automatically by the kernel. One of the reasons for it is to permit > exactly the situation you mention: to enable PXE or "fallback" > communication to work even if LACP negotiation fails or is not > configured or implemented at one end. This is called out explicitly in > 802.1AX, 6.1.1.j. > > The duplex test is only part of the "individual" logic; it comes > up because LACP negotiation requires the peers to be point-to-point > links, i.e., full duplex (IEEE 802.1AX-2014, 6.4.8). That's the norm > for most everything now, but historically a port in half duplex could be > on a multiple access topology, e.g., 802.3 CSMA/CD 10BASE2 on a coax > cable, which is incompatible with LACP aggregation. This situation > doesn't come up a lot these days. > > The important part of the "individual" logic is whether or not > the port successfully completes LACP negotiation with a link partner. > If not, the port is an individual port, which acts essentially like an > aggregator with just one port in it. This is separate from > "is_individual" in the bonding code, and happens in > ad_port_selection_logic(), after the comment "check if current > aggregator suits us". "is_individual" is one element of this test, the > remaining tests compare the various keys and whether the partner MAC > address has been populated. OK. So it sounds like this should just work automatically with no configuration required to identify which slaves are running in individual mode. Thanks for clarifying. > As far as documentation goes, the bonding docs[0] describe some > of the parameters, but doesn't describe the specifics of bonding's > ability to manage multiple aggregators; I should write that up, since > this comes up periodically. The IEEE standard (to which the bonding > implementation conforms) describes how the whole system works, but > doesn't really have a simple overview. > > [0] https://www.kernel.org/doc/Documentation/networking/bonding.rst I noticed the parameters related to this and did do some google searching to learn about having multiple aggregators, but as you say, it would be helpful to have a few more clues about how this works in the Bonding Howto, as well as a mention of this individual port capability. > I'll have to give this some thought. The best long term > solution would be to decouple the link monitoring stuff from the mode, > and thus allow ARP and MII in a wider variety of modes. I've prototyped > that out in the past, along with changing the MII monitor to respond to > carrier state changes in real time instead of polling, and it's fairly > complicated. > > In any event, this does sound like a valid use case for nesting > the bonds, so simply disabling that facility seems to be off the table. OK, great. Then I'll stick with this config for now, even though NetworkManager has some brain damage in this area, since it tries to bring up both bonds before the MAC addresses have gotten sorted out, which can leave everything with a random MAC address. I've managed to kludge a solution to this by setting ONBOOT=no for the active-backup bond, which convinces NetworkManager to start it a bit later and somehow fixes the race condition. Regards, Andy ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad 2023-05-12 14:44 ` Andrew J. Schorr @ 2023-05-16 15:11 ` Andrew J. Schorr 0 siblings, 0 replies; 11+ messages in thread From: Andrew J. Schorr @ 2023-05-16 15:11 UTC (permalink / raw) To: Jay Vosburgh; +Cc: Hangbin Liu, netdev Hi, On Fri, May 12, 2023 at 10:44:01AM -0400, Andrew J. Schorr wrote: > OK. So it sounds like this should just work automatically with no > configuration required to identify which slaves are running in individual > mode. Thanks for clarifying. Just to follow up on this -- for test purposes, I booted the system with the 802.3ad bond containing both the 20 Gbps port-channel and the individual 1 Gbps port on the other switch, and it worked as expected. The only drawback to this configuration is the lack of ARP monitoring, so I will stick with the active-backup bond on top of the 802.3ad bond. Regards, Andy ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-05-16 15:11 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-04-28 7:36 [Issue] Bonding can't show correct speed if lower interface is bond 802.3ad Hangbin Liu 2023-04-28 16:06 ` Jay Vosburgh 2023-05-08 9:26 ` Hangbin Liu 2023-05-08 18:32 ` Jay Vosburgh 2023-05-09 3:16 ` Hangbin Liu 2023-05-10 7:50 ` Hangbin Liu 2023-05-10 16:57 ` Andrew J. Schorr 2023-05-10 17:14 ` Andrew J. Schorr 2023-05-12 1:38 ` Jay Vosburgh 2023-05-12 14:44 ` Andrew J. Schorr 2023-05-16 15:11 ` Andrew J. Schorr
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).