[PATCHv3 net 0/3] bonding: fix 802.3ad churn machine and port state issues

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCHv3 net 0/3] bonding: fix 802.3ad churn machine and port state issues
@ 2026-02-26 12:53 Hangbin Liu
  2026-02-26 12:53 ` [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port Hangbin Liu
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Hangbin Liu @ 2026-02-26 12:53 UTC (permalink / raw)
  To: netdev
  Cc: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest, Hangbin Liu

This series fixes two issues in the bonding 802.3ad implementation
related to port state management and churn detection:

1. When disabling a port, we need to set AD_RX_PORT_DISABLED to ensure
   proper state machine transitions, preventing ports from getting stuck
   in AD_RX_CURRENT state.

2. The ad_churn_machine implementation is restructured to follow IEEE
   802.1AX-2014 specifications correctly. The current implementation has
   several issues: it doesn't transition to "none" state immediately when
   synchronization is achieved, and can get stuck in churned state in
   multi-aggregator scenarios.

3. Selftests are enhanced to validate both mux state machine and churn
   state logic under aggregator selection and failover scenarios.

These changes ensure proper LACP state machine behavior and fix issues
where ports could remain in incorrect states during aggregator failover.

v3: re-post to net, no code change (Jakub)
v2:
  * set AD_RX_PORT_DISABLED only in ad_agg_selection_logic to avoid side effect. (Paolo Abeni)
  * remove actor_churn as it can only be true when the state is ACTOR_CHURN (Paolo Abeni)
  * remove AD_PORT_CHURNED since we don't need it anywhere (Paolo Abeni)
  * I didn't add new helper for ad_churn_machine() as it looks not help much.
  * https://lore.kernel.org/netdev/20260114064921.57686-1-liuhangbin@gmail.com
v1: https://lore.kernel.org/netdev/20251124043310.34073-1-liuhangbin@gmail.com

Hangbin Liu (3):
  bonding: set AD_RX_PORT_DISABLED when disabling a port
  bonding: restructure ad_churn_machine
  selftests: bonding: add mux and churn state testing

 drivers/net/bonding/bond_3ad.c                | 97 ++++++++++++++-----
 .../selftests/drivers/net/bonding/Makefile    |  2 +-
 ...nd_lacp_prio.sh => bond_lacp_ad_select.sh} | 73 ++++++++++++++
 3 files changed, 146 insertions(+), 26 deletions(-)
 rename tools/testing/selftests/drivers/net/bonding/{bond_lacp_prio.sh => bond_lacp_ad_select.sh} (64%)

-- 
2.50.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port
  2026-02-26 12:53 [PATCHv3 net 0/3] bonding: fix 802.3ad churn machine and port state issues Hangbin Liu
@ 2026-02-26 12:53 ` Hangbin Liu
  2026-02-27  1:16   ` Jay Vosburgh
  2026-02-26 12:53 ` [PATCHv3 net 2/3] bonding: restructure ad_churn_machine Hangbin Liu
  2026-02-26 12:53 ` [PATCHv3 net 3/3] selftests: bonding: add mux and churn state testing Hangbin Liu
  2 siblings, 1 reply; 14+ messages in thread
From: Hangbin Liu @ 2026-02-26 12:53 UTC (permalink / raw)
  To: netdev
  Cc: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest, Hangbin Liu

When disabling a port’s collecting and distributing states, updating only
rx_disabled is not sufficient. We also need to set AD_RX_PORT_DISABLED
so that the rx_machine transitions into the AD_RX_EXPIRED state.

One example is in ad_agg_selection_logic(): when a new aggregator is
selected and old active aggregator is disabled, if AD_RX_PORT_DISABLED is
not set, the disabled port may remain stuck in AD_RX_CURRENT due to
continuing to receive partner LACP messages.

The __disable_port() called by ad_disable_collecting_distributing()
does not have this issue, since its caller also clears the
collecting/distributing bits.

The __disable_port() called by bond_3ad_bind_slave() should also be fine,
as the RX state machine is re-initialized to AD_RX_INITIALIZE.

Let's fix this only in ad_agg_selection_logic() to reduce the chances of
unintended side effects.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 drivers/net/bonding/bond_3ad.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index af7f74cfdc08..c47f6a69fd2a 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -1932,6 +1932,7 @@ static void ad_agg_selection_logic(struct aggregator *agg,
 		if (active) {
 			for (port = active->lag_ports; port;
 			     port = port->next_port_in_aggregator) {
+				port->sm_rx_state = AD_RX_PORT_DISABLED;
 				__disable_port(port);
 			}
 		}
-- 
2.50.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCHv3 net 2/3] bonding: restructure ad_churn_machine
  2026-02-26 12:53 [PATCHv3 net 0/3] bonding: fix 802.3ad churn machine and port state issues Hangbin Liu
  2026-02-26 12:53 ` [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port Hangbin Liu
@ 2026-02-26 12:53 ` Hangbin Liu
  2026-02-27  0:36   ` Jay Vosburgh
  2026-02-26 12:53 ` [PATCHv3 net 3/3] selftests: bonding: add mux and churn state testing Hangbin Liu
  2 siblings, 1 reply; 14+ messages in thread
From: Hangbin Liu @ 2026-02-26 12:53 UTC (permalink / raw)
  To: netdev
  Cc: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest, Hangbin Liu,
	Liang Li

The current ad_churn_machine implementation only transitions the
actor/partner churn state to churned or none after the churn timer expires.
However, IEEE 802.1AX-2014 specifies that a port should enter the none
state immediately once the actor’s port state enters synchronization.

Another issue is that if the churn timer expires while the churn machine is
not in the monitor state (e.g. already in churn), the state may remain
stuck indefinitely with no further transitions. This becomes visible in
multi-aggregator scenarios. For example:

Ports 1 and 2 are in aggregator 1 (active)
Ports 3 and 4 are in aggregator 2 (backup)

Ports 1 and 2 should be in none
Ports 3 and 4 should be in churned

If a failover occurs due to port 2 link down/up, aggregator 2 becomes active.
Under the current implementation, the resulting states may look like:

agg 1 (backup): port 1 -> none, port 2 -> churned
agg 2 (active): ports 3,4 keep in churned.

The root cause is that ad_churn_machine() only clears the
AD_PORT_CHURNED flag and starts a timer. When a churned port becomes active,
its RX state becomes AD_RX_CURRENT, preventing the churn flag from being set
again, leaving no way to retrigger the timer. Fixing this solely in
ad_rx_machine() is insufficient.

This patch rewrites ad_churn_machine according to IEEE 802.1AX-2014
(Figures 6-23 and 6-24), ensuring correct churn detection, state transitions,
and timer behavior. With new implementation, there is no need to set
AD_PORT_CHURNED in ad_rx_machine().

Fixes: 14c9551a32eb ("bonding: Implement port churn-machine (AD standard 43.4.17).")
Reported-by: Liang Li <liali@redhat.com>
Tested-by: Liang Li <liali@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 drivers/net/bonding/bond_3ad.c | 96 +++++++++++++++++++++++++---------
 1 file changed, 71 insertions(+), 25 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index c47f6a69fd2a..68258d61fd1c 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -44,7 +44,6 @@
 #define AD_PORT_STANDBY         0x80
 #define AD_PORT_SELECTED        0x100
 #define AD_PORT_MOVED           0x200
-#define AD_PORT_CHURNED         (AD_PORT_ACTOR_CHURN | AD_PORT_PARTNER_CHURN)
 
 /* Port Key definitions
  * key is determined according to the link speed, duplex and
@@ -1254,7 +1253,6 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
 	/* first, check if port was reinitialized */
 	if (port->sm_vars & AD_PORT_BEGIN) {
 		port->sm_rx_state = AD_RX_INITIALIZE;
-		port->sm_vars |= AD_PORT_CHURNED;
 	/* check if port is not enabled */
 	} else if (!(port->sm_vars & AD_PORT_BEGIN) && !port->is_enabled)
 		port->sm_rx_state = AD_RX_PORT_DISABLED;
@@ -1262,8 +1260,6 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
 	else if (lacpdu && ((port->sm_rx_state == AD_RX_EXPIRED) ||
 		 (port->sm_rx_state == AD_RX_DEFAULTED) ||
 		 (port->sm_rx_state == AD_RX_CURRENT))) {
-		if (port->sm_rx_state != AD_RX_CURRENT)
-			port->sm_vars |= AD_PORT_CHURNED;
 		port->sm_rx_timer_counter = 0;
 		port->sm_rx_state = AD_RX_CURRENT;
 	} else {
@@ -1347,7 +1343,6 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
 			port->partner_oper.port_state |= LACP_STATE_LACP_TIMEOUT;
 			port->sm_rx_timer_counter = __ad_timer_to_ticks(AD_CURRENT_WHILE_TIMER, (u16)(AD_SHORT_TIMEOUT));
 			port->actor_oper_port_state |= LACP_STATE_EXPIRED;
-			port->sm_vars |= AD_PORT_CHURNED;
 			break;
 		case AD_RX_DEFAULTED:
 			__update_default_selected(port);
@@ -1379,11 +1374,41 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
  * ad_churn_machine - handle port churn's state machine
  * @port: the port we're looking at
  *
+ * IEEE 802.1AX-2014 Figure 6-23 - Actor Churn Detection machine state diagram
+ *
+ *                                                     BEGIN || (! port_enabled)
+ *                                                               |
+ *                                      (3)                (1)   v
+ *   +----------------------+     ActorPort.Sync     +-------------------------+
+ *   |    NO_ACTOR_CHURN    | <--------------------- |   ACTOR_CHURN_MONITOR   |
+ *   |======================|                        |=========================|
+ *   | actor_churn = FALSE; |    ! ActorPort.Sync    | actor_churn = FALSE;    |
+ *   |                      | ---------------------> | Start actor_churn_timer |
+ *   +----------------------+           (4)          +-------------------------+
+ *             ^                                                 |
+ *             |                                                 |
+ *             |                                      actor_churn_timer expired
+ *             |                                                 |
+ *       ActorPort.Sync                                          |  (2)
+ *             |              +--------------------+             |
+ *        (3)  |              |   ACTOR_CHURN      |             |
+ *             |              |====================|             |
+ *             +------------- | actor_churn = True | <-----------+
+ *                            |                    |
+ *                            +--------------------+
+ *
+ * Similar for the Figure 6-24 - Partner Churn Detection machine state diagram
+ *
+ * We don’t need to check actor_churn, because it can only be true when the
+ * state is ACTOR_CHURN.
  */
 static void ad_churn_machine(struct port *port)
 {
-	if (port->sm_vars & AD_PORT_CHURNED) {
-		port->sm_vars &= ~AD_PORT_CHURNED;
+	bool partner_synced = port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION;
+	bool actor_synced = port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION;
+
+	/* ---- 1. begin or port not enabled ---- */
+	if ((port->sm_vars & AD_PORT_BEGIN) || !port->is_enabled) {
 		port->sm_churn_actor_state = AD_CHURN_MONITOR;
 		port->sm_churn_partner_state = AD_CHURN_MONITOR;
 		port->sm_churn_actor_timer_counter =
@@ -1392,25 +1417,46 @@ static void ad_churn_machine(struct port *port)
 			 __ad_timer_to_ticks(AD_PARTNER_CHURN_TIMER, 0);
 		return;
 	}
-	if (port->sm_churn_actor_timer_counter &&
-	    !(--port->sm_churn_actor_timer_counter) &&
-	    port->sm_churn_actor_state == AD_CHURN_MONITOR) {
-		if (port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION) {
-			port->sm_churn_actor_state = AD_NO_CHURN;
-		} else {
-			port->churn_actor_count++;
-			port->sm_churn_actor_state = AD_CHURN;
-		}
+
+	if (port->sm_churn_actor_timer_counter)
+		port->sm_churn_actor_timer_counter--;
+
+	if (port->sm_churn_partner_timer_counter)
+		port->sm_churn_partner_timer_counter--;
+
+	/* ---- 2. timer expired, enter CHURN ---- */
+	if (port->sm_churn_actor_state == AD_CHURN_MONITOR &&
+	    !port->sm_churn_actor_timer_counter) {
+		port->sm_churn_actor_state = AD_CHURN;
+		port->churn_actor_count++;
 	}
-	if (port->sm_churn_partner_timer_counter &&
-	    !(--port->sm_churn_partner_timer_counter) &&
-	    port->sm_churn_partner_state == AD_CHURN_MONITOR) {
-		if (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) {
-			port->sm_churn_partner_state = AD_NO_CHURN;
-		} else {
-			port->churn_partner_count++;
-			port->sm_churn_partner_state = AD_CHURN;
-		}
+
+	if (port->sm_churn_partner_state == AD_CHURN_MONITOR &&
+	    !port->sm_churn_partner_timer_counter) {
+		port->sm_churn_partner_state = AD_CHURN;
+		port->churn_partner_count++;
+	}
+
+	/* ---- 3. CHURN_MONITOR/CHURN + sync -> NO_CHURN ---- */
+	if ((port->sm_churn_actor_state == AD_CHURN_MONITOR ||
+	     port->sm_churn_actor_state == AD_CHURN) && actor_synced)
+		port->sm_churn_actor_state = AD_NO_CHURN;
+
+	if ((port->sm_churn_partner_state == AD_CHURN_MONITOR ||
+	     port->sm_churn_partner_state == AD_CHURN) && partner_synced)
+		port->sm_churn_partner_state = AD_NO_CHURN;
+
+	/* ---- 4. NO_CHURN + !sync -> MONITOR ---- */
+	if (port->sm_churn_actor_state == AD_NO_CHURN && !actor_synced) {
+		port->sm_churn_actor_state = AD_CHURN_MONITOR;
+		port->sm_churn_actor_timer_counter =
+			__ad_timer_to_ticks(AD_ACTOR_CHURN_TIMER, 0);
+	}
+
+	if (port->sm_churn_partner_state == AD_NO_CHURN && !partner_synced) {
+		port->sm_churn_partner_state = AD_CHURN_MONITOR;
+		port->sm_churn_partner_timer_counter =
+			__ad_timer_to_ticks(AD_PARTNER_CHURN_TIMER, 0);
 	}
 }
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCHv3 net 3/3] selftests: bonding: add mux and churn state testing
  2026-02-26 12:53 [PATCHv3 net 0/3] bonding: fix 802.3ad churn machine and port state issues Hangbin Liu
  2026-02-26 12:53 ` [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port Hangbin Liu
  2026-02-26 12:53 ` [PATCHv3 net 2/3] bonding: restructure ad_churn_machine Hangbin Liu
@ 2026-02-26 12:53 ` Hangbin Liu
  2 siblings, 0 replies; 14+ messages in thread
From: Hangbin Liu @ 2026-02-26 12:53 UTC (permalink / raw)
  To: netdev
  Cc: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest, Hangbin Liu

Rename the current LACP priority test to LACP ad_select testing, and
extend it to include validation of the actor state machine and churn
state logic. The updated tests verify that both the mux state machine and
the churn state machine behave correctly under aggregator selection and
failover scenarios.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 .../selftests/drivers/net/bonding/Makefile    |  2 +-
 ...nd_lacp_prio.sh => bond_lacp_ad_select.sh} | 73 +++++++++++++++++++
 2 files changed, 74 insertions(+), 1 deletion(-)
 rename tools/testing/selftests/drivers/net/bonding/{bond_lacp_prio.sh => bond_lacp_ad_select.sh} (64%)

diff --git a/tools/testing/selftests/drivers/net/bonding/Makefile b/tools/testing/selftests/drivers/net/bonding/Makefile
index 6c5c60adb5e8..e7bddfbf0f7a 100644
--- a/tools/testing/selftests/drivers/net/bonding/Makefile
+++ b/tools/testing/selftests/drivers/net/bonding/Makefile
@@ -7,7 +7,7 @@ TEST_PROGS := \
 	bond-eth-type-change.sh \
 	bond-lladdr-target.sh \
 	bond_ipsec_offload.sh \
-	bond_lacp_prio.sh \
+	bond_lacp_ad_select.sh \
 	bond_macvlan_ipvlan.sh \
 	bond_options.sh \
 	bond_passive_lacp.sh \
diff --git a/tools/testing/selftests/drivers/net/bonding/bond_lacp_prio.sh b/tools/testing/selftests/drivers/net/bonding/bond_lacp_ad_select.sh
similarity index 64%
rename from tools/testing/selftests/drivers/net/bonding/bond_lacp_prio.sh
rename to tools/testing/selftests/drivers/net/bonding/bond_lacp_ad_select.sh
index a483d505c6a8..9f0b3de4f55c 100755
--- a/tools/testing/selftests/drivers/net/bonding/bond_lacp_prio.sh
+++ b/tools/testing/selftests/drivers/net/bonding/bond_lacp_ad_select.sh
@@ -89,6 +89,65 @@ test_agg_reselect()
 		RET=1
 }
 
+is_distributing()
+{
+	ip -j -n "$c_ns" -d link show "$1" \
+		| jq -e '.[].linkinfo.info_slave_data.ad_actor_oper_port_state_str | index("distributing")' > /dev/null
+}
+
+get_churn_state()
+{
+	local slave=$1
+	# shellcheck disable=SC2016
+	ip netns exec "$c_ns" awk -v s="$slave" '
+	$0 ~ "Slave Interface: " s {found=1}
+	found && /Actor Churn State:/ { print $4; exit }
+	' /proc/net/bonding/bond0
+}
+
+check_slave_state()
+{
+	local state=$1
+	local slave_0=$2
+	local slave_1=$3
+	local churn_state
+	RET=0
+
+	s0_agg_id=$(cmd_jq "ip -n ${c_ns} -d -j link show $slave_0" \
+		".[].linkinfo.info_slave_data.ad_aggregator_id")
+	s1_agg_id=$(cmd_jq "ip -n ${c_ns} -d -j link show $slave_1" \
+		".[].linkinfo.info_slave_data.ad_aggregator_id")
+	if [ "${s0_agg_id}" -ne "${s1_agg_id}" ]; then
+		log_info "$state: $slave_0 $slave_1 agg ids are different"
+		RET=1
+	fi
+
+	for s in "$slave_0" "$slave_1"; do
+		churn_state=$(get_churn_state "$s")
+		if [ "$state" = "active" ]; then
+			if ! is_distributing "$s"; then
+				log_info "$state: $s is not in distributing state"
+				RET=1
+			fi
+			if [ "$churn_state" != "none" ]; then
+				log_info "$state: $s churn state $churn_state"
+				RET=1
+			fi
+		else
+			# Backup state, should be in churned and not distributing
+			if is_distributing "$s"; then
+				log_info "$state: $s is in distributing state"
+				RET=1
+			fi
+			if [ "$churn_state" != "churned" ]; then
+				log_info "$state: $s churn state $churn_state"
+				# shellcheck disable=SC2034
+				RET=1
+			fi
+		fi
+	done
+}
+
 trap cleanup_all_ns EXIT
 setup_ns c_ns s_ns b_ns
 setup_links
@@ -98,11 +157,25 @@ log_test "bond 802.3ad" "actor_port_prio setting"
 
 test_agg_reselect eth0
 log_test "bond 802.3ad" "actor_port_prio select"
+# sleep for a while to make sure the mux state machine has completed.
+sleep 10
+check_slave_state active eth0 eth1
+log_test "bond 802.3ad" "active state/churn checking"
+# wait for churn timer expired, need a bit longer as we restart eth1
+sleep 55
+check_slave_state backup eth2 eth3
+log_test "bond 802.3ad" "backup state/churn checking"
 
 # Change the actor port prio and re-test
 ip -n "${c_ns}" link set eth0 type bond_slave actor_port_prio 10
 ip -n "${c_ns}" link set eth2 type bond_slave actor_port_prio 1000
 test_agg_reselect eth2
 log_test "bond 802.3ad" "actor_port_prio switch"
+sleep 10
+check_slave_state active eth2 eth3
+log_test "bond 802.3ad" "active state/churn checking"
+sleep 55
+check_slave_state backup eth0 eth1
+log_test "bond 802.3ad" "backup state/churn checking"
 
 exit "${EXIT_STATUS}"
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 net 2/3] bonding: restructure ad_churn_machine
  2026-02-26 12:53 ` [PATCHv3 net 2/3] bonding: restructure ad_churn_machine Hangbin Liu
@ 2026-02-27  0:36   ` Jay Vosburgh
  2026-02-27  0:52     ` Hangbin Liu
  0 siblings, 1 reply; 14+ messages in thread
From: Jay Vosburgh @ 2026-02-27  0:36 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest, Liang Li

Hangbin Liu <liuhangbin@gmail.com> wrote:

>The current ad_churn_machine implementation only transitions the
>actor/partner churn state to churned or none after the churn timer expires.
>However, IEEE 802.1AX-2014 specifies that a port should enter the none
>state immediately once the actor’s port state enters synchronization.
>
>Another issue is that if the churn timer expires while the churn machine is
>not in the monitor state (e.g. already in churn), the state may remain
>stuck indefinitely with no further transitions. This becomes visible in
>multi-aggregator scenarios. For example:
>
>Ports 1 and 2 are in aggregator 1 (active)
>Ports 3 and 4 are in aggregator 2 (backup)
>
>Ports 1 and 2 should be in none
>Ports 3 and 4 should be in churned
>
>If a failover occurs due to port 2 link down/up, aggregator 2 becomes active.
>Under the current implementation, the resulting states may look like:
>
>agg 1 (backup): port 1 -> none, port 2 -> churned
>agg 2 (active): ports 3,4 keep in churned.
>
>The root cause is that ad_churn_machine() only clears the
>AD_PORT_CHURNED flag and starts a timer. When a churned port becomes active,
>its RX state becomes AD_RX_CURRENT, preventing the churn flag from being set
>again, leaving no way to retrigger the timer. Fixing this solely in
>ad_rx_machine() is insufficient.
>
>This patch rewrites ad_churn_machine according to IEEE 802.1AX-2014
>(Figures 6-23 and 6-24), ensuring correct churn detection, state transitions,
>and timer behavior. With new implementation, there is no need to set
>AD_PORT_CHURNED in ad_rx_machine().
>
>Fixes: 14c9551a32eb ("bonding: Implement port churn-machine (AD standard 43.4.17).")
>Reported-by: Liang Li <liali@redhat.com>
>Tested-by: Liang Li <liali@redhat.com>
>Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

	I missed this last time it was posted, but reading it now I
think the functional change looks good, but I question the usefulness of
including the 25 line ASCII art version of the state diagram.

	The standard is publicly available, so a comment saying that the
state machine logic conforms to IEEE 802.1AX-2014 figures 6-23 and 6-24
should be sufficient.  Anyone seriously checking the code against the
standard will need to read the relevant text, so they'll be looking it
up anyway.

	-J

>---
> drivers/net/bonding/bond_3ad.c | 96 +++++++++++++++++++++++++---------
> 1 file changed, 71 insertions(+), 25 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index c47f6a69fd2a..68258d61fd1c 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -44,7 +44,6 @@
> #define AD_PORT_STANDBY         0x80
> #define AD_PORT_SELECTED        0x100
> #define AD_PORT_MOVED           0x200
>-#define AD_PORT_CHURNED         (AD_PORT_ACTOR_CHURN | AD_PORT_PARTNER_CHURN)
> 
> /* Port Key definitions
>  * key is determined according to the link speed, duplex and
>@@ -1254,7 +1253,6 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
> 	/* first, check if port was reinitialized */
> 	if (port->sm_vars & AD_PORT_BEGIN) {
> 		port->sm_rx_state = AD_RX_INITIALIZE;
>-		port->sm_vars |= AD_PORT_CHURNED;
> 	/* check if port is not enabled */
> 	} else if (!(port->sm_vars & AD_PORT_BEGIN) && !port->is_enabled)
> 		port->sm_rx_state = AD_RX_PORT_DISABLED;
>@@ -1262,8 +1260,6 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
> 	else if (lacpdu && ((port->sm_rx_state == AD_RX_EXPIRED) ||
> 		 (port->sm_rx_state == AD_RX_DEFAULTED) ||
> 		 (port->sm_rx_state == AD_RX_CURRENT))) {
>-		if (port->sm_rx_state != AD_RX_CURRENT)
>-			port->sm_vars |= AD_PORT_CHURNED;
> 		port->sm_rx_timer_counter = 0;
> 		port->sm_rx_state = AD_RX_CURRENT;
> 	} else {
>@@ -1347,7 +1343,6 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
> 			port->partner_oper.port_state |= LACP_STATE_LACP_TIMEOUT;
> 			port->sm_rx_timer_counter = __ad_timer_to_ticks(AD_CURRENT_WHILE_TIMER, (u16)(AD_SHORT_TIMEOUT));
> 			port->actor_oper_port_state |= LACP_STATE_EXPIRED;
>-			port->sm_vars |= AD_PORT_CHURNED;
> 			break;
> 		case AD_RX_DEFAULTED:
> 			__update_default_selected(port);
>@@ -1379,11 +1374,41 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
>  * ad_churn_machine - handle port churn's state machine
>  * @port: the port we're looking at
>  *
>+ * IEEE 802.1AX-2014 Figure 6-23 - Actor Churn Detection machine state diagram
>+ *
>+ *                                                     BEGIN || (! port_enabled)
>+ *                                                               |
>+ *                                      (3)                (1)   v
>+ *   +----------------------+     ActorPort.Sync     +-------------------------+
>+ *   |    NO_ACTOR_CHURN    | <--------------------- |   ACTOR_CHURN_MONITOR   |
>+ *   |======================|                        |=========================|
>+ *   | actor_churn = FALSE; |    ! ActorPort.Sync    | actor_churn = FALSE;    |
>+ *   |                      | ---------------------> | Start actor_churn_timer |
>+ *   +----------------------+           (4)          +-------------------------+
>+ *             ^                                                 |
>+ *             |                                                 |
>+ *             |                                      actor_churn_timer expired
>+ *             |                                                 |
>+ *       ActorPort.Sync                                          |  (2)
>+ *             |              +--------------------+             |
>+ *        (3)  |              |   ACTOR_CHURN      |             |
>+ *             |              |====================|             |
>+ *             +------------- | actor_churn = True | <-----------+
>+ *                            |                    |
>+ *                            +--------------------+
>+ *
>+ * Similar for the Figure 6-24 - Partner Churn Detection machine state diagram
>+ *
>+ * We don’t need to check actor_churn, because it can only be true when the
>+ * state is ACTOR_CHURN.
>  */
> static void ad_churn_machine(struct port *port)
> {
>-	if (port->sm_vars & AD_PORT_CHURNED) {
>-		port->sm_vars &= ~AD_PORT_CHURNED;
>+	bool partner_synced = port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION;
>+	bool actor_synced = port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION;
>+
>+	/* ---- 1. begin or port not enabled ---- */
>+	if ((port->sm_vars & AD_PORT_BEGIN) || !port->is_enabled) {
> 		port->sm_churn_actor_state = AD_CHURN_MONITOR;
> 		port->sm_churn_partner_state = AD_CHURN_MONITOR;
> 		port->sm_churn_actor_timer_counter =
>@@ -1392,25 +1417,46 @@ static void ad_churn_machine(struct port *port)
> 			 __ad_timer_to_ticks(AD_PARTNER_CHURN_TIMER, 0);
> 		return;
> 	}
>-	if (port->sm_churn_actor_timer_counter &&
>-	    !(--port->sm_churn_actor_timer_counter) &&
>-	    port->sm_churn_actor_state == AD_CHURN_MONITOR) {
>-		if (port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION) {
>-			port->sm_churn_actor_state = AD_NO_CHURN;
>-		} else {
>-			port->churn_actor_count++;
>-			port->sm_churn_actor_state = AD_CHURN;
>-		}
>+
>+	if (port->sm_churn_actor_timer_counter)
>+		port->sm_churn_actor_timer_counter--;
>+
>+	if (port->sm_churn_partner_timer_counter)
>+		port->sm_churn_partner_timer_counter--;
>+
>+	/* ---- 2. timer expired, enter CHURN ---- */
>+	if (port->sm_churn_actor_state == AD_CHURN_MONITOR &&
>+	    !port->sm_churn_actor_timer_counter) {
>+		port->sm_churn_actor_state = AD_CHURN;
>+		port->churn_actor_count++;
> 	}
>-	if (port->sm_churn_partner_timer_counter &&
>-	    !(--port->sm_churn_partner_timer_counter) &&
>-	    port->sm_churn_partner_state == AD_CHURN_MONITOR) {
>-		if (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) {
>-			port->sm_churn_partner_state = AD_NO_CHURN;
>-		} else {
>-			port->churn_partner_count++;
>-			port->sm_churn_partner_state = AD_CHURN;
>-		}
>+
>+	if (port->sm_churn_partner_state == AD_CHURN_MONITOR &&
>+	    !port->sm_churn_partner_timer_counter) {
>+		port->sm_churn_partner_state = AD_CHURN;
>+		port->churn_partner_count++;
>+	}
>+
>+	/* ---- 3. CHURN_MONITOR/CHURN + sync -> NO_CHURN ---- */
>+	if ((port->sm_churn_actor_state == AD_CHURN_MONITOR ||
>+	     port->sm_churn_actor_state == AD_CHURN) && actor_synced)
>+		port->sm_churn_actor_state = AD_NO_CHURN;
>+
>+	if ((port->sm_churn_partner_state == AD_CHURN_MONITOR ||
>+	     port->sm_churn_partner_state == AD_CHURN) && partner_synced)
>+		port->sm_churn_partner_state = AD_NO_CHURN;
>+
>+	/* ---- 4. NO_CHURN + !sync -> MONITOR ---- */
>+	if (port->sm_churn_actor_state == AD_NO_CHURN && !actor_synced) {
>+		port->sm_churn_actor_state = AD_CHURN_MONITOR;
>+		port->sm_churn_actor_timer_counter =
>+			__ad_timer_to_ticks(AD_ACTOR_CHURN_TIMER, 0);
>+	}
>+
>+	if (port->sm_churn_partner_state == AD_NO_CHURN && !partner_synced) {
>+		port->sm_churn_partner_state = AD_CHURN_MONITOR;
>+		port->sm_churn_partner_timer_counter =
>+			__ad_timer_to_ticks(AD_PARTNER_CHURN_TIMER, 0);
> 	}
> }
> 
>-- 
>2.50.1
>

---
	-Jay Vosburgh, jv@jvosburgh.net

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 net 2/3] bonding: restructure ad_churn_machine
  2026-02-27  0:36   ` Jay Vosburgh
@ 2026-02-27  0:52     ` Hangbin Liu
  2026-02-27  1:42       ` Jay Vosburgh
  0 siblings, 1 reply; 14+ messages in thread
From: Hangbin Liu @ 2026-02-27  0:52 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest, Liang Li

On Thu, Feb 26, 2026 at 04:36:46PM -0800, Jay Vosburgh wrote:
> Hangbin Liu <liuhangbin@gmail.com> wrote:
> 
> >The current ad_churn_machine implementation only transitions the
> >actor/partner churn state to churned or none after the churn timer expires.
> >However, IEEE 802.1AX-2014 specifies that a port should enter the none
> >state immediately once the actor’s port state enters synchronization.
> >
> >Another issue is that if the churn timer expires while the churn machine is
> >not in the monitor state (e.g. already in churn), the state may remain
> >stuck indefinitely with no further transitions. This becomes visible in
> >multi-aggregator scenarios. For example:
> >
> >Ports 1 and 2 are in aggregator 1 (active)
> >Ports 3 and 4 are in aggregator 2 (backup)
> >
> >Ports 1 and 2 should be in none
> >Ports 3 and 4 should be in churned
> >
> >If a failover occurs due to port 2 link down/up, aggregator 2 becomes active.
> >Under the current implementation, the resulting states may look like:
> >
> >agg 1 (backup): port 1 -> none, port 2 -> churned
> >agg 2 (active): ports 3,4 keep in churned.
> >
> >The root cause is that ad_churn_machine() only clears the
> >AD_PORT_CHURNED flag and starts a timer. When a churned port becomes active,
> >its RX state becomes AD_RX_CURRENT, preventing the churn flag from being set
> >again, leaving no way to retrigger the timer. Fixing this solely in
> >ad_rx_machine() is insufficient.
> >
> >This patch rewrites ad_churn_machine according to IEEE 802.1AX-2014
> >(Figures 6-23 and 6-24), ensuring correct churn detection, state transitions,
> >and timer behavior. With new implementation, there is no need to set
> >AD_PORT_CHURNED in ad_rx_machine().
> >
> >Fixes: 14c9551a32eb ("bonding: Implement port churn-machine (AD standard 43.4.17).")
> >Reported-by: Liang Li <liali@redhat.com>
> >Tested-by: Liang Li <liali@redhat.com>
> >Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> 
> 	I missed this last time it was posted, but reading it now I
> think the functional change looks good, but I question the usefulness of
> including the 25 line ASCII art version of the state diagram.
> 
> 	The standard is publicly available, so a comment saying that the
> state machine logic conforms to IEEE 802.1AX-2014 figures 6-23 and 6-24
> should be sufficient.  Anyone seriously checking the code against the
> standard will need to read the relevant text, so they'll be looking it
> up anyway.

I added it here to help new readers and reviewers understand the logic
quickly. If you think there’s no need to include it in the code, maybe
we can move it to the commit description?

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port
  2026-02-26 12:53 ` [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port Hangbin Liu
@ 2026-02-27  1:16   ` Jay Vosburgh
  2026-02-27  2:31     ` Hangbin Liu
  0 siblings, 1 reply; 14+ messages in thread
From: Jay Vosburgh @ 2026-02-27  1:16 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest

Hangbin Liu <liuhangbin@gmail.com> wrote:

>When disabling a port’s collecting and distributing states, updating only
>rx_disabled is not sufficient. We also need to set AD_RX_PORT_DISABLED
>so that the rx_machine transitions into the AD_RX_EXPIRED state.
>
>One example is in ad_agg_selection_logic(): when a new aggregator is
>selected and old active aggregator is disabled, if AD_RX_PORT_DISABLED is
>not set, the disabled port may remain stuck in AD_RX_CURRENT due to
>continuing to receive partner LACP messages.

	I'm not sure I'm seeing the problem here, is there an actual
misbehavior being fixed here?  The port is receiving LACPDUs, and from
the receive state machine point of view (Figure 6-18) there's no issue.
The "port_enabled" variable (6.4.7) also informs the state machine
behavior, but that's not the same as what's changed by bonding's
__disable_port function.

	Where I'm going with this is that, when multiple aggregator
support was originally implemented, the theory was to keep aggregators
other than the active agg in a state such that they could be put into
service immediately, without having to do LACPDU exchanges in order to
transition into the appropriate state.  A hot standby, basically,
analogous to an active-backup mode backup interface with link state up.

	I haven't tested this in some time, though, so my question is
whether this change affects the failover time when an active aggregator
is de-selected in favor of another aggregator.  By "failover time," I
mean how long transmission and/or reception are interrupted when
changing from one aggregator to another.  I presume that if aggregator
failover ater this change requires LACPDU exchanges, etc, it will take
longer to fail over.

	-J


>The __disable_port() called by ad_disable_collecting_distributing()
>does not have this issue, since its caller also clears the
>collecting/distributing bits.
>
>The __disable_port() called by bond_3ad_bind_slave() should also be fine,
>as the RX state machine is re-initialized to AD_RX_INITIALIZE.
>
>Let's fix this only in ad_agg_selection_logic() to reduce the chances of
>unintended side effects.
>
>Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
>---
> drivers/net/bonding/bond_3ad.c | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index af7f74cfdc08..c47f6a69fd2a 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -1932,6 +1932,7 @@ static void ad_agg_selection_logic(struct aggregator *agg,
> 		if (active) {
> 			for (port = active->lag_ports; port;
> 			     port = port->next_port_in_aggregator) {
>+				port->sm_rx_state = AD_RX_PORT_DISABLED;
> 				__disable_port(port);
> 			}
> 		}
>-- 
>2.50.1
>

---
	-Jay Vosburgh, jv@jvosburgh.net


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 net 2/3] bonding: restructure ad_churn_machine
  2026-02-27  0:52     ` Hangbin Liu
@ 2026-02-27  1:42       ` Jay Vosburgh
  2026-02-27  2:36         ` Hangbin Liu
  0 siblings, 1 reply; 14+ messages in thread
From: Jay Vosburgh @ 2026-02-27  1:42 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest, Liang Li

Hangbin Liu <liuhangbin@gmail.com> wrote:

>On Thu, Feb 26, 2026 at 04:36:46PM -0800, Jay Vosburgh wrote:
>> Hangbin Liu <liuhangbin@gmail.com> wrote:
>> 
>> >The current ad_churn_machine implementation only transitions the
>> >actor/partner churn state to churned or none after the churn timer expires.
>> >However, IEEE 802.1AX-2014 specifies that a port should enter the none
>> >state immediately once the actor’s port state enters synchronization.
>> >
>> >Another issue is that if the churn timer expires while the churn machine is
>> >not in the monitor state (e.g. already in churn), the state may remain
>> >stuck indefinitely with no further transitions. This becomes visible in
>> >multi-aggregator scenarios. For example:
>> >
>> >Ports 1 and 2 are in aggregator 1 (active)
>> >Ports 3 and 4 are in aggregator 2 (backup)
>> >
>> >Ports 1 and 2 should be in none
>> >Ports 3 and 4 should be in churned
>> >
>> >If a failover occurs due to port 2 link down/up, aggregator 2 becomes active.
>> >Under the current implementation, the resulting states may look like:
>> >
>> >agg 1 (backup): port 1 -> none, port 2 -> churned
>> >agg 2 (active): ports 3,4 keep in churned.
>> >
>> >The root cause is that ad_churn_machine() only clears the
>> >AD_PORT_CHURNED flag and starts a timer. When a churned port becomes active,
>> >its RX state becomes AD_RX_CURRENT, preventing the churn flag from being set
>> >again, leaving no way to retrigger the timer. Fixing this solely in
>> >ad_rx_machine() is insufficient.
>> >
>> >This patch rewrites ad_churn_machine according to IEEE 802.1AX-2014
>> >(Figures 6-23 and 6-24), ensuring correct churn detection, state transitions,
>> >and timer behavior. With new implementation, there is no need to set
>> >AD_PORT_CHURNED in ad_rx_machine().
>> >
>> >Fixes: 14c9551a32eb ("bonding: Implement port churn-machine (AD standard 43.4.17).")
>> >Reported-by: Liang Li <liali@redhat.com>
>> >Tested-by: Liang Li <liali@redhat.com>
>> >Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
>> 
>> 	I missed this last time it was posted, but reading it now I
>> think the functional change looks good, but I question the usefulness of
>> including the 25 line ASCII art version of the state diagram.
>> 
>> 	The standard is publicly available, so a comment saying that the
>> state machine logic conforms to IEEE 802.1AX-2014 figures 6-23 and 6-24
>> should be sufficient.  Anyone seriously checking the code against the
>> standard will need to read the relevant text, so they'll be looking it
>> up anyway.
>
>I added it here to help new readers and reviewers understand the logic
>quickly. If you think there’s no need to include it in the code, maybe
>we can move it to the commit description?

	Personally, I'd leave it out, and just put a reference.

	The churn machine isn't that critical, even the standards
committee didn't like it and removed it from the 2020 edition of
802.1AX.

	Still, whatever we provide should work in accordance with the
2014 standard we're nominally conforming to, so functionally the patch
looks fine to me.

	-J

---
	-Jay Vosburgh, jv@jvosburgh.net


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port
  2026-02-27  1:16   ` Jay Vosburgh
@ 2026-02-27  2:31     ` Hangbin Liu
  2026-02-27  4:14       ` Hangbin Liu
  2026-02-27  4:42       ` Jay Vosburgh
  0 siblings, 2 replies; 14+ messages in thread
From: Hangbin Liu @ 2026-02-27  2:31 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest

On Thu, Feb 26, 2026 at 05:16:55PM -0800, Jay Vosburgh wrote:
> Hangbin Liu <liuhangbin@gmail.com> wrote:
> 
> >When disabling a port’s collecting and distributing states, updating only
> >rx_disabled is not sufficient. We also need to set AD_RX_PORT_DISABLED
> >so that the rx_machine transitions into the AD_RX_EXPIRED state.
> >
> >One example is in ad_agg_selection_logic(): when a new aggregator is
> >selected and old active aggregator is disabled, if AD_RX_PORT_DISABLED is
> >not set, the disabled port may remain stuck in AD_RX_CURRENT due to
> >continuing to receive partner LACP messages.
> 
> 	I'm not sure I'm seeing the problem here, is there an actual
> misbehavior being fixed here?  The port is receiving LACPDUs, and from
> the receive state machine point of view (Figure 6-18) there's no issue.
> The "port_enabled" variable (6.4.7) also informs the state machine
> behavior, but that's not the same as what's changed by bonding's
> __disable_port function.

Yes, the reason I do it here is we select another aggregator and called
__disable_port() for the old one. If we don't update sm_rx_state, the port
will be keep in collecting/distributing state, and the partner will also
keep in the c/d state.

Here we entered a logical paradox, on one hand we want to disable the port,
on the other hand we keep the port in collecting/distributing state.

> 
> 	Where I'm going with this is that, when multiple aggregator
> support was originally implemented, the theory was to keep aggregators
> other than the active agg in a state such that they could be put into
> service immediately, without having to do LACPDU exchanges in order to
> transition into the appropriate state.  A hot standby, basically,
> analogous to an active-backup mode backup interface with link state up.

This sounds good. But without LACPDU exchange, the hot standby actor and
partner should be in collecting/distributing state. What should we do when
partner start send packets to us?
> 
> 	I haven't tested this in some time, though, so my question is
> whether this change affects the failover time when an active aggregator
> is de-selected in favor of another aggregator.  By "failover time," I
> mean how long transmission and/or reception are interrupted when
> changing from one aggregator to another.  I presume that if aggregator
> failover ater this change requires LACPDU exchanges, etc, it will take
> longer to fail over.

I haven't tested it yet. I think the failover time should be in 1 second.
Let me do some testing today.

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 net 2/3] bonding: restructure ad_churn_machine
  2026-02-27  1:42       ` Jay Vosburgh
@ 2026-02-27  2:36         ` Hangbin Liu
  0 siblings, 0 replies; 14+ messages in thread
From: Hangbin Liu @ 2026-02-27  2:36 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest, Liang Li

On Thu, Feb 26, 2026 at 05:42:13PM -0800, Jay Vosburgh wrote:
> >I added it here to help new readers and reviewers understand the logic
> >quickly. If you think there’s no need to include it in the code, maybe
> >we can move it to the commit description?
> 
> 	Personally, I'd leave it out, and just put a reference.
> 
> 	The churn machine isn't that critical, even the standards
> committee didn't like it and removed it from the 2020 edition of
> 802.1AX.

Oh, I didn't notice this. I posted a patch[1] to net-next to export the
churn state via netlink recently. Should I revert this patch?

> 
> 	Still, whatever we provide should work in accordance with the
> 2014 standard we're nominally conforming to, so functionally the patch
> looks fine to me.

Got it, I will remove it.

[1] http://lore.kernel.org/netdev/20260224020215.6012-1-liuhangbin@gmail.com

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port
  2026-02-27  2:31     ` Hangbin Liu
@ 2026-02-27  4:14       ` Hangbin Liu
  2026-02-27  4:42       ` Jay Vosburgh
  1 sibling, 0 replies; 14+ messages in thread
From: Hangbin Liu @ 2026-02-27  4:14 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest

On Fri, Feb 27, 2026 at 02:31:05AM +0000, Hangbin Liu wrote:
> > 	I haven't tested this in some time, though, so my question is
> > whether this change affects the failover time when an active aggregator
> > is de-selected in favor of another aggregator.  By "failover time," I
> > mean how long transmission and/or reception are interrupted when
> > changing from one aggregator to another.  I presume that if aggregator
> > failover ater this change requires LACPDU exchanges, etc, it will take
> > longer to fail over.
> 
> I haven't tested it yet. I think the failover time should be in 1 second.
> Let me do some testing today.

I did a test and the failover takes about 200ms with the environment in patch 03.

Here is the full log

Code: the timer count starts after the old active port link down.
```
ip -n "${c_ns}" link set eth1 down
date +'%F %T.%3N'
ip -n ${c_ns} -d link show eth2
while ! ip -n ${c_ns} -d link show eth2 | grep -q distributing; do
        sleep 0.01
done
date +'%F %T.%3N'
ip -n ${c_ns} -d link show eth2
```

Log:
2026-02-26 22:59:54.334   <-- The time when eth1 link down

5: eth2@if3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 12:40:54:81:d3:80 brd ff:ff:ff:ff:ff:ff link-netns b_ns-PKIXVg promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535
    veth
    bond_slave state BACKUP mii_status UP link_failure_count 0 perm_hwaddr 26:10:46:58:22:e4 queue_id 0 prio 0 ad_aggregator_id 2 ad_actor_oper_port_state 7 ad_actor_oper_port_state_str <active,short_timeout,aggregating> ad_partner_oper_port_state 15 ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync> actor_port_prio 1000 addrgenmode eui64 numtxqueues 4 numrxqueues 4 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536

2026-02-26 22:59:54.529   <--- The time when eth2 enter collecting,distributing state

5: eth2@if3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 12:40:54:81:d3:80 brd ff:ff:ff:ff:ff:ff link-netns b_ns-PKIXVg promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535
    veth
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 26:10:46:58:22:e4 queue_id 0 prio 0 ad_aggregator_id 2 ad_actor_oper_port_state 63 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state 63 ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> actor_port_prio 1000 addrgenmode eui64 numtxqueues 4 numrxqueues 4 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port
  2026-02-27  2:31     ` Hangbin Liu
  2026-02-27  4:14       ` Hangbin Liu
@ 2026-02-27  4:42       ` Jay Vosburgh
  2026-02-27  6:21         ` Hangbin Liu
  1 sibling, 1 reply; 14+ messages in thread
From: Jay Vosburgh @ 2026-02-27  4:42 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest

Hangbin Liu <liuhangbin@gmail.com> wrote:

>On Thu, Feb 26, 2026 at 05:16:55PM -0800, Jay Vosburgh wrote:
>> Hangbin Liu <liuhangbin@gmail.com> wrote:
>> 
>> >When disabling a port’s collecting and distributing states, updating only
>> >rx_disabled is not sufficient. We also need to set AD_RX_PORT_DISABLED
>> >so that the rx_machine transitions into the AD_RX_EXPIRED state.
>> >
>> >One example is in ad_agg_selection_logic(): when a new aggregator is
>> >selected and old active aggregator is disabled, if AD_RX_PORT_DISABLED is
>> >not set, the disabled port may remain stuck in AD_RX_CURRENT due to
>> >continuing to receive partner LACP messages.
>> 
>> 	I'm not sure I'm seeing the problem here, is there an actual
>> misbehavior being fixed here?  The port is receiving LACPDUs, and from
>> the receive state machine point of view (Figure 6-18) there's no issue.
>> The "port_enabled" variable (6.4.7) also informs the state machine
>> behavior, but that's not the same as what's changed by bonding's
>> __disable_port function.
>
>Yes, the reason I do it here is we select another aggregator and called
>__disable_port() for the old one. If we don't update sm_rx_state, the port
>will be keep in collecting/distributing state, and the partner will also
>keep in the c/d state.
>
>Here we entered a logical paradox, on one hand we want to disable the port,
>on the other hand we keep the port in collecting/distributing state.

	"disable" the port here really means from bonding's perspective,
so, generally equivalent to the backup interface of an active-backup
mode bond.

	Such a backup interface is typically carrier up and able to send
or receive packets.  The peer generally won't send packets to the backup
interface, however, as no traffic is sent from the backup, and the MAC
for the bond uses a different interface, so no forwarding entries will
direct to the backup interface.

	There are a couple of special cases, like LLDP, that are handled
as an exception, but in general, if a peer does send packets to the
backup interface (due to a switch flood, for example), they're dropped.

>> 	Where I'm going with this is that, when multiple aggregator
>> support was originally implemented, the theory was to keep aggregators
>> other than the active agg in a state such that they could be put into
>> service immediately, without having to do LACPDU exchanges in order to
>> transition into the appropriate state.  A hot standby, basically,
>> analogous to an active-backup mode backup interface with link state up.
>
>This sounds good. But without LACPDU exchange, the hot standby actor and
>partner should be in collecting/distributing state. What should we do when
>partner start send packets to us?

	Did you mean "should not be in c/d state" above?  I.e., without
LACPDU exchage, ... not in c/d state?

	Regardless, as above, the situation is generally equivalent to a
backup interface in active-backup mode: incoming traffic that isn't a
special case is dropped.  Normal traffic (bearing the bond source MAC)
isn't sent, as that would update the peer's forwarding table.

	Nothing in the standard prohibits us from having multiple
aggregators in c/d state simultaneously.  A configuration with two
separate bonds, each with interfaces successfully aggregated together
with their respective peers, wherein those two bonds are placed into a
third bond in active-backup mode is essentially the same thing as what
we're discussing.

	-J

>> 	I haven't tested this in some time, though, so my question is
>> whether this change affects the failover time when an active aggregator
>> is de-selected in favor of another aggregator.  By "failover time," I
>> mean how long transmission and/or reception are interrupted when
>> changing from one aggregator to another.  I presume that if aggregator
>> failover ater this change requires LACPDU exchanges, etc, it will take
>> longer to fail over.
>
>I haven't tested it yet. I think the failover time should be in 1 second.
>Let me do some testing today.
>
>Thanks
>Hangbin

---
	-Jay Vosburgh, jv@jvosburgh.net


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port
  2026-02-27  4:42       ` Jay Vosburgh
@ 2026-02-27  6:21         ` Hangbin Liu
  2026-03-10  3:01           ` Hangbin Liu
  0 siblings, 1 reply; 14+ messages in thread
From: Hangbin Liu @ 2026-02-27  6:21 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest

On Thu, Feb 26, 2026 at 08:42:27PM -0800, Jay Vosburgh wrote:
> >> 	I'm not sure I'm seeing the problem here, is there an actual
> >> misbehavior being fixed here?  The port is receiving LACPDUs, and from
> >> the receive state machine point of view (Figure 6-18) there's no issue.
> >> The "port_enabled" variable (6.4.7) also informs the state machine
> >> behavior, but that's not the same as what's changed by bonding's
> >> __disable_port function.
> >
> >Yes, the reason I do it here is we select another aggregator and called
> >__disable_port() for the old one. If we don't update sm_rx_state, the port
> >will be keep in collecting/distributing state, and the partner will also
> >keep in the c/d state.
> >
> >Here we entered a logical paradox, on one hand we want to disable the port,
> >on the other hand we keep the port in collecting/distributing state.
> 
> 	"disable" the port here really means from bonding's perspective,
> so, generally equivalent to the backup interface of an active-backup
> mode bond.

Oh, got it.

> 
> 	Such a backup interface is typically carrier up and able to send
> or receive packets.  The peer generally won't send packets to the backup
> interface, however, as no traffic is sent from the backup, and the MAC
> for the bond uses a different interface, so no forwarding entries will
> direct to the backup interface.
> 
> 	There are a couple of special cases, like LLDP, that are handled
> as an exception, but in general, if a peer does send packets to the
> backup interface (due to a switch flood, for example), they're dropped.

OK, this makes sense to me.

> 
> >> 	Where I'm going with this is that, when multiple aggregator
> >> support was originally implemented, the theory was to keep aggregators
> >> other than the active agg in a state such that they could be put into
> >> service immediately, without having to do LACPDU exchanges in order to
> >> transition into the appropriate state.  A hot standby, basically,
> >> analogous to an active-backup mode backup interface with link state up.
> >
> >This sounds good. But without LACPDU exchange, the hot standby actor and
                         ^^ I mean with LACPDU exchange..
> >partner should be in collecting/distributing state. What should we do when
> >partner start send packets to us?
> 
> 	Did you mean "should not be in c/d state" above?  I.e., without
> LACPDU exchage, ... not in c/d state?
> 
> 	Regardless, as above, the situation is generally equivalent to a
> backup interface in active-backup mode: incoming traffic that isn't a
> special case is dropped.  Normal traffic (bearing the bond source MAC)
> isn't sent, as that would update the peer's forwarding table.
> 
> 	Nothing in the standard prohibits us from having multiple
> aggregators in c/d state simultaneously.  A configuration with two
> separate bonds, each with interfaces successfully aggregated together
> with their respective peers, wherein those two bonds are placed into a
> third bond in active-backup mode is essentially the same thing as what
> we're discussing.

In theory this looks good. But in fact, when we do failover and set the
previous active port to disabled via
  - __disable_port(port)
    - slave->rx_disabled = 1

This will stop the failover port back to c/d state. For example, in my
testing (see details in patch 03), we have 4 ports, eth0, eth1, eth2, eth3.
eth0 and eth1 are agg1, eth2 and eth3 are agg2. If we do failover on eth1,
when eth1 come up, the final state will be:

3: eth0@if3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
    bond_slave state BACKUP ad_aggregator_id 1 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> actor_port_prio 10

4: eth1@if4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
    bond_slave state BACKUP ad_aggregator_id 1 ad_actor_oper_port_state_str <active,short_timeout,aggregating> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync> actor_port_prio 255

5: eth2@if3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
    bond_slave state ACTIVE ad_aggregator_id 2 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> actor_port_prio 1000

6: eth3@if4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
    bond_slave state ACTIVE ad_aggregator_id 2 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> actor_port_prio 255

7: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    bond mode 802.3ad actor_port_prio ad_aggregator 2

So you can see the eth0 state is c/d, while eth1 state is active, aggregating.
Do you think it's a correct state?

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port
  2026-02-27  6:21         ` Hangbin Liu
@ 2026-03-10  3:01           ` Hangbin Liu
  0 siblings, 0 replies; 14+ messages in thread
From: Hangbin Liu @ 2026-03-10  3:01 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Nikolay Aleksandrov,
	Mahesh Bandewar, linux-kernel, linux-kselftest

Hi Jay,

Any comments for this?

Thanks
Hangbin
On Fri, Feb 27, 2026 at 06:21:20AM +0000, Hangbin Liu wrote:
> > 	Such a backup interface is typically carrier up and able to send
> > or receive packets.  The peer generally won't send packets to the backup
> > interface, however, as no traffic is sent from the backup, and the MAC
> > for the bond uses a different interface, so no forwarding entries will
> > direct to the backup interface.
> > 
> > 	There are a couple of special cases, like LLDP, that are handled
> > as an exception, but in general, if a peer does send packets to the
> > backup interface (due to a switch flood, for example), they're dropped.
> 
> OK, this makes sense to me.
> 
> > 
> > >> 	Where I'm going with this is that, when multiple aggregator
> > >> support was originally implemented, the theory was to keep aggregators
> > >> other than the active agg in a state such that they could be put into
> > >> service immediately, without having to do LACPDU exchanges in order to
> > >> transition into the appropriate state.  A hot standby, basically,
> > >> analogous to an active-backup mode backup interface with link state up.
> > >
> > >This sounds good. But without LACPDU exchange, the hot standby actor and
>                          ^^ I mean with LACPDU exchange..
> > >partner should be in collecting/distributing state. What should we do when
> > >partner start send packets to us?
> > 
> > 	Did you mean "should not be in c/d state" above?  I.e., without
> > LACPDU exchage, ... not in c/d state?
> > 
> > 	Regardless, as above, the situation is generally equivalent to a
> > backup interface in active-backup mode: incoming traffic that isn't a
> > special case is dropped.  Normal traffic (bearing the bond source MAC)
> > isn't sent, as that would update the peer's forwarding table.
> > 
> > 	Nothing in the standard prohibits us from having multiple
> > aggregators in c/d state simultaneously.  A configuration with two
> > separate bonds, each with interfaces successfully aggregated together
> > with their respective peers, wherein those two bonds are placed into a
> > third bond in active-backup mode is essentially the same thing as what
> > we're discussing.
> 
> In theory this looks good. But in fact, when we do failover and set the
> previous active port to disabled via
>   - __disable_port(port)
>     - slave->rx_disabled = 1
> 
> This will stop the failover port back to c/d state. For example, in my
> testing (see details in patch 03), we have 4 ports, eth0, eth1, eth2, eth3.
> eth0 and eth1 are agg1, eth2 and eth3 are agg2. If we do failover on eth1,
> when eth1 come up, the final state will be:
> 
> 3: eth0@if3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
>     bond_slave state BACKUP ad_aggregator_id 1 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> actor_port_prio 10
> 
> 4: eth1@if4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
>     bond_slave state BACKUP ad_aggregator_id 1 ad_actor_oper_port_state_str <active,short_timeout,aggregating> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync> actor_port_prio 255
> 
> 5: eth2@if3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
>     bond_slave state ACTIVE ad_aggregator_id 2 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> actor_port_prio 1000
> 
> 6: eth3@if4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
>     bond_slave state ACTIVE ad_aggregator_id 2 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> actor_port_prio 255
> 
> 7: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
>     bond mode 802.3ad actor_port_prio ad_aggregator 2
> 
> So you can see the eth0 state is c/d, while eth1 state is active, aggregating.
> Do you think it's a correct state?
> 
> Thanks
> Hangbin

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-03-10  3:02 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-26 12:53 [PATCHv3 net 0/3] bonding: fix 802.3ad churn machine and port state issues Hangbin Liu
2026-02-26 12:53 ` [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port Hangbin Liu
2026-02-27  1:16   ` Jay Vosburgh
2026-02-27  2:31     ` Hangbin Liu
2026-02-27  4:14       ` Hangbin Liu
2026-02-27  4:42       ` Jay Vosburgh
2026-02-27  6:21         ` Hangbin Liu
2026-03-10  3:01           ` Hangbin Liu
2026-02-26 12:53 ` [PATCHv3 net 2/3] bonding: restructure ad_churn_machine Hangbin Liu
2026-02-27  0:36   ` Jay Vosburgh
2026-02-27  0:52     ` Hangbin Liu
2026-02-27  1:42       ` Jay Vosburgh
2026-02-27  2:36         ` Hangbin Liu
2026-02-26 12:53 ` [PATCHv3 net 3/3] selftests: bonding: add mux and churn state testing Hangbin Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox