From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22F8B2D8DB1 for ; Thu, 26 Feb 2026 12:54:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.171 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772110441; cv=none; b=I6u0mFHREsn8V2fYHne/KRL3PgDyUIUpcmCNjViBM+9yOQAAtjmX+HGOHq/VdD0io3knHeTrbOiZywzdhjnIiItaV5xFqdMUAC0+kVwKcCJm5vbEbzzfO0q/R3sMiSwIr7pzAgC0P4mVjS2cKT7kdqMN3q1vbq7lUl9lhgSWLuM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772110441; c=relaxed/simple; bh=EOlFZQfF9Me2mXWrNa0wqh7Ijd2um5Sdh6uvsfuEhsc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=hVJu1EbVOC+f8F1NUIjLspexZhzHiF+Zlu9LA2LlmAIm9uY5DZqGuqhMHg0X+TB+guiSPKwu/kmKuT/YBgOBQBJBsJAu+PNXe73J2kiSGoxwq5q8vB6Lpr1Xxe9S0+iRhuiTZuni7Y2hXimMnmODdWU5lBR8N9sCdYWDBAydD5w= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=NgyXd5B8; arc=none smtp.client-ip=209.85.215.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="NgyXd5B8" Received: by mail-pg1-f171.google.com with SMTP id 41be03b00d2f7-c06cb8004e8so304673a12.0 for ; Thu, 26 Feb 2026 04:54:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772110439; x=1772715239; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mqli0nfMs/A4Vg2iZeEDlDkF30pGRggTn9I8T0+gOmI=; b=NgyXd5B83JIEu9HmSVvI/DjvZ2nH5gHSTf0Y9bb1bRA6G2YuYxWi6XhcqUu343ksfB yfaYQvC3GjebDrsuTrrcLjYzB/HGSMeI9rhb3P+AHFY8i4YdrzQKyU/ldKiUGCU57kLa 0qLmR2H7+aUpzxdRsPUFvZNx4NQ1g7vzFhzYOfiigyzzfAOqYpGN1escWZt49yT8eEXu kS6oiR79IP7Kp8f7vpbRdrdg6TZqTA+f1R/Okra/CC1W3Xuhgdc7ZBz0dzN78kuhd2/b qIvfF0mvuk2iNiH8cN3AJ+8y90HokEJBTh7CNeJ5yzfFrhpT5iZqnjkej9d/GEJtZfzC R1Ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772110439; x=1772715239; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=mqli0nfMs/A4Vg2iZeEDlDkF30pGRggTn9I8T0+gOmI=; b=ELDFnHZ5jQjTGCcCYqoBo5OhDHszrANz6FYOOTYHBJStZy4qGi3hD2FLSN4oNOkk9u K5vbQfyXDJzsnYqx+gvDaLQRUp3OBlxrv4ul3do2OisF/QgBPZ+HnKzGnHQrsxkQ/UTk JDz2vTmjzYIe0ELJzdNOv6RGbtwX5q946IVMZAaMHJN1KGwFKywGY1XnA5PxoAh7o07+ cix1GIX0+225+ah1LvPUnvVhWwDndUeJiN72Km1tnLCXmk+XITcfBpCGm+0oob7t8vPY qDFIdPc+TQYvrZ+lmK66+ZCboV52/eiMK0LRAMvrl68qmS+nfw0U5OXzNbGVedLjuhjI qA2A== X-Forwarded-Encrypted: i=1; AJvYcCX26P6oVE8q8x3CPv/Uujii9Azn/g045fWXnIGHQ7z05VJaWQXgqtXOm6+ewzIc0oUHKn5f2MvSXk04VtlLuSw=@vger.kernel.org X-Gm-Message-State: AOJu0YxIJ+LJxJXqEsVdJreWS8i6/TZnwmnfvtfMCh95riqSPjoMVzmQ +Qj21iHpaxhb7PPUxBJ+5WfEdq+8i75Zzdo5UE8SfN1q7ctwvGPvA92M X-Gm-Gg: ATEYQzwHLox7LIsuejz6xgM71AfpIeEkbP52kzBiISdkY9KZNDRE3xY1kFt4kBfAMOs tbqqb5bfxE8xdeXmboH4gZmAKkjHlmGdYeUPhNk1Jpy/Jr0CVGSzFLvyGtDXtcHvsDoIF86AV+7 ABkqIT72Xb+dbF9g18auPMe4Vg2YcYVxADfgI1Sm7v7fuusJ2UTNXXM9Nupuvj4xbKk9aH6GLa6 FNdEOMZpg3lpNtmNmHUqjZKJrlq6vplsatCFb1aE51oCmJGj9LG8DyXThq3QC5oPmbrJI1+kusT JxX7XhGaAaiBjCpNQOiI0GzJaJhu8qajR4eBN2UaOwXriNb8gGvUzDHC3YN+gM4kovf4V6DSIoI OyNofD/avypH7AC3KoNviDlGgezYhxzILVU4amDS/3/g91fVmhQoh1i80TwT1KrRlonoYz7RhNl x4zNuvawT+quqtL4JHGDzBlRztkLU= X-Received: by 2002:a17:903:985:b0:2a0:823f:4da6 with SMTP id d9443c01a7336-2ad7456d474mr188212875ad.50.1772110439413; Thu, 26 Feb 2026 04:53:59 -0800 (PST) Received: from fedora ([209.132.188.88]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2adfb705b53sm25276715ad.92.2026.02.26.04.53.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Feb 2026 04:53:58 -0800 (PST) From: Hangbin Liu To: netdev@vger.kernel.org Cc: Jay Vosburgh , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Shuah Khan , Nikolay Aleksandrov , Mahesh Bandewar , linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Hangbin Liu , Liang Li Subject: [PATCHv3 net 2/3] bonding: restructure ad_churn_machine Date: Thu, 26 Feb 2026 12:53:29 +0000 Message-ID: <20260226125331.28147-3-liuhangbin@gmail.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260226125331.28147-1-liuhangbin@gmail.com> References: <20260226125331.28147-1-liuhangbin@gmail.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The current ad_churn_machine implementation only transitions the actor/partner churn state to churned or none after the churn timer expires. However, IEEE 802.1AX-2014 specifies that a port should enter the none state immediately once the actor’s port state enters synchronization. Another issue is that if the churn timer expires while the churn machine is not in the monitor state (e.g. already in churn), the state may remain stuck indefinitely with no further transitions. This becomes visible in multi-aggregator scenarios. For example: Ports 1 and 2 are in aggregator 1 (active) Ports 3 and 4 are in aggregator 2 (backup) Ports 1 and 2 should be in none Ports 3 and 4 should be in churned If a failover occurs due to port 2 link down/up, aggregator 2 becomes active. Under the current implementation, the resulting states may look like: agg 1 (backup): port 1 -> none, port 2 -> churned agg 2 (active): ports 3,4 keep in churned. The root cause is that ad_churn_machine() only clears the AD_PORT_CHURNED flag and starts a timer. When a churned port becomes active, its RX state becomes AD_RX_CURRENT, preventing the churn flag from being set again, leaving no way to retrigger the timer. Fixing this solely in ad_rx_machine() is insufficient. This patch rewrites ad_churn_machine according to IEEE 802.1AX-2014 (Figures 6-23 and 6-24), ensuring correct churn detection, state transitions, and timer behavior. With new implementation, there is no need to set AD_PORT_CHURNED in ad_rx_machine(). Fixes: 14c9551a32eb ("bonding: Implement port churn-machine (AD standard 43.4.17).") Reported-by: Liang Li Tested-by: Liang Li Signed-off-by: Hangbin Liu --- drivers/net/bonding/bond_3ad.c | 96 +++++++++++++++++++++++++--------- 1 file changed, 71 insertions(+), 25 deletions(-) diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c index c47f6a69fd2a..68258d61fd1c 100644 --- a/drivers/net/bonding/bond_3ad.c +++ b/drivers/net/bonding/bond_3ad.c @@ -44,7 +44,6 @@ #define AD_PORT_STANDBY 0x80 #define AD_PORT_SELECTED 0x100 #define AD_PORT_MOVED 0x200 -#define AD_PORT_CHURNED (AD_PORT_ACTOR_CHURN | AD_PORT_PARTNER_CHURN) /* Port Key definitions * key is determined according to the link speed, duplex and @@ -1254,7 +1253,6 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port) /* first, check if port was reinitialized */ if (port->sm_vars & AD_PORT_BEGIN) { port->sm_rx_state = AD_RX_INITIALIZE; - port->sm_vars |= AD_PORT_CHURNED; /* check if port is not enabled */ } else if (!(port->sm_vars & AD_PORT_BEGIN) && !port->is_enabled) port->sm_rx_state = AD_RX_PORT_DISABLED; @@ -1262,8 +1260,6 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port) else if (lacpdu && ((port->sm_rx_state == AD_RX_EXPIRED) || (port->sm_rx_state == AD_RX_DEFAULTED) || (port->sm_rx_state == AD_RX_CURRENT))) { - if (port->sm_rx_state != AD_RX_CURRENT) - port->sm_vars |= AD_PORT_CHURNED; port->sm_rx_timer_counter = 0; port->sm_rx_state = AD_RX_CURRENT; } else { @@ -1347,7 +1343,6 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port) port->partner_oper.port_state |= LACP_STATE_LACP_TIMEOUT; port->sm_rx_timer_counter = __ad_timer_to_ticks(AD_CURRENT_WHILE_TIMER, (u16)(AD_SHORT_TIMEOUT)); port->actor_oper_port_state |= LACP_STATE_EXPIRED; - port->sm_vars |= AD_PORT_CHURNED; break; case AD_RX_DEFAULTED: __update_default_selected(port); @@ -1379,11 +1374,41 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port) * ad_churn_machine - handle port churn's state machine * @port: the port we're looking at * + * IEEE 802.1AX-2014 Figure 6-23 - Actor Churn Detection machine state diagram + * + * BEGIN || (! port_enabled) + * | + * (3) (1) v + * +----------------------+ ActorPort.Sync +-------------------------+ + * | NO_ACTOR_CHURN | <--------------------- | ACTOR_CHURN_MONITOR | + * |======================| |=========================| + * | actor_churn = FALSE; | ! ActorPort.Sync | actor_churn = FALSE; | + * | | ---------------------> | Start actor_churn_timer | + * +----------------------+ (4) +-------------------------+ + * ^ | + * | | + * | actor_churn_timer expired + * | | + * ActorPort.Sync | (2) + * | +--------------------+ | + * (3) | | ACTOR_CHURN | | + * | |====================| | + * +------------- | actor_churn = True | <-----------+ + * | | + * +--------------------+ + * + * Similar for the Figure 6-24 - Partner Churn Detection machine state diagram + * + * We don’t need to check actor_churn, because it can only be true when the + * state is ACTOR_CHURN. */ static void ad_churn_machine(struct port *port) { - if (port->sm_vars & AD_PORT_CHURNED) { - port->sm_vars &= ~AD_PORT_CHURNED; + bool partner_synced = port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION; + bool actor_synced = port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION; + + /* ---- 1. begin or port not enabled ---- */ + if ((port->sm_vars & AD_PORT_BEGIN) || !port->is_enabled) { port->sm_churn_actor_state = AD_CHURN_MONITOR; port->sm_churn_partner_state = AD_CHURN_MONITOR; port->sm_churn_actor_timer_counter = @@ -1392,25 +1417,46 @@ static void ad_churn_machine(struct port *port) __ad_timer_to_ticks(AD_PARTNER_CHURN_TIMER, 0); return; } - if (port->sm_churn_actor_timer_counter && - !(--port->sm_churn_actor_timer_counter) && - port->sm_churn_actor_state == AD_CHURN_MONITOR) { - if (port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION) { - port->sm_churn_actor_state = AD_NO_CHURN; - } else { - port->churn_actor_count++; - port->sm_churn_actor_state = AD_CHURN; - } + + if (port->sm_churn_actor_timer_counter) + port->sm_churn_actor_timer_counter--; + + if (port->sm_churn_partner_timer_counter) + port->sm_churn_partner_timer_counter--; + + /* ---- 2. timer expired, enter CHURN ---- */ + if (port->sm_churn_actor_state == AD_CHURN_MONITOR && + !port->sm_churn_actor_timer_counter) { + port->sm_churn_actor_state = AD_CHURN; + port->churn_actor_count++; } - if (port->sm_churn_partner_timer_counter && - !(--port->sm_churn_partner_timer_counter) && - port->sm_churn_partner_state == AD_CHURN_MONITOR) { - if (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) { - port->sm_churn_partner_state = AD_NO_CHURN; - } else { - port->churn_partner_count++; - port->sm_churn_partner_state = AD_CHURN; - } + + if (port->sm_churn_partner_state == AD_CHURN_MONITOR && + !port->sm_churn_partner_timer_counter) { + port->sm_churn_partner_state = AD_CHURN; + port->churn_partner_count++; + } + + /* ---- 3. CHURN_MONITOR/CHURN + sync -> NO_CHURN ---- */ + if ((port->sm_churn_actor_state == AD_CHURN_MONITOR || + port->sm_churn_actor_state == AD_CHURN) && actor_synced) + port->sm_churn_actor_state = AD_NO_CHURN; + + if ((port->sm_churn_partner_state == AD_CHURN_MONITOR || + port->sm_churn_partner_state == AD_CHURN) && partner_synced) + port->sm_churn_partner_state = AD_NO_CHURN; + + /* ---- 4. NO_CHURN + !sync -> MONITOR ---- */ + if (port->sm_churn_actor_state == AD_NO_CHURN && !actor_synced) { + port->sm_churn_actor_state = AD_CHURN_MONITOR; + port->sm_churn_actor_timer_counter = + __ad_timer_to_ticks(AD_ACTOR_CHURN_TIMER, 0); + } + + if (port->sm_churn_partner_state == AD_NO_CHURN && !partner_synced) { + port->sm_churn_partner_state = AD_CHURN_MONITOR; + port->sm_churn_partner_timer_counter = + __ad_timer_to_ticks(AD_PARTNER_CHURN_TIMER, 0); } } -- 2.50.1