* [PATCH net 0/2] bonding: 3ad: fix carrier state with no valid slaves
@ 2026-03-16 13:18 Louis Scalbert
2026-03-16 13:18 ` [PATCH net 1/2] bonding: 3ad: fix carrier when " Louis Scalbert
2026-03-16 13:18 ` [PATCH net 2/2] bonding: 3ad: fix mux port state on oper down Louis Scalbert
0 siblings, 2 replies; 7+ messages in thread
From: Louis Scalbert @ 2026-03-16 13:18 UTC (permalink / raw)
To: netdev
Cc: jandrew+netdev, jv, edumazet, kuba, pabeni, fbl, andy, shemminger,
Louis Scalbert
Hi everyone,
This series addresses a blackholing issue and a subsequent link-flapping
issue in the 802.3ad bonding driver when dealing with inactive slaves
and the `min_links` parameter.
Currently, when an 802.3ad bond has no slaves in the collecting
/ distributing state, the bonding master still reports its carrier as up.
This causes upper-layer protocols to consider the interface operational,
resulting in silently dropped traffic.
Patch 1 fixes the core issue by ensuring the carrier is only asserted when
at least `min_links` slaves are genuinely in a valid state (collecting/
distributing, or collecting only if coupled_control is disabled).
Patch 2 fixes a side effect of the first patch. Tightening the carrier
logic exposes a state persistence bug: when a physical link goes down,
the LACP collecting/distributing flags remain set. When the link returns,
the interface briefly hallucinates that it is ready, bounces the carrier
up, and then drops it again once LACP renegotiation starts. Unsetting the
SELECTED flag when the link goes down forces the state machine through
DETACHED, clearing the stale flags and preventing the flap.
Louis Scalbert (2):
bonding: 3ad: fix carrier when no valid slaves
bonding: 3ad: fix mux port state on oper down
drivers/net/bonding/bond_3ad.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net 1/2] bonding: 3ad: fix carrier when no valid slaves
2026-03-16 13:18 [PATCH net 0/2] bonding: 3ad: fix carrier state with no valid slaves Louis Scalbert
@ 2026-03-16 13:18 ` Louis Scalbert
2026-03-19 1:14 ` Jay Vosburgh
2026-03-16 13:18 ` [PATCH net 2/2] bonding: 3ad: fix mux port state on oper down Louis Scalbert
1 sibling, 1 reply; 7+ messages in thread
From: Louis Scalbert @ 2026-03-16 13:18 UTC (permalink / raw)
To: netdev
Cc: jandrew+netdev, jv, edumazet, kuba, pabeni, fbl, andy, shemminger,
Louis Scalbert
When an 802.3ad (LACP) bonding interface has no slaves in the
collecting/distributing state, the bonding master may still report
carrier as up. In this situation, no slave is actually able to transmit
or receive traffic.
As a result, upper-layer daemons consider the interface operational
while traffic is effectively blackholed.
Fix this by asserting carrier only when at least 'min_links' slaves are
in the collecting/distributing state (or collecting only if the
coupled_control default behavior is disabled).
Fixes: 655f8919d549 ("bonding: add min links parameter to 802.3ad")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
---
drivers/net/bonding/bond_3ad.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index af7f74cfdc08..7d7661972c5e 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -745,6 +745,22 @@ static void __set_agg_ports_ready(struct aggregator *aggregator, int val)
}
}
+static int __agg_valid_ports(struct aggregator *agg)
+{
+ struct port *port;
+ int valid = 0;
+
+ for (port = agg->lag_ports; port;
+ port = port->next_port_in_aggregator) {
+ if (port->actor_oper_port_state & LACP_STATE_COLLECTING &&
+ (!port->slave->bond->params.coupled_control ||
+ port->actor_oper_port_state & LACP_STATE_DISTRIBUTING))
+ valid++;
+ }
+
+ return valid;
+}
+
static int __agg_active_ports(struct aggregator *agg)
{
struct port *port;
@@ -2120,6 +2136,7 @@ static void ad_enable_collecting_distributing(struct port *port,
port->actor_port_number,
port->aggregator->aggregator_identifier);
__enable_port(port);
+ bond_3ad_set_carrier(port->slave->bond);
/* Slave array needs update */
*update_slave_arr = true;
/* Should notify peers if possible */
@@ -2141,6 +2158,7 @@ static void ad_disable_collecting_distributing(struct port *port,
port->actor_port_number,
port->aggregator->aggregator_identifier);
__disable_port(port);
+ bond_3ad_set_carrier(port->slave->bond);
/* Slave array needs an update */
*update_slave_arr = true;
}
@@ -2819,8 +2837,8 @@ int bond_3ad_set_carrier(struct bonding *bond)
}
active = __get_active_agg(&(SLAVE_AD_INFO(first_slave)->aggregator));
if (active) {
- /* are enough slaves available to consider link up? */
- if (__agg_active_ports(active) < bond->params.min_links) {
+ /* are enough slaves in collecting (and distributing) state to consider link up? */
+ if (__agg_valid_ports(active) < bond->params.min_links) {
if (netif_carrier_ok(bond->dev)) {
netif_carrier_off(bond->dev);
goto out;
--
2.39.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net 2/2] bonding: 3ad: fix mux port state on oper down
2026-03-16 13:18 [PATCH net 0/2] bonding: 3ad: fix carrier state with no valid slaves Louis Scalbert
2026-03-16 13:18 ` [PATCH net 1/2] bonding: 3ad: fix carrier when " Louis Scalbert
@ 2026-03-16 13:18 ` Louis Scalbert
2026-03-19 1:40 ` Jay Vosburgh
1 sibling, 1 reply; 7+ messages in thread
From: Louis Scalbert @ 2026-03-16 13:18 UTC (permalink / raw)
To: netdev
Cc: jandrew+netdev, jv, edumazet, kuba, pabeni, fbl, andy, shemminger,
Louis Scalbert
When the bonding interface has carrier down due to the absence of
valid slaves and a slave transitions from down to up, the bonding
interface briefly goes carrier up, then down again, and finally up
once LACP negotiates collecting and distributing on the port.
The interface should not transition to carrier up until LACP
negotiation is complete.
This happens because the actor and partner port states remain in
collecting (and distributing) when the port goes down. When the port
comes back up, it temporarily remains in this state until LACP
renegotiation occurs.
Previously this was mostly cosmetic, but since the bonding carrier
state now depends on the LACP negotiation state, it causes the
interface to flap.
Fix this by unsetting the SELECTED flag when a port goes down so that
the mux state machine transitions through ATTACHED and DETACHED,
which clears the actor collecting and distributing flags.
Fixes: 655f8919d549 ("bonding: add min links parameter to 802.3ad")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
---
drivers/net/bonding/bond_3ad.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 7d7661972c5e..367d89977000 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -716,6 +716,8 @@ static int __agg_ports_are_ready(struct aggregator *aggregator)
for (port = aggregator->lag_ports;
port;
port = port->next_port_in_aggregator) {
+ if (!port->is_enabled)
+ continue;
if (!(port->sm_vars & AD_PORT_READY_N)) {
retval = 0;
break;
@@ -1570,6 +1572,9 @@ static void ad_port_selection_logic(struct port *port, bool *update_slave_arr)
struct slave *slave;
int found = 0;
+ if (!port->is_enabled)
+ return;
+
/* if the port is already Selected, do nothing */
if (port->sm_vars & AD_PORT_SELECTED)
return;
@@ -2794,6 +2799,7 @@ void bond_3ad_handle_link_change(struct slave *slave, char link)
/* link has failed */
port->is_enabled = false;
ad_update_actor_keys(port, true);
+ port->sm_vars &= ~AD_PORT_SELECTED;
}
agg = __get_first_agg(port);
ad_agg_selection_logic(agg, &dummy);
--
2.39.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net 1/2] bonding: 3ad: fix carrier when no valid slaves
2026-03-16 13:18 ` [PATCH net 1/2] bonding: 3ad: fix carrier when " Louis Scalbert
@ 2026-03-19 1:14 ` Jay Vosburgh
2026-03-20 14:06 ` Louis Scalbert
0 siblings, 1 reply; 7+ messages in thread
From: Jay Vosburgh @ 2026-03-19 1:14 UTC (permalink / raw)
To: Louis Scalbert
Cc: netdev, jandrew+netdev, edumazet, kuba, pabeni, fbl, andy,
shemminger
Louis Scalbert <louis.scalbert@6wind.com> wrote:
>When an 802.3ad (LACP) bonding interface has no slaves in the
>collecting/distributing state, the bonding master may still report
>carrier as up. In this situation, no slave is actually able to transmit
>or receive traffic.
>
>As a result, upper-layer daemons consider the interface operational
>while traffic is effectively blackholed.
>
>Fix this by asserting carrier only when at least 'min_links' slaves are
>in the collecting/distributing state (or collecting only if the
>coupled_control default behavior is disabled).
>
>Fixes: 655f8919d549 ("bonding: add min links parameter to 802.3ad")
>Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
>---
> drivers/net/bonding/bond_3ad.c | 22 ++++++++++++++++++++--
> 1 file changed, 20 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index af7f74cfdc08..7d7661972c5e 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -745,6 +745,22 @@ static void __set_agg_ports_ready(struct aggregator *aggregator, int val)
> }
> }
>
>+static int __agg_valid_ports(struct aggregator *agg)
>+{
>+ struct port *port;
>+ int valid = 0;
>+
>+ for (port = agg->lag_ports; port;
>+ port = port->next_port_in_aggregator) {
>+ if (port->actor_oper_port_state & LACP_STATE_COLLECTING &&
>+ (!port->slave->bond->params.coupled_control ||
>+ port->actor_oper_port_state & LACP_STATE_DISTRIBUTING))
>+ valid++;
>+ }
>+
>+ return valid;
>+}
Did you consider instead rolling the COLLECTING / DISTRIBUTING
test into __agg_active_ports? It appears at first glance that the users
of __agg_active_ports would do the right thing with the "able to
communicate" logic.
As an example, logically the aggregator selection logic should
count "usable" ports, not simply ports that are enabled and may or may
not be usable.
-J
>+
> static int __agg_active_ports(struct aggregator *agg)
> {
> struct port *port;
>@@ -2120,6 +2136,7 @@ static void ad_enable_collecting_distributing(struct port *port,
> port->actor_port_number,
> port->aggregator->aggregator_identifier);
> __enable_port(port);
>+ bond_3ad_set_carrier(port->slave->bond);
> /* Slave array needs update */
> *update_slave_arr = true;
> /* Should notify peers if possible */
>@@ -2141,6 +2158,7 @@ static void ad_disable_collecting_distributing(struct port *port,
> port->actor_port_number,
> port->aggregator->aggregator_identifier);
> __disable_port(port);
>+ bond_3ad_set_carrier(port->slave->bond);
> /* Slave array needs an update */
> *update_slave_arr = true;
> }
>@@ -2819,8 +2837,8 @@ int bond_3ad_set_carrier(struct bonding *bond)
> }
> active = __get_active_agg(&(SLAVE_AD_INFO(first_slave)->aggregator));
> if (active) {
>- /* are enough slaves available to consider link up? */
>- if (__agg_active_ports(active) < bond->params.min_links) {
>+ /* are enough slaves in collecting (and distributing) state to consider link up? */
>+ if (__agg_valid_ports(active) < bond->params.min_links) {
> if (netif_carrier_ok(bond->dev)) {
> netif_carrier_off(bond->dev);
> goto out;
>--
>2.39.2
>
---
-Jay Vosburgh, jv@jvosburgh.net
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net 2/2] bonding: 3ad: fix mux port state on oper down
2026-03-16 13:18 ` [PATCH net 2/2] bonding: 3ad: fix mux port state on oper down Louis Scalbert
@ 2026-03-19 1:40 ` Jay Vosburgh
2026-03-20 10:01 ` Louis Scalbert
0 siblings, 1 reply; 7+ messages in thread
From: Jay Vosburgh @ 2026-03-19 1:40 UTC (permalink / raw)
To: Louis Scalbert
Cc: netdev, jandrew+netdev, edumazet, kuba, pabeni, fbl, andy,
shemminger
Louis Scalbert <louis.scalbert@6wind.com> wrote:
>When the bonding interface has carrier down due to the absence of
>valid slaves and a slave transitions from down to up, the bonding
>interface briefly goes carrier up, then down again, and finally up
>once LACP negotiates collecting and distributing on the port.
>
>The interface should not transition to carrier up until LACP
>negotiation is complete.
>
>This happens because the actor and partner port states remain in
>collecting (and distributing) when the port goes down. When the port
>comes back up, it temporarily remains in this state until LACP
>renegotiation occurs.
>
>Previously this was mostly cosmetic, but since the bonding carrier
>state now depends on the LACP negotiation state, it causes the
>interface to flap.
>
>Fix this by unsetting the SELECTED flag when a port goes down so that
>the mux state machine transitions through ATTACHED and DETACHED,
>which clears the actor collecting and distributing flags.
>
>Fixes: 655f8919d549 ("bonding: add min links parameter to 802.3ad")
>Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
>---
> drivers/net/bonding/bond_3ad.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index 7d7661972c5e..367d89977000 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -716,6 +716,8 @@ static int __agg_ports_are_ready(struct aggregator *aggregator)
> for (port = aggregator->lag_ports;
> port;
> port = port->next_port_in_aggregator) {
>+ if (!port->is_enabled)
>+ continue;
Your commit message mentions clearing SELECTED, further below,
but not the new two tests for port->is_enabled. In this change,
__agg_ports_are_ready is testing that all ports of an aggregator are
READY_N, but if a port of the aggregator is not is_enabled, then
logically it would not be READY_N, either. So why exclude it here?
Is a this is an optimization to settle the aggregator state more
quickly, as an !is_enabled port is likely to be transitioning out of the
aggregator?
> if (!(port->sm_vars & AD_PORT_READY_N)) {
> retval = 0;
> break;
>@@ -1570,6 +1572,9 @@ static void ad_port_selection_logic(struct port *port, bool *update_slave_arr)
> struct slave *slave;
> int found = 0;
>
>+ if (!port->is_enabled)
>+ return;
>+
Same question as above, why exclude !is_enabled ports here?
-J
> /* if the port is already Selected, do nothing */
> if (port->sm_vars & AD_PORT_SELECTED)
> return;
>@@ -2794,6 +2799,7 @@ void bond_3ad_handle_link_change(struct slave *slave, char link)
> /* link has failed */
> port->is_enabled = false;
> ad_update_actor_keys(port, true);
>+ port->sm_vars &= ~AD_PORT_SELECTED;
> }
> agg = __get_first_agg(port);
> ad_agg_selection_logic(agg, &dummy);
>--
>2.39.2
>
---
-Jay Vosburgh, jv@jvosburgh.net
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net 2/2] bonding: 3ad: fix mux port state on oper down
2026-03-19 1:40 ` Jay Vosburgh
@ 2026-03-20 10:01 ` Louis Scalbert
0 siblings, 0 replies; 7+ messages in thread
From: Louis Scalbert @ 2026-03-20 10:01 UTC (permalink / raw)
To: Jay Vosburgh
Cc: netdev, andrew+netdev, edumazet, kuba, pabeni, fbl, andy,
shemminger
Thank you for the review.
Le jeu. 19 mars 2026 à 02:40, Jay Vosburgh <jv@jvosburgh.net> a écrit :
>
> Louis Scalbert <louis.scalbert@6wind.com> wrote:
>
> >When the bonding interface has carrier down due to the absence of
> >valid slaves and a slave transitions from down to up, the bonding
> >interface briefly goes carrier up, then down again, and finally up
> >once LACP negotiates collecting and distributing on the port.
> >
> >The interface should not transition to carrier up until LACP
> >negotiation is complete.
> >
> >This happens because the actor and partner port states remain in
> >collecting (and distributing) when the port goes down. When the port
> >comes back up, it temporarily remains in this state until LACP
> >renegotiation occurs.
> >
> >Previously this was mostly cosmetic, but since the bonding carrier
> >state now depends on the LACP negotiation state, it causes the
> >interface to flap.
> >
> >Fix this by unsetting the SELECTED flag when a port goes down so that
> >the mux state machine transitions through ATTACHED and DETACHED,
> >which clears the actor collecting and distributing flags.
I will add in the next version: Do not attempt to set the SELECTED flag if the
interface is still down (!port->is_enabled).
>
> >
> >Fixes: 655f8919d549 ("bonding: add min links parameter to 802.3ad")
> >Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
> >---
> > drivers/net/bonding/bond_3ad.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> >diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> >index 7d7661972c5e..367d89977000 100644
> >--- a/drivers/net/bonding/bond_3ad.c
> >+++ b/drivers/net/bonding/bond_3ad.c
> >@@ -716,6 +716,8 @@ static int __agg_ports_are_ready(struct aggregator *aggregator)
> > for (port = aggregator->lag_ports;
> > port;
> > port = port->next_port_in_aggregator) {
> >+ if (!port->is_enabled)
> >+ continue;
>
> Your commit message mentions clearing SELECTED, further below,
> but not the new two tests for port->is_enabled. In this change,
>
> __agg_ports_are_ready is testing that all ports of an aggregator are
> READY_N, but if a port of the aggregator is not is_enabled, then
> logically it would not be READY_N, either. So why exclude it here?
__agg_ports_are_ready was testing that all ports were READY_N independently
of the ports operation state (up/down). Now, it is testing that only
the enabled ports
are READY_N.
>
> Is a this is an optimization to settle the aggregator state more
> quickly, as an !is_enabled port is likely to be transitioning out of the
> aggregator?
This is not an optimization, but a fix for a side effect caused by unsetting
the SELECTED flag: after a link comes back up, it may no longer be able to
renegotiate LACP.
Consider the following scenario:
1. All aggregator ports go down
- The SELECTED flag is cleared on all of them.
2. One port comes back up
- Its SELECTED flag is set again.
- It enters the WAITING state and gets its READY_N flag.
- The remaining ports stay UNSELECTED. Because of that, they cannot enter the
WAITING state and therefore never get READY_N.
- __agg_ports_are_ready() returns 0 because it finds a port without READY_N.
- As a result, __set_agg_ports_ready() keeps the READY flag cleared on all
ports.
- The port that came back up is therefore not marked READY and cannot
transition to ATTACHED.
- LACP negotiation becomes stuck, and the port cannot be used.
3. All aggregator ports come back up
- They all regain SELECTED and READY_N.
- __agg_ports_are_ready() now returns 1.
- __set_agg_ports_ready() sets READY on all ports.
- They can then transition to ATTACHED.
- Negotiation resumes and the aggregator becomes operational again.
The fix is to consider only enabled ports, which allows negotiation to proceed
in step 2. In that case, __agg_ports_are_ready() returns 1.
The solution is consistent with IEEE 802.3ad, which states:
"The Selection Logic asserts Ready TRUE when the values of Ready_N for all ports
that are waiting to attach to a given Aggregator are TRUE."
An even more accurate fix would be to check only ports currently in the WAITING
mux state for READY_N. I plan to send a new version implementing that behavior,
and to move the __agg_ports_are_ready() change in a dedicated commit.
>
> > if (!(port->sm_vars & AD_PORT_READY_N)) {
> > retval = 0;
> > break;
> >@@ -1570,6 +1572,9 @@ static void ad_port_selection_logic(struct port *port, bool *update_slave_arr)
> > struct slave *slave;
> > int found = 0;
> >
> >+ if (!port->is_enabled)
> >+ return;
> >+
>
> Same question as above, why exclude !is_enabled ports here?
This function is intended to transition UNSELECTED ports into the SELECTED
state. The change prevents attempting this transition while the port is still
down.
I will add a comment here in version 2.
>
>
> -J
>
> > /* if the port is already Selected, do nothing */
> > if (port->sm_vars & AD_PORT_SELECTED)
> > return;
> >@@ -2794,6 +2799,7 @@ void bond_3ad_handle_link_change(struct slave *slave, char link)
> > /* link has failed */
> > port->is_enabled = false;
> > ad_update_actor_keys(port, true);
> >+ port->sm_vars &= ~AD_PORT_SELECTED;
> > }
> > agg = __get_first_agg(port);
> > ad_agg_selection_logic(agg, &dummy);
> >--
> >2.39.2
> >
>
> ---
> -Jay Vosburgh, jv@jvosburgh.net
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net 1/2] bonding: 3ad: fix carrier when no valid slaves
2026-03-19 1:14 ` Jay Vosburgh
@ 2026-03-20 14:06 ` Louis Scalbert
0 siblings, 0 replies; 7+ messages in thread
From: Louis Scalbert @ 2026-03-20 14:06 UTC (permalink / raw)
To: Jay Vosburgh
Cc: netdev, edumazet, kuba, pabeni, fbl, andy, shemminger,
andrew+netdev
Le jeu. 19 mars 2026 à 02:14, Jay Vosburgh <jv@jvosburgh.net> a écrit :
>
> Louis Scalbert <louis.scalbert@6wind.com> wrote:
>
> >When an 802.3ad (LACP) bonding interface has no slaves in the
> >collecting/distributing state, the bonding master may still report
> >carrier as up. In this situation, no slave is actually able to transmit
> >or receive traffic.
> >
> >As a result, upper-layer daemons consider the interface operational
> >while traffic is effectively blackholed.
> >
> >Fix this by asserting carrier only when at least 'min_links' slaves are
> >in the collecting/distributing state (or collecting only if the
> >coupled_control default behavior is disabled).
> >
> >Fixes: 655f8919d549 ("bonding: add min links parameter to 802.3ad")
> >Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
> >---
> > drivers/net/bonding/bond_3ad.c | 22 ++++++++++++++++++++--
> > 1 file changed, 20 insertions(+), 2 deletions(-)
> >
> >diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> >index af7f74cfdc08..7d7661972c5e 100644
> >--- a/drivers/net/bonding/bond_3ad.c
> >+++ b/drivers/net/bonding/bond_3ad.c
> >@@ -745,6 +745,22 @@ static void __set_agg_ports_ready(struct aggregator *aggregator, int val)
> > }
> > }
> >
> >+static int __agg_valid_ports(struct aggregator *agg)
> >+{
> >+ struct port *port;
> >+ int valid = 0;
> >+
> >+ for (port = agg->lag_ports; port;
> >+ port = port->next_port_in_aggregator) {
> >+ if (port->actor_oper_port_state & LACP_STATE_COLLECTING &&
> >+ (!port->slave->bond->params.coupled_control ||
> >+ port->actor_oper_port_state & LACP_STATE_DISTRIBUTING))
> >+ valid++;
> >+ }
> >+
> >+ return valid;
> >+}
>
> Did you consider instead rolling the COLLECTING / DISTRIBUTING
> test into __agg_active_ports? It appears at first glance that the users
> of __agg_active_ports would do the right thing with the "able to
> communicate" logic.
Yes, I did, but it is not relevant here. In those other places, the code is
meant to count enabled ports, not ports in the COLLECTING/DISTRIBUTING state.
>
> As an example, logically the aggregator selection logic should
> count "usable" ports, not simply ports that are enabled and may or may
> not be usable.
If by usable you mean COLLECTING / DISTRIBUTING, that cannot work, because the
aggregator selection logic runs before the mux state machine that performs the
transition to COLLECTING / DISTRIBUTING. In bond_3ad_state_machine_handler(),
ad_port_selection_logic() is called before ad_mux_machine().
This ordering is correct. The aggregator selection logic is responsible for
selecting the best set of ports among the configured enslaved ports. Only the
ports selected at this stage are allowed to leave the WAITING mux state and
later transition to ATTACHED, then to COLLECTING / DISTRIBUTING.
ad_agg_selection_test() uses __agg_active_ports() to compare the number of
enabled ports in the current aggregator and in a candidate one, then selects
the aggregator with the highest number of enabled ports. Using the number of
COLLECTING / DISTRIBUTING ports instead would not make sense, because only the
currently selected aggregator can already have ports in that state.
__agg_active_ports() callers and sub-callers are:
- __bond_3ad_get_active_agg_info
=> used to display the enabled ports in /proc/net/bonding/
=> this is correct
- __get_agg_bandwidth
- ad_agg_selection_test
=> compute the total bandwidth available on the enabled ports. Correct
- ad_agg_selection_test
- called by ad_agg_selection_logic
- ad_agg_selection_logic
- ad_port_selection_logic
- bond_3ad_state_machine_handler
- bond_3ad_handle_link_change
- bond_3ad_state_machine_handler
- bond_3ad_unbind_slave
=> the selection logic takes into account the enabled link. Correct
- bond_3ad_unbind_slave
=> When a port is removed from the bond, check whether its aggregator still
has any enabled ports. If none remain and the aggregator was active,
invoke the aggregator selection logic. This behavior is correct.
Best regards,
Louis Scalbert
>
> -J
>
> >+
> > static int __agg_active_ports(struct aggregator *agg)
> > {
> > struct port *port;
> >@@ -2120,6 +2136,7 @@ static void ad_enable_collecting_distributing(struct port *port,
> > port->actor_port_number,
> > port->aggregator->aggregator_identifier);
> > __enable_port(port);
> >+ bond_3ad_set_carrier(port->slave->bond);
> > /* Slave array needs update */
> > *update_slave_arr = true;
> > /* Should notify peers if possible */
> >@@ -2141,6 +2158,7 @@ static void ad_disable_collecting_distributing(struct port *port,
> > port->actor_port_number,
> > port->aggregator->aggregator_identifier);
> > __disable_port(port);
> >+ bond_3ad_set_carrier(port->slave->bond);
> > /* Slave array needs an update */
> > *update_slave_arr = true;
> > }
> >@@ -2819,8 +2837,8 @@ int bond_3ad_set_carrier(struct bonding *bond)
> > }
> > active = __get_active_agg(&(SLAVE_AD_INFO(first_slave)->aggregator));
> > if (active) {
> >- /* are enough slaves available to consider link up? */
> >- if (__agg_active_ports(active) < bond->params.min_links) {
> >+ /* are enough slaves in collecting (and distributing) state to consider link up? */
> >+ if (__agg_valid_ports(active) < bond->params.min_links) {
> > if (netif_carrier_ok(bond->dev)) {
> > netif_carrier_off(bond->dev);
> > goto out;
> >--
> >2.39.2
> >
>
> ---
> -Jay Vosburgh, jv@jvosburgh.net
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-03-20 14:07 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-16 13:18 [PATCH net 0/2] bonding: 3ad: fix carrier state with no valid slaves Louis Scalbert
2026-03-16 13:18 ` [PATCH net 1/2] bonding: 3ad: fix carrier when " Louis Scalbert
2026-03-19 1:14 ` Jay Vosburgh
2026-03-20 14:06 ` Louis Scalbert
2026-03-16 13:18 ` [PATCH net 2/2] bonding: 3ad: fix mux port state on oper down Louis Scalbert
2026-03-19 1:40 ` Jay Vosburgh
2026-03-20 10:01 ` Louis Scalbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox