From: Hangbin Liu <liuhangbin@gmail.com>
To: Jay Vosburgh <jv@jvosburgh.net>
Cc: netdev@vger.kernel.org, Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Shuah Khan <shuah@kernel.org>,
Nikolay Aleksandrov <razor@blackwall.org>,
Mahesh Bandewar <maheshb@google.com>,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port
Date: Fri, 27 Feb 2026 06:21:12 +0000 [thread overview]
Message-ID: <aaE32DlfrX9S5KNT@fedora> (raw)
In-Reply-To: <949695.1772167347@famine>
On Thu, Feb 26, 2026 at 08:42:27PM -0800, Jay Vosburgh wrote:
> >> I'm not sure I'm seeing the problem here, is there an actual
> >> misbehavior being fixed here? The port is receiving LACPDUs, and from
> >> the receive state machine point of view (Figure 6-18) there's no issue.
> >> The "port_enabled" variable (6.4.7) also informs the state machine
> >> behavior, but that's not the same as what's changed by bonding's
> >> __disable_port function.
> >
> >Yes, the reason I do it here is we select another aggregator and called
> >__disable_port() for the old one. If we don't update sm_rx_state, the port
> >will be keep in collecting/distributing state, and the partner will also
> >keep in the c/d state.
> >
> >Here we entered a logical paradox, on one hand we want to disable the port,
> >on the other hand we keep the port in collecting/distributing state.
>
> "disable" the port here really means from bonding's perspective,
> so, generally equivalent to the backup interface of an active-backup
> mode bond.
Oh, got it.
>
> Such a backup interface is typically carrier up and able to send
> or receive packets. The peer generally won't send packets to the backup
> interface, however, as no traffic is sent from the backup, and the MAC
> for the bond uses a different interface, so no forwarding entries will
> direct to the backup interface.
>
> There are a couple of special cases, like LLDP, that are handled
> as an exception, but in general, if a peer does send packets to the
> backup interface (due to a switch flood, for example), they're dropped.
OK, this makes sense to me.
>
> >> Where I'm going with this is that, when multiple aggregator
> >> support was originally implemented, the theory was to keep aggregators
> >> other than the active agg in a state such that they could be put into
> >> service immediately, without having to do LACPDU exchanges in order to
> >> transition into the appropriate state. A hot standby, basically,
> >> analogous to an active-backup mode backup interface with link state up.
> >
> >This sounds good. But without LACPDU exchange, the hot standby actor and
^^ I mean with LACPDU exchange..
> >partner should be in collecting/distributing state. What should we do when
> >partner start send packets to us?
>
> Did you mean "should not be in c/d state" above? I.e., without
> LACPDU exchage, ... not in c/d state?
>
> Regardless, as above, the situation is generally equivalent to a
> backup interface in active-backup mode: incoming traffic that isn't a
> special case is dropped. Normal traffic (bearing the bond source MAC)
> isn't sent, as that would update the peer's forwarding table.
>
> Nothing in the standard prohibits us from having multiple
> aggregators in c/d state simultaneously. A configuration with two
> separate bonds, each with interfaces successfully aggregated together
> with their respective peers, wherein those two bonds are placed into a
> third bond in active-backup mode is essentially the same thing as what
> we're discussing.
In theory this looks good. But in fact, when we do failover and set the
previous active port to disabled via
- __disable_port(port)
- slave->rx_disabled = 1
This will stop the failover port back to c/d state. For example, in my
testing (see details in patch 03), we have 4 ports, eth0, eth1, eth2, eth3.
eth0 and eth1 are agg1, eth2 and eth3 are agg2. If we do failover on eth1,
when eth1 come up, the final state will be:
3: eth0@if3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
bond_slave state BACKUP ad_aggregator_id 1 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> actor_port_prio 10
4: eth1@if4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
bond_slave state BACKUP ad_aggregator_id 1 ad_actor_oper_port_state_str <active,short_timeout,aggregating> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync> actor_port_prio 255
5: eth2@if3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
bond_slave state ACTIVE ad_aggregator_id 2 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> actor_port_prio 1000
6: eth3@if4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000
bond_slave state ACTIVE ad_aggregator_id 2 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> actor_port_prio 255
7: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
bond mode 802.3ad actor_port_prio ad_aggregator 2
So you can see the eth0 state is c/d, while eth1 state is active, aggregating.
Do you think it's a correct state?
Thanks
Hangbin
next prev parent reply other threads:[~2026-02-27 6:21 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-26 12:53 [PATCHv3 net 0/3] bonding: fix 802.3ad churn machine and port state issues Hangbin Liu
2026-02-26 12:53 ` [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port Hangbin Liu
2026-02-27 1:16 ` Jay Vosburgh
2026-02-27 2:31 ` Hangbin Liu
2026-02-27 4:14 ` Hangbin Liu
2026-02-27 4:42 ` Jay Vosburgh
2026-02-27 6:21 ` Hangbin Liu [this message]
2026-03-10 3:01 ` Hangbin Liu
2026-02-26 12:53 ` [PATCHv3 net 2/3] bonding: restructure ad_churn_machine Hangbin Liu
2026-02-27 0:36 ` Jay Vosburgh
2026-02-27 0:52 ` Hangbin Liu
2026-02-27 1:42 ` Jay Vosburgh
2026-02-27 2:36 ` Hangbin Liu
2026-02-26 12:53 ` [PATCHv3 net 3/3] selftests: bonding: add mux and churn state testing Hangbin Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aaE32DlfrX9S5KNT@fedora \
--to=liuhangbin@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jv@jvosburgh.net \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=maheshb@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=razor@blackwall.org \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox