linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/2] bonding: fix LACP negotiation issues in passive mode
@ 2025-07-09  9:03 Hangbin Liu
  2025-07-09  9:03 ` [PATCH net 1/2] bonding: update ntt to true " Hangbin Liu
  2025-07-09  9:03 ` [PATCH net 2/2] selftests: bonding: add test for passive LACP mode Hangbin Liu
  0 siblings, 2 replies; 15+ messages in thread
From: Hangbin Liu @ 2025-07-09  9:03 UTC (permalink / raw)
  To: netdev
  Cc: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel, Hangbin Liu

This patchset fixes an issue where bonding fails to establish a stable LACP
negotiation when operating in passive mode (lacp_active=off).

In passive mode, the current implementation only replies when the partner's
state changes, which results in LACP timeout and unstable aggregator formation.

With this change, the bond responds to each received LACPDU in passive mode
by setting ntt = true, ensuring timely replies and stable LACP negotiation.

Hangbin Liu (2):
  bonding: update ntt to true in passive mode
  selftests: bonding: add test for passive LACP mode

 drivers/net/bonding/bond_3ad.c                |  6 ++
 .../drivers/net/bonding/bond_passive_lacp.sh  | 21 +++++
 .../drivers/net/bonding/bond_topo_lacp.sh     | 77 +++++++++++++++++++
 3 files changed, 104 insertions(+)
 create mode 100755 tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
 create mode 100644 tools/testing/selftests/drivers/net/bonding/bond_topo_lacp.sh

-- 
2.46.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH net 1/2] bonding: update ntt to true in passive mode
  2025-07-09  9:03 [PATCH net 0/2] bonding: fix LACP negotiation issues in passive mode Hangbin Liu
@ 2025-07-09  9:03 ` Hangbin Liu
  2025-07-16  4:19   ` Jay Vosburgh
  2025-07-09  9:03 ` [PATCH net 2/2] selftests: bonding: add test for passive LACP mode Hangbin Liu
  1 sibling, 1 reply; 15+ messages in thread
From: Hangbin Liu @ 2025-07-09  9:03 UTC (permalink / raw)
  To: netdev
  Cc: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel, Hangbin Liu

When lacp_active is set to off, the bond operates in passive mode, meaning it
will only "speak when spoken to." However, the current kernel implementation
only sends an LACPDU in response when the partner's state changes.

In this situation, once LACP negotiation succeeds, the actor stops sending
LACPDUs until the partner times out and sends an "expired" LACPDU.
This leads to endless LACP state flapping.

To avoid this, we need update ntt to true once received an LACPDU from the
partner, ensuring an immediate reply. With this fix, the link becomes stable
in most cases, except for one specific scenario:

Actor: lacp_active=off, lacp_rate=slow
Partner: lacp_active=on, lacp_rate=fast

In this case, the partner expects frequent LACPDUs (every 1 second), but the
actor only responds after receiving an LACPDU, which, in this setup, the
partner sends every 30 seconds due to the actor's lacp_rate=slow. By the time
the actor replies, the partner has already timed out and sent an "expired"
LACPDU.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 drivers/net/bonding/bond_3ad.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index c6807e473ab7..e001d1c8a49b 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -666,6 +666,8 @@ static void __update_default_selected(struct port *port)
  */
 static void __update_ntt(struct lacpdu *lacpdu, struct port *port)
 {
+	struct bonding *bond;
+
 	/* validate lacpdu and port */
 	if (lacpdu && port) {
 		/* check if any parameter is different then
@@ -683,6 +685,10 @@ static void __update_ntt(struct lacpdu *lacpdu, struct port *port)
 		   ) {
 			port->ntt = true;
 		}
+
+		bond = __get_bond_by_port(port);
+		if (bond && !bond->params.lacp_active)
+			port->ntt = true;
 	}
 }
 
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net 2/2] selftests: bonding: add test for passive LACP mode
  2025-07-09  9:03 [PATCH net 0/2] bonding: fix LACP negotiation issues in passive mode Hangbin Liu
  2025-07-09  9:03 ` [PATCH net 1/2] bonding: update ntt to true " Hangbin Liu
@ 2025-07-09  9:03 ` Hangbin Liu
  2025-07-15  9:37   ` Paolo Abeni
  1 sibling, 1 reply; 15+ messages in thread
From: Hangbin Liu @ 2025-07-09  9:03 UTC (permalink / raw)
  To: netdev
  Cc: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel, Hangbin Liu

Add a selftest to verify bonding behavior when lacp_active is set to off.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 .../drivers/net/bonding/bond_passive_lacp.sh  | 21 +++++
 .../drivers/net/bonding/bond_topo_lacp.sh     | 77 +++++++++++++++++++
 2 files changed, 98 insertions(+)
 create mode 100755 tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
 create mode 100644 tools/testing/selftests/drivers/net/bonding/bond_topo_lacp.sh

diff --git a/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
new file mode 100755
index 000000000000..4cf8a5999aaa
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
@@ -0,0 +1,21 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Testing if bond works with lacp_active = off
+
+lib_dir=$(dirname "$0")
+source ${lib_dir}/bond_topo_lacp.sh
+
+trap cleanup EXIT
+setup_topo_lacp
+
+lacp_bond_reset "${c_ns}" "lacp_active off"
+# make sure the switch state is not expired [A,T,G,S,Ex]
+if slowwait 15 ip netns exec ${s_ns} grep -q 'port state: 143' /proc/net/bonding/bond0; then
+	RET=1
+else
+	RET=0
+fi
+log_test "bond 802.3ad" "lacp_active off"
+
+exit $EXIT_STATUS
diff --git a/tools/testing/selftests/drivers/net/bonding/bond_topo_lacp.sh b/tools/testing/selftests/drivers/net/bonding/bond_topo_lacp.sh
new file mode 100644
index 000000000000..33dca6ebdc4b
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/bonding/bond_topo_lacp.sh
@@ -0,0 +1,77 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Topology for Bond mode 802.3ad testing
+#
+#  +-------------------------+
+#  |          bond0          |  Switch
+#  |            +            |  192.0.2.254/24
+#  |      eth0  |  eth1      |  2001:db8::254/24
+#  |        +---+---+        |
+#  |        |       |        |
+#  +-------------------------+
+#           |       |
+#  +-------------------------+
+#  |        |       |        |
+#  |        +---+---+        |  Client
+#  |      eth0  |  eth1      |  192.0.2.1/24
+#  |            +            |  2001:db8::1/24
+#  |          bond0          |
+#  +-------------------------+
+
+REQUIRE_MZ=no
+NUM_NETIFS=0
+lib_dir=$(dirname "$0")
+source "$lib_dir"/../../../net/forwarding/lib.sh
+
+s_ip4="192.0.2.254"
+c_ip4="192.0.2.1"
+s_ip6="2001:db8::254"
+c_ip6="2001:db8::1"
+
+switch_create()
+{
+	ip netns exec ${s_ns} sysctl -q net.ipv6.conf.default.keep_addr_on_down=1
+	ip -n ${s_ns} link add eth0 type veth peer name eth0 netns ${c_ns}
+	ip -n ${s_ns} link add eth1 type veth peer name eth1 netns ${c_ns}
+	ip -n ${s_ns} link add bond0 type bond mode 802.3ad miimon 100 lacp_active on lacp_rate fast
+	ip -n ${s_ns} link set eth0 master bond0
+	ip -n ${s_ns} link set eth1 master bond0
+	ip -n ${s_ns} addr add ${s_ip4}/24 dev bond0
+	ip -n ${s_ns} addr add ${s_ip6}/24 dev bond0
+	ip -n ${s_ns} link set bond0 up
+}
+
+client_create()
+{
+	ip netns exec ${c_ns} sysctl -q net.ipv6.conf.default.keep_addr_on_down=1
+	ip -n ${c_ns} link add bond0 type bond mode 802.3ad miimon 100 lacp_active on lacp_rate fast
+	ip -n ${c_ns} link set eth0 master bond0
+	ip -n ${c_ns} link set eth1 master bond0
+	ip -n ${c_ns} addr add ${c_ip4}/24 dev bond0
+	ip -n ${c_ns} addr add ${c_ip6}/24 dev bond0
+	ip -n ${c_ns} link set bond0 up
+}
+
+setup_topo_lacp()
+{
+	setup_ns s_ns c_ns
+	defer cleanup_all_ns
+
+	switch_create
+	client_create
+}
+
+# Reset bond with and options
+lacp_bond_reset()
+{
+	local netns="$1"
+	local param="$2"
+
+	ip -n ${netns} link set bond0 down
+	ip -n ${netns} link set bond0 type bond $param
+	ip -n ${netns} link set bond0 up
+
+	# Wait for IPv6 address ready as it needs DAD
+	slowwait 10 ip netns exec ${c_ns} ping6 ${s_ip6} -c 1 -W 0.1 &> /dev/null
+}
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net 2/2] selftests: bonding: add test for passive LACP mode
  2025-07-09  9:03 ` [PATCH net 2/2] selftests: bonding: add test for passive LACP mode Hangbin Liu
@ 2025-07-15  9:37   ` Paolo Abeni
  2025-07-16 11:23     ` Hangbin Liu
  2025-07-24  4:05     ` Hangbin Liu
  0 siblings, 2 replies; 15+ messages in thread
From: Paolo Abeni @ 2025-07-15  9:37 UTC (permalink / raw)
  To: Hangbin Liu, netdev
  Cc: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Nikolay Aleksandrov, Simon Horman, Shuah Khan,
	linux-kselftest, linux-kernel

On 7/9/25 11:03 AM, Hangbin Liu wrote:
> Add a selftest to verify bonding behavior when lacp_active is set to off.
> 
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> ---
>  .../drivers/net/bonding/bond_passive_lacp.sh  | 21 +++++
>  .../drivers/net/bonding/bond_topo_lacp.sh     | 77 +++++++++++++++++++
>  2 files changed, 98 insertions(+)
>  create mode 100755 tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
>  create mode 100644 tools/testing/selftests/drivers/net/bonding/bond_topo_lacp.sh

New test should be listed in the relevant makefile
> diff --git a/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> new file mode 100755
> index 000000000000..4cf8a5999aaa
> --- /dev/null
> +++ b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> @@ -0,0 +1,21 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Testing if bond works with lacp_active = off
> +
> +lib_dir=$(dirname "$0")
> +source ${lib_dir}/bond_topo_lacp.sh

shellcheck is not super happy about 'source' usage:

In bond_passive_lacp.sh line 7:
source ${lib_dir}/bond_topo_lacp.sh
^-- SC3046 (warning): In POSIX sh, 'source' in place of '.' is undefined.
^-- SC3051 (warning): In POSIX sh, 'source' in place of '.' is undefined.

either switch to '. ' or use bash instead of 'sh'.

> +lacp_bond_reset "${c_ns}" "lacp_active off"
> +# make sure the switch state is not expired [A,T,G,S,Ex]
> +if slowwait 15 ip netns exec ${s_ns} grep -q 'port state: 143' /proc/net/bonding/bond0; then

Shellcheck wants double quote everywhere. Since in many cases (all the
blamed ones in this patch) we know the variable is really a single word,
I think you could simply disable the warning with:

#shellcheck disable=SC2086

(same in the other test file)

> +	RET=1
> +else
> +	RET=0
> +fi

/P


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 1/2] bonding: update ntt to true in passive mode
  2025-07-09  9:03 ` [PATCH net 1/2] bonding: update ntt to true " Hangbin Liu
@ 2025-07-16  4:19   ` Jay Vosburgh
  2025-07-16 10:01     ` Hangbin Liu
  2025-07-23 10:27     ` Hangbin Liu
  0 siblings, 2 replies; 15+ messages in thread
From: Jay Vosburgh @ 2025-07-16  4:19 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel

Hangbin Liu <liuhangbin@gmail.com> wrote:

>When lacp_active is set to off, the bond operates in passive mode, meaning it
>will only "speak when spoken to." However, the current kernel implementation
>only sends an LACPDU in response when the partner's state changes.
>
>In this situation, once LACP negotiation succeeds, the actor stops sending
>LACPDUs until the partner times out and sends an "expired" LACPDU.
>This leads to endless LACP state flapping.

	From the above, I suspect our implementation isn't compliant to
the standard.  Per IEEE 802.1AX-2014 6.4.1 LACP design elements:

c)	Active or passive participation in LACP is controlled by
	LACP_Activity, an administrative control associated with each
	Aggregation Port, that can take the value Active LACP or Passive
	LACP. Passive LACP indicates the Aggregation Port’s preference
	for not transmitting LACPDUs unless its Partner’s control value
	is Active LACP (i.e., a preference not to speak unless spoken
	to). Active LACP indicates the Aggregation Port’s preference to
	participate in the protocol regardless of the Partner’s control
	value (i.e., a preference to speak regardless).

d)	Periodic transmission of LACPDUs occurs if the LACP_Activity
	control of either the Actor or the Partner is Active LACP. These
	periodic transmissions will occur at either a slow or fast
	transmission rate depending upon the expressed LACP_Timeout
	preference (Long Timeout or Short Timeout) of the Partner
	System.

	Which, in summary, means that if either end (actor or partner)
has LACP_Activity set, both ends must send periodic LACPDUs at the rate
specified by their respective partner's LACP_Timeout rate.

>To avoid this, we need update ntt to true once received an LACPDU from the
>partner, ensuring an immediate reply. With this fix, the link becomes stable
>in most cases, except for one specific scenario:
>
>Actor: lacp_active=off, lacp_rate=slow
>Partner: lacp_active=on, lacp_rate=fast
>
>In this case, the partner expects frequent LACPDUs (every 1 second), but the
>actor only responds after receiving an LACPDU, which, in this setup, the
>partner sends every 30 seconds due to the actor's lacp_rate=slow. By the time
>the actor replies, the partner has already timed out and sent an "expired"
>LACPDU.

	Presuming that I'm correct that we're not implementing 6.4.1 d),
above, correctly, then I don't think this is a proper fix, as it kind of
band-aids over the problem a bit.

	Looking at the code, I suspect the problem revolves around the
"lacp_active" check in ad_periodic_machine():

static void ad_periodic_machine(struct port *port, struct bond_params *bond_params)
{
	periodic_states_t last_state;

	/* keep current state machine state to compare later if it was changed */
	last_state = port->sm_periodic_state;

	/* check if port was reinitialized */
	if (((port->sm_vars & AD_PORT_BEGIN) || !(port->sm_vars & AD_PORT_LACP_ENABLED) || !port->is_enabled) ||
	    (!(port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)) ||
	    !bond_params->lacp_active) {
		port->sm_periodic_state = AD_NO_PERIODIC;
	}

	In the above, because all the tests are chained with ||, the
lacp_active test overrides the two correct-looking
LACP_STATE_LACP_ACTIVITY tests.

	It looks like ad_initialize_port() always sets
LACP_STATE_LACP_ACTIVITY in the port->actor_oper_port_state, and nothing
ever clears it.

	Thinking out loud, perhaps this could be fixed by

	a) remove the test of bond_params->lacp_active here, and,

	b) The lacp_active option setting controls whether LACP_ACTIVITY
is set in port->actor_oper_port_state.

	Thoughts?

	-J

>Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
>---
> drivers/net/bonding/bond_3ad.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index c6807e473ab7..e001d1c8a49b 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -666,6 +666,8 @@ static void __update_default_selected(struct port *port)
>  */
> static void __update_ntt(struct lacpdu *lacpdu, struct port *port)
> {
>+	struct bonding *bond;
>+
> 	/* validate lacpdu and port */
> 	if (lacpdu && port) {
> 		/* check if any parameter is different then
>@@ -683,6 +685,10 @@ static void __update_ntt(struct lacpdu *lacpdu, struct port *port)
> 		   ) {
> 			port->ntt = true;
> 		}
>+
>+		bond = __get_bond_by_port(port);
>+		if (bond && !bond->params.lacp_active)
>+			port->ntt = true;
> 	}
> }
> 
>-- 
>2.46.0
>

---
	-Jay Vosburgh, jv@jvosburgh.net


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 1/2] bonding: update ntt to true in passive mode
  2025-07-16  4:19   ` Jay Vosburgh
@ 2025-07-16 10:01     ` Hangbin Liu
  2025-07-16 17:35       ` Jay Vosburgh
  2025-07-23 10:27     ` Hangbin Liu
  1 sibling, 1 reply; 15+ messages in thread
From: Hangbin Liu @ 2025-07-16 10:01 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel

On Tue, Jul 15, 2025 at 09:19:49PM -0700, Jay Vosburgh wrote:
> Hangbin Liu <liuhangbin@gmail.com> wrote:
> 
> >When lacp_active is set to off, the bond operates in passive mode, meaning it
> >will only "speak when spoken to." However, the current kernel implementation
> >only sends an LACPDU in response when the partner's state changes.
> >
> >In this situation, once LACP negotiation succeeds, the actor stops sending
> >LACPDUs until the partner times out and sends an "expired" LACPDU.
> >This leads to endless LACP state flapping.
> 
> 	From the above, I suspect our implementation isn't compliant to
> the standard.  Per IEEE 802.1AX-2014 6.4.1 LACP design elements:
> 
> c)	Active or passive participation in LACP is controlled by
> 	LACP_Activity, an administrative control associated with each
> 	Aggregation Port, that can take the value Active LACP or Passive
> 	LACP. Passive LACP indicates the Aggregation Port’s preference
> 	for not transmitting LACPDUs unless its Partner’s control value
> 	is Active LACP (i.e., a preference not to speak unless spoken
> 	to). Active LACP indicates the Aggregation Port’s preference to

OK, so this means the passive side should start sending LACPDUs when receive
passive actor's LACPDUs, with the slow/fast rate based on partner's rate?

Hmm, then when we should stop sending LACPDUs? After
port->sm_mux_state == AD_MUX_DETACHED ?

> 	participate in the protocol regardless of the Partner’s control
> 	value (i.e., a preference to speak regardless).
> 
> d)	Periodic transmission of LACPDUs occurs if the LACP_Activity
> 	control of either the Actor or the Partner is Active LACP. These
> 	periodic transmissions will occur at either a slow or fast
> 	transmission rate depending upon the expressed LACP_Timeout
> 	preference (Long Timeout or Short Timeout) of the Partner
> 	System.
> 
> 	Which, in summary, means that if either end (actor or partner)
> has LACP_Activity set, both ends must send periodic LACPDUs at the rate
> specified by their respective partner's LACP_Timeout rate.
> 
> >To avoid this, we need update ntt to true once received an LACPDU from the
> >partner, ensuring an immediate reply. With this fix, the link becomes stable
> >in most cases, except for one specific scenario:
> >
> >Actor: lacp_active=off, lacp_rate=slow
> >Partner: lacp_active=on, lacp_rate=fast
> >
> >In this case, the partner expects frequent LACPDUs (every 1 second), but the
> >actor only responds after receiving an LACPDU, which, in this setup, the
> >partner sends every 30 seconds due to the actor's lacp_rate=slow. By the time
> >the actor replies, the partner has already timed out and sent an "expired"
> >LACPDU.
> 
> 	Presuming that I'm correct that we're not implementing 6.4.1 d),
> above, correctly, then I don't think this is a proper fix, as it kind of
> band-aids over the problem a bit.
> 
> 	Looking at the code, I suspect the problem revolves around the
> "lacp_active" check in ad_periodic_machine():
> 
> static void ad_periodic_machine(struct port *port, struct bond_params *bond_params)
> {
> 	periodic_states_t last_state;
> 
> 	/* keep current state machine state to compare later if it was changed */
> 	last_state = port->sm_periodic_state;
> 
> 	/* check if port was reinitialized */
> 	if (((port->sm_vars & AD_PORT_BEGIN) || !(port->sm_vars & AD_PORT_LACP_ENABLED) || !port->is_enabled) ||
> 	    (!(port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)) ||
> 	    !bond_params->lacp_active) {
> 		port->sm_periodic_state = AD_NO_PERIODIC;
> 	}
> 
> 	In the above, because all the tests are chained with ||, the
> lacp_active test overrides the two correct-looking
> LACP_STATE_LACP_ACTIVITY tests.
> 
> 	It looks like ad_initialize_port() always sets
> LACP_STATE_LACP_ACTIVITY in the port->actor_oper_port_state, and nothing
> ever clears it.
> 
> 	Thinking out loud, perhaps this could be fixed by
> 
> 	a) remove the test of bond_params->lacp_active here, and,
> 
> 	b) The lacp_active option setting controls whether LACP_ACTIVITY
> is set in port->actor_oper_port_state.
> 
> 	Thoughts?

As the upper question. When should we stop sending the LACPDUs?

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 2/2] selftests: bonding: add test for passive LACP mode
  2025-07-15  9:37   ` Paolo Abeni
@ 2025-07-16 11:23     ` Hangbin Liu
  2025-07-24  4:05     ` Hangbin Liu
  1 sibling, 0 replies; 15+ messages in thread
From: Hangbin Liu @ 2025-07-16 11:23 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Nikolay Aleksandrov, Simon Horman, Shuah Khan,
	linux-kselftest, linux-kernel

On Tue, Jul 15, 2025 at 11:37:54AM +0200, Paolo Abeni wrote:
> On 7/9/25 11:03 AM, Hangbin Liu wrote:
> > Add a selftest to verify bonding behavior when lacp_active is set to off.
> > 
> > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> > ---
> >  .../drivers/net/bonding/bond_passive_lacp.sh  | 21 +++++
> >  .../drivers/net/bonding/bond_topo_lacp.sh     | 77 +++++++++++++++++++
> >  2 files changed, 98 insertions(+)
> >  create mode 100755 tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> >  create mode 100644 tools/testing/selftests/drivers/net/bonding/bond_topo_lacp.sh
> 
> New test should be listed in the relevant makefile

Ah, yes, I forgot this.

> > diff --git a/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> > new file mode 100755
> > index 000000000000..4cf8a5999aaa
> > --- /dev/null
> > +++ b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> > @@ -0,0 +1,21 @@
> > +#!/bin/sh
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Testing if bond works with lacp_active = off
> > +
> > +lib_dir=$(dirname "$0")
> > +source ${lib_dir}/bond_topo_lacp.sh
> 
> shellcheck is not super happy about 'source' usage:
> 
> In bond_passive_lacp.sh line 7:
> source ${lib_dir}/bond_topo_lacp.sh
> ^-- SC3046 (warning): In POSIX sh, 'source' in place of '.' is undefined.
> ^-- SC3051 (warning): In POSIX sh, 'source' in place of '.' is undefined.
> 
> either switch to '. ' or use bash instead of 'sh'.

OK, I will fix this and other warns.

Thanks
Hangbin
> 
> > +lacp_bond_reset "${c_ns}" "lacp_active off"
> > +# make sure the switch state is not expired [A,T,G,S,Ex]
> > +if slowwait 15 ip netns exec ${s_ns} grep -q 'port state: 143' /proc/net/bonding/bond0; then
> 
> Shellcheck wants double quote everywhere. Since in many cases (all the
> blamed ones in this patch) we know the variable is really a single word,
> I think you could simply disable the warning with:
> 
> #shellcheck disable=SC2086
> 
> (same in the other test file)
> 
> > +	RET=1
> > +else
> > +	RET=0
> > +fi
> 
> /P
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 1/2] bonding: update ntt to true in passive mode
  2025-07-16 10:01     ` Hangbin Liu
@ 2025-07-16 17:35       ` Jay Vosburgh
  0 siblings, 0 replies; 15+ messages in thread
From: Jay Vosburgh @ 2025-07-16 17:35 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel

Hangbin Liu <liuhangbin@gmail.com> wrote:

>On Tue, Jul 15, 2025 at 09:19:49PM -0700, Jay Vosburgh wrote:
>> Hangbin Liu <liuhangbin@gmail.com> wrote:
>> 
>> >When lacp_active is set to off, the bond operates in passive mode, meaning it
>> >will only "speak when spoken to." However, the current kernel implementation
>> >only sends an LACPDU in response when the partner's state changes.
>> >
>> >In this situation, once LACP negotiation succeeds, the actor stops sending
>> >LACPDUs until the partner times out and sends an "expired" LACPDU.
>> >This leads to endless LACP state flapping.
>> 
>> 	From the above, I suspect our implementation isn't compliant to
>> the standard.  Per IEEE 802.1AX-2014 6.4.1 LACP design elements:
>> 
>> c)	Active or passive participation in LACP is controlled by
>> 	LACP_Activity, an administrative control associated with each
>> 	Aggregation Port, that can take the value Active LACP or Passive
>> 	LACP. Passive LACP indicates the Aggregation Port’s preference
>> 	for not transmitting LACPDUs unless its Partner’s control value
>> 	is Active LACP (i.e., a preference not to speak unless spoken
>> 	to). Active LACP indicates the Aggregation Port’s preference to
>
>OK, so this means the passive side should start sending LACPDUs when receive
>passive actor's LACPDUs, with the slow/fast rate based on partner's rate?

	Did you mean "receive active actor's LACPDUs"?

	Regardless, the standard requires both sides to initiate
periodic LACPDU transmission if either or both enable LACP_Activity in
their LACPDUs.

	So, if a received LACPDU from the partner has LACP_Activity set,
then, yes, we would enable periodic LACPDU transmission, regardless of
our local setting of "lacp_active" / LACP_Activity.

>Hmm, then when we should stop sending LACPDUs? After
>port->sm_mux_state == AD_MUX_DETACHED ?

	We stop sending when the criteria for NO_PERIODIC in the
periodic state machine is met (IEEE 802.1AX-2014 6.4.13, Figure 6-19).

	Practically speaking, this happens when a BEGIN event occurs,
due to a port being reinitialized.  The ad_mux_machine() will set the
mux state to AD_MUX_DETACHED when BEGIN occurs, so I don't think we need
to test for DETACHED explicitly.

	The NO_PERIODIC check is the first "if" block in
ad_periodic_machine() that I referenced below.  The code currently tests
all of the criteria from Figure 6-19, but adds a test of "!lacp_active",
which is why I suspect that removing that bit and managing the
lacp_active option via the LACP_Activity in the actor port state would
do the right thing.

	-J

>> 	participate in the protocol regardless of the Partner’s control
>> 	value (i.e., a preference to speak regardless).
>> 
>> d)	Periodic transmission of LACPDUs occurs if the LACP_Activity
>> 	control of either the Actor or the Partner is Active LACP. These
>> 	periodic transmissions will occur at either a slow or fast
>> 	transmission rate depending upon the expressed LACP_Timeout
>> 	preference (Long Timeout or Short Timeout) of the Partner
>> 	System.
>> 
>> 	Which, in summary, means that if either end (actor or partner)
>> has LACP_Activity set, both ends must send periodic LACPDUs at the rate
>> specified by their respective partner's LACP_Timeout rate.
>> 
>> >To avoid this, we need update ntt to true once received an LACPDU from the
>> >partner, ensuring an immediate reply. With this fix, the link becomes stable
>> >in most cases, except for one specific scenario:
>> >
>> >Actor: lacp_active=off, lacp_rate=slow
>> >Partner: lacp_active=on, lacp_rate=fast
>> >
>> >In this case, the partner expects frequent LACPDUs (every 1 second), but the
>> >actor only responds after receiving an LACPDU, which, in this setup, the
>> >partner sends every 30 seconds due to the actor's lacp_rate=slow. By the time
>> >the actor replies, the partner has already timed out and sent an "expired"
>> >LACPDU.
>> 
>> 	Presuming that I'm correct that we're not implementing 6.4.1 d),
>> above, correctly, then I don't think this is a proper fix, as it kind of
>> band-aids over the problem a bit.
>> 
>> 	Looking at the code, I suspect the problem revolves around the
>> "lacp_active" check in ad_periodic_machine():
>> 
>> static void ad_periodic_machine(struct port *port, struct bond_params *bond_params)
>> {
>> 	periodic_states_t last_state;
>> 
>> 	/* keep current state machine state to compare later if it was changed */
>> 	last_state = port->sm_periodic_state;
>> 
>> 	/* check if port was reinitialized */
>> 	if (((port->sm_vars & AD_PORT_BEGIN) || !(port->sm_vars & AD_PORT_LACP_ENABLED) || !port->is_enabled) ||
>> 	    (!(port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)) ||
>> 	    !bond_params->lacp_active) {
>> 		port->sm_periodic_state = AD_NO_PERIODIC;
>> 	}
>> 
>> 	In the above, because all the tests are chained with ||, the
>> lacp_active test overrides the two correct-looking
>> LACP_STATE_LACP_ACTIVITY tests.
>> 
>> 	It looks like ad_initialize_port() always sets
>> LACP_STATE_LACP_ACTIVITY in the port->actor_oper_port_state, and nothing
>> ever clears it.
>> 
>> 	Thinking out loud, perhaps this could be fixed by
>> 
>> 	a) remove the test of bond_params->lacp_active here, and,
>> 
>> 	b) The lacp_active option setting controls whether LACP_ACTIVITY
>> is set in port->actor_oper_port_state.
>> 
>> 	Thoughts?
>
>As the upper question. When should we stop sending the LACPDUs?
>
>Thanks
>Hangbin

---
	-Jay Vosburgh, jv@jvosburgh.net


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 1/2] bonding: update ntt to true in passive mode
  2025-07-16  4:19   ` Jay Vosburgh
  2025-07-16 10:01     ` Hangbin Liu
@ 2025-07-23 10:27     ` Hangbin Liu
  2025-07-24  9:57       ` Jay Vosburgh
  1 sibling, 1 reply; 15+ messages in thread
From: Hangbin Liu @ 2025-07-23 10:27 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel

On Tue, Jul 15, 2025 at 09:19:49PM -0700, Jay Vosburgh wrote:
> 	Presuming that I'm correct that we're not implementing 6.4.1 d),
> above, correctly, then I don't think this is a proper fix, as it kind of
> band-aids over the problem a bit.
> 
> 	Looking at the code, I suspect the problem revolves around the
> "lacp_active" check in ad_periodic_machine():
> 
> static void ad_periodic_machine(struct port *port, struct bond_params *bond_params)
> {
> 	periodic_states_t last_state;
> 
> 	/* keep current state machine state to compare later if it was changed */
> 	last_state = port->sm_periodic_state;
> 
> 	/* check if port was reinitialized */
> 	if (((port->sm_vars & AD_PORT_BEGIN) || !(port->sm_vars & AD_PORT_LACP_ENABLED) || !port->is_enabled) ||
> 	    (!(port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)) ||
> 	    !bond_params->lacp_active) {
> 		port->sm_periodic_state = AD_NO_PERIODIC;
> 	}
> 
> 	In the above, because all the tests are chained with ||, the
> lacp_active test overrides the two correct-looking
> LACP_STATE_LACP_ACTIVITY tests.
> 
> 	It looks like ad_initialize_port() always sets
> LACP_STATE_LACP_ACTIVITY in the port->actor_oper_port_state, and nothing
> ever clears it.
> 
> 	Thinking out loud, perhaps this could be fixed by
> 
> 	a) remove the test of bond_params->lacp_active here, and,
> 
> 	b) The lacp_active option setting controls whether LACP_ACTIVITY
> is set in port->actor_oper_port_state.
> 
> 	Thoughts?

Hi Jay,

I did some investigation and testing. In addition to your previous change,
we also need to initialize the partner's state to 0 in ad_initialize_port_tmpl().
Otherwise, the check:
```
!(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)
```
in ad_periodic_machine() will fail even when the actor is in passive mode.

Also, the line:
```
port->partner_oper.port_state |= LACP_STATE_LACP_ACTIVITY;
```
in ad_rx_machine() should be removed, since we can't assume the partner is in
active mode. [1]

With these two changes, we can ensure:
1. In passive mode, the actor will not send LACPDU before receiving any LACPDU from the partner.
2. Once it receives the partner’s LACPDU, it will start sending periodic LACPDUs as expected.

Do you agree with making these changes? If so, I can post a patch for your review.

[1] IEEE 8021AX-2020, Figure 6-14—LACP Receive state diagram, the AD_RX_EXPIRED
statue should be
```
Partner_Oper_Port_State.Synchronization = FALSE;
Partner_Oper_Port_State.Short_Timeout = TRUE;
Actor_Oper_Port_State.Expired = TRUE;
LACP_currentWhile = Short_Timeout_Time;
```

Thanks,
Hangbin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 2/2] selftests: bonding: add test for passive LACP mode
  2025-07-15  9:37   ` Paolo Abeni
  2025-07-16 11:23     ` Hangbin Liu
@ 2025-07-24  4:05     ` Hangbin Liu
  2025-07-24  4:12       ` Hangbin Liu
  1 sibling, 1 reply; 15+ messages in thread
From: Hangbin Liu @ 2025-07-24  4:05 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Nikolay Aleksandrov, Simon Horman, Shuah Khan,
	linux-kselftest, linux-kernel

On Tue, Jul 15, 2025 at 11:37:54AM +0200, Paolo Abeni wrote:
> > diff --git a/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> > new file mode 100755
> > index 000000000000..4cf8a5999aaa
> > --- /dev/null
> > +++ b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> > @@ -0,0 +1,21 @@
> > +#!/bin/sh
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Testing if bond works with lacp_active = off
> > +
> > +lib_dir=$(dirname "$0")
> > +source ${lib_dir}/bond_topo_lacp.sh
> 
> shellcheck is not super happy about 'source' usage:
> 
> In bond_passive_lacp.sh line 7:
> source ${lib_dir}/bond_topo_lacp.sh
> ^-- SC3046 (warning): In POSIX sh, 'source' in place of '.' is undefined.
> ^-- SC3051 (warning): In POSIX sh, 'source' in place of '.' is undefined.
> 
> either switch to '. ' or use bash instead of 'sh'.

Hi Paolo,

I updated the case and remove the source file bond_topo_lacp.sh.
Instead I source the forwarding lib directly like:

lib_dir=$(dirname "$0")
source "$lib_dir"/../../../net/forwarding/lib.sh

But this cause shell check unable to find the lib.sh as $lib_dir is get
dynamically. This usage is common in selftest. How should we resolves this
problem?

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 2/2] selftests: bonding: add test for passive LACP mode
  2025-07-24  4:05     ` Hangbin Liu
@ 2025-07-24  4:12       ` Hangbin Liu
  2025-07-25  8:27         ` Petr Machata
  0 siblings, 1 reply; 15+ messages in thread
From: Hangbin Liu @ 2025-07-24  4:12 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Nikolay Aleksandrov, Simon Horman, Shuah Khan,
	linux-kselftest, linux-kernel

On Thu, Jul 24, 2025 at 04:06:03AM +0000, Hangbin Liu wrote:
> On Tue, Jul 15, 2025 at 11:37:54AM +0200, Paolo Abeni wrote:
> > > diff --git a/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> > > new file mode 100755
> > > index 000000000000..4cf8a5999aaa
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> > > @@ -0,0 +1,21 @@
> > > +#!/bin/sh
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +#
> > > +# Testing if bond works with lacp_active = off
> > > +
> > > +lib_dir=$(dirname "$0")
> > > +source ${lib_dir}/bond_topo_lacp.sh
> > 
> > shellcheck is not super happy about 'source' usage:
> > 
> > In bond_passive_lacp.sh line 7:
> > source ${lib_dir}/bond_topo_lacp.sh
> > ^-- SC3046 (warning): In POSIX sh, 'source' in place of '.' is undefined.
> > ^-- SC3051 (warning): In POSIX sh, 'source' in place of '.' is undefined.
> > 
> > either switch to '. ' or use bash instead of 'sh'.
> 
> Hi Paolo,
> 
> I updated the case and remove the source file bond_topo_lacp.sh.
> Instead I source the forwarding lib directly like:
> 
> lib_dir=$(dirname "$0")
> source "$lib_dir"/../../../net/forwarding/lib.sh
> 
> But this cause shell check unable to find the lib.sh as $lib_dir is get
> dynamically. This usage is common in selftest. How should we resolves this
> problem?

OK, I just disabled this warning.

# shellcheck disable=SC1091

Hangbin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 1/2] bonding: update ntt to true in passive mode
  2025-07-23 10:27     ` Hangbin Liu
@ 2025-07-24  9:57       ` Jay Vosburgh
  2025-07-24 12:15         ` Hangbin Liu
  0 siblings, 1 reply; 15+ messages in thread
From: Jay Vosburgh @ 2025-07-24  9:57 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel

Hangbin Liu <liuhangbin@gmail.com> wrote:

>On Tue, Jul 15, 2025 at 09:19:49PM -0700, Jay Vosburgh wrote:
>> 	Presuming that I'm correct that we're not implementing 6.4.1 d),
>> above, correctly, then I don't think this is a proper fix, as it kind of
>> band-aids over the problem a bit.
>> 
>> 	Looking at the code, I suspect the problem revolves around the
>> "lacp_active" check in ad_periodic_machine():
>> 
>> static void ad_periodic_machine(struct port *port, struct bond_params *bond_params)
>> {
>> 	periodic_states_t last_state;
>> 
>> 	/* keep current state machine state to compare later if it was changed */
>> 	last_state = port->sm_periodic_state;
>> 
>> 	/* check if port was reinitialized */
>> 	if (((port->sm_vars & AD_PORT_BEGIN) || !(port->sm_vars & AD_PORT_LACP_ENABLED) || !port->is_enabled) ||
>> 	    (!(port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)) ||
>> 	    !bond_params->lacp_active) {
>> 		port->sm_periodic_state = AD_NO_PERIODIC;
>> 	}
>> 
>> 	In the above, because all the tests are chained with ||, the
>> lacp_active test overrides the two correct-looking
>> LACP_STATE_LACP_ACTIVITY tests.
>> 
>> 	It looks like ad_initialize_port() always sets
>> LACP_STATE_LACP_ACTIVITY in the port->actor_oper_port_state, and nothing
>> ever clears it.
>> 
>> 	Thinking out loud, perhaps this could be fixed by
>> 
>> 	a) remove the test of bond_params->lacp_active here, and,
>> 
>> 	b) The lacp_active option setting controls whether LACP_ACTIVITY
>> is set in port->actor_oper_port_state.
>> 
>> 	Thoughts?
>
>Hi Jay,
>
>I did some investigation and testing. In addition to your previous change,
>we also need to initialize the partner's state to 0 in ad_initialize_port_tmpl().
>Otherwise, the check:
>```
>!(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)
>```
>in ad_periodic_machine() will fail even when the actor is in passive mode.

	Agreed; the .port_state in the port_params tmpl should just be
zero; the magic number 1 there now, which is LACP_STATE_LACP_ACTIVITY,
is just wrong.  For the actor side, the lacp_active option will set it
appropriately, and the partner's will be updated by any LACPDUs that
arrive.

>Also, the line:
>```
>port->partner_oper.port_state |= LACP_STATE_LACP_ACTIVITY;
>```
>in ad_rx_machine() should be removed, since we can't assume the partner is in
>active mode. [1]

	Also agreed.

>With these two changes, we can ensure:
>1. In passive mode, the actor will not send LACPDU before receiving any LACPDU from the partner.
>2. Once it receives the partner’s LACPDU, it will start sending periodic LACPDUs as expected.
>
>Do you agree with making these changes? If so, I can post a patch for your review.

	Yes, please post a patch.

>[1] IEEE 8021AX-2020, Figure 6-14—LACP Receive state diagram, the AD_RX_EXPIRED
>statue should be
>```
>Partner_Oper_Port_State.Synchronization = FALSE;
>Partner_Oper_Port_State.Short_Timeout = TRUE;
>Actor_Oper_Port_State.Expired = TRUE;
>LACP_currentWhile = Short_Timeout_Time;
>```

	FWIW, I usually reference the older standards 2008 or 2014, as
the 2020 edition changes a lot of things and bonding isn't necessarily
conformant to those changes (e.g., many of the state machines are
different in large or small ways).  Technically, the bonding
implementation was written to the pre-802.1AX standard when it was still
part of 802.3 (hence the name 802.3ad), clause 43.

	This particular bit (the EXPIRED state actions) is the same,
but, for example, the transition test from EXPIRED to DEFAULTED is
different in the 2014 vs 2020 editions, and we need to be careful not to
implement the state machines piecemeal from different editions of the
standard.

	-J

---
	-Jay Vosburgh, jv@jvosburgh.net

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 1/2] bonding: update ntt to true in passive mode
  2025-07-24  9:57       ` Jay Vosburgh
@ 2025-07-24 12:15         ` Hangbin Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Hangbin Liu @ 2025-07-24 12:15 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel

On Thu, Jul 24, 2025 at 11:57:53AM +0200, Jay Vosburgh wrote:
> 	FWIW, I usually reference the older standards 2008 or 2014, as
> the 2020 edition changes a lot of things and bonding isn't necessarily
> conformant to those changes (e.g., many of the state machines are
> different in large or small ways).  Technically, the bonding
> implementation was written to the pre-802.1AX standard when it was still
> part of 802.3 (hence the name 802.3ad), clause 43.
> 
> 	This particular bit (the EXPIRED state actions) is the same,
> but, for example, the transition test from EXPIRED to DEFAULTED is
> different in the 2014 vs 2020 editions, and we need to be careful not to
> implement the state machines piecemeal from different editions of the
> standard.

Thanks for this info. I will download 2014 version and recheck my
changes.

Cheers
Hangbin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 2/2] selftests: bonding: add test for passive LACP mode
  2025-07-24  4:12       ` Hangbin Liu
@ 2025-07-25  8:27         ` Petr Machata
  2025-07-25 12:53           ` Hangbin Liu
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Machata @ 2025-07-25  8:27 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Paolo Abeni, netdev, Jay Vosburgh, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel


Hangbin Liu <liuhangbin@gmail.com> writes:

> On Thu, Jul 24, 2025 at 04:06:03AM +0000, Hangbin Liu wrote:
>> On Tue, Jul 15, 2025 at 11:37:54AM +0200, Paolo Abeni wrote:
>> > > diff --git a/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
>> > > b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
>> > > new file mode 100755
>> > > index 000000000000..4cf8a5999aaa
>> > > --- /dev/null
>> > > +++ b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
>> > > @@ -0,0 +1,21 @@
>> > > +#!/bin/sh
>> > > +# SPDX-License-Identifier: GPL-2.0
>> > > +#
>> > > +# Testing if bond works with lacp_active = off
>> > > +
>> > > +lib_dir=$(dirname "$0")
>> > > +source ${lib_dir}/bond_topo_lacp.sh
>> > 
>> > shellcheck is not super happy about 'source' usage:
>> > 
>> > In bond_passive_lacp.sh line 7:
>> > source ${lib_dir}/bond_topo_lacp.sh
>> > ^-- SC3046 (warning): In POSIX sh, 'source' in place of '.' is undefined.
>> > ^-- SC3051 (warning): In POSIX sh, 'source' in place of '.' is undefined.
>> > 
>> > either switch to '. ' or use bash instead of 'sh'.
>> 
>> Hi Paolo,
>> 
>> I updated the case and remove the source file bond_topo_lacp.sh.
>> Instead I source the forwarding lib directly like:
>> 
>> lib_dir=$(dirname "$0")
>> source "$lib_dir"/../../../net/forwarding/lib.sh
>> 
>> But this cause shell check unable to find the lib.sh as $lib_dir is get
>> dynamically. This usage is common in selftest. How should we resolves this
>> problem?
>
> OK, I just disabled this warning.
>
> # shellcheck disable=SC1091

I believe the point was only about using "." instead of "source". The
following should have fixed it:

. ${lib_dir}/bond_topo_lacp.sh

... or just use bash as the interpreter, I suspect lib.sh is not
actually POSIX clean.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 2/2] selftests: bonding: add test for passive LACP mode
  2025-07-25  8:27         ` Petr Machata
@ 2025-07-25 12:53           ` Hangbin Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Hangbin Liu @ 2025-07-25 12:53 UTC (permalink / raw)
  To: Petr Machata
  Cc: Paolo Abeni, netdev, Jay Vosburgh, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Nikolay Aleksandrov, Simon Horman,
	Shuah Khan, linux-kselftest, linux-kernel

On Fri, Jul 25, 2025 at 10:27:48AM +0200, Petr Machata wrote:
> 
> Hangbin Liu <liuhangbin@gmail.com> writes:
> 
> > On Thu, Jul 24, 2025 at 04:06:03AM +0000, Hangbin Liu wrote:
> >> On Tue, Jul 15, 2025 at 11:37:54AM +0200, Paolo Abeni wrote:
> >> > > diff --git a/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> >> > > b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> >> > > new file mode 100755
> >> > > index 000000000000..4cf8a5999aaa
> >> > > --- /dev/null
> >> > > +++ b/tools/testing/selftests/drivers/net/bonding/bond_passive_lacp.sh
> >> > > @@ -0,0 +1,21 @@
> >> > > +#!/bin/sh
> >> > > +# SPDX-License-Identifier: GPL-2.0
> >> > > +#
> >> > > +# Testing if bond works with lacp_active = off
> >> > > +
> >> > > +lib_dir=$(dirname "$0")
> >> > > +source ${lib_dir}/bond_topo_lacp.sh
> >> > 
> >> > shellcheck is not super happy about 'source' usage:
> >> > 
> >> > In bond_passive_lacp.sh line 7:
> >> > source ${lib_dir}/bond_topo_lacp.sh
> >> > ^-- SC3046 (warning): In POSIX sh, 'source' in place of '.' is undefined.
> >> > ^-- SC3051 (warning): In POSIX sh, 'source' in place of '.' is undefined.
> >> > 
> >> > either switch to '. ' or use bash instead of 'sh'.
> >> 
> >> Hi Paolo,
> >> 
> >> I updated the case and remove the source file bond_topo_lacp.sh.
> >> Instead I source the forwarding lib directly like:
> >> 
> >> lib_dir=$(dirname "$0")
> >> source "$lib_dir"/../../../net/forwarding/lib.sh
> >> 
> >> But this cause shell check unable to find the lib.sh as $lib_dir is get
> >> dynamically. This usage is common in selftest. How should we resolves this
> >> problem?
> >
> > OK, I just disabled this warning.
> >
> > # shellcheck disable=SC1091
> 
> I believe the point was only about using "." instead of "source". The
> following should have fixed it:
> 
> . ${lib_dir}/bond_topo_lacp.sh
> 
> ... or just use bash as the interpreter, I suspect lib.sh is not
> actually POSIX clean.

Thanks Petr, I know Paolo means to use "." to fix this. The issue is that
I changed the script to source forwarding lib. And shell check could only
analyse static path. Unless use -x to supply the real source path. But I guess
the CI can't do this. So I disabled the SC1091 checking as a workaround.

Regards
Hangbin

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-07-25 13:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-09  9:03 [PATCH net 0/2] bonding: fix LACP negotiation issues in passive mode Hangbin Liu
2025-07-09  9:03 ` [PATCH net 1/2] bonding: update ntt to true " Hangbin Liu
2025-07-16  4:19   ` Jay Vosburgh
2025-07-16 10:01     ` Hangbin Liu
2025-07-16 17:35       ` Jay Vosburgh
2025-07-23 10:27     ` Hangbin Liu
2025-07-24  9:57       ` Jay Vosburgh
2025-07-24 12:15         ` Hangbin Liu
2025-07-09  9:03 ` [PATCH net 2/2] selftests: bonding: add test for passive LACP mode Hangbin Liu
2025-07-15  9:37   ` Paolo Abeni
2025-07-16 11:23     ` Hangbin Liu
2025-07-24  4:05     ` Hangbin Liu
2025-07-24  4:12       ` Hangbin Liu
2025-07-25  8:27         ` Petr Machata
2025-07-25 12:53           ` Hangbin Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).