linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bonding: don't force LACPDU tx to ~333 ms boundaries
@ 2025-06-25 16:01 Seth Forshee (DigitalOcean)
  2025-06-25 16:13 ` Carlos Bilbao
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Seth Forshee (DigitalOcean) @ 2025-06-25 16:01 UTC (permalink / raw)
  To: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Carlos Bilbao, Tonghao Zhang,
	Seth Forshee (DigitalOcean)

The timer which ensures that no more than 3 LACPDUs are transmitted in
a second rearms itself every 333ms regardless of whether an LACPDU is
transmitted when the timer expires. This causes LACPDU tx to be delayed
until the next expiration of the timer, which effectively aligns LACPDUs
to ~333ms boundaries. This results in a variable amount of jitter in the
timing of periodic LACPDUs.

Change this to only rearm the timer when an LACPDU is actually sent,
allowing tx at any point after the timer has expired.

Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
---
 drivers/net/bonding/bond_3ad.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index c6807e473ab706afed9560bcdb5e6eca1934f5b7..a8d8aaa169fc09d7d5c201ff298b37b3f11a7ded 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -1378,7 +1378,7 @@ static void ad_tx_machine(struct port *port)
 	/* check if tx timer expired, to verify that we do not send more than
 	 * 3 packets per second
 	 */
-	if (port->sm_tx_timer_counter && !(--port->sm_tx_timer_counter)) {
+	if (!port->sm_tx_timer_counter || !(--port->sm_tx_timer_counter)) {
 		/* check if there is something to send */
 		if (port->ntt && (port->sm_vars & AD_PORT_LACP_ENABLED)) {
 			__update_lacpdu_from_port(port);
@@ -1393,12 +1393,13 @@ static void ad_tx_machine(struct port *port)
 				 * again until demanded
 				 */
 				port->ntt = false;
+
+				/* restart tx timer(to verify that we will not
+				 * exceed AD_MAX_TX_IN_SECOND
+				 */
+				port->sm_tx_timer_counter = ad_ticks_per_sec / AD_MAX_TX_IN_SECOND;
 			}
 		}
-		/* restart tx timer(to verify that we will not exceed
-		 * AD_MAX_TX_IN_SECOND
-		 */
-		port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND;
 	}
 }
 

---
base-commit: 86731a2a651e58953fc949573895f2fa6d456841
change-id: 20250625-fix-lacpdu-jitter-1554d9f600ab

Best regards,
-- 
Seth Forshee (DigitalOcean) <sforshee@kernel.org>



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] bonding: don't force LACPDU tx to ~333 ms boundaries
  2025-06-25 16:01 [PATCH] bonding: don't force LACPDU tx to ~333 ms boundaries Seth Forshee (DigitalOcean)
@ 2025-06-25 16:13 ` Carlos Bilbao
  2025-06-25 16:30 ` Jay Vosburgh
  2025-07-03 13:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 6+ messages in thread
From: Carlos Bilbao @ 2025-06-25 16:13 UTC (permalink / raw)
  To: Seth Forshee (DigitalOcean), Jay Vosburgh, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Carlos Bilbao, Tonghao Zhang

Hello,

On 6/25/25 11:01, Seth Forshee (DigitalOcean) wrote:
> The timer which ensures that no more than 3 LACPDUs are transmitted in
> a second rearms itself every 333ms regardless of whether an LACPDU is
> transmitted when the timer expires. This causes LACPDU tx to be delayed
> until the next expiration of the timer, which effectively aligns LACPDUs
> to ~333ms boundaries. This results in a variable amount of jitter in the
> timing of periodic LACPDUs.
>
> Change this to only rearm the timer when an LACPDU is actually sent,
> allowing tx at any point after the timer has expired.
>
> Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
> ---
>   drivers/net/bonding/bond_3ad.c | 11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> index c6807e473ab706afed9560bcdb5e6eca1934f5b7..a8d8aaa169fc09d7d5c201ff298b37b3f11a7ded 100644
> --- a/drivers/net/bonding/bond_3ad.c
> +++ b/drivers/net/bonding/bond_3ad.c
> @@ -1378,7 +1378,7 @@ static void ad_tx_machine(struct port *port)
>   	/* check if tx timer expired, to verify that we do not send more than
>   	 * 3 packets per second
>   	 */
> -	if (port->sm_tx_timer_counter && !(--port->sm_tx_timer_counter)) {
> +	if (!port->sm_tx_timer_counter || !(--port->sm_tx_timer_counter)) {
>   		/* check if there is something to send */
>   		if (port->ntt && (port->sm_vars & AD_PORT_LACP_ENABLED)) {
>   			__update_lacpdu_from_port(port);
> @@ -1393,12 +1393,13 @@ static void ad_tx_machine(struct port *port)
>   				 * again until demanded
>   				 */
>   				port->ntt = false;
> +
> +				/* restart tx timer(to verify that we will not
> +				 * exceed AD_MAX_TX_IN_SECOND
> +				 */
> +				port->sm_tx_timer_counter = ad_ticks_per_sec / AD_MAX_TX_IN_SECOND;
>   			}
>   		}
> -		/* restart tx timer(to verify that we will not exceed
> -		 * AD_MAX_TX_IN_SECOND
> -		 */
> -		port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND;
>   	}
>   }


Reviewed-by: Carlos Bilbao <carlos.bilbao@kernel.org>


>   
>
> ---
> base-commit: 86731a2a651e58953fc949573895f2fa6d456841
> change-id: 20250625-fix-lacpdu-jitter-1554d9f600ab
>
> Best regards,


Thanks,

Carlos


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bonding: don't force LACPDU tx to ~333 ms boundaries
  2025-06-25 16:01 [PATCH] bonding: don't force LACPDU tx to ~333 ms boundaries Seth Forshee (DigitalOcean)
  2025-06-25 16:13 ` Carlos Bilbao
@ 2025-06-25 16:30 ` Jay Vosburgh
  2025-06-25 17:00   ` Seth Forshee (DigitalOcean)
  2025-07-03 13:30 ` patchwork-bot+netdevbpf
  2 siblings, 1 reply; 6+ messages in thread
From: Jay Vosburgh @ 2025-06-25 16:30 UTC (permalink / raw)
  To: Seth Forshee (DigitalOcean)
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Carlos Bilbao, Tonghao Zhang

Seth Forshee (DigitalOcean) <sforshee@kernel.org> wrote:

>The timer which ensures that no more than 3 LACPDUs are transmitted in
>a second rearms itself every 333ms regardless of whether an LACPDU is
>transmitted when the timer expires. This causes LACPDU tx to be delayed
>until the next expiration of the timer, which effectively aligns LACPDUs
>to ~333ms boundaries. This results in a variable amount of jitter in the
>timing of periodic LACPDUs.

	To be clear, the "3 per second" limitation that all of this
should to conform to is from IEEE 802.1AX-2014, 6.4.16 Transmit machine:

	"When the LACP_Enabled variable is TRUE and the NTT (6.4.7)
	variable is TRUE, the Transmit machine shall ensure that a
	properly formatted LACPDU (6.4.2) is transmitted [i.e., issue a
	CtrlMuxN:M_UNITDATA.Request(LACPDU) service primitive], subject
	to the restriction that no more than three LACPDUs may be
	transmitted in any Fast_Periodic_Time interval. If NTT is set to
	TRUE when this limit is in force, the transmission shall be
	delayed until such a time as the restriction is no longer in
	force. The NTT variable shall be set to FALSE when the Transmit
	machine has transmitted a LACPDU."

	The current implementation conforms to this as you describe: by
aligning transmission to 1/3 second boundaries, no more than 3 can ever
be sent in one second.

	If, hypothetically, the state machine were to transition, or a
user updates port settings (either of which would set NTT each time)
more than 3 times in a second, would your patched code obey this
restriction?

	For completeness, and to make this email as complicated as
possible, I'll note that 802.1AX-2020 removes this particular
restriction in favor of incorporating the 802.3 generic limit on
transmission rates for Slow Protocols (of which LACP is one) to 10 per
second (802.3-2022, 30.3.1.1.38) into the state machine (802.1AX-2020,
6.4.7, see "txOpportunity" and 6.4.14 LACP Transmit machine).  Linux
bonding doesn't implement the 802.1AX-2020 state machines, though, so I
don't think we can reasonably pick and choose arbitrary pieces from two
differing editions of a standard.

	-J

>Change this to only rearm the timer when an LACPDU is actually sent,
>allowing tx at any point after the timer has expired.
>
>Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
>---
> drivers/net/bonding/bond_3ad.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index c6807e473ab706afed9560bcdb5e6eca1934f5b7..a8d8aaa169fc09d7d5c201ff298b37b3f11a7ded 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -1378,7 +1378,7 @@ static void ad_tx_machine(struct port *port)
> 	/* check if tx timer expired, to verify that we do not send more than
> 	 * 3 packets per second
> 	 */
>-	if (port->sm_tx_timer_counter && !(--port->sm_tx_timer_counter)) {
>+	if (!port->sm_tx_timer_counter || !(--port->sm_tx_timer_counter)) {
> 		/* check if there is something to send */
> 		if (port->ntt && (port->sm_vars & AD_PORT_LACP_ENABLED)) {
> 			__update_lacpdu_from_port(port);
>@@ -1393,12 +1393,13 @@ static void ad_tx_machine(struct port *port)
> 				 * again until demanded
> 				 */
> 				port->ntt = false;
>+
>+				/* restart tx timer(to verify that we will not
>+				 * exceed AD_MAX_TX_IN_SECOND
>+				 */
>+				port->sm_tx_timer_counter = ad_ticks_per_sec / AD_MAX_TX_IN_SECOND;
> 			}
> 		}
>-		/* restart tx timer(to verify that we will not exceed
>-		 * AD_MAX_TX_IN_SECOND
>-		 */
>-		port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND;
> 	}
> }
> 
>
>---
>base-commit: 86731a2a651e58953fc949573895f2fa6d456841
>change-id: 20250625-fix-lacpdu-jitter-1554d9f600ab
>
>Best regards,
>-- 
>Seth Forshee (DigitalOcean) <sforshee@kernel.org>

---
	-Jay Vosburgh, jv@jvosburgh.net

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bonding: don't force LACPDU tx to ~333 ms boundaries
  2025-06-25 16:30 ` Jay Vosburgh
@ 2025-06-25 17:00   ` Seth Forshee (DigitalOcean)
  2025-07-01  8:10     ` Paolo Abeni
  0 siblings, 1 reply; 6+ messages in thread
From: Seth Forshee (DigitalOcean) @ 2025-06-25 17:00 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Carlos Bilbao, Tonghao Zhang

On Wed, Jun 25, 2025 at 09:30:56AM -0700, Jay Vosburgh wrote:
> Seth Forshee (DigitalOcean) <sforshee@kernel.org> wrote:
> 
> >The timer which ensures that no more than 3 LACPDUs are transmitted in
> >a second rearms itself every 333ms regardless of whether an LACPDU is
> >transmitted when the timer expires. This causes LACPDU tx to be delayed
> >until the next expiration of the timer, which effectively aligns LACPDUs
> >to ~333ms boundaries. This results in a variable amount of jitter in the
> >timing of periodic LACPDUs.
> 
> 	To be clear, the "3 per second" limitation that all of this
> should to conform to is from IEEE 802.1AX-2014, 6.4.16 Transmit machine:
> 
> 	"When the LACP_Enabled variable is TRUE and the NTT (6.4.7)
> 	variable is TRUE, the Transmit machine shall ensure that a
> 	properly formatted LACPDU (6.4.2) is transmitted [i.e., issue a
> 	CtrlMuxN:M_UNITDATA.Request(LACPDU) service primitive], subject
> 	to the restriction that no more than three LACPDUs may be
> 	transmitted in any Fast_Periodic_Time interval. If NTT is set to
> 	TRUE when this limit is in force, the transmission shall be
> 	delayed until such a time as the restriction is no longer in
> 	force. The NTT variable shall be set to FALSE when the Transmit
> 	machine has transmitted a LACPDU."
> 
> 	The current implementation conforms to this as you describe: by
> aligning transmission to 1/3 second boundaries, no more than 3 can ever
> be sent in one second.
> 
> 	If, hypothetically, the state machine were to transition, or a
> user updates port settings (either of which would set NTT each time)
> more than 3 times in a second, would your patched code obey this
> restriction?

As long as the transition doesn't reset sm_tx_timer_counter to something
smaller than ad_ticks_per_sec/AD_MAX_TX_IN_SECOND, which nothing does
currently (and if it did it would be at risk of sending more than 3 in a
second already). The timer is reset on each tx, so no two consecutive
LACPDUs can be sent less than 300ms apart, therefore no more than 3 can
be per second. If a state machine transition sets NTT within 300ms of
the previous tx, it will not send another until the timer expires.


> 	For completeness, and to make this email as complicated as
> possible, I'll note that 802.1AX-2020 removes this particular
> restriction in favor of incorporating the 802.3 generic limit on
> transmission rates for Slow Protocols (of which LACP is one) to 10 per
> second (802.3-2022, 30.3.1.1.38) into the state machine (802.1AX-2020,
> 6.4.7, see "txOpportunity" and 6.4.14 LACP Transmit machine).  Linux
> bonding doesn't implement the 802.1AX-2020 state machines, though, so I
> don't think we can reasonably pick and choose arbitrary pieces from two
> differing editions of a standard.
> 
> 	-J
> 
> >Change this to only rearm the timer when an LACPDU is actually sent,
> >allowing tx at any point after the timer has expired.
> >
> >Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
> >---
> > drivers/net/bonding/bond_3ad.c | 11 ++++++-----
> > 1 file changed, 6 insertions(+), 5 deletions(-)
> >
> >diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> >index c6807e473ab706afed9560bcdb5e6eca1934f5b7..a8d8aaa169fc09d7d5c201ff298b37b3f11a7ded 100644
> >--- a/drivers/net/bonding/bond_3ad.c
> >+++ b/drivers/net/bonding/bond_3ad.c
> >@@ -1378,7 +1378,7 @@ static void ad_tx_machine(struct port *port)
> > 	/* check if tx timer expired, to verify that we do not send more than
> > 	 * 3 packets per second
> > 	 */
> >-	if (port->sm_tx_timer_counter && !(--port->sm_tx_timer_counter)) {
> >+	if (!port->sm_tx_timer_counter || !(--port->sm_tx_timer_counter)) {
> > 		/* check if there is something to send */
> > 		if (port->ntt && (port->sm_vars & AD_PORT_LACP_ENABLED)) {
> > 			__update_lacpdu_from_port(port);
> >@@ -1393,12 +1393,13 @@ static void ad_tx_machine(struct port *port)
> > 				 * again until demanded
> > 				 */
> > 				port->ntt = false;
> >+
> >+				/* restart tx timer(to verify that we will not
> >+				 * exceed AD_MAX_TX_IN_SECOND
> >+				 */
> >+				port->sm_tx_timer_counter = ad_ticks_per_sec / AD_MAX_TX_IN_SECOND;
> > 			}
> > 		}
> >-		/* restart tx timer(to verify that we will not exceed
> >-		 * AD_MAX_TX_IN_SECOND
> >-		 */
> >-		port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND;
> > 	}
> > }
> > 
> >
> >---
> >base-commit: 86731a2a651e58953fc949573895f2fa6d456841
> >change-id: 20250625-fix-lacpdu-jitter-1554d9f600ab
> >
> >Best regards,
> >-- 
> >Seth Forshee (DigitalOcean) <sforshee@kernel.org>
> 
> ---
> 	-Jay Vosburgh, jv@jvosburgh.net

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bonding: don't force LACPDU tx to ~333 ms boundaries
  2025-06-25 17:00   ` Seth Forshee (DigitalOcean)
@ 2025-07-01  8:10     ` Paolo Abeni
  0 siblings, 0 replies; 6+ messages in thread
From: Paolo Abeni @ 2025-07-01  8:10 UTC (permalink / raw)
  To: Seth Forshee (DigitalOcean), Jay Vosburgh
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netdev, linux-kernel, Carlos Bilbao, Tonghao Zhang

On 6/25/25 7:00 PM, Seth Forshee (DigitalOcean) wrote:
> On Wed, Jun 25, 2025 at 09:30:56AM -0700, Jay Vosburgh wrote:
>> Seth Forshee (DigitalOcean) <sforshee@kernel.org> wrote:
>>
>>> The timer which ensures that no more than 3 LACPDUs are transmitted in
>>> a second rearms itself every 333ms regardless of whether an LACPDU is
>>> transmitted when the timer expires. This causes LACPDU tx to be delayed
>>> until the next expiration of the timer, which effectively aligns LACPDUs
>>> to ~333ms boundaries. This results in a variable amount of jitter in the
>>> timing of periodic LACPDUs.
>>
>> 	To be clear, the "3 per second" limitation that all of this
>> should to conform to is from IEEE 802.1AX-2014, 6.4.16 Transmit machine:
>>
>> 	"When the LACP_Enabled variable is TRUE and the NTT (6.4.7)
>> 	variable is TRUE, the Transmit machine shall ensure that a
>> 	properly formatted LACPDU (6.4.2) is transmitted [i.e., issue a
>> 	CtrlMuxN:M_UNITDATA.Request(LACPDU) service primitive], subject
>> 	to the restriction that no more than three LACPDUs may be
>> 	transmitted in any Fast_Periodic_Time interval. If NTT is set to
>> 	TRUE when this limit is in force, the transmission shall be
>> 	delayed until such a time as the restriction is no longer in
>> 	force. The NTT variable shall be set to FALSE when the Transmit
>> 	machine has transmitted a LACPDU."
>>
>> 	The current implementation conforms to this as you describe: by
>> aligning transmission to 1/3 second boundaries, no more than 3 can ever
>> be sent in one second.
>>
>> 	If, hypothetically, the state machine were to transition, or a
>> user updates port settings (either of which would set NTT each time)
>> more than 3 times in a second, would your patched code obey this
>> restriction?
> 
> As long as the transition doesn't reset sm_tx_timer_counter to something
> smaller than ad_ticks_per_sec/AD_MAX_TX_IN_SECOND, which nothing does
> currently (and if it did it would be at risk of sending more than 3 in a
> second already). The timer is reset on each tx, so no two consecutive
> LACPDUs can be sent less than 300ms apart, therefore no more than 3 can
> be per second. If a state machine transition sets NTT within 300ms of
> the previous tx, it will not send another until the timer expires.

@Jay, I believe the above statement is correct. What I'm missing?

Side note: I'm wondering if this should be considered a fix, and thus
requiring targeting the 'net' tree and a 'fixes' tag.

/P


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bonding: don't force LACPDU tx to ~333 ms boundaries
  2025-06-25 16:01 [PATCH] bonding: don't force LACPDU tx to ~333 ms boundaries Seth Forshee (DigitalOcean)
  2025-06-25 16:13 ` Carlos Bilbao
  2025-06-25 16:30 ` Jay Vosburgh
@ 2025-07-03 13:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-07-03 13:30 UTC (permalink / raw)
  To: Seth Forshee
  Cc: jv, andrew+netdev, davem, edumazet, kuba, pabeni, netdev,
	linux-kernel, carlos.bilbao, tonghao

Hello:

This patch was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Wed, 25 Jun 2025 11:01:24 -0500 you wrote:
> The timer which ensures that no more than 3 LACPDUs are transmitted in
> a second rearms itself every 333ms regardless of whether an LACPDU is
> transmitted when the timer expires. This causes LACPDU tx to be delayed
> until the next expiration of the timer, which effectively aligns LACPDUs
> to ~333ms boundaries. This results in a variable amount of jitter in the
> timing of periodic LACPDUs.
> 
> [...]

Here is the summary with links:
  - bonding: don't force LACPDU tx to ~333 ms boundaries
    https://git.kernel.org/netdev/net-next/c/135faae63218

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-07-03 13:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-25 16:01 [PATCH] bonding: don't force LACPDU tx to ~333 ms boundaries Seth Forshee (DigitalOcean)
2025-06-25 16:13 ` Carlos Bilbao
2025-06-25 16:30 ` Jay Vosburgh
2025-06-25 17:00   ` Seth Forshee (DigitalOcean)
2025-07-01  8:10     ` Paolo Abeni
2025-07-03 13:30 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).