From: Jay Vosburgh <jv@jvosburgh.net>
To: Carlos Bilbao <bilbao@vt.edu>
Cc: carlos.bilbao@kernel.org, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, horms@kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, sforshee@kernel.org
Subject: Re: [PATCH] bonding: Switch periodic LACPDU state machine from counter to jiffies
Date: Wed, 16 Jul 2025 08:30:03 -0700 [thread overview]
Message-ID: <798952.1752679803@famine> (raw)
In-Reply-To: <c9eac8f6-8e7f-4ed0-b34d-5dc50be8078f@vt.edu>
Carlos Bilbao <bilbao@vt.edu> wrote:
>FYI, I was able to test this locally but couldn’t find any kselftests to
>stress the bonding state machine. If anyone knows of additional ways to
>test it, I’d be happy to run them.
Your commit message says this change will "help reduce drift
under contention," but above you say you're unable to stress the state
machine.
How do you induce "drift under contention" to test that your
patch actually improves something? What testing has been done to insure
that the new code doesn't change the behavior in other ways (regressions)?
Without a specific reproducable bug scenario that this change
fixes, I'm leery of applying such a refactor to code that has seemingly
been working fine for 20+ years.
I gather that what this is intending to do is reduce the current
dependency on the scheduling accuracy of the workqueue event that runs
the state machines. The current implementation works on a "number of
invocations" basis, assuming that the event is invoked every 100 msec,
and computes various timeouts based on the number of times the state
machine runs.
-J
>Thanks!
>
>Carlos
>
>On 7/15/25 15:57, carlos.bilbao@kernel.org wrote:
>> From: Carlos Bilbao <carlos.bilbao@kernel.org>
>>
>> Replace the bonding periodic state machine for LACPDU transmission of
>> function ad_periodic_machine() with a jiffies-based mechanism, which is
>> more accurate and can help reduce drift under contention.
>>
>> Signed-off-by: Carlos Bilbao (DigitalOcean) <carlos.bilbao@kernel.org>
>> ---
>> drivers/net/bonding/bond_3ad.c | 79 +++++++++++++---------------------
>> include/net/bond_3ad.h | 2 +-
>> 2 files changed, 32 insertions(+), 49 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>> index c6807e473ab7..8654a51266a3 100644
>> --- a/drivers/net/bonding/bond_3ad.c
>> +++ b/drivers/net/bonding/bond_3ad.c
>> @@ -1421,44 +1421,24 @@ static void ad_periodic_machine(struct port *port, struct bond_params *bond_para
>> (!(port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)) ||
>> !bond_params->lacp_active) {
>> port->sm_periodic_state = AD_NO_PERIODIC;
>> - }
>> - /* check if state machine should change state */
>> - else if (port->sm_periodic_timer_counter) {
>> - /* check if periodic state machine expired */
>> - if (!(--port->sm_periodic_timer_counter)) {
>> - /* if expired then do tx */
>> - port->sm_periodic_state = AD_PERIODIC_TX;
>> - } else {
>> - /* If not expired, check if there is some new timeout
>> - * parameter from the partner state
>> - */
>> - switch (port->sm_periodic_state) {
>> - case AD_FAST_PERIODIC:
>> - if (!(port->partner_oper.port_state
>> - & LACP_STATE_LACP_TIMEOUT))
>> - port->sm_periodic_state = AD_SLOW_PERIODIC;
>> - break;
>> - case AD_SLOW_PERIODIC:
>> - if ((port->partner_oper.port_state & LACP_STATE_LACP_TIMEOUT)) {
>> - port->sm_periodic_timer_counter = 0;
>> - port->sm_periodic_state = AD_PERIODIC_TX;
>> - }
>> - break;
>> - default:
>> - break;
>> - }
>> - }
>> + } else if (port->sm_periodic_state == AD_NO_PERIODIC)
>> + port->sm_periodic_state = AD_FAST_PERIODIC;
>> + /* check if periodic state machine expired */
>> + else if (time_after_eq(jiffies, port->sm_periodic_next_jiffies)) {
>> + /* if expired then do tx */
>> + port->sm_periodic_state = AD_PERIODIC_TX;
>> } else {
>> + /* If not expired, check if there is some new timeout
>> + * parameter from the partner state
>> + */
>> switch (port->sm_periodic_state) {
>> - case AD_NO_PERIODIC:
>> - port->sm_periodic_state = AD_FAST_PERIODIC;
>> - break;
>> - case AD_PERIODIC_TX:
>> - if (!(port->partner_oper.port_state &
>> - LACP_STATE_LACP_TIMEOUT))
>> + case AD_FAST_PERIODIC:
>> + if (!(port->partner_oper.port_state & LACP_STATE_LACP_TIMEOUT))
>> port->sm_periodic_state = AD_SLOW_PERIODIC;
>> - else
>> - port->sm_periodic_state = AD_FAST_PERIODIC;
>> + break;
>> + case AD_SLOW_PERIODIC:
>> + if ((port->partner_oper.port_state & LACP_STATE_LACP_TIMEOUT))
>> + port->sm_periodic_state = AD_PERIODIC_TX;
>> break;
>> default:
>> break;
>> @@ -1471,21 +1451,24 @@ static void ad_periodic_machine(struct port *port, struct bond_params *bond_para
>> "Periodic Machine: Port=%d, Last State=%d, Curr State=%d\n",
>> port->actor_port_number, last_state,
>> port->sm_periodic_state);
>> +
>> switch (port->sm_periodic_state) {
>> - case AD_NO_PERIODIC:
>> - port->sm_periodic_timer_counter = 0;
>> - break;
>> - case AD_FAST_PERIODIC:
>> - /* decrement 1 tick we lost in the PERIODIC_TX cycle */
>> - port->sm_periodic_timer_counter = __ad_timer_to_ticks(AD_PERIODIC_TIMER, (u16)(AD_FAST_PERIODIC_TIME))-1;
>> - break;
>> - case AD_SLOW_PERIODIC:
>> - /* decrement 1 tick we lost in the PERIODIC_TX cycle */
>> - port->sm_periodic_timer_counter = __ad_timer_to_ticks(AD_PERIODIC_TIMER, (u16)(AD_SLOW_PERIODIC_TIME))-1;
>> - break;
>> case AD_PERIODIC_TX:
>> port->ntt = true;
>> - break;
>> + if (!(port->partner_oper.port_state &
>> + LACP_STATE_LACP_TIMEOUT))
>> + port->sm_periodic_state = AD_SLOW_PERIODIC;
>> + else
>> + port->sm_periodic_state = AD_FAST_PERIODIC;
>> + fallthrough;
>> + case AD_SLOW_PERIODIC:
>> + case AD_FAST_PERIODIC:
>> + if (port->sm_periodic_state == AD_SLOW_PERIODIC)
>> + port->sm_periodic_next_jiffies = jiffies
>> + + HZ * AD_SLOW_PERIODIC_TIME;
>> + else /* AD_FAST_PERIODIC */
>> + port->sm_periodic_next_jiffies = jiffies
>> + + HZ * AD_FAST_PERIODIC_TIME;
>> default:
>> break;
>> }
>> @@ -1987,7 +1970,7 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
>> port->sm_rx_state = 0;
>> port->sm_rx_timer_counter = 0;
>> port->sm_periodic_state = 0;
>> - port->sm_periodic_timer_counter = 0;
>> + port->sm_periodic_next_jiffies = 0;
>> port->sm_mux_state = 0;
>> port->sm_mux_timer_counter = 0;
>> port->sm_tx_state = 0;
>> diff --git a/include/net/bond_3ad.h b/include/net/bond_3ad.h
>> index 2053cd8e788a..aabb8c97caf4 100644
>> --- a/include/net/bond_3ad.h
>> +++ b/include/net/bond_3ad.h
>> @@ -227,7 +227,7 @@ typedef struct port {
>> rx_states_t sm_rx_state; /* state machine rx state */
>> u16 sm_rx_timer_counter; /* state machine rx timer counter */
>> periodic_states_t sm_periodic_state; /* state machine periodic state */
>> - u16 sm_periodic_timer_counter; /* state machine periodic timer counter */
>> + unsigned long sm_periodic_next_jiffies; /* state machine periodic next expected sent */
>> mux_states_t sm_mux_state; /* state machine mux state */
>> u16 sm_mux_timer_counter; /* state machine mux timer counter */
>> tx_states_t sm_tx_state; /* state machine tx state */
---
-Jay Vosburgh, jv@jvosburgh.net
next prev parent reply other threads:[~2025-07-16 15:30 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-15 20:57 [PATCH] bonding: Switch periodic LACPDU state machine from counter to jiffies carlos.bilbao
2025-07-15 20:59 ` Carlos Bilbao
2025-07-16 15:30 ` Jay Vosburgh [this message]
2025-07-16 19:44 ` Carlos Bilbao
2025-07-16 23:33 ` Jay Vosburgh
2025-08-05 13:14 ` Hangbin Liu
2025-07-16 9:09 ` Simon Horman
2025-07-16 19:45 ` Carlos Bilbao
2025-07-17 4:35 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=798952.1752679803@famine \
--to=jv@jvosburgh.net \
--cc=andrew+netdev@lunn.ch \
--cc=bilbao@vt.edu \
--cc=carlos.bilbao@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sforshee@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).