From: Thomas Bogendoerfer <tbogendoerfer@suse.de>
To: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: Andy Gospodarek <andy@greyhouse.net>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net] bonding: 802.3ad: Avoid packet loss when switching aggregator
Date: Wed, 10 Apr 2024 17:50:52 +0200 [thread overview]
Message-ID: <20240410175052.25ac7638@samweis> (raw)
In-Reply-To: <21529.1712592371@famine>
On Mon, 08 Apr 2024 09:06:11 -0700
Jay Vosburgh <jay.vosburgh@canonical.com> wrote:
> Thomas Bogendoerfer <tbogendoerfer@suse.de> wrote:
>
> >If selection logic decides to switch to a new aggregator it disables
> >all ports of the old aggregator, but doesn't enable ports on
> >the new aggregator. These ports will eventually be enabled when
> >the next LACPDU is received, which might take some time and without an
> >active port transmitted frames are dropped. Avoid this by enabling
> >already collected ports of the new aggregator immediately.
> >
> >Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
> >---
> > drivers/net/bonding/bond_3ad.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> >diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> >index c6807e473ab7..529e2a7c51e2 100644
> >--- a/drivers/net/bonding/bond_3ad.c
> >+++ b/drivers/net/bonding/bond_3ad.c
> >@@ -1876,6 +1876,13 @@ static void ad_agg_selection_logic(struct aggregator *agg,
> > __disable_port(port);
> > }
> > }
> >+
> >+ /* enable ports on new active aggregator */
> >+ for (port = best->lag_ports; port;
> >+ port = port->next_port_in_aggregator) {
> >+ __enable_port(port);
> >+ }
> >+
>
> I think this will do the wrong thing if the port in question is
> not in a valid state to send or receive (i.e., it is not one of
> COLLECTING_DISTRIBUTING, COLLECTING, or DISTRIBUTING).
>
>
> As it happens, this situation, except for the case of individual
> ports, is handled just below this code:
>
> /* if the selected aggregator is of join individuals
> * (partner_system is NULL), enable their ports
> */
> active = __get_active_agg(origin);
>
> if (active) {
> if (!__agg_has_partner(active)) {
> for (port = active->lag_ports; port;
> port = port->next_port_in_aggregator) {
> __enable_port(port);
> }
> *update_slave_arr = true;
> }
> }
>
> rcu_read_unlock();
>
> FWIW, looking at it, I'm not sure that "__agg_has_partner" is
> the proper test for invididual-ness, but I'd have to do a bit of poking
> to confirm that. In any event, that's not what you want to change right
> now.
>
> Instead of adding another block that does more or less the same
> thing, I'd suggest updating this logic to include tests for C_D, C, or D
> states, and enabling the ports if that is the case. Probably something
> like (I have not tested or compiled this at all):
>
> if (active) {
> if (!__agg_has_partner(active)) {
> [ ... the current !__agg_has_partner() stuff ]
> } else {
moving it here will run this part on every call of ad_agg_selection_logic(),
but it would be only relevant, if there is a switch to a different aggregator.
> for (port = active->lag_ports; port;
> port = port->next_port_in_aggregator) {
> switch (port->sm_mux_state) {
> case AD_MUX_DISTRIBUTING:
> case AD_MUX_COLLECTING_DISTRIBUTING:
> ad_enable_collecting_distributing(port,
> update_slave_arr);
> port->ntt = true;
> break;
> case AD_MUX_COLLECTING:
> ad_enable_collecting(port);
> ad_disable_distributing(port, update_slave_arr);
> port->ntt = true;
> break;
> default:
> break;
> }
I've tried this in my test environment and it doesn't fixed the issue
I'm seeing, because the port of the new aggregator is still in AD_MUX_WAITING...
The issue is that after bringing the bond up it happens that the bond link
is up, but no slave can transmit. This happens exactly when the aggregator
is changed due to timing of the received lacpdu. So if enabling the port
in AD_MUX_WAITING is wrong, what are other ways to fix this problem ?
Thomas.
--
SUSE Software Solutions Germany GmbH
HRB 36809 (AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew McDonald, Werner Knoblich
next prev parent reply other threads:[~2024-04-10 15:50 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-04 11:49 [PATCH net] bonding: 802.3ad: Avoid packet loss when switching aggregator Thomas Bogendoerfer
2024-04-06 16:03 ` Simon Horman
2024-04-09 14:34 ` Thomas Bogendoerfer
2024-04-08 16:06 ` Jay Vosburgh
2024-04-10 15:50 ` Thomas Bogendoerfer [this message]
2024-04-11 0:28 ` Jay Vosburgh
2024-04-15 16:57 ` Thomas Bogendoerfer
2024-04-11 2:44 ` Hangbin Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240410175052.25ac7638@samweis \
--to=tbogendoerfer@suse.de \
--cc=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jay.vosburgh@canonical.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).