Netdev List
 help / color / mirror / Atom feed
From: Vladimir Oltean <olteanv@gmail.com>
To: Ioana Ciornei <ioana.ciornei@nxp.com>
Cc: andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next v2 09/13] dpaa2-switch: offload FDBs added on an upper bond device
Date: Thu, 14 May 2026 13:46:55 +0300	[thread overview]
Message-ID: <20260514104655.vrn4bkrahimqpcgv@skbuf> (raw)
In-Reply-To: <20260512131554.952971-10-ioana.ciornei@nxp.com> <20260512131554.952971-10-ioana.ciornei@nxp.com>

On Tue, May 12, 2026 at 04:15:50PM +0300, Ioana Ciornei wrote:
> +static bool dpaa2_switch_foreign_dev_check(const struct net_device *dev,
> +					   const struct net_device *foreign_dev)
> +{
> +	struct ethsw_port_priv *port_priv = netdev_priv(dev);
> +	struct ethsw_core *ethsw = port_priv->ethsw_data;
> +	struct ethsw_port_priv *other_port;
> +	int i;
> +
> +	if (netif_is_bridge_master(foreign_dev))
> +		if (port_priv->fdb->bridge_dev == foreign_dev)
> +			return false;
> +
> +	if (netif_is_bridge_port(foreign_dev)) {
> +		for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
> +			other_port = ethsw->ports[i];
> +
> +			if (!other_port || !other_port->lag)
> +				continue;
> +			if (other_port->lag->bond_dev == foreign_dev)
> +				return false;
> +		}
> +	}
> +
> +	return true;
> +}

I noticed Sashiko says:

 Does this misclassify a non-LAG dpaa2 switch port as foreign?  For a
 plain "bridge fdb add aa:bb:cc:dd:ee:ff dev sw0p0 master static",
 br_switchdev_fdb_populate() sets info.dev = sw0p0 and
 __switchdev_handle_fdb_event_to_device() calls
 dpaa2_switch_port_fdb_event() with dev == orig_dev == sw0p0.
 Inside dpaa2_switch_foreign_dev_check(sw0p0, sw0p0):
   netif_is_bridge_master(sw0p0) is false
   netif_is_bridge_port(sw0p0)   is true
   the loop only matches when other_port->lag->bond_dev == foreign_dev,
   but sw0p0 is not a bond_dev, so no entry matches
 so the function falls through to "return true" and the FDB event is
 dropped without ever calling dpaa2_switch_port_fdb_add().  Pre-patch the
 equivalent path only required dpaa2_switch_port_dev_check(dev) to be
 true and would have installed the entry.
 The new dpaa2_switch_port_offloads_bridge_port() helper added in
 dpaa2-switch.h covers both the LAG bond_dev case and the direct port
 netdev case:
   static inline bool
   dpaa2_switch_port_offloads_bridge_port(struct ethsw_port_priv *port_priv,
                                          const struct net_device *dev)
   {
           if (port_priv->lag && port_priv->lag->bond_dev == dev)
                   return true;
           if (port_priv->netdev == dev)
                   return true;
           return false;
   }
 Should dpaa2_switch_foreign_dev_check() also consult the per-port netdev
 match (e.g. via this helper) so non-LAG bridge ports aren't classified
 as foreign?


I think that the foreign_dev_check_cb() expectations are poorly defined/
documented, so I'll try to make some notes in the hope it may be improved:

At a high level, it is supposed to determine whether an offloaded
forwarding data path exists between %dev and %foreign_dev, and return
the negation of that.

The %dev is an interface that always passes the %check_cb.
The %foreign_dev may be:
(1) a bridge port with lowers (team, bond) - in that case
    __switchdev_handle_fdb_event_to_device() walks over those lowers to
    determine whether to fan out the FDB event to them or not. Here,
    %dev is one of those lowers. If you say the LAG is foreign to you,
    %you are not interested in processing FDB events from it.
(2) a bridge device - in case the FDB event was emitted on a bridge port
    that didn't pass %check_cb, we search for a neighbour port that
    offloads that bridge, that can trap the FDB entry to the CPU for
    software forwarding. If you say the bridge is foreign to you, you are
    not interested at all in FDB entries from foreign neighbour ports
(3) a neighbour bridge port - which is actually the port that triggered
    the call (2) so we know it is under a bridge not foreign to us.
    Here, if you say the neighbour port is foreign to you, you are
    actually saying that you _are_ interested in FDB entries on this
    device (will trap them to the CPU for software forwarding)

But switchdev_handle_fdb_event_to_device() has a major limitation which
makes things very confusing.

If you have this topology:

     br0
    /   \
sw0p0   sw1p0

where sw0p0 and sw1p0 are different switch instances handled by the same
driver, which have no offloaded data path to each other, and you run
like Sashiko says:

$ bridge fdb add aa:bb:cc:dd:ee:ff dev sw0p0 master static

then switchdev_handle_fdb_event_to_device() is supposed to also ask
sw1p0 whether sw0p0 is foreign to it, in order to trap the FDB entry to
the CPU.

But it doesn't. Instead, it just stops here, i.e. it assumes that if it
could notify the FDB event to sw0p0, there's nothing more to do:

	if (check_cb(dev))
		return mod_cb(dev, orig_dev, event, info->ctx, fdb_info);

So, yes, currently the bug has no consequences for 2 reasons:
- %foreign_dev is never an interface that passes check_cb()
- dpaa2_switch_port_fdb_event() has this:
	/* For the moment, do nothing with entries towards foreign devices */
	if (dpaa2_switch_foreign_dev_check(dev, orig_dev))
		return 0;

but I need to think of an improved recursion algorithm where that case
is handled as well.

Looking at the other implementations, I see dsa_foreign_dev_check() is
prepared to respond correctly to the switch port <-> switch port
question, and lan966x_foreign_dev_check() isn't.

It would be good, however, to do what the LLM says and fix this by
saying that if netif_is_bridge_port(foreign_dev), the dpaa2-switch
port %dev is foreign to it if there is no other_port under ethsw which
has either a lag_dev _or_ a netdev equal to the foreign_dev. This way,
the driver will be prepared when the switchdev_handle_fdb_event_to_device()
algorithm changes.

 dpaa2_switch_port_event() is documented as called under rcu_read_lock().
 This loop reads other_port->lag and other_port->lag->bond_dev with no
 synchronization, while those fields are mutated under rtnl by
 dpaa2_switch_port_set_lag_group() and the bond-leave path
 (port_priv->lag = NULL; lag->bond_dev = NULL).  Pointer reads are
 atomic, but the compound (other_port->lag) && (lag->bond_dev) is not.
 Is the foreign/native classification stable under concurrent bond
 join/leave, or should this rely on RTNL / netdev adjacency RCU lists
 instead?

I think this is valid. DSA has closed this race with the dsa_flush_workqueue()
call placed after switchdev_bridge_port_unoffload(). It ensures that
pending FDB events are all processed while the netdev adjacency
relationships have not yet been destroyed (running in prechangeupper
context).

  reply	other threads:[~2026-05-14 10:47 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12 13:15 [PATCH net-next v2 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 01/13] dpaa2-switch: add LAG configuration API Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 02/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 03/13] dpaa2-switch: change dpaa2_switch_port_set_fdb() function prototype Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 04/13] dpaa2-switch: extend dpaa2_switch_port_set_fdb() to cover bond scenarios Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 05/13] dpaa2-switch: add dpaa2_switch_port_to_bridge_port() helper Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 06/13] dpaa2-switch: create a separate dpaa2_switch_port_fdb_event() function Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 07/13] dpaa2-switch: check early if an FDB entry should be added Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 08/13] dpaa2-switch: consolidate unicast and multicast management Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 09/13] dpaa2-switch: offload FDBs added on an upper bond device Ioana Ciornei
2026-05-14 10:46   ` Vladimir Oltean [this message]
2026-05-14 13:46     ` Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 10/13] dpaa2-switch: offload port objects " Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 11/13] dpaa2-switch: trap all link local reserved addresses to the CPU Ioana Ciornei
2026-05-12 13:15 ` [PATCH net-next v2 12/13] dpaa2-switch: add support for imprecise source port Ioana Ciornei
2026-05-12 13:23 ` [PATCH net-next v2 13/13] dpaa2-switch: do not error out when the same VLAN is installed multiple times Ioana Ciornei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260514104655.vrn4bkrahimqpcgv@skbuf \
    --to=olteanv@gmail.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=ioana.ciornei@nxp.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox