Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload
@ 2026-06-29 11:22 Ioana Ciornei
  2026-06-29 11:22 ` [PATCH net-next v4 01/13] dpaa2-switch: remove unnecessary dev_mc_add/dev_mc_del calls Ioana Ciornei
                   ` (13 more replies)
  0 siblings, 14 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:22 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

This patch set adds support in dpaa2-switch for offloading upper bond
devices.

The first two patches remove the necessity to hold rtnl_lock during the
event processing workqueue by ensuring that all event were processed
before any changes in FDB layout happens.

Patch #3 updates the logic around choosing the FDB that should be used
on a switch port. This is necessary since with the addition of the LAG
offload, we need to take into account all ports which are under the same
bridge, even though not directly.

The next four patches clean up the FDB event by making them easier to
integrate with bond devices and also add the
dpaa2_switch_port_to_bridge_port() helper to be used in the LAG offload
support.

The 8th patch adds the necessary new APIs for the LAG configuration
while the next one uses them, both in the prechangeupper phase and the
changeupper one. Which ports can be part of the same LAG group is
configurable at boot time, thus we use the prechangeupper callback in
order to validate that a requested configuration can be offloaded or
not.

This set also extends the handling of FDBs and port objects so that they
are handled by the driver even on an offloaded bond device.

Changes in v4:
- Moved and split some patches so that any preparatory work is being
  done before the driver offloads upper bond devices.
- Add a defensive check in dpaa2_switch_port_bond_leave() for a NULL
port_priv->lag
- Extend the dpaa2_switch_prevent_bridging_with_8021q_upper() function
so that we prevent a bond device with VLAN uppers joinging a bridge.
The restriction is related to VLAN management in terms of the FDB which
can change upon a topology change. VLAN uppers can only be added once
the bridge topology is setup.
- Remove all FDB management from the bond join/leave paths. Decided to
reconfigure the FDB only on bridge join/leave since the FDB determines
the forwarding domain and when a bond is not bridged, from a
configuration standpoint, the individual lowers can be viewed as
standalone.
- Moved here the update to the dpaa2_switch_port_to_bridge_port()
function so that the LAG state is taken into account.
- Add a new per LAG field - primary - which is used to keep track of the
primary port of a LAG group instead of determining each time we need to
use it.
- Set 'skb->offload_fwd_mark' only when the port is under a bridge.
- Migrate FDBs in case the primary interface of a LAG changes.
- Use lag->primary instead of determining each time the primary
interface of a LAG device
- Link to v3: https://lore.kernel.org/all/20260603143623.3712024-1-ioana.ciornei@nxp.com/

Changes in v3:
- Add a check in dpsw_lag_set() for cfg->num_ifs against
DPSW_MAX_LAG_IFS
- Add kerneldoc for the dpsw_lag_cfg structure.
- Fix logic in prechangeupper callback in order to not call
dpaa2_switch_prechangeupper_sanity_checks() on !info->linking
- Fixed up the logic in the dpaa2_switch_port_bond_join()'s error path
so that the FDBs are cleaned-up properly and we do not end-up with FDB's
leaked, meaning that they could have been marked as in-use but actually
no port was using it.
- Mark the port_priv->lag field as __rcu and use the proper accesors for
it. This will eventually become useful in a later patch when the lag
field will be accessed concurrently from the NAPI context and the
join/leave paths
- Access lag field through rtnl_dereference() so that we adapt to the
__rcu change.
- Check that the brport is non-NULL before calling
switchdev_bridge_port_unoffload() on it.
- Get hold on port_priv->ethsw_data only after we know the device is a
dpaa2-switch one
- Update dpaa2_switch_foreign_dev_check() so that we check if there is
any port in the same switch as dev which offloads foreign_dev in case
this is a bridge port.
- Add mutex_destroy on the per LAG fdb_lock
- Make sure that all FDB events were processed on the workqueue on the
.remove() path.
- Delete the refcounted entry in dpaa2_switch_lag_fdb_del() as soon as
possible, even if the HW deletion would fail
- Access the port_priv->lag field only through the proper rcu accessors.
- Change the mask so that we restrict the trap only to the link local
addresses (01:80:c2:00:00:00 to 01:80:c2:00:00:0F) instead of the entire
reserved bridge block of addresses
- Link to v2: https://lore.kernel.org/all/20260512131554.952971-1-ioana.ciornei@nxp.com/

Changes in v2:
- Extend dpaa2_switch_prechangeupper_sanity_checks() with
netdev_walk_all_lower_dev() so that checks are done on all lower devices
of a bridge, even for the lowers of a bridged bond.
- Manage better the default VLAN on bond join
- Clean-up the error path in dpaa2_switch_port_bond_join()
- Call dpaa2_switch_port_bridge_leave() in case a port is leaving a bond
which is also a bridged port
- Update dpaa2_switch_port_bond_leave() so that in case of any failure
the driver tries to cleanup the LAG offload configuration.
- Call switchdev_bridge_port_unoffload() in a switch port is leaving a
bridge bond device.
- The rollback in dpaa2_switch_port_mdb_add() uses the newly introduced
dpaa2_switch_port_fdb_del() helper instead of the _mc counterpart.
- Update dpaa2_switch_foreign_dev_check() so that we check if between
the switch port and the foreign net_device is an offloaded path. Before
this change we also checked if the foreign_dev was offloaded or not by
the switch port.
- Update the switchdev_bridge_port_unoffload() by passing it the proper
context and the notifier blocks.
- Add dev_hold() and dev_put() calls for orig_dev
- In case dev_mc_add() fails, remove the MDB address from HW with the
proper function, dpaa2_switch_lag_fdb_del() or
dpaa2_switch_port_fdb_del(), depending on the LAG offload state.
- Fix 32bit build by using BIT_ULL
- Take a reference to port_priv->lag instead of reading it multiple
times.
- Link to v1: https://lore.kernel.org/all/20260506151540.1242997-1-ioana.ciornei@nxp.com/

Ioana Ciornei (13):
  dpaa2-switch: remove unnecessary dev_mc_add/dev_mc_del calls
  dpaa2-switch: avoid holding rtnl_lock in dpaa2_switch_event_work()
  dpaa2-switch: extend the FDB management to cover bond scenarios
  dpaa2-switch: create a separate dpaa2_switch_port_fdb_event() function
  dpaa2-switch: check early if an FDB entry should be added
  dpaa2-switch: add dpaa2_switch_port_to_bridge_port() helper
  dpaa2-switch: consolidate unicast and multicast management
  dpaa2-switch: add LAG configuration API
  dpaa2-switch: add support for LAG offload
  dpaa2-switch: offload FDBs added on an upper bond device
  dpaa2-switch: offload port objects on an upper bond device
  dpaa2-switch: trap all link local reserved addresses to the CPU
  dpaa2-switch: add support for imprecise source port

 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 931 +++++++++++++++---
 .../ethernet/freescale/dpaa2/dpaa2-switch.h   |  42 +-
 .../net/ethernet/freescale/dpaa2/dpsw-cmd.h   |  18 +-
 drivers/net/ethernet/freescale/dpaa2/dpsw.c   |  60 ++
 drivers/net/ethernet/freescale/dpaa2/dpsw.h   |  30 +
 5 files changed, 948 insertions(+), 133 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 01/13] dpaa2-switch: remove unnecessary dev_mc_add/dev_mc_del calls
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
@ 2026-06-29 11:22 ` Ioana Ciornei
  2026-06-29 11:22 ` [PATCH net-next v4 02/13] dpaa2-switch: avoid holding rtnl_lock in dpaa2_switch_event_work() Ioana Ciornei
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:22 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

The DPSW object does not implement strict address filtering thus any
call to the dev_mc_add() / dev_mc_del() is pointless. Remove these calls
from the dpaa2_switch_port_mdb_add() and dpaa2_switch_port_mdb_del()
functions.

And since the multicast addresses no longer reach the netdev->mc list,
there is no point in keeping the dpaa2_switch_port_lookup_address()
function which searches through that list to verify if the same address
is added multiple times.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- new patch
---
 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 50 +------------------
 1 file changed, 2 insertions(+), 48 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 858ba844ac51..d70e6f06ac15 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -1860,44 +1860,12 @@ int dpaa2_switch_port_vlans_add(struct net_device *netdev,
 					  vlan->changed);
 }
 
-static int dpaa2_switch_port_lookup_address(struct net_device *netdev, int is_uc,
-					    const unsigned char *addr)
-{
-	struct netdev_hw_addr_list *list = (is_uc) ? &netdev->uc : &netdev->mc;
-	struct netdev_hw_addr *ha;
-
-	netif_addr_lock_bh(netdev);
-	list_for_each_entry(ha, &list->list, list) {
-		if (ether_addr_equal(ha->addr, addr)) {
-			netif_addr_unlock_bh(netdev);
-			return 1;
-		}
-	}
-	netif_addr_unlock_bh(netdev);
-	return 0;
-}
-
 static int dpaa2_switch_port_mdb_add(struct net_device *netdev,
 				     const struct switchdev_obj_port_mdb *mdb)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
-	int err;
-
-	/* Check if address is already set on this port */
-	if (dpaa2_switch_port_lookup_address(netdev, 0, mdb->addr))
-		return -EEXIST;
 
-	err = dpaa2_switch_port_fdb_add_mc(port_priv, mdb->addr);
-	if (err)
-		return err;
-
-	err = dev_mc_add(netdev, mdb->addr);
-	if (err) {
-		netdev_err(netdev, "dev_mc_add err %d\n", err);
-		dpaa2_switch_port_fdb_del_mc(port_priv, mdb->addr);
-	}
-
-	return err;
+	return dpaa2_switch_port_fdb_add_mc(port_priv, mdb->addr);
 }
 
 static int dpaa2_switch_port_obj_add(struct net_device *netdev,
@@ -2000,22 +1968,8 @@ static int dpaa2_switch_port_mdb_del(struct net_device *netdev,
 				     const struct switchdev_obj_port_mdb *mdb)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
-	int err;
 
-	if (!dpaa2_switch_port_lookup_address(netdev, 0, mdb->addr))
-		return -ENOENT;
-
-	err = dpaa2_switch_port_fdb_del_mc(port_priv, mdb->addr);
-	if (err)
-		return err;
-
-	err = dev_mc_del(netdev, mdb->addr);
-	if (err) {
-		netdev_err(netdev, "dev_mc_del err %d\n", err);
-		return err;
-	}
-
-	return err;
+	return dpaa2_switch_port_fdb_del_mc(port_priv, mdb->addr);
 }
 
 static int dpaa2_switch_port_obj_del(struct net_device *netdev,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 02/13] dpaa2-switch: avoid holding rtnl_lock in dpaa2_switch_event_work()
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
  2026-06-29 11:22 ` [PATCH net-next v4 01/13] dpaa2-switch: remove unnecessary dev_mc_add/dev_mc_del calls Ioana Ciornei
@ 2026-06-29 11:22 ` Ioana Ciornei
  2026-06-29 11:22 ` [PATCH net-next v4 03/13] dpaa2-switch: extend the FDB management to cover bond scenarios Ioana Ciornei
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:22 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

The only reason why the rtnl_lock is held in the
dpaa2_switch_event_work() is so that there is no concurency between the
changeupper notifier which manages the per port FDB assignment and the
workqueue which adds / deletes addresses into that forwarding database.

To avoid this kind of concurency without a rtnl_lock, flush the event
workqueue as the last step from the pre_bridge_leave so that any
in-flight operations targeting the current FDB are finalized before the
bridge layout (and the per port FDB assignment) changes.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- New patch.
---
 drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index d70e6f06ac15..67c639fad0db 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -2069,7 +2069,15 @@ static int dpaa2_switch_port_restore_rxvlan(struct net_device *vdev, int vid, vo
 
 static void dpaa2_switch_port_pre_bridge_leave(struct net_device *netdev)
 {
+	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
+
 	switchdev_bridge_port_unoffload(netdev, NULL, NULL, NULL);
+
+	/* Make sure that any FDB add/del operations are completed before the
+	 * bridge layout changes
+	 */
+	flush_workqueue(ethsw->workqueue);
 }
 
 static int dpaa2_switch_port_bridge_leave(struct net_device *netdev)
@@ -2281,7 +2289,6 @@ static void dpaa2_switch_event_work(struct work_struct *work)
 	struct switchdev_notifier_fdb_info *fdb_info;
 	int err;
 
-	rtnl_lock();
 	fdb_info = &switchdev_work->fdb_info;
 
 	switch (switchdev_work->event) {
@@ -2310,7 +2317,6 @@ static void dpaa2_switch_event_work(struct work_struct *work)
 		break;
 	}
 
-	rtnl_unlock();
 	kfree(switchdev_work->fdb_info.addr);
 	kfree(switchdev_work);
 	dev_put(dev);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 03/13] dpaa2-switch: extend the FDB management to cover bond scenarios
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
  2026-06-29 11:22 ` [PATCH net-next v4 01/13] dpaa2-switch: remove unnecessary dev_mc_add/dev_mc_del calls Ioana Ciornei
  2026-06-29 11:22 ` [PATCH net-next v4 02/13] dpaa2-switch: avoid holding rtnl_lock in dpaa2_switch_event_work() Ioana Ciornei
@ 2026-06-29 11:22 ` Ioana Ciornei
  2026-06-29 11:23 ` [PATCH net-next v4 04/13] dpaa2-switch: create a separate dpaa2_switch_port_fdb_event() function Ioana Ciornei
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:22 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

The dpaa2_switch_fdb_for_join() function is responsible with determining
what FDB should be used by a port as a consequence of it joining a
bridge. The rule is that all DPAA2 switch ports under the same bridge
will use the FDB of the first port which joined that bridge. Extend the
function so that the function also covers the scenario in which there is
bridged bond device.

For this to happen, in case a bond device is encountered through the
bridge ports the function needs to descend one level through its lowers
as well.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- New patch. The same idea was present also in v3 but the implemetation
changed quite a bit since there was some restructuring work done to the
main function in the meantime.
---
 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 35 +++++++++++++------
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 67c639fad0db..eacab00b586a 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -71,9 +71,9 @@ static struct dpaa2_switch_fdb *
 dpaa2_switch_fdb_for_join(struct ethsw_port_priv *port_priv,
 			  struct net_device *upper_dev)
 {
-	struct ethsw_port_priv *other_port_priv;
-	struct net_device *other_dev;
-	struct list_head *iter;
+	struct ethsw_port_priv *other_port_priv = NULL;
+	struct net_device *other_dev, *other_dev2;
+	struct list_head *iter, *iter2;
 
 	/* The below call to netdev_for_each_lower_dev() demands the RTNL lock
 	 * being held. Assert on it so that it's easier to catch new code
@@ -82,17 +82,32 @@ dpaa2_switch_fdb_for_join(struct ethsw_port_priv *port_priv,
 	ASSERT_RTNL();
 
 	/* If part of a bridge, use the FDB of the first dpaa2 switch interface
-	 * to be present in that bridge
+	 * to be present in that bridge. The search descends one level through
+	 * a bridged bond's lowers as well.
 	 */
 	netdev_for_each_lower_dev(upper_dev, other_dev, iter) {
-		if (!dpaa2_switch_port_dev_check(other_dev))
-			continue;
+		if (netif_is_lag_master(other_dev)) {
+			netdev_for_each_lower_dev(other_dev, other_dev2, iter2) {
+				if (!dpaa2_switch_port_dev_check(other_dev2))
+					continue;
 
-		if (other_dev == port_priv->netdev)
-			continue;
+				if (other_dev2 == port_priv->netdev)
+					continue;
 
-		other_port_priv = netdev_priv(other_dev);
-		return other_port_priv->fdb;
+				other_port_priv = netdev_priv(other_dev2);
+				break;
+			}
+		} else {
+			if (!dpaa2_switch_port_dev_check(other_dev))
+				continue;
+
+			if (other_dev == port_priv->netdev)
+				continue;
+
+			other_port_priv = netdev_priv(other_dev);
+		}
+		if (other_port_priv)
+			return other_port_priv->fdb;
 	}
 
 	return port_priv->fdb;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 04/13] dpaa2-switch: create a separate dpaa2_switch_port_fdb_event() function
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (2 preceding siblings ...)
  2026-06-29 11:22 ` [PATCH net-next v4 03/13] dpaa2-switch: extend the FDB management to cover bond scenarios Ioana Ciornei
@ 2026-06-29 11:23 ` Ioana Ciornei
  2026-06-29 11:23 ` [PATCH net-next v4 05/13] dpaa2-switch: check early if an FDB entry should be added Ioana Ciornei
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

Create a separate dpaa2_switch_port_fdb_event() function that will only
handle the FDB related events. With this change, the
dpaa2_switch_port_event() notifier handler can be written in a way that
it's easier to follow.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- none

Changes in v3:
- Get hold on port_priv->ethsw_data only after we know the device is a
dpaa2-switch one

Changes in v2:
- none
---
 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 28 ++++++++++++++-----
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index eacab00b586a..c7c84bf2fde7 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -2337,21 +2337,18 @@ static void dpaa2_switch_event_work(struct work_struct *work)
 	dev_put(dev);
 }
 
-/* Called under rcu_read_lock() */
-static int dpaa2_switch_port_event(struct notifier_block *nb,
-				   unsigned long event, void *ptr)
+static int dpaa2_switch_port_fdb_event(struct notifier_block *nb,
+				       unsigned long event, void *ptr)
 {
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
 	struct ethsw_port_priv *port_priv = netdev_priv(dev);
 	struct ethsw_switchdev_event_work *switchdev_work;
 	struct switchdev_notifier_fdb_info *fdb_info = ptr;
-	struct ethsw_core *ethsw = port_priv->ethsw_data;
-
-	if (event == SWITCHDEV_PORT_ATTR_SET)
-		return dpaa2_switch_port_attr_set_event(dev, ptr);
+	struct ethsw_core *ethsw;
 
 	if (!dpaa2_switch_port_dev_check(dev))
 		return NOTIFY_DONE;
+	ethsw = port_priv->ethsw_data;
 
 	switchdev_work = kzalloc_obj(*switchdev_work, GFP_ATOMIC);
 	if (!switchdev_work)
@@ -2390,6 +2387,23 @@ static int dpaa2_switch_port_event(struct notifier_block *nb,
 	return NOTIFY_BAD;
 }
 
+/* Called under rcu_read_lock() */
+static int dpaa2_switch_port_event(struct notifier_block *nb,
+				   unsigned long event, void *ptr)
+{
+	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+
+	switch (event) {
+	case SWITCHDEV_PORT_ATTR_SET:
+		return dpaa2_switch_port_attr_set_event(dev, ptr);
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		return dpaa2_switch_port_fdb_event(nb, event, ptr);
+	default:
+		return NOTIFY_DONE;
+	}
+}
+
 static int dpaa2_switch_port_obj_event(unsigned long event,
 				       struct net_device *netdev,
 				       struct switchdev_notifier_port_obj_info *port_obj_info)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 05/13] dpaa2-switch: check early if an FDB entry should be added
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (3 preceding siblings ...)
  2026-06-29 11:23 ` [PATCH net-next v4 04/13] dpaa2-switch: create a separate dpaa2_switch_port_fdb_event() function Ioana Ciornei
@ 2026-06-29 11:23 ` Ioana Ciornei
  2026-06-29 11:23 ` [PATCH net-next v4 06/13] dpaa2-switch: add dpaa2_switch_port_to_bridge_port() helper Ioana Ciornei
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

Instead of waiting until the last moment to check if an FDB entry should
be added to HW, move the check earlier (before even scheduling the work
item) so that we don't just waste time.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- none

Changes in v3:
- none

Changes in v2:
- none
---
 drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index c7c84bf2fde7..d4975d08fa44 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -2308,8 +2308,6 @@ static void dpaa2_switch_event_work(struct work_struct *work)
 
 	switch (switchdev_work->event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-		if (!fdb_info->added_by_user || fdb_info->is_local)
-			break;
 		if (is_unicast_ether_addr(fdb_info->addr))
 			err = dpaa2_switch_port_fdb_add_uc(netdev_priv(dev),
 							   fdb_info->addr);
@@ -2323,8 +2321,6 @@ static void dpaa2_switch_event_work(struct work_struct *work)
 					 &fdb_info->info, NULL);
 		break;
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		if (!fdb_info->added_by_user || fdb_info->is_local)
-			break;
 		if (is_unicast_ether_addr(fdb_info->addr))
 			dpaa2_switch_port_fdb_del_uc(netdev_priv(dev), fdb_info->addr);
 		else
@@ -2350,6 +2346,9 @@ static int dpaa2_switch_port_fdb_event(struct notifier_block *nb,
 		return NOTIFY_DONE;
 	ethsw = port_priv->ethsw_data;
 
+	if (!fdb_info->added_by_user || fdb_info->is_local)
+		return NOTIFY_DONE;
+
 	switchdev_work = kzalloc_obj(*switchdev_work, GFP_ATOMIC);
 	if (!switchdev_work)
 		return NOTIFY_BAD;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 06/13] dpaa2-switch: add dpaa2_switch_port_to_bridge_port() helper
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (4 preceding siblings ...)
  2026-06-29 11:23 ` [PATCH net-next v4 05/13] dpaa2-switch: check early if an FDB entry should be added Ioana Ciornei
@ 2026-06-29 11:23 ` Ioana Ciornei
  2026-06-30 13:51   ` Ioana Ciornei
  2026-06-29 11:23 ` [PATCH net-next v4 07/13] dpaa2-switch: consolidate unicast and multicast management Ioana Ciornei
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

In preparation for adding offloading support for upper bond devices we
have to let the switchdev framework know if a specific bridge port is
offloaded or not, even if that brport is an upper device.

For this to happen, create the dpaa2_switch_port_to_bridge_port function
which will determine the bridge port corresponding to a particular DPAA2
switch interface and use it in the switchdev_bridge_port_offload call.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- Split the patch so that the first part only adds the base function and
its call sites and the logic aroung lag is added later in the patch
which actually adds the support for LAG.
- Moved the patch so that it's a preparatory patch

Changes in v3:
- Access lag field through rtnl_dereference() so that we adapt to the
__rcu change.
- Check that the brport is non-NULL before calling
switchdev_bridge_port_unoffload() on it.

Changes in v2:
- none
---
 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 23 ++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index d4975d08fa44..88d199befbd9 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -2017,6 +2017,15 @@ static int dpaa2_switch_port_attr_set_event(struct net_device *netdev,
 	return notifier_from_errno(err);
 }
 
+static struct net_device *
+dpaa2_switch_port_to_bridge_port(struct ethsw_port_priv *port_priv)
+{
+	if (!port_priv->fdb->bridge_dev)
+		return NULL;
+
+	return port_priv->netdev;
+}
+
 static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
 					 struct net_device *upper_dev,
 					 struct netlink_ext_ack *extack)
@@ -2024,6 +2033,7 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	struct dpaa2_switch_fdb *old_fdb = port_priv->fdb;
 	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	struct net_device *brport_dev;
 	bool learn_ena;
 	int err;
 
@@ -2035,7 +2045,8 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
 	dpaa2_switch_port_set_fdb(port_priv, upper_dev, true);
 
 	/* Inherit the initial bridge port learning state */
-	learn_ena = br_port_flag_is_set(netdev, BR_LEARNING);
+	brport_dev = dpaa2_switch_port_to_bridge_port(port_priv);
+	learn_ena = br_port_flag_is_set(brport_dev, BR_LEARNING);
 	err = dpaa2_switch_port_set_learning(port_priv, learn_ena);
 	port_priv->learn_ena = learn_ena;
 
@@ -2049,7 +2060,8 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
 	if (err)
 		goto err_egress_flood;
 
-	err = switchdev_bridge_port_offload(netdev, netdev, NULL,
+	brport_dev = dpaa2_switch_port_to_bridge_port(port_priv);
+	err = switchdev_bridge_port_offload(brport_dev, netdev, NULL,
 					    NULL, NULL, false, extack);
 	if (err)
 		goto err_switchdev_offload;
@@ -2086,8 +2098,13 @@ static void dpaa2_switch_port_pre_bridge_leave(struct net_device *netdev)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	struct net_device *brport_dev;
+
+	brport_dev = dpaa2_switch_port_to_bridge_port(port_priv);
+	if (!brport_dev)
+		return;
 
-	switchdev_bridge_port_unoffload(netdev, NULL, NULL, NULL);
+	switchdev_bridge_port_unoffload(brport_dev, NULL, NULL, NULL);
 
 	/* Make sure that any FDB add/del operations are completed before the
 	 * bridge layout changes
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 07/13] dpaa2-switch: consolidate unicast and multicast management
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (5 preceding siblings ...)
  2026-06-29 11:23 ` [PATCH net-next v4 06/13] dpaa2-switch: add dpaa2_switch_port_to_bridge_port() helper Ioana Ciornei
@ 2026-06-29 11:23 ` Ioana Ciornei
  2026-06-29 11:23 ` [PATCH net-next v4 08/13] dpaa2-switch: add LAG configuration API Ioana Ciornei
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

This patch consolidates the unicast and multicast management by creating
two new functions - dpaa2_switch_port_fdb_[add|del]() - which can be
used for either uc or mc addresses. Having this common entrypoint for
both types of addresses will help us in the next patches to streamline
the same addresses but on LAG ports.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- Moved the commit ordering, no actual code changes

Changes in v3:
- none

Changes in v2:
- The rollback in dpaa2_switch_port_mdb_add() uses the newly introduced
dpaa2_switch_port_fdb_del() helper instead of the _mc counterpart.
---
 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 39 +++++++++++++------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 88d199befbd9..3472f5d5b08a 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -552,6 +552,28 @@ static int dpaa2_switch_port_fdb_del_mc(struct ethsw_port_priv *port_priv,
 	return err;
 }
 
+static int dpaa2_switch_port_fdb_add(struct ethsw_port_priv *port_priv,
+				     const unsigned char *addr)
+{
+	int err;
+
+	if (is_unicast_ether_addr(addr))
+		err = dpaa2_switch_port_fdb_add_uc(port_priv, addr);
+	else
+		err = dpaa2_switch_port_fdb_add_mc(port_priv, addr);
+
+	return err;
+}
+
+static int dpaa2_switch_port_fdb_del(struct ethsw_port_priv *port_priv,
+				     const unsigned char *addr)
+{
+	if (is_unicast_ether_addr(addr))
+		return dpaa2_switch_port_fdb_del_uc(port_priv, addr);
+	else
+		return dpaa2_switch_port_fdb_del_mc(port_priv, addr);
+}
+
 static void dpaa2_switch_port_get_stats(struct net_device *netdev,
 					struct rtnl_link_stats64 *stats)
 {
@@ -1880,7 +1902,7 @@ static int dpaa2_switch_port_mdb_add(struct net_device *netdev,
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 
-	return dpaa2_switch_port_fdb_add_mc(port_priv, mdb->addr);
+	return dpaa2_switch_port_fdb_add(port_priv, mdb->addr);
 }
 
 static int dpaa2_switch_port_obj_add(struct net_device *netdev,
@@ -1984,7 +2006,7 @@ static int dpaa2_switch_port_mdb_del(struct net_device *netdev,
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 
-	return dpaa2_switch_port_fdb_del_mc(port_priv, mdb->addr);
+	return dpaa2_switch_port_fdb_del(port_priv, mdb->addr);
 }
 
 static int dpaa2_switch_port_obj_del(struct net_device *netdev,
@@ -2325,12 +2347,8 @@ static void dpaa2_switch_event_work(struct work_struct *work)
 
 	switch (switchdev_work->event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-		if (is_unicast_ether_addr(fdb_info->addr))
-			err = dpaa2_switch_port_fdb_add_uc(netdev_priv(dev),
-							   fdb_info->addr);
-		else
-			err = dpaa2_switch_port_fdb_add_mc(netdev_priv(dev),
-							   fdb_info->addr);
+		err = dpaa2_switch_port_fdb_add(netdev_priv(dev),
+						fdb_info->addr);
 		if (err)
 			break;
 		fdb_info->offloaded = true;
@@ -2338,10 +2356,7 @@ static void dpaa2_switch_event_work(struct work_struct *work)
 					 &fdb_info->info, NULL);
 		break;
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		if (is_unicast_ether_addr(fdb_info->addr))
-			dpaa2_switch_port_fdb_del_uc(netdev_priv(dev), fdb_info->addr);
-		else
-			dpaa2_switch_port_fdb_del_mc(netdev_priv(dev), fdb_info->addr);
+		dpaa2_switch_port_fdb_del(netdev_priv(dev), fdb_info->addr);
 		break;
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 08/13] dpaa2-switch: add LAG configuration API
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (6 preceding siblings ...)
  2026-06-29 11:23 ` [PATCH net-next v4 07/13] dpaa2-switch: consolidate unicast and multicast management Ioana Ciornei
@ 2026-06-29 11:23 ` Ioana Ciornei
  2026-06-29 11:23 ` [PATCH net-next v4 09/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

Add the necessary APIs to configure and control the LAG support on the
DPAA2 switch object.
 - The dpsw_lag_set() function will be used to either verify that a LAG
 configuration can be support or to actually apply it in HW.
 - The dpsw_if_set_lag_state() will get used in the next patches to
 change the per port LAG state of a specific DPSW interface.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- None

Changes in v3:
- Add a check in dpsw_lag_set() for cfg->num_ifs against
DPSW_MAX_LAG_IFS
- Add kerneldoc for the dpsw_lag_cfg structure.

Changes in v2:
- none
---
 .../net/ethernet/freescale/dpaa2/dpsw-cmd.h   | 18 +++++-
 drivers/net/ethernet/freescale/dpaa2/dpsw.c   | 60 +++++++++++++++++++
 drivers/net/ethernet/freescale/dpaa2/dpsw.h   | 30 ++++++++++
 3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpsw-cmd.h b/drivers/net/ethernet/freescale/dpaa2/dpsw-cmd.h
index 397d55f2bd99..9a2055c64983 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpsw-cmd.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpsw-cmd.h
@@ -12,7 +12,7 @@
 
 /* DPSW Version */
 #define DPSW_VER_MAJOR		8
-#define DPSW_VER_MINOR		9
+#define DPSW_VER_MINOR		13
 
 #define DPSW_CMD_BASE_VERSION	1
 #define DPSW_CMD_VERSION_2	2
@@ -92,11 +92,14 @@
 #define DPSW_CMDID_CTRL_IF_SET_POOLS        DPSW_CMD_ID(0x0A1)
 #define DPSW_CMDID_CTRL_IF_ENABLE           DPSW_CMD_ID(0x0A2)
 #define DPSW_CMDID_CTRL_IF_DISABLE          DPSW_CMD_ID(0x0A3)
+#define DPSW_CMDID_SET_LAG                  DPSW_CMD_V2(0x0A4)
 #define DPSW_CMDID_CTRL_IF_SET_QUEUE        DPSW_CMD_ID(0x0A6)
 
 #define DPSW_CMDID_SET_EGRESS_FLOOD         DPSW_CMD_ID(0x0AC)
 #define DPSW_CMDID_IF_SET_LEARNING_MODE     DPSW_CMD_ID(0x0AD)
 
+#define DPSW_CMDID_IF_SET_LAG_STATE         DPSW_CMD_ID(0x0B0)
+
 /* Macros for accessing command fields smaller than 1byte */
 #define DPSW_MASK(field)        \
 	GENMASK(DPSW_##field##_SHIFT + DPSW_##field##_SIZE - 1, \
@@ -552,5 +555,18 @@ struct dpsw_cmd_if_reflection {
 	/* only 2 bits from the LSB */
 	u8 filter;
 };
+
+struct dpsw_cmd_lag {
+	u8 group_id;
+	u8 num_ifs;
+	u8 pad[6];
+	u8 if_id[DPSW_MAX_LAG_IFS];
+	u8 phase;
+};
+
+struct dpsw_cmd_if_set_lag_state {
+	__le16 if_id;
+	u8 tx_enabled;
+};
 #pragma pack(pop)
 #endif /* __FSL_DPSW_CMD_H */
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpsw.c b/drivers/net/ethernet/freescale/dpaa2/dpsw.c
index ab921d75deb2..f75cbdce42ba 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpsw.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpsw.c
@@ -1659,3 +1659,63 @@ int dpsw_if_remove_reflection(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
 
 	return mc_send_command(mc_io, &cmd);
 }
+
+/**
+ * dpsw_lag_set() - Set LAG configuration
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPSW object
+ * @cfg:	pointer to LAG configuration
+ *
+ * Return:   '0' on Success; Error code otherwise.
+ */
+int dpsw_lag_set(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+		 const struct dpsw_lag_cfg *cfg)
+{
+	struct fsl_mc_command cmd = { 0 };
+	struct dpsw_cmd_lag *cmd_params;
+	int i = 0;
+
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_SET_LAG, cmd_flags, token);
+
+	if (cfg->num_ifs > DPSW_MAX_LAG_IFS)
+		return -EOPNOTSUPP;
+
+	cmd_params = (struct dpsw_cmd_lag *)cmd.params;
+	cmd_params->group_id = cfg->group_id;
+	cmd_params->num_ifs = cfg->num_ifs;
+	cmd_params->phase = cfg->phase;
+
+	for (i = 0; i < cfg->num_ifs; i++)
+		cmd_params->if_id[i] = cfg->if_id[i];
+
+	return mc_send_command(mc_io, &cmd);
+}
+
+/**
+ * dpsw_if_set_lag_state() - Change per port LAG state
+ * @mc_io:      Pointer to MC portal's I/O object
+ * @cmd_flags:  Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:      Token of DPSW object
+ * @if_id:      ID of the switch interface
+ * @tx_enabled: Value of the per port LAG state
+ *     - 0 if the interface will not be active as part of the LAG group
+ *     - 1 if the interface will be active in the LAG group
+ *
+ * Return:   '0' on Success; Error code otherwise.
+ */
+int dpsw_if_set_lag_state(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+			  u16 if_id, u8 tx_enabled)
+{
+	struct dpsw_cmd_if_set_lag_state *cmd_params;
+	struct fsl_mc_command cmd = { 0 };
+
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_IF_SET_LAG_STATE,
+					  cmd_flags, token);
+
+	cmd_params = (struct dpsw_cmd_if_set_lag_state *)cmd.params;
+	cmd_params->if_id = cpu_to_le16(if_id);
+	cmd_params->tx_enabled = tx_enabled;
+
+	return mc_send_command(mc_io, &cmd);
+}
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpsw.h b/drivers/net/ethernet/freescale/dpaa2/dpsw.h
index b90bd363f47a..89f0267de8e9 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpsw.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpsw.h
@@ -20,6 +20,8 @@ struct fsl_mc_io;
 
 #define DPSW_MAX_IF		64
 
+#define DPSW_MAX_LAG_IFS	8
+
 int dpsw_open(struct fsl_mc_io *mc_io, u32 cmd_flags, int dpsw_id, u16 *token);
 
 int dpsw_close(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token);
@@ -788,4 +790,32 @@ int dpsw_if_add_reflection(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
 
 int dpsw_if_remove_reflection(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
 			      u16 if_id, const struct dpsw_reflection_cfg *cfg);
+
+/* Link Aggregation Group configuration */
+
+#define DPSW_LAG_SET_PHASE_APPLY 0
+#define DPSW_LAG_SET_PHASE_CHECK 1
+
+/**
+ * struct dpsw_lag_cfg - Configuration structure for a LAG group
+ * @group_id: Link aggregation group ID. Valid values are in the
+ * [1, DPSW_MAX_LAG_IFS] range.
+ * @num_ifs: Number of interfaces in this LAG group, valid range is
+ * [0, DPSW_MAX_LAG_IFS].
+ * @if_id: Array containing the interface IDs of the ports part of a LAG group
+ * @phase: Use DPSW_LAG_SET_PHASE_APPLY for LAG configuration processing or
+ * DPSW_LAG_SET_PHASE_CHECK for LAG configuration validation.
+ */
+struct dpsw_lag_cfg {
+	u8 group_id;
+	u8 num_ifs;
+	u8 if_id[DPSW_MAX_LAG_IFS];
+	u8 phase;
+};
+
+int dpsw_lag_set(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+		 const struct dpsw_lag_cfg *cfg);
+
+int dpsw_if_set_lag_state(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+			  u16 if_id, u8 tx_enabled);
 #endif /* __FSL_DPSW_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 09/13] dpaa2-switch: add support for LAG offload
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (7 preceding siblings ...)
  2026-06-29 11:23 ` [PATCH net-next v4 08/13] dpaa2-switch: add LAG configuration API Ioana Ciornei
@ 2026-06-29 11:23 ` Ioana Ciornei
  2026-06-30 14:23   ` Ioana Ciornei
  2026-06-29 11:23 ` [PATCH net-next v4 10/13] dpaa2-switch: offload FDBs added on an upper bond device Ioana Ciornei
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

This patch adds the bulk of the changes needed in order to support
offloading of an upper bond device.

First of all, handling of the NETDEV_CHANGEUPPER and
NETDEV_PRECHANGEUPPER events is extended so that the driver is capable
to handle joining or leaving an upper bond device.
All the restrictions around the LAG offload support are added in the
newly added dpaa2_switch_pre_lag_join() function.

The same events are extended to also detect if one of our upper bond
devices changes its own upper device. In this case, on each lower device
that is DPAA2 the corresponding dpaa2_switch_port_[pre]changeupper()
function will be called. This will start the process of joining the same
FDB as the one used by the bridge device.

Setting the 'offload_fwd_mark' field on the skbs is also extended to be
setup not only when the port is under a bridge but also under a bond
device that is offloaded.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- Add a defensive check in dpaa2_switch_port_bond_leave() for a NULL
port_priv->lag
- Extend the dpaa2_switch_prevent_bridging_with_8021q_upper() function
so that we prevent a bond device with VLAN uppers joinging a bridge.
The restriction is related to VLAN management in terms of the FDB which
can change upon a topology change. VLAN uppers can only be added once
the bridge topology is setup.
- Remove all FDB management from the bond join/leave paths. Decided to
reconfigure the FDB only on bridge join/leave since the FDB determines
the forwarding domain and when a bond is not bridged, from a
configuration standpoint, the individual lowers can be viewed as
standalone.
- Moved here the update to the dpaa2_switch_port_to_bridge_port()
function so that the LAG state is taken into account.
- Add a new per LAG field - primary - which is used to keep track of the
primary port of a LAG group instead of determining each time we need to
use it.
- Set 'skb->offload_fwd_mark' only when the port is under a bridge.

Changes in v3:
- Fix logic in prechangeupper callback in order to not call
dpaa2_switch_prechangeupper_sanity_checks() on !info->linking
- Fixed up the logic in the dpaa2_switch_port_bond_join()'s error path
so that the FDBs are cleaned-up properly and we do not end-up with FDB's
leaked, meaning that they could have been marked as in-use but actually
no port was using it.
- Mark the port_priv->lag field as __rcu and use the proper accesors for
it. This will eventually become useful in a later patch when the lag
field will be accessed concurrently from the NAPI context and the
join/leave paths

Changes in v2:
- Extend dpaa2_switch_prechangeupper_sanity_checks() with
netdev_walk_all_lower_dev() so that checks are done on all lower devices
of a bridge, even for the lowers of a bridged bond.
- Manage better the default VLAN on bond join
- Clean-up the error path in dpaa2_switch_port_bond_join()
- Call dpaa2_switch_port_bridge_leave() in case a port is leaving a bond
which is also a bridged port
- Update dpaa2_switch_port_bond_leave() so that in case of any failure
the driver tries to cleanup the LAG offload configuration.
- Call switchdev_bridge_port_unoffload() in a switch port is leaving a
bridge bond device.
---
 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 473 +++++++++++++++++-
 .../ethernet/freescale/dpaa2/dpaa2-switch.h   |  15 +-
 2 files changed, 476 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 3472f5d5b08a..949a7241a00f 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -51,6 +51,17 @@ dpaa2_switch_filter_block_get_unused(struct ethsw_core *ethsw)
 	return NULL;
 }
 
+static struct dpaa2_switch_lag *
+dpaa2_switch_lag_get_unused(struct ethsw_core *ethsw)
+{
+	int i;
+
+	for (i = 0; i < ethsw->sw_attr.num_ifs; i++)
+		if (!ethsw->lags[i].in_use)
+			return &ethsw->lags[i];
+	return NULL;
+}
+
 static bool dpaa2_switch_fdb_in_use_by_others(struct ethsw_core *ethsw,
 					      struct dpaa2_switch_fdb *fdb,
 					      struct ethsw_port_priv *except)
@@ -2042,9 +2053,15 @@ static int dpaa2_switch_port_attr_set_event(struct net_device *netdev,
 static struct net_device *
 dpaa2_switch_port_to_bridge_port(struct ethsw_port_priv *port_priv)
 {
+	struct dpaa2_switch_lag *lag;
+
 	if (!port_priv->fdb->bridge_dev)
 		return NULL;
 
+	lag = rtnl_dereference(port_priv->lag);
+	if (lag)
+		return lag->bond_dev;
+
 	return port_priv->netdev;
 }
 
@@ -2193,30 +2210,53 @@ static int dpaa2_switch_port_bridge_leave(struct net_device *netdev)
 					  false);
 }
 
+static int
+dpaa2_switch_have_vlan_upper(struct net_device *upper_dev,
+			     __always_unused struct netdev_nested_priv *priv)
+{
+	return is_vlan_dev(upper_dev);
+}
+
 static int dpaa2_switch_prevent_bridging_with_8021q_upper(struct net_device *netdev)
 {
-	struct net_device *upper_dev;
-	struct list_head *iter;
+	struct netdev_nested_priv priv = {};
 
 	/* RCU read lock not necessary because we have write-side protection
-	 * (rtnl_mutex), however a non-rcu iterator does not exist.
+	 * (rtnl_mutex), however a non-rcu iterator does not exist. Walk the
+	 * entire upper chain so that a VLAN device stacked on a intermediate
+	 * bond is caught too.
 	 */
-	netdev_for_each_upper_dev_rcu(netdev, upper_dev, iter)
-		if (is_vlan_dev(upper_dev))
-			return -EOPNOTSUPP;
+	if (netdev_walk_all_upper_dev_rcu(netdev, dpaa2_switch_have_vlan_upper,
+					  &priv))
+		return -EOPNOTSUPP;
 
 	return 0;
 }
 
+static int dpaa2_switch_check_dpsw_instance(struct net_device *dev,
+					    struct netdev_nested_priv *priv)
+{
+	struct ethsw_port_priv *port_priv = (struct ethsw_port_priv *)priv->data;
+	struct ethsw_port_priv *other_priv = netdev_priv(dev);
+
+	if (!dpaa2_switch_port_dev_check(dev))
+		return 0;
+
+	if (other_priv->ethsw_data == port_priv->ethsw_data)
+		return 0;
+
+	return 1;
+}
+
 static int
 dpaa2_switch_prechangeupper_sanity_checks(struct net_device *netdev,
 					  struct net_device *upper_dev,
 					  struct netlink_ext_ack *extack)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
-	struct ethsw_port_priv *other_port_priv;
-	struct net_device *other_dev;
-	struct list_head *iter;
+	struct netdev_nested_priv data = {
+		.data = (void *)port_priv,
+	};
 	int err;
 
 	if (!br_vlan_enabled(upper_dev)) {
@@ -2231,6 +2271,70 @@ dpaa2_switch_prechangeupper_sanity_checks(struct net_device *netdev,
 		return err;
 	}
 
+	err = netdev_walk_all_lower_dev(upper_dev,
+					dpaa2_switch_check_dpsw_instance,
+					&data);
+	if (err) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Interface from a different DPSW is in the bridge already");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int dpaa2_switch_pre_lag_join(struct net_device *netdev,
+				     struct net_device *upper_dev,
+				     struct netdev_lag_upper_info *info,
+				     struct netlink_ext_ack *extack)
+{
+	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	struct ethsw_port_priv *other_port_priv;
+	struct dpaa2_switch_lag *lag = NULL;
+	struct dpsw_lag_cfg cfg = {0};
+	struct net_device *other_dev;
+	int i, num_ifs = 0, err;
+	struct list_head *iter;
+
+	if (!(ethsw->features & ETHSW_FEATURE_LAG_OFFLOAD)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "LAG offload is supported only for DPSW >= v8.13");
+		return -EOPNOTSUPP;
+	}
+
+	if (info->tx_type != NETDEV_LAG_TX_TYPE_HASH) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Can only offload LAG using hash TX type");
+		return -EOPNOTSUPP;
+	}
+
+	if (info->hash_type != NETDEV_LAG_HASH_L23) {
+		NL_SET_ERR_MSG_MOD(extack, "Can only offload L2+L3 Tx hash");
+		return -EOPNOTSUPP;
+	}
+
+	if (!dpaa2_switch_port_has_mac(port_priv)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Only switch interfaces connected to MACs can be under a LAG");
+		return -EINVAL;
+	}
+
+	if (vlan_uses_dev(upper_dev)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Cannot join a LAG upper that has a VLAN");
+		return -EOPNOTSUPP;
+	}
+
+	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
+		if (!ethsw->lags[i].in_use)
+			continue;
+		if (ethsw->lags[i].bond_dev != upper_dev)
+			continue;
+		lag = &ethsw->lags[i];
+		break;
+	}
+
 	netdev_for_each_lower_dev(upper_dev, other_dev, iter) {
 		if (!dpaa2_switch_port_dev_check(other_dev))
 			continue;
@@ -2238,11 +2342,229 @@ dpaa2_switch_prechangeupper_sanity_checks(struct net_device *netdev,
 		other_port_priv = netdev_priv(other_dev);
 		if (other_port_priv->ethsw_data != port_priv->ethsw_data) {
 			NL_SET_ERR_MSG_MOD(extack,
-					   "Interface from a different DPSW is in the bridge already");
+					   "Interface from a different DPSW is in the bond already");
+			return -EINVAL;
+		}
+
+		cfg.if_id[num_ifs++] = other_port_priv->idx;
+
+		if (num_ifs >= DPSW_MAX_LAG_IFS) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "Cannot add more than 8 DPAA2 switch ports under the same bond");
 			return -EINVAL;
 		}
 	}
 
+	if (lag) {
+		cfg.group_id = lag->id;
+		cfg.if_id[num_ifs++] = port_priv->idx;
+		cfg.num_ifs = num_ifs;
+		cfg.phase = DPSW_LAG_SET_PHASE_CHECK;
+
+		err = dpsw_lag_set(ethsw->mc_io, 0, ethsw->dpsw_handle, &cfg);
+		if (err) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "Cannot offload LAG configuration");
+			return -EOPNOTSUPP;
+		}
+	}
+
+	return 0;
+}
+
+static void dpaa2_switch_port_set_lag_group(struct ethsw_port_priv *port_priv,
+					    struct net_device *bond_dev)
+{
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	struct ethsw_port_priv *other_port_priv = NULL;
+	struct dpaa2_switch_lag *lag = NULL;
+	struct dpaa2_switch_lag *other_lag;
+	struct net_device *other_dev;
+	struct list_head *iter;
+
+	netdev_for_each_lower_dev(bond_dev, other_dev, iter) {
+		if (!dpaa2_switch_port_dev_check(other_dev))
+			continue;
+
+		other_port_priv = netdev_priv(other_dev);
+		other_lag = rtnl_dereference(other_port_priv->lag);
+		if (!other_lag)
+			continue;
+
+		if (other_lag->bond_dev == bond_dev) {
+			rcu_assign_pointer(port_priv->lag, other_lag);
+			return;
+		}
+	}
+
+	/* This is the first interface to be added under a bond device. Find an
+	 * unused LAG group. No need to check for NULL since there are the same
+	 * amount of DPSW ports as LAG groups, meaning that each port can have
+	 * its own LAG group.
+	 */
+	lag = dpaa2_switch_lag_get_unused(ethsw);
+	lag->in_use = true;
+	lag->bond_dev = bond_dev;
+	lag->primary = port_priv;
+	rcu_assign_pointer(port_priv->lag, lag);
+}
+
+static bool dpaa2_switch_port_in_lag(struct ethsw_port_priv *port_priv,
+				     struct net_device *bond_dev)
+{
+	struct dpaa2_switch_lag *lag;
+
+	if (!port_priv)
+		return false;
+
+	lag = rtnl_dereference(port_priv->lag);
+	return lag && lag->bond_dev == bond_dev;
+}
+
+static int dpaa2_switch_set_lag_cfg(struct net_device *bond_dev, u8 lag_id,
+				    struct ethsw_core *ethsw)
+{
+	struct dpaa2_switch_lag *lag = &ethsw->lags[lag_id - 1];
+	struct ethsw_port_priv *primary, *new_primary = NULL;
+	struct ethsw_port_priv *port_priv = NULL;
+	struct dpsw_lag_cfg cfg = {0};
+	u8 num_ifs = 0;
+	int err, i;
+
+	cfg.group_id = lag_id;
+
+	/* Determine the primary port. The caller clears ->lag on the port that
+	 * is leaving, so a NULL ->lag on the current primary means it is the
+	 * one leaving: elect the first remaining member as the new primary.
+	 * Otherwise keep the current primary.
+	 */
+	if (rtnl_dereference(lag->primary->lag)) {
+		primary = lag->primary;
+	} else {
+		primary = NULL;
+		for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
+			if (dpaa2_switch_port_in_lag(ethsw->ports[i], bond_dev)) {
+				new_primary = ethsw->ports[i];
+				primary = new_primary;
+				break;
+			}
+		}
+	}
+
+	/* Build the interface list, always placing the primary first */
+	if (primary)
+		cfg.if_id[num_ifs++] = primary->idx;
+
+	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
+		port_priv = ethsw->ports[i];
+		if (port_priv == primary)
+			continue;
+		if (!dpaa2_switch_port_in_lag(port_priv, bond_dev))
+			continue;
+
+		cfg.if_id[num_ifs++] = port_priv->idx;
+	}
+	cfg.num_ifs = num_ifs;
+
+	/* No more interfaces under this LAG group, mark it as not in use. Wait
+	 * for a grace period so that any readers of the lag structure finished.
+	 */
+	if (!num_ifs) {
+		synchronize_net();
+
+		lag->bond_dev = NULL;
+		lag->primary = NULL;
+		lag->in_use = false;
+	}
+
+	err = dpsw_lag_set(ethsw->mc_io, 0, ethsw->dpsw_handle, &cfg);
+	if (err)
+		return err;
+
+	if (new_primary) {
+		synchronize_net();
+		lag->primary = new_primary;
+	}
+
+	return 0;
+}
+
+static int dpaa2_switch_port_bond_join(struct net_device *netdev,
+				       struct net_device *bond_dev,
+				       struct netdev_lag_upper_info *info,
+				       struct netlink_ext_ack *extack)
+{
+	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	struct net_device *bridge_dev;
+	struct dpaa2_switch_lag *lag;
+	int err = 0;
+	u8 lag_id;
+
+	/* Setup the port_priv->lag pointer for this switch port */
+	dpaa2_switch_port_set_lag_group(port_priv, bond_dev);
+
+	/* Create the LAG configuration and apply it in MC */
+	lag = rtnl_dereference(port_priv->lag);
+	lag_id = lag->id;
+	err = dpaa2_switch_set_lag_cfg(bond_dev, lag_id, ethsw);
+	if (err)
+		goto err_lag_cfg;
+
+	/* If the bond device is a switch port, join the bridge as well */
+	bridge_dev = netdev_master_upper_dev_get(bond_dev);
+	if (!bridge_dev || !netif_is_bridge_master(bridge_dev))
+		return 0;
+
+	err = dpaa2_switch_port_bridge_join(netdev, bridge_dev, extack);
+	if (err)
+		goto err_lag_cfg;
+
+	return err;
+
+err_lag_cfg:
+	rcu_assign_pointer(port_priv->lag, NULL);
+	dpaa2_switch_set_lag_cfg(bond_dev, lag_id, ethsw);
+
+	return err;
+}
+
+static int dpaa2_switch_port_bond_leave(struct net_device *netdev,
+					struct net_device *bond_dev)
+{
+	struct net_device *bridge_dev = netdev_master_upper_dev_get(bond_dev);
+	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct dpaa2_switch_lag *lag = rtnl_dereference(port_priv->lag);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	struct net_device *brpdev;
+	bool learn_ena;
+	int err;
+
+	if (!lag)
+		return 0;
+
+	/* Recreate the LAG configuration for the LAG group that we left. */
+	rcu_assign_pointer(port_priv->lag, NULL);
+	dpaa2_switch_set_lag_cfg(bond_dev, lag->id, ethsw);
+
+	if (bridge_dev && netif_is_bridge_master(bridge_dev)) {
+		/* Make sure that the new primary inherits the learning state */
+		if (lag->primary) {
+			brpdev = dpaa2_switch_port_to_bridge_port(lag->primary);
+			learn_ena = br_port_flag_is_set(brpdev, BR_LEARNING);
+			err = dpaa2_switch_port_set_learning(lag->primary,
+							     learn_ena);
+			if (err)
+				return err;
+			lag->primary->learn_ena = learn_ena;
+		}
+
+		/* In case the bond is a bridge port, leave the upper bridge as
+		 * well.
+		 */
+		return dpaa2_switch_port_bridge_leave(netdev);
+	}
+
 	return 0;
 }
 
@@ -2250,8 +2572,8 @@ static int dpaa2_switch_port_prechangeupper(struct net_device *netdev,
 					    struct netdev_notifier_changeupper_info *info)
 {
 	struct ethsw_port_priv *port_priv;
+	struct net_device *upper_dev, *br;
 	struct netlink_ext_ack *extack;
-	struct net_device *upper_dev;
 	int err;
 
 	if (!dpaa2_switch_port_dev_check(netdev))
@@ -2268,6 +2590,24 @@ static int dpaa2_switch_port_prechangeupper(struct net_device *netdev,
 
 		if (!info->linking)
 			dpaa2_switch_port_pre_bridge_leave(netdev);
+	} else if (netif_is_lag_master(upper_dev)) {
+		if (!info->linking) {
+			if (netif_is_bridge_port(upper_dev))
+				dpaa2_switch_port_pre_bridge_leave(netdev);
+			return 0;
+		}
+
+		if (netif_is_bridge_port(upper_dev)) {
+			br = netdev_master_upper_dev_get(upper_dev);
+			err = dpaa2_switch_prechangeupper_sanity_checks(netdev,
+									br,
+									extack);
+			if (err)
+				return err;
+		}
+
+		return dpaa2_switch_pre_lag_join(netdev, upper_dev,
+						 info->upper_info, extack);
 	} else if (is_vlan_dev(upper_dev)) {
 		port_priv = netdev_priv(netdev);
 		if (port_priv->fdb->bridge_dev) {
@@ -2299,6 +2639,80 @@ static int dpaa2_switch_port_changeupper(struct net_device *netdev,
 							     extack);
 		else
 			return dpaa2_switch_port_bridge_leave(netdev);
+	} else if (netif_is_lag_master(upper_dev)) {
+		if (info->linking)
+			return dpaa2_switch_port_bond_join(netdev, upper_dev,
+							   info->upper_info,
+							   extack);
+		else
+			return dpaa2_switch_port_bond_leave(netdev, upper_dev);
+	}
+
+	return 0;
+}
+
+static int
+dpaa2_switch_lag_prechangeupper(struct net_device *netdev,
+				struct netdev_notifier_changeupper_info *info)
+{
+	struct net_device *lower;
+	struct list_head *iter;
+	int err = 0;
+
+	if (!netif_is_lag_master(netdev))
+		return 0;
+
+	netdev_for_each_lower_dev(netdev, lower, iter) {
+		if (!dpaa2_switch_port_dev_check(lower))
+			continue;
+
+		err = dpaa2_switch_port_prechangeupper(lower, info);
+		if (err)
+			return err;
+	}
+
+	return err;
+}
+
+static int
+dpaa2_switch_lag_changeupper(struct net_device *netdev,
+			     struct netdev_notifier_changeupper_info *info)
+{
+	struct net_device *lower;
+	struct list_head *iter;
+	int err = 0;
+
+	if (!netif_is_lag_master(netdev))
+		return 0;
+
+	netdev_for_each_lower_dev(netdev, lower, iter) {
+		if (!dpaa2_switch_port_dev_check(lower))
+			continue;
+
+		err = dpaa2_switch_port_changeupper(lower, info);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int
+dpaa2_switch_port_changelowerstate(struct net_device *netdev,
+				   struct netdev_lag_lower_state_info *linfo)
+{
+	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	int err;
+
+	if (!rtnl_dereference(port_priv->lag))
+		return 0;
+
+	err = dpsw_if_set_lag_state(ethsw->mc_io, 0, ethsw->dpsw_handle,
+				    port_priv->idx, linfo->tx_enabled ? 1 : 0);
+	if (err) {
+		netdev_err(netdev, "dpsw_if_set_lag_state() = %d\n", err);
+		return err;
 	}
 
 	return 0;
@@ -2308,6 +2722,7 @@ static int dpaa2_switch_port_netdevice_event(struct notifier_block *nb,
 					     unsigned long event, void *ptr)
 {
 	struct net_device *netdev = netdev_notifier_info_to_dev(ptr);
+	struct netdev_notifier_changelowerstate_info *info;
 	int err = 0;
 
 	switch (event) {
@@ -2316,13 +2731,29 @@ static int dpaa2_switch_port_netdevice_event(struct notifier_block *nb,
 		if (err)
 			return notifier_from_errno(err);
 
+		err = dpaa2_switch_lag_prechangeupper(netdev, ptr);
+		if (err)
+			return notifier_from_errno(err);
+
 		break;
 	case NETDEV_CHANGEUPPER:
 		err = dpaa2_switch_port_changeupper(netdev, ptr);
 		if (err)
 			return notifier_from_errno(err);
 
+		err = dpaa2_switch_lag_changeupper(netdev, ptr);
+		if (err)
+			return notifier_from_errno(err);
+
 		break;
+	case NETDEV_CHANGELOWERSTATE:
+		info = ptr;
+		if (!dpaa2_switch_port_dev_check(netdev))
+			break;
+
+		err = dpaa2_switch_port_changelowerstate(netdev,
+							 info->lower_state_info);
+		return notifier_from_errno(err);
 	}
 
 	return NOTIFY_DONE;
@@ -2581,6 +3012,9 @@ static void dpaa2_switch_detect_features(struct ethsw_core *ethsw)
 
 	if (ethsw->major > 8 || (ethsw->major == 8 && ethsw->minor >= 6))
 		ethsw->features |= ETHSW_FEATURE_MAC_ADDR;
+
+	if (ethsw->major > 8 || (ethsw->major == 8 && ethsw->minor >= 13))
+		ethsw->features |= ETHSW_FEATURE_LAG_OFFLOAD;
 }
 
 static int dpaa2_switch_setup_fqs(struct ethsw_core *ethsw)
@@ -3370,6 +3804,7 @@ static void dpaa2_switch_remove(struct fsl_mc_device *sw_dev)
 	kfree(ethsw->fdbs);
 	kfree(ethsw->filter_blocks);
 	kfree(ethsw->ports);
+	kfree(ethsw->lags);
 
 	dpaa2_switch_teardown(sw_dev);
 
@@ -3397,6 +3832,7 @@ static int dpaa2_switch_probe_port(struct ethsw_core *ethsw,
 	port_priv = netdev_priv(port_netdev);
 	port_priv->netdev = port_netdev;
 	port_priv->ethsw_data = ethsw;
+	rcu_assign_pointer(port_priv->lag, NULL);
 
 	mutex_init(&port_priv->mac_lock);
 
@@ -3504,6 +3940,19 @@ static int dpaa2_switch_probe(struct fsl_mc_device *sw_dev)
 		goto err_free_fdbs;
 	}
 
+	ethsw->lags = kcalloc(ethsw->sw_attr.num_ifs, sizeof(*ethsw->lags),
+			      GFP_KERNEL);
+	if (!ethsw->lags) {
+		err = -ENOMEM;
+		goto err_free_filter;
+	}
+	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
+		ethsw->lags[i].bond_dev = NULL;
+		ethsw->lags[i].ethsw = ethsw;
+		ethsw->lags[i].id = i + 1;
+		ethsw->lags[i].in_use = 0;
+	}
+
 	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
 		err = dpaa2_switch_probe_port(ethsw, i);
 		if (err)
@@ -3550,6 +3999,8 @@ static int dpaa2_switch_probe(struct fsl_mc_device *sw_dev)
 err_free_netdev:
 	for (i--; i >= 0; i--)
 		dpaa2_switch_remove_port(ethsw, i);
+	kfree(ethsw->lags);
+err_free_filter:
 	kfree(ethsw->filter_blocks);
 err_free_fdbs:
 	kfree(ethsw->fdbs);
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
index 42b3ca73f55d..c98bddd7e359 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
@@ -41,7 +41,8 @@
 #define ETHSW_MAX_FRAME_LENGTH	(DPAA2_MFL - VLAN_ETH_HLEN - ETH_FCS_LEN)
 #define ETHSW_L2_MAX_FRM(mtu)	((mtu) + VLAN_ETH_HLEN + ETH_FCS_LEN)
 
-#define ETHSW_FEATURE_MAC_ADDR	BIT(0)
+#define ETHSW_FEATURE_MAC_ADDR		BIT(0)
+#define ETHSW_FEATURE_LAG_OFFLOAD	BIT(1)
 
 /* Number of receive queues (one RX and one TX_CONF) */
 #define DPAA2_SWITCH_RX_NUM_FQS	2
@@ -105,6 +106,14 @@ struct dpaa2_switch_fdb {
 	bool			in_use;
 };
 
+struct dpaa2_switch_lag {
+	struct ethsw_core	*ethsw;
+	struct net_device	*bond_dev;
+	bool			in_use;
+	u8			id;
+	struct ethsw_port_priv	*primary;
+};
+
 struct dpaa2_switch_acl_entry {
 	struct list_head	list;
 	u16			prio;
@@ -163,6 +172,8 @@ struct ethsw_port_priv {
 	struct dpaa2_mac	*mac;
 	/* Protects against changes to port_priv->mac */
 	struct mutex		mac_lock;
+
+	struct dpaa2_switch_lag __rcu *lag;
 };
 
 /* Switch data */
@@ -190,6 +201,8 @@ struct ethsw_core {
 	struct dpaa2_switch_fdb		*fdbs;
 	struct dpaa2_switch_filter_block *filter_blocks;
 	u16				mirror_port;
+
+	struct dpaa2_switch_lag		*lags;
 };
 
 static inline int dpaa2_switch_get_index(struct ethsw_core *ethsw,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 10/13] dpaa2-switch: offload FDBs added on an upper bond device
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (8 preceding siblings ...)
  2026-06-29 11:23 ` [PATCH net-next v4 09/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
@ 2026-06-29 11:23 ` Ioana Ciornei
  2026-06-30 14:30   ` Ioana Ciornei
  2026-06-30 14:41   ` Ioana Ciornei
  2026-06-29 11:23 ` [PATCH net-next v4 11/13] dpaa2-switch: offload port objects " Ioana Ciornei
                   ` (3 subsequent siblings)
  13 siblings, 2 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

This patch adds support for offloading FDB entries added on upper bond
devices.

First of all, the call to switchdev_bridge_port_offload() is updated so
that the notifier blocks needed for FDB events replay are available to
the bridge core.

Using switchdev_handle_*() helpers is also necessary because each FDB
event needs to be fanned out to any DPAA2 switch lower device. This
triggers another change in the return type used by the
dpaa2_switch_port_fdb_event() - from notifier types to regular errno
types.

Handling of the SWITCHDEV_FDB_ADD_TO_DEVICE/SWITCHDEV_FDB_DEL_TO_DEVICE
events is updated so that the newly dpaa2_switch_lag_fdb_add() /
dpaa2_switch_lag_fdb_del() functions are called anytime a port is under
a bond device. This will allow us to manage refcounting on FDB entries
which are added on the upper bond devices.

The DPAA2 switch uses shared-VLAN learning which means that the vid
parameter is not used when adding an FDB entry to HW. The current
behavior when dealing with FDB entries with the same MAC address but
different VLANs is to add the entry to HW every time while removal will
get done on the first 'bridge fdb del' command issued by the user.

The same behavior is kept also for FDBs added on bond devices by keeping
the refcount on the {vid, addr} pair while the HW operation disregards
entirely the vid parameter.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- Migrate FDBs in case the primary interface of a LAG changes.
- Use lag->primary instead of determining each time the primary
interface of a LAG device

Changes in v3:
- Update dpaa2_switch_foreign_dev_check() so that we check if there is
any port in the same switch as dev which offloads foreign_dev in case
this is a bridge port.
- Add mutex_destroy on the per LAG fdb_lock
- Make sure that all FDB events were processed on the workqueue on the
.remove() path.
- Delete the refcounted entry in dpaa2_switch_lag_fdb_del() as soon as
possible, even if the HW deletion would fail
- Access the port_priv->lag field only through the proper rcu accessors.

Changes in v2:
- Update dpaa2_switch_foreign_dev_check() so that we check if between
the switch port and the foreign net_device is an offloaded path. Before
this change we also checked if the foreign_dev was offloaded or not by
the switch port.
- Update the switchdev_bridge_port_unoffload() by passing it the proper
context and the notifier blocks.
- Add dev_hold() and dev_put() calls for orig_dev
---
 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 227 ++++++++++++++++--
 .../ethernet/freescale/dpaa2/dpaa2-switch.h   |  24 ++
 2 files changed, 225 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 949a7241a00f..307b3b7a1bfb 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -25,6 +25,9 @@
 
 #define DEFAULT_VLAN_ID			1
 
+static struct notifier_block dpaa2_switch_port_switchdev_nb;
+static struct notifier_block dpaa2_switch_port_switchdev_blocking_nb;
+
 static u16 dpaa2_switch_port_get_fdb_id(struct ethsw_port_priv *port_priv)
 {
 	return port_priv->fdb->fdb_id;
@@ -585,6 +588,81 @@ static int dpaa2_switch_port_fdb_del(struct ethsw_port_priv *port_priv,
 		return dpaa2_switch_port_fdb_del_mc(port_priv, addr);
 }
 
+static struct dpaa2_mac_addr *
+dpaa2_switch_mac_addr_find(struct list_head *addr_list,
+			   const unsigned char *addr, u16 vid)
+{
+	struct dpaa2_mac_addr *a;
+
+	list_for_each_entry(a, addr_list, list)
+		if (ether_addr_equal(a->addr, addr) && a->vid == vid)
+			return a;
+
+	return NULL;
+}
+
+static int dpaa2_switch_lag_fdb_add(struct dpaa2_switch_lag *lag,
+				    const unsigned char *addr, u16 vid)
+{
+	struct ethsw_port_priv *port_priv = lag->primary;
+	struct dpaa2_mac_addr *a;
+	int err = 0;
+
+	mutex_lock(&lag->fdb_lock);
+
+	a = dpaa2_switch_mac_addr_find(&lag->fdbs, addr, vid);
+	if (a) {
+		refcount_inc(&a->refcount);
+		goto out;
+	}
+
+	a = kzalloc(sizeof(*a), GFP_KERNEL);
+	if (!a) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	err = dpaa2_switch_port_fdb_add(port_priv, addr);
+	if (err) {
+		kfree(a);
+		goto out;
+	}
+
+	ether_addr_copy(a->addr, addr);
+	a->vid = vid;
+	refcount_set(&a->refcount, 1);
+	list_add_tail(&a->list, &lag->fdbs);
+
+out:
+	mutex_unlock(&lag->fdb_lock);
+
+	return err;
+}
+
+static void dpaa2_switch_lag_fdb_del(struct dpaa2_switch_lag *lag,
+				     const unsigned char *addr, u16 vid)
+{
+	struct ethsw_port_priv *port_priv = lag->primary;
+	struct dpaa2_mac_addr *a;
+
+	mutex_lock(&lag->fdb_lock);
+
+	a = dpaa2_switch_mac_addr_find(&lag->fdbs, addr, vid);
+	if (!a)
+		goto out;
+
+	if (!refcount_dec_and_test(&a->refcount))
+		goto out;
+
+	list_del(&a->list);
+	kfree(a);
+
+	dpaa2_switch_port_fdb_del(port_priv, addr);
+
+out:
+	mutex_unlock(&lag->fdb_lock);
+}
+
 static void dpaa2_switch_port_get_stats(struct net_device *netdev,
 					struct rtnl_link_stats64 *stats)
 {
@@ -1533,6 +1611,33 @@ bool dpaa2_switch_port_dev_check(const struct net_device *netdev)
 	return netdev->netdev_ops == &dpaa2_switch_port_ops;
 }
 
+static bool dpaa2_switch_foreign_dev_check(const struct net_device *dev,
+					   const struct net_device *foreign_dev)
+{
+	struct ethsw_port_priv *port_priv = netdev_priv(dev);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	struct ethsw_port_priv *other_port;
+	int i;
+
+	if (netif_is_bridge_master(foreign_dev))
+		if (port_priv->fdb->bridge_dev == foreign_dev)
+			return false;
+
+	if (netif_is_bridge_port(foreign_dev)) {
+		for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
+			other_port = ethsw->ports[i];
+
+			if (!other_port)
+				continue;
+			if (dpaa2_switch_port_offloads_bridge_port(other_port,
+								   foreign_dev))
+				return false;
+		}
+	}
+
+	return true;
+}
+
 static int dpaa2_switch_port_connect_mac(struct ethsw_port_priv *port_priv)
 {
 	struct fsl_mc_device *dpsw_port_dev, *dpmac_dev;
@@ -2100,8 +2205,10 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
 		goto err_egress_flood;
 
 	brport_dev = dpaa2_switch_port_to_bridge_port(port_priv);
-	err = switchdev_bridge_port_offload(brport_dev, netdev, NULL,
-					    NULL, NULL, false, extack);
+	err = switchdev_bridge_port_offload(brport_dev, netdev, port_priv,
+					    &dpaa2_switch_port_switchdev_nb,
+					    &dpaa2_switch_port_switchdev_blocking_nb,
+					    false, extack);
 	if (err)
 		goto err_switchdev_offload;
 
@@ -2143,7 +2250,9 @@ static void dpaa2_switch_port_pre_bridge_leave(struct net_device *netdev)
 	if (!brport_dev)
 		return;
 
-	switchdev_bridge_port_unoffload(brport_dev, NULL, NULL, NULL);
+	switchdev_bridge_port_unoffload(brport_dev, port_priv,
+					&dpaa2_switch_port_switchdev_nb,
+					&dpaa2_switch_port_switchdev_blocking_nb);
 
 	/* Make sure that any FDB add/del operations are completed before the
 	 * bridge layout changes
@@ -2425,9 +2534,10 @@ static int dpaa2_switch_set_lag_cfg(struct net_device *bond_dev, u8 lag_id,
 				    struct ethsw_core *ethsw)
 {
 	struct dpaa2_switch_lag *lag = &ethsw->lags[lag_id - 1];
-	struct ethsw_port_priv *primary, *new_primary = NULL;
-	struct ethsw_port_priv *port_priv = NULL;
+	struct ethsw_port_priv *primary, *port_priv;
+	struct ethsw_port_priv *new_primary = NULL;
 	struct dpsw_lag_cfg cfg = {0};
+	struct dpaa2_mac_addr *a;
 	u8 num_ifs = 0;
 	int err, i;
 
@@ -2454,7 +2564,6 @@ static int dpaa2_switch_set_lag_cfg(struct net_device *bond_dev, u8 lag_id,
 	/* Build the interface list, always placing the primary first */
 	if (primary)
 		cfg.if_id[num_ifs++] = primary->idx;
-
 	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
 		port_priv = ethsw->ports[i];
 		if (port_priv == primary)
@@ -2477,11 +2586,32 @@ static int dpaa2_switch_set_lag_cfg(struct net_device *bond_dev, u8 lag_id,
 		lag->in_use = false;
 	}
 
+	/* When the primary changes, migrate the FDB entries from the old
+	 * primary to the new one: remove them before reconfiguring the LAG in
+	 * hardware and re-add them on the new primary afterwards. We do not
+	 * touch any refcounting since the intention is to change the HW entry,
+	 * not the parallel software tracking.
+	 */
+	if (new_primary) {
+		mutex_lock(&lag->fdb_lock);
+		list_for_each_entry(a, &lag->fdbs, list)
+			dpaa2_switch_port_fdb_del(lag->primary, a->addr);
+		mutex_unlock(&lag->fdb_lock);
+	}
+
 	err = dpsw_lag_set(ethsw->mc_io, 0, ethsw->dpsw_handle, &cfg);
 	if (err)
 		return err;
 
 	if (new_primary) {
+		mutex_lock(&lag->fdb_lock);
+		list_for_each_entry(a, &lag->fdbs, list) {
+			err = dpaa2_switch_port_fdb_add(new_primary, a->addr);
+			if (err)
+				netdev_err(new_primary->netdev, "Unable to migrate FDB\n");
+		}
+		mutex_unlock(&lag->fdb_lock);
+
 		synchronize_net();
 		lag->primary = new_primary;
 	}
@@ -2763,67 +2893,97 @@ struct ethsw_switchdev_event_work {
 	struct work_struct work;
 	struct switchdev_notifier_fdb_info fdb_info;
 	struct net_device *dev;
+	struct net_device *orig_dev;
 	unsigned long event;
+	u16 vid;
 };
 
 static void dpaa2_switch_event_work(struct work_struct *work)
 {
 	struct ethsw_switchdev_event_work *switchdev_work =
 		container_of(work, struct ethsw_switchdev_event_work, work);
+	struct net_device *orig_dev = switchdev_work->orig_dev;
 	struct net_device *dev = switchdev_work->dev;
+	struct ethsw_port_priv *port_priv = netdev_priv(dev);
 	struct switchdev_notifier_fdb_info *fdb_info;
+	struct dpaa2_switch_lag *lag;
 	int err;
 
 	fdb_info = &switchdev_work->fdb_info;
 
+	/* The lag structures are freed only from dpaa2_switch_remove(), which
+	 * first flushes this workqueue, so the pointer stays valid for the
+	 * lifetime of the work item. Only the dereference needs the RCU
+	 * read-side lock; the FDB helpers below can sleep and must run outside
+	 * of it.
+	 */
+	rcu_read_lock();
+	lag = rcu_dereference(port_priv->lag);
+	rcu_read_unlock();
+
 	switch (switchdev_work->event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-		err = dpaa2_switch_port_fdb_add(netdev_priv(dev),
-						fdb_info->addr);
+		if (lag)
+			err = dpaa2_switch_lag_fdb_add(lag, fdb_info->addr,
+						       switchdev_work->vid);
+		else
+			err = dpaa2_switch_port_fdb_add(port_priv,
+							fdb_info->addr);
 		if (err)
 			break;
 		fdb_info->offloaded = true;
-		call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED, dev,
+		call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED, orig_dev,
 					 &fdb_info->info, NULL);
 		break;
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		dpaa2_switch_port_fdb_del(netdev_priv(dev), fdb_info->addr);
+		if (lag)
+			dpaa2_switch_lag_fdb_del(lag, fdb_info->addr,
+						 switchdev_work->vid);
+		else
+			dpaa2_switch_port_fdb_del(port_priv, fdb_info->addr);
 		break;
 	}
 
 	kfree(switchdev_work->fdb_info.addr);
 	kfree(switchdev_work);
 	dev_put(dev);
+	dev_put(orig_dev);
 }
 
-static int dpaa2_switch_port_fdb_event(struct notifier_block *nb,
-				       unsigned long event, void *ptr)
+static int
+dpaa2_switch_port_fdb_event(struct net_device *dev,
+			    struct net_device *orig_dev,
+			    unsigned long event, const void *ctx,
+			    const struct switchdev_notifier_fdb_info *fdb_info)
 {
-	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
 	struct ethsw_port_priv *port_priv = netdev_priv(dev);
 	struct ethsw_switchdev_event_work *switchdev_work;
-	struct switchdev_notifier_fdb_info *fdb_info = ptr;
-	struct ethsw_core *ethsw;
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
 
-	if (!dpaa2_switch_port_dev_check(dev))
-		return NOTIFY_DONE;
-	ethsw = port_priv->ethsw_data;
+	if (ctx && ctx != port_priv)
+		return 0;
+
+	/* For the moment, do nothing with entries towards foreign devices */
+	if (dpaa2_switch_foreign_dev_check(dev, orig_dev))
+		return 0;
 
 	if (!fdb_info->added_by_user || fdb_info->is_local)
-		return NOTIFY_DONE;
+		return 0;
 
 	switchdev_work = kzalloc_obj(*switchdev_work, GFP_ATOMIC);
 	if (!switchdev_work)
-		return NOTIFY_BAD;
+		return -ENOMEM;
 
 	INIT_WORK(&switchdev_work->work, dpaa2_switch_event_work);
 	switchdev_work->dev = dev;
 	switchdev_work->event = event;
+	switchdev_work->orig_dev = orig_dev;
+	switchdev_work->vid = fdb_info->vid;
 
 	switch (event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		memcpy(&switchdev_work->fdb_info, ptr,
+		memcpy(&switchdev_work->fdb_info, fdb_info,
 		       sizeof(switchdev_work->fdb_info));
 		switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
 		if (!switchdev_work->fdb_info.addr)
@@ -2834,19 +2994,20 @@ static int dpaa2_switch_port_fdb_event(struct notifier_block *nb,
 
 		/* Take a reference on the device to avoid being freed. */
 		dev_hold(dev);
+		dev_hold(orig_dev);
 		break;
 	default:
 		kfree(switchdev_work);
-		return NOTIFY_DONE;
+		return 0;
 	}
 
 	queue_work(ethsw->workqueue, &switchdev_work->work);
 
-	return NOTIFY_DONE;
+	return 0;
 
 err_addr_alloc:
 	kfree(switchdev_work);
-	return NOTIFY_BAD;
+	return -ENOMEM;
 }
 
 /* Called under rcu_read_lock() */
@@ -2854,13 +3015,18 @@ static int dpaa2_switch_port_event(struct notifier_block *nb,
 				   unsigned long event, void *ptr)
 {
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+	int err;
 
 	switch (event) {
 	case SWITCHDEV_PORT_ATTR_SET:
 		return dpaa2_switch_port_attr_set_event(dev, ptr);
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		return dpaa2_switch_port_fdb_event(nb, event, ptr);
+		err = switchdev_handle_fdb_event_to_device(dev, event, ptr,
+							   dpaa2_switch_port_dev_check,
+							   dpaa2_switch_foreign_dev_check,
+							   dpaa2_switch_port_fdb_event);
+		return notifier_from_errno(err);
 	default:
 		return NOTIFY_DONE;
 	}
@@ -3785,6 +3951,9 @@ static void dpaa2_switch_remove(struct fsl_mc_device *sw_dev)
 	dev = &sw_dev->dev;
 	ethsw = dev_get_drvdata(dev);
 
+	/* Make sure that all events were handled before we kfree anything */
+	flush_workqueue(ethsw->workqueue);
+
 	dpaa2_switch_teardown_irqs(sw_dev);
 
 	dpsw_disable(ethsw->mc_io, 0, ethsw->dpsw_handle);
@@ -3798,8 +3967,10 @@ static void dpaa2_switch_remove(struct fsl_mc_device *sw_dev)
 	for (i = 0; i < DPAA2_SWITCH_RX_NUM_FQS; i++)
 		netif_napi_del(&ethsw->fq[i].napi);
 
-	for (i = 0; i < ethsw->sw_attr.num_ifs; i++)
+	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
 		dpaa2_switch_remove_port(ethsw, i);
+		mutex_destroy(&ethsw->lags[i].fdb_lock);
+	}
 
 	kfree(ethsw->fdbs);
 	kfree(ethsw->filter_blocks);
@@ -3951,6 +4122,8 @@ static int dpaa2_switch_probe(struct fsl_mc_device *sw_dev)
 		ethsw->lags[i].ethsw = ethsw;
 		ethsw->lags[i].id = i + 1;
 		ethsw->lags[i].in_use = 0;
+		mutex_init(&ethsw->lags[i].fdb_lock);
+		INIT_LIST_HEAD(&ethsw->lags[i].fdbs);
 	}
 
 	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
@@ -3999,6 +4172,8 @@ static int dpaa2_switch_probe(struct fsl_mc_device *sw_dev)
 err_free_netdev:
 	for (i--; i >= 0; i--)
 		dpaa2_switch_remove_port(ethsw, i);
+	for (i = 0; i < ethsw->sw_attr.num_ifs; i++)
+		mutex_destroy(&ethsw->lags[i].fdb_lock);
 	kfree(ethsw->lags);
 err_free_filter:
 	kfree(ethsw->filter_blocks);
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
index c98bddd7e359..e8bc1469cbf7 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
@@ -100,6 +100,13 @@ struct dpaa2_switch_fq {
 	u32 fqid;
 };
 
+struct dpaa2_mac_addr {
+	unsigned char addr[ETH_ALEN];
+	u16 vid;
+	refcount_t refcount;
+	struct list_head list;
+};
+
 struct dpaa2_switch_fdb {
 	struct net_device	*bridge_dev;
 	u16			fdb_id;
@@ -112,6 +119,9 @@ struct dpaa2_switch_lag {
 	bool			in_use;
 	u8			id;
 	struct ethsw_port_priv	*primary;
+	/* Protects the list of fdbs installed on this LAG */
+	struct mutex		fdb_lock;
+	struct list_head	fdbs;
 };
 
 struct dpaa2_switch_acl_entry {
@@ -287,4 +297,18 @@ int dpaa2_switch_block_offload_mirror(struct dpaa2_switch_filter_block *block,
 
 int dpaa2_switch_block_unoffload_mirror(struct dpaa2_switch_filter_block *block,
 					struct ethsw_port_priv *port_priv);
+
+static inline bool
+dpaa2_switch_port_offloads_bridge_port(struct ethsw_port_priv *port_priv,
+				       const struct net_device *dev)
+{
+	struct dpaa2_switch_lag *lag = rcu_dereference_rtnl(port_priv->lag);
+
+	if (lag && lag->bond_dev == dev)
+		return true;
+	if (port_priv->netdev == dev)
+		return true;
+	return false;
+}
+
 #endif	/* __ETHSW_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 11/13] dpaa2-switch: offload port objects on an upper bond device
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (9 preceding siblings ...)
  2026-06-29 11:23 ` [PATCH net-next v4 10/13] dpaa2-switch: offload FDBs added on an upper bond device Ioana Ciornei
@ 2026-06-29 11:23 ` Ioana Ciornei
  2026-06-29 11:23 ` [PATCH net-next v4 12/13] dpaa2-switch: trap all link local reserved addresses to the CPU Ioana Ciornei
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

This patch adds support for offloading port objects, VLANs and MDBs,
added on upper bond devices.

First of all, the use of the switchdev_handle_*() replication helpers
is introduced for the SWITCHDEV_PORT_OBJ_ADD/SWITCHDEV_PORT_OBJ_DEL
events. With this change, setting up the 'port_obj_info->handled = true'
is not needed anymore since it's now handled by the new helpers.

In the DPAA2 architecture, there is no difference in adding a FDB or MDB
which points towards a LAG port. Unlike other architectures, we do not
need to populate all the possible destinations which are under the LAG,
we only have to specify a single queueing destination (QDID) which
represents the LAG. This all means that handling of MDBs in bond devices
needs to have refcount mechanism as with the FDBs.
This mechanism is triggered by calling the dpaa2_switch_lag_fdb_add() /
dpaa2_switch_lag_fdb_del() functions which were added in the previous
patch.

Also change how dpaa2_switch_port_mdb_del() behaves in case the
underlying HW operation failed. Since the delete operations cannot be
stopped from a switchdev standpoint, go ahead and ignore the return code
from the dpaa2_switch_*_fdb_del() calls.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- Updates necessary for the dev_mc_add/dev_mc_del removal

Changes in v3:
- Access the port_priv->lag field only through the proper rcu accessors.

Changes in v2:
- In case dev_mc_add() fails, remove the MDB address from HW with the
proper function, dpaa2_switch_lag_fdb_del() or
dpaa2_switch_port_fdb_del(), depending on the LAG offload state.
---
 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 69 +++++++++++--------
 1 file changed, 41 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 307b3b7a1bfb..1f7875ecefe2 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -2017,15 +2017,28 @@ static int dpaa2_switch_port_mdb_add(struct net_device *netdev,
 				     const struct switchdev_obj_port_mdb *mdb)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct dpaa2_switch_lag *lag;
 
-	return dpaa2_switch_port_fdb_add(port_priv, mdb->addr);
+	lag = rtnl_dereference(port_priv->lag);
+	if (lag)
+		return dpaa2_switch_lag_fdb_add(lag, mdb->addr, mdb->vid);
+	else
+		return dpaa2_switch_port_fdb_add(port_priv, mdb->addr);
 }
 
-static int dpaa2_switch_port_obj_add(struct net_device *netdev,
-				     const struct switchdev_obj *obj)
+static int dpaa2_switch_port_obj_add(struct net_device *netdev, const void *ctx,
+				     const struct switchdev_obj *obj,
+				     struct netlink_ext_ack *extack)
 {
+	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	int err;
 
+	if (ctx && ctx != port_priv)
+		return 0;
+
+	if (!dpaa2_switch_port_offloads_bridge_port(port_priv, obj->orig_dev))
+		return -EOPNOTSUPP;
+
 	switch (obj->id) {
 	case SWITCHDEV_OBJ_ID_PORT_VLAN:
 		err = dpaa2_switch_port_vlans_add(netdev,
@@ -2121,15 +2134,29 @@ static int dpaa2_switch_port_mdb_del(struct net_device *netdev,
 				     const struct switchdev_obj_port_mdb *mdb)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct dpaa2_switch_lag *lag;
+
+	lag = rtnl_dereference(port_priv->lag);
+	if (lag)
+		dpaa2_switch_lag_fdb_del(lag, mdb->addr, mdb->vid);
+	else
+		dpaa2_switch_port_fdb_del(port_priv, mdb->addr);
 
-	return dpaa2_switch_port_fdb_del(port_priv, mdb->addr);
+	return 0;
 }
 
-static int dpaa2_switch_port_obj_del(struct net_device *netdev,
+static int dpaa2_switch_port_obj_del(struct net_device *netdev, const void *ctx,
 				     const struct switchdev_obj *obj)
 {
+	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	int err;
 
+	if (ctx && ctx != port_priv)
+		return 0;
+
+	if (!dpaa2_switch_port_offloads_bridge_port(port_priv, obj->orig_dev))
+		return -EOPNOTSUPP;
+
 	switch (obj->id) {
 	case SWITCHDEV_OBJ_ID_PORT_VLAN:
 		err = dpaa2_switch_port_vlans_del(netdev, SWITCHDEV_OBJ_PORT_VLAN(obj));
@@ -3032,37 +3059,23 @@ static int dpaa2_switch_port_event(struct notifier_block *nb,
 	}
 }
 
-static int dpaa2_switch_port_obj_event(unsigned long event,
-				       struct net_device *netdev,
-				       struct switchdev_notifier_port_obj_info *port_obj_info)
-{
-	int err = -EOPNOTSUPP;
-
-	if (!dpaa2_switch_port_dev_check(netdev))
-		return NOTIFY_DONE;
-
-	switch (event) {
-	case SWITCHDEV_PORT_OBJ_ADD:
-		err = dpaa2_switch_port_obj_add(netdev, port_obj_info->obj);
-		break;
-	case SWITCHDEV_PORT_OBJ_DEL:
-		err = dpaa2_switch_port_obj_del(netdev, port_obj_info->obj);
-		break;
-	}
-
-	port_obj_info->handled = true;
-	return notifier_from_errno(err);
-}
-
 static int dpaa2_switch_port_blocking_event(struct notifier_block *nb,
 					    unsigned long event, void *ptr)
 {
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+	int err;
 
 	switch (event) {
 	case SWITCHDEV_PORT_OBJ_ADD:
+		err = switchdev_handle_port_obj_add(dev, ptr,
+						    dpaa2_switch_port_dev_check,
+						    dpaa2_switch_port_obj_add);
+		return notifier_from_errno(err);
 	case SWITCHDEV_PORT_OBJ_DEL:
-		return dpaa2_switch_port_obj_event(event, dev, ptr);
+		err = switchdev_handle_port_obj_del(dev, ptr,
+						    dpaa2_switch_port_dev_check,
+						    dpaa2_switch_port_obj_del);
+		return notifier_from_errno(err);
 	case SWITCHDEV_PORT_ATTR_SET:
 		return dpaa2_switch_port_attr_set_event(dev, ptr);
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 12/13] dpaa2-switch: trap all link local reserved addresses to the CPU
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (10 preceding siblings ...)
  2026-06-29 11:23 ` [PATCH net-next v4 11/13] dpaa2-switch: offload port objects " Ioana Ciornei
@ 2026-06-29 11:23 ` Ioana Ciornei
  2026-06-29 11:23 ` [PATCH net-next v4 13/13] dpaa2-switch: add support for imprecise source port Ioana Ciornei
  2026-07-01 16:10 ` [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload patchwork-bot+netdevbpf
  13 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

Do not trap only STP frames to the control interface but rather trap all
link local reserved addresses. This will still be done by looking at the
destination MAC address but keeping in mind to not take into account the
last byte.

This change will benefit LACP frames which now will reach the control
interface.

While at it, change the prototype of the
dpaa2_switch_port_trap_mac_addr() function so that we directly pass a
'const u8 *' so that it matches the ether_addr_copy() used.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- none

Changes in v3:
- Change the mask so that we restrict the trap only to the link local
addresses (01:80:c2:00:00:00 to 01:80:c2:00:00:0F) instead of the entire
reserved bridge block of addresses

Changes in v2:
- none
---
 drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 1f7875ecefe2..b94d83f5ef06 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -3828,17 +3828,15 @@ static int dpaa2_switch_init(struct fsl_mc_device *sw_dev)
 	return err;
 }
 
-/* Add an ACL to redirect frames with specific destination MAC address to
- * control interface
- */
+/* Add an ACL to redirect frames to control interface based on the dst MAC */
 static int dpaa2_switch_port_trap_mac_addr(struct ethsw_port_priv *port_priv,
-					   const char *mac)
+					   const u8 *mac, const u8 *mask)
 {
 	struct dpaa2_switch_acl_entry acl_entry = {0};
 
 	/* Match on the destination MAC address */
 	ether_addr_copy(acl_entry.key.match.l2_dest_mac, mac);
-	eth_broadcast_addr(acl_entry.key.mask.l2_dest_mac);
+	ether_addr_copy(acl_entry.key.mask.l2_dest_mac, mask);
 
 	/* Trap to CPU */
 	acl_entry.cfg.precedence = 0;
@@ -3849,7 +3847,8 @@ static int dpaa2_switch_port_trap_mac_addr(struct ethsw_port_priv *port_priv,
 
 static int dpaa2_switch_port_init(struct ethsw_port_priv *port_priv, u16 port)
 {
-	const char stpa[ETH_ALEN] = {0x01, 0x80, 0xc2, 0x00, 0x00, 0x00};
+	const u8 ll_mac[ETH_ALEN] = {0x01, 0x80, 0xc2, 0x00, 0x00, 0x00};
+	const u8 ll_mask[ETH_ALEN] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xf0};
 	struct switchdev_obj_port_vlan vlan = {
 		.obj.id = SWITCHDEV_OBJ_ID_PORT_VLAN,
 		.vid = DEFAULT_VLAN_ID,
@@ -3924,7 +3923,7 @@ static int dpaa2_switch_port_init(struct ethsw_port_priv *port_priv, u16 port)
 	if (err)
 		return err;
 
-	err = dpaa2_switch_port_trap_mac_addr(port_priv, stpa);
+	err = dpaa2_switch_port_trap_mac_addr(port_priv, ll_mac, ll_mask);
 	if (err)
 		return err;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v4 13/13] dpaa2-switch: add support for imprecise source port
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (11 preceding siblings ...)
  2026-06-29 11:23 ` [PATCH net-next v4 12/13] dpaa2-switch: trap all link local reserved addresses to the CPU Ioana Ciornei
@ 2026-06-29 11:23 ` Ioana Ciornei
  2026-07-01 16:10 ` [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload patchwork-bot+netdevbpf
  13 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-29 11:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

Switch ports configured as part of a LAG group are not able to provide
a precise source port for all packets which reach the control interface.

The only frames which will have a precise source port are those that are
explicitly trapped, for example STP and LCAP frames. For any other
frames (for example, those which are flooded) we can only know the
ingress LAG group.

Take into account the DPAA2_ETHSW_FLC_IMPRECISE_IF_ID bit and based on
its value target the bond device or the specific source netdevice.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v4:
- None
- Note that I did not address sashiko's feedback related to the
rcu_read_lock() dropped before netif_receive_skb() since even under
PREEMPT_RT NAPI is under rcu protection, rcu_read_lock() being called
from local_bh_disable().

Changes in v3:
- None

Changes in v2:
- Fix 32bit build by using BIT_ULL
- Take a reference to port_priv->lag instead of reading it multiple
times.
---
 .../net/ethernet/freescale/dpaa2/dpaa2-switch.c   | 15 +++++++++++++--
 .../net/ethernet/freescale/dpaa2/dpaa2-switch.h   |  3 +++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index b94d83f5ef06..8320b26c3f72 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -3120,19 +3120,22 @@ static void dpaa2_switch_rx(struct dpaa2_switch_fq *fq,
 	dma_addr_t addr = dpaa2_fd_get_addr(fd);
 	struct ethsw_core *ethsw = fq->ethsw;
 	struct ethsw_port_priv *port_priv;
+	struct dpaa2_switch_lag *lag;
 	struct net_device *netdev;
 	struct vlan_ethhdr *hdr;
 	struct sk_buff *skb;
 	u16 vlan_tci, vid;
 	int if_id, err;
 	void *vaddr;
+	u64 flc;
 
 	vaddr = dpaa2_iova_to_virt(ethsw->iommu_domain, addr);
 	dma_unmap_page(ethsw->dev, addr, DPAA2_SWITCH_RX_BUF_SIZE,
 		       DMA_FROM_DEVICE);
 
 	/* get switch ingress interface ID */
-	if_id = upper_32_bits(dpaa2_fd_get_flc(fd)) & 0x0000FFFF;
+	flc = dpaa2_fd_get_flc(fd);
+	if_id = DPAA2_ETHSW_FLC_IF_ID(flc);
 	if (if_id >= ethsw->sw_attr.num_ifs) {
 		dev_err(ethsw->dev, "Frame received from unknown interface!\n");
 		goto err_free_fd;
@@ -3171,12 +3174,20 @@ static void dpaa2_switch_rx(struct dpaa2_switch_fq *fq,
 		}
 	}
 
-	skb->dev = netdev;
+	rcu_read_lock();
+
+	lag = rcu_dereference(port_priv->lag);
+	if (DPAA2_ETHSW_FLC_IMPRECISE_IF_ID(flc) && lag)
+		skb->dev = lag->bond_dev;
+	else
+		skb->dev = netdev;
 	skb->protocol = eth_type_trans(skb, skb->dev);
 
 	/* Setup the offload_fwd_mark only if the port is under a bridge */
 	skb->offload_fwd_mark = !!(port_priv->fdb->bridge_dev);
 
+	rcu_read_unlock();
+
 	netif_receive_skb(skb);
 
 	return;
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
index e8bc1469cbf7..63b702b0000c 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
@@ -87,6 +87,9 @@
 
 #define DPAA2_ETHSW_PORT_ACL_CMD_BUF_SIZE	256
 
+#define DPAA2_ETHSW_FLC_IF_ID(flc)		(((flc) >> 32) & GENMASK(15, 0))
+#define DPAA2_ETHSW_FLC_IMPRECISE_IF_ID(flc)	((flc) & BIT_ULL(63))
+
 extern const struct ethtool_ops dpaa2_switch_port_ethtool_ops;
 
 struct ethsw_core;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v4 06/13] dpaa2-switch: add dpaa2_switch_port_to_bridge_port() helper
  2026-06-29 11:23 ` [PATCH net-next v4 06/13] dpaa2-switch: add dpaa2_switch_port_to_bridge_port() helper Ioana Ciornei
@ 2026-06-30 13:51   ` Ioana Ciornei
  0 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-30 13:51 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

On Mon, Jun 29, 2026 at 02:23:02PM +0300, Ioana Ciornei wrote:
> In preparation for adding offloading support for upper bond devices we
> have to let the switchdev framework know if a specific bridge port is
> offloaded or not, even if that brport is an upper device.
> 
> For this to happen, create the dpaa2_switch_port_to_bridge_port function
> which will determine the bridge port corresponding to a particular DPAA2
> switch interface and use it in the switchdev_bridge_port_offload call.
> 
> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
> ---
> Changes in v4:
> - Split the patch so that the first part only adds the base function and
> its call sites and the logic aroung lag is added later in the patch
> which actually adds the support for LAG.
> - Moved the patch so that it's a preparatory patch
> 
> Changes in v3:
> - Access lag field through rtnl_dereference() so that we adapt to the
> __rcu change.
> - Check that the brport is non-NULL before calling
> switchdev_bridge_port_unoffload() on it.
> 
> Changes in v2:
> - none
> ---
>  .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 23 ++++++++++++++++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> index d4975d08fa44..88d199befbd9 100644
> --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> @@ -2017,6 +2017,15 @@ static int dpaa2_switch_port_attr_set_event(struct net_device *netdev,
>  	return notifier_from_errno(err);
>  }
>  
> +static struct net_device *
> +dpaa2_switch_port_to_bridge_port(struct ethsw_port_priv *port_priv)
> +{
> +	if (!port_priv->fdb->bridge_dev)
> +		return NULL;
> +
> +	return port_priv->netdev;
> +}
> +
>  static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
>  					 struct net_device *upper_dev,
>  					 struct netlink_ext_ack *extack)
> @@ -2024,6 +2033,7 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
>  	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
>  	struct dpaa2_switch_fdb *old_fdb = port_priv->fdb;
>  	struct ethsw_core *ethsw = port_priv->ethsw_data;
> +	struct net_device *brport_dev;
>  	bool learn_ena;
>  	int err;
>  
> @@ -2035,7 +2045,8 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
>  	dpaa2_switch_port_set_fdb(port_priv, upper_dev, true);

sashiko.dev notes:

	Does the removal of rtnl_lock() earlier in this patch series
	expose port_priv->fdb to a concurrent data race here?

	dpaa2_switch_event_work() reads port_priv->fdb without taking
	rtnl_lock or using READ_ONCE(), which can race with bridge
	join/leave operations that modify it via
	dpaa2_switch_port_set_fdb().

No, that is not correct and how was this avoided is explained in the
commit message from patch 2/13:

	To avoid this kind of concurency without a rtnl_lock, flush the
	event workqueue as the last step from the pre_bridge_leave so
	that any in-flight operations targeting the current FDB are
	finalized before the bridge layout (and the per port FDB
	assignment) changes.

Ioana

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v4 09/13] dpaa2-switch: add support for LAG offload
  2026-06-29 11:23 ` [PATCH net-next v4 09/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
@ 2026-06-30 14:23   ` Ioana Ciornei
  0 siblings, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-30 14:23 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

On Mon, Jun 29, 2026 at 02:23:05PM +0300, Ioana Ciornei wrote:
> This patch adds the bulk of the changes needed in order to support
> offloading of an upper bond device.
> 
> First of all, handling of the NETDEV_CHANGEUPPER and
> NETDEV_PRECHANGEUPPER events is extended so that the driver is capable
> to handle joining or leaving an upper bond device.
> All the restrictions around the LAG offload support are added in the
> newly added dpaa2_switch_pre_lag_join() function.
> 
> The same events are extended to also detect if one of our upper bond
> devices changes its own upper device. In this case, on each lower device
> that is DPAA2 the corresponding dpaa2_switch_port_[pre]changeupper()
> function will be called. This will start the process of joining the same
> FDB as the one used by the bridge device.
> 
> Setting the 'offload_fwd_mark' field on the skbs is also extended to be
> setup not only when the port is under a bridge but also under a bond
> device that is offloaded.
> 
> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
> ---
> Changes in v4:
> - Add a defensive check in dpaa2_switch_port_bond_leave() for a NULL
> port_priv->lag
> - Extend the dpaa2_switch_prevent_bridging_with_8021q_upper() function
> so that we prevent a bond device with VLAN uppers joinging a bridge.
> The restriction is related to VLAN management in terms of the FDB which
> can change upon a topology change. VLAN uppers can only be added once
> the bridge topology is setup.
> - Remove all FDB management from the bond join/leave paths. Decided to
> reconfigure the FDB only on bridge join/leave since the FDB determines
> the forwarding domain and when a bond is not bridged, from a
> configuration standpoint, the individual lowers can be viewed as
> standalone.
> - Moved here the update to the dpaa2_switch_port_to_bridge_port()
> function so that the LAG state is taken into account.
> - Add a new per LAG field - primary - which is used to keep track of the
> primary port of a LAG group instead of determining each time we need to
> use it.
> - Set 'skb->offload_fwd_mark' only when the port is under a bridge.
> 
> Changes in v3:
> - Fix logic in prechangeupper callback in order to not call
> dpaa2_switch_prechangeupper_sanity_checks() on !info->linking
> - Fixed up the logic in the dpaa2_switch_port_bond_join()'s error path
> so that the FDBs are cleaned-up properly and we do not end-up with FDB's
> leaked, meaning that they could have been marked as in-use but actually
> no port was using it.
> - Mark the port_priv->lag field as __rcu and use the proper accesors for
> it. This will eventually become useful in a later patch when the lag
> field will be accessed concurrently from the NAPI context and the
> join/leave paths
> 
> Changes in v2:
> - Extend dpaa2_switch_prechangeupper_sanity_checks() with
> netdev_walk_all_lower_dev() so that checks are done on all lower devices
> of a bridge, even for the lowers of a bridged bond.
> - Manage better the default VLAN on bond join
> - Clean-up the error path in dpaa2_switch_port_bond_join()
> - Call dpaa2_switch_port_bridge_leave() in case a port is leaving a bond
> which is also a bridged port
> - Update dpaa2_switch_port_bond_leave() so that in case of any failure
> the driver tries to cleanup the LAG offload configuration.
> - Call switchdev_bridge_port_unoffload() in a switch port is leaving a
> bridge bond device.
> ---
>  .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 473 +++++++++++++++++-
>  .../ethernet/freescale/dpaa2/dpaa2-switch.h   |  15 +-
>  2 files changed, 476 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> index 3472f5d5b08a..949a7241a00f 100644
> --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> @@ -51,6 +51,17 @@ dpaa2_switch_filter_block_get_unused(struct ethsw_core *ethsw)
>  	return NULL;
>  }
>  
> +static struct dpaa2_switch_lag *
> +dpaa2_switch_lag_get_unused(struct ethsw_core *ethsw)
> +{
> +	int i;
> +
> +	for (i = 0; i < ethsw->sw_attr.num_ifs; i++)
> +		if (!ethsw->lags[i].in_use)
> +			return &ethsw->lags[i];
> +	return NULL;
> +}
> +
>  static bool dpaa2_switch_fdb_in_use_by_others(struct ethsw_core *ethsw,
>  					      struct dpaa2_switch_fdb *fdb,
>  					      struct ethsw_port_priv *except)
> @@ -2042,9 +2053,15 @@ static int dpaa2_switch_port_attr_set_event(struct net_device *netdev,
>  static struct net_device *
>  dpaa2_switch_port_to_bridge_port(struct ethsw_port_priv *port_priv)
>  {
> +	struct dpaa2_switch_lag *lag;
> +
>  	if (!port_priv->fdb->bridge_dev)
>  		return NULL;
>  
> +	lag = rtnl_dereference(port_priv->lag);
> +	if (lag)
> +		return lag->bond_dev;
> +
>  	return port_priv->netdev;
>  }
>  
> @@ -2193,30 +2210,53 @@ static int dpaa2_switch_port_bridge_leave(struct net_device *netdev)
>  					  false);
>  }
>  
> +static int
> +dpaa2_switch_have_vlan_upper(struct net_device *upper_dev,
> +			     __always_unused struct netdev_nested_priv *priv)
> +{
> +	return is_vlan_dev(upper_dev);
> +}
> +
>  static int dpaa2_switch_prevent_bridging_with_8021q_upper(struct net_device *netdev)
>  {
> -	struct net_device *upper_dev;
> -	struct list_head *iter;
> +	struct netdev_nested_priv priv = {};
>  
>  	/* RCU read lock not necessary because we have write-side protection
> -	 * (rtnl_mutex), however a non-rcu iterator does not exist.
> +	 * (rtnl_mutex), however a non-rcu iterator does not exist. Walk the
> +	 * entire upper chain so that a VLAN device stacked on a intermediate
> +	 * bond is caught too.
>  	 */
> -	netdev_for_each_upper_dev_rcu(netdev, upper_dev, iter)
> -		if (is_vlan_dev(upper_dev))
> -			return -EOPNOTSUPP;
> +	if (netdev_walk_all_upper_dev_rcu(netdev, dpaa2_switch_have_vlan_upper,
> +					  &priv))
> +		return -EOPNOTSUPP;
>  
>  	return 0;
>  }
>  
> +static int dpaa2_switch_check_dpsw_instance(struct net_device *dev,
> +					    struct netdev_nested_priv *priv)
> +{
> +	struct ethsw_port_priv *port_priv = (struct ethsw_port_priv *)priv->data;
> +	struct ethsw_port_priv *other_priv = netdev_priv(dev);
> +
> +	if (!dpaa2_switch_port_dev_check(dev))
> +		return 0;
> +
> +	if (other_priv->ethsw_data == port_priv->ethsw_data)
> +		return 0;
> +
> +	return 1;
> +}
> +
>  static int
>  dpaa2_switch_prechangeupper_sanity_checks(struct net_device *netdev,
>  					  struct net_device *upper_dev,
>  					  struct netlink_ext_ack *extack)
>  {
>  	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
> -	struct ethsw_port_priv *other_port_priv;
> -	struct net_device *other_dev;
> -	struct list_head *iter;
> +	struct netdev_nested_priv data = {
> +		.data = (void *)port_priv,
> +	};
>  	int err;
>  
>  	if (!br_vlan_enabled(upper_dev)) {
> @@ -2231,6 +2271,70 @@ dpaa2_switch_prechangeupper_sanity_checks(struct net_device *netdev,
>  		return err;
>  	}
>  
> +	err = netdev_walk_all_lower_dev(upper_dev,
> +					dpaa2_switch_check_dpsw_instance,
> +					&data);
> +	if (err) {
> +		NL_SET_ERR_MSG_MOD(extack,
> +				   "Interface from a different DPSW is in the bridge already");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int dpaa2_switch_pre_lag_join(struct net_device *netdev,
> +				     struct net_device *upper_dev,
> +				     struct netdev_lag_upper_info *info,
> +				     struct netlink_ext_ack *extack)
> +{
> +	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
> +	struct ethsw_core *ethsw = port_priv->ethsw_data;
> +	struct ethsw_port_priv *other_port_priv;
> +	struct dpaa2_switch_lag *lag = NULL;
> +	struct dpsw_lag_cfg cfg = {0};
> +	struct net_device *other_dev;
> +	int i, num_ifs = 0, err;
> +	struct list_head *iter;
> +
> +	if (!(ethsw->features & ETHSW_FEATURE_LAG_OFFLOAD)) {
> +		NL_SET_ERR_MSG_MOD(extack,
> +				   "LAG offload is supported only for DPSW >= v8.13");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (info->tx_type != NETDEV_LAG_TX_TYPE_HASH) {
> +		NL_SET_ERR_MSG_MOD(extack,
> +				   "Can only offload LAG using hash TX type");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (info->hash_type != NETDEV_LAG_HASH_L23) {
> +		NL_SET_ERR_MSG_MOD(extack, "Can only offload L2+L3 Tx hash");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (!dpaa2_switch_port_has_mac(port_priv)) {
> +		NL_SET_ERR_MSG_MOD(extack,
> +				   "Only switch interfaces connected to MACs can be under a LAG");
> +		return -EINVAL;
> +	}
> +
> +	if (vlan_uses_dev(upper_dev)) {
> +		NL_SET_ERR_MSG_MOD(extack,
> +				   "Cannot join a LAG upper that has a VLAN");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
> +		if (!ethsw->lags[i].in_use)
> +			continue;
> +		if (ethsw->lags[i].bond_dev != upper_dev)
> +			continue;
> +		lag = &ethsw->lags[i];
> +		break;
> +	}
> +
>  	netdev_for_each_lower_dev(upper_dev, other_dev, iter) {
>  		if (!dpaa2_switch_port_dev_check(other_dev))
>  			continue;
> @@ -2238,11 +2342,229 @@ dpaa2_switch_prechangeupper_sanity_checks(struct net_device *netdev,
>  		other_port_priv = netdev_priv(other_dev);
>  		if (other_port_priv->ethsw_data != port_priv->ethsw_data) {
>  			NL_SET_ERR_MSG_MOD(extack,
> -					   "Interface from a different DPSW is in the bridge already");
> +					   "Interface from a different DPSW is in the bond already");
> +			return -EINVAL;
> +		}
> +
> +		cfg.if_id[num_ifs++] = other_port_priv->idx;
> +
> +		if (num_ifs >= DPSW_MAX_LAG_IFS) {
> +			NL_SET_ERR_MSG_MOD(extack,
> +					   "Cannot add more than 8 DPAA2 switch ports under the same bond");
>  			return -EINVAL;
>  		}
>  	}
>  
> +	if (lag) {
> +		cfg.group_id = lag->id;
> +		cfg.if_id[num_ifs++] = port_priv->idx;
> +		cfg.num_ifs = num_ifs;
> +		cfg.phase = DPSW_LAG_SET_PHASE_CHECK;
> +
> +		err = dpsw_lag_set(ethsw->mc_io, 0, ethsw->dpsw_handle, &cfg);
> +		if (err) {
> +			NL_SET_ERR_MSG_MOD(extack,
> +					   "Cannot offload LAG configuration");
> +			return -EOPNOTSUPP;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static void dpaa2_switch_port_set_lag_group(struct ethsw_port_priv *port_priv,
> +					    struct net_device *bond_dev)
> +{
> +	struct ethsw_core *ethsw = port_priv->ethsw_data;
> +	struct ethsw_port_priv *other_port_priv = NULL;
> +	struct dpaa2_switch_lag *lag = NULL;
> +	struct dpaa2_switch_lag *other_lag;
> +	struct net_device *other_dev;
> +	struct list_head *iter;
> +
> +	netdev_for_each_lower_dev(bond_dev, other_dev, iter) {
> +		if (!dpaa2_switch_port_dev_check(other_dev))
> +			continue;
> +
> +		other_port_priv = netdev_priv(other_dev);
> +		other_lag = rtnl_dereference(other_port_priv->lag);
> +		if (!other_lag)
> +			continue;
> +
> +		if (other_lag->bond_dev == bond_dev) {
> +			rcu_assign_pointer(port_priv->lag, other_lag);
> +			return;
> +		}
> +	}
> +
> +	/* This is the first interface to be added under a bond device. Find an
> +	 * unused LAG group. No need to check for NULL since there are the same
> +	 * amount of DPSW ports as LAG groups, meaning that each port can have
> +	 * its own LAG group.
> +	 */
> +	lag = dpaa2_switch_lag_get_unused(ethsw);
> +	lag->in_use = true;
> +	lag->bond_dev = bond_dev;
> +	lag->primary = port_priv;
> +	rcu_assign_pointer(port_priv->lag, lag);
> +}
> +
> +static bool dpaa2_switch_port_in_lag(struct ethsw_port_priv *port_priv,
> +				     struct net_device *bond_dev)
> +{
> +	struct dpaa2_switch_lag *lag;
> +
> +	if (!port_priv)
> +		return false;
> +
> +	lag = rtnl_dereference(port_priv->lag);
> +	return lag && lag->bond_dev == bond_dev;
> +}
> +
> +static int dpaa2_switch_set_lag_cfg(struct net_device *bond_dev, u8 lag_id,
> +				    struct ethsw_core *ethsw)
> +{
> +	struct dpaa2_switch_lag *lag = &ethsw->lags[lag_id - 1];
> +	struct ethsw_port_priv *primary, *new_primary = NULL;
> +	struct ethsw_port_priv *port_priv = NULL;
> +	struct dpsw_lag_cfg cfg = {0};
> +	u8 num_ifs = 0;
> +	int err, i;
> +
> +	cfg.group_id = lag_id;
> +
> +	/* Determine the primary port. The caller clears ->lag on the port that
> +	 * is leaving, so a NULL ->lag on the current primary means it is the
> +	 * one leaving: elect the first remaining member as the new primary.
> +	 * Otherwise keep the current primary.
> +	 */
> +	if (rtnl_dereference(lag->primary->lag)) {
> +		primary = lag->primary;
> +	} else {
> +		primary = NULL;
> +		for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
> +			if (dpaa2_switch_port_in_lag(ethsw->ports[i], bond_dev)) {
> +				new_primary = ethsw->ports[i];
> +				primary = new_primary;
> +				break;
> +			}
> +		}
> +	}
> +
> +	/* Build the interface list, always placing the primary first */
> +	if (primary)
> +		cfg.if_id[num_ifs++] = primary->idx;
> +
> +	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
> +		port_priv = ethsw->ports[i];
> +		if (port_priv == primary)
> +			continue;
> +		if (!dpaa2_switch_port_in_lag(port_priv, bond_dev))
> +			continue;
> +
> +		cfg.if_id[num_ifs++] = port_priv->idx;
> +	}
> +	cfg.num_ifs = num_ifs;
> +
> +	/* No more interfaces under this LAG group, mark it as not in use. Wait
> +	 * for a grace period so that any readers of the lag structure finished.
> +	 */
> +	if (!num_ifs) {
> +		synchronize_net();
> +
> +		lag->bond_dev = NULL;
> +		lag->primary = NULL;
> +		lag->in_use = false;
> +	}
> +
> +	err = dpsw_lag_set(ethsw->mc_io, 0, ethsw->dpsw_handle, &cfg);
> +	if (err)
> +		return err;
> +
> +	if (new_primary) {
> +		synchronize_net();
> +		lag->primary = new_primary;
> +	}
> +
> +	return 0;
> +}
> +
> +static int dpaa2_switch_port_bond_join(struct net_device *netdev,
> +				       struct net_device *bond_dev,
> +				       struct netdev_lag_upper_info *info,
> +				       struct netlink_ext_ack *extack)
> +{
> +	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
> +	struct ethsw_core *ethsw = port_priv->ethsw_data;
> +	struct net_device *bridge_dev;
> +	struct dpaa2_switch_lag *lag;
> +	int err = 0;
> +	u8 lag_id;
> +
> +	/* Setup the port_priv->lag pointer for this switch port */
> +	dpaa2_switch_port_set_lag_group(port_priv, bond_dev);
> +
> +	/* Create the LAG configuration and apply it in MC */
> +	lag = rtnl_dereference(port_priv->lag);
> +	lag_id = lag->id;
> +	err = dpaa2_switch_set_lag_cfg(bond_dev, lag_id, ethsw);
> +	if (err)
> +		goto err_lag_cfg;
> +
> +	/* If the bond device is a switch port, join the bridge as well */
> +	bridge_dev = netdev_master_upper_dev_get(bond_dev);
> +	if (!bridge_dev || !netif_is_bridge_master(bridge_dev))
> +		return 0;
> +
> +	err = dpaa2_switch_port_bridge_join(netdev, bridge_dev, extack);
> +	if (err)
> +		goto err_lag_cfg;
> +
> +	return err;
> +
> +err_lag_cfg:
> +	rcu_assign_pointer(port_priv->lag, NULL);
> +	dpaa2_switch_set_lag_cfg(bond_dev, lag_id, ethsw);
> +
> +	return err;
> +}
> +
> +static int dpaa2_switch_port_bond_leave(struct net_device *netdev,
> +					struct net_device *bond_dev)
> +{
> +	struct net_device *bridge_dev = netdev_master_upper_dev_get(bond_dev);
> +	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
> +	struct dpaa2_switch_lag *lag = rtnl_dereference(port_priv->lag);
> +	struct ethsw_core *ethsw = port_priv->ethsw_data;
> +	struct net_device *brpdev;
> +	bool learn_ena;
> +	int err;
> +
> +	if (!lag)
> +		return 0;
> +
> +	/* Recreate the LAG configuration for the LAG group that we left. */
> +	rcu_assign_pointer(port_priv->lag, NULL);
> +	dpaa2_switch_set_lag_cfg(bond_dev, lag->id, ethsw);
> +
> +	if (bridge_dev && netif_is_bridge_master(bridge_dev)) {
> +		/* Make sure that the new primary inherits the learning state */
> +		if (lag->primary) {
> +			brpdev = dpaa2_switch_port_to_bridge_port(lag->primary);
> +			learn_ena = br_port_flag_is_set(brpdev, BR_LEARNING);
> +			err = dpaa2_switch_port_set_learning(lag->primary,
> +							     learn_ena);
> +			if (err)
> +				return err;
> +			lag->primary->learn_ena = learn_ena;
> +		}
> +
> +		/* In case the bond is a bridge port, leave the upper bridge as
> +		 * well.
> +		 */
> +		return dpaa2_switch_port_bridge_leave(netdev);
> +	}
> +
>  	return 0;
>  }
>  
> @@ -2250,8 +2572,8 @@ static int dpaa2_switch_port_prechangeupper(struct net_device *netdev,
>  					    struct netdev_notifier_changeupper_info *info)
>  {
>  	struct ethsw_port_priv *port_priv;
> +	struct net_device *upper_dev, *br;
>  	struct netlink_ext_ack *extack;
> -	struct net_device *upper_dev;
>  	int err;
>  
>  	if (!dpaa2_switch_port_dev_check(netdev))
> @@ -2268,6 +2590,24 @@ static int dpaa2_switch_port_prechangeupper(struct net_device *netdev,
>  
>  		if (!info->linking)
>  			dpaa2_switch_port_pre_bridge_leave(netdev);
> +	} else if (netif_is_lag_master(upper_dev)) {
> +		if (!info->linking) {
> +			if (netif_is_bridge_port(upper_dev))
> +				dpaa2_switch_port_pre_bridge_leave(netdev);
> +			return 0;
> +		}
> +

sashiko-nipa notes:


	When a single DPAA2 port leaves a bond that itself is a bridge port,
	dpaa2_switch_port_pre_bridge_leave(netdev) is called unconditionally,
	regardless of whether other DPAA2 ports still remain in the same bond.

	Inside dpaa2_switch_port_pre_bridge_leave(), the bridge port being
	unoffloaded is computed by dpaa2_switch_port_to_bridge_port(), which
	now returns the bond:

		lag = rtnl_dereference(port_priv->lag);
		if (lag)
			return lag->bond_dev;
		return port_priv->netdev;

	So switchdev_bridge_port_unoffload(bond_dev, NULL, NULL, NULL) is
	issued for every leaving member. Since the matching join path also
	calls switchdev_bridge_port_offload(bond_dev, netdev, NULL, ...) per
	member with the same brport_dev and a NULL ctx, the bridge layer has
	no per-port handle either.

	When the first of several bonded DPAA2 ports leaves, this dispatches
	SWITCHDEV_BRPORT_UNOFFLOADED for the bond while the remaining members
	still rely on the bond being offloaded.

	Should the unoffload only happen when the last DPAA2 port leaves the
	bond, similar to how lan966x tracks per-port bridge offload state?

No, switchdev_bridge_port_offload() and
switchdev_bridge_port_unoffload() can be called multiple times for the
same bridge port, see nbp_switchdev_add():

	/* Tolerate drivers that call switchdev_bridge_port_offload()
	 * more than once for the same bridge port, such as when the
	 * bridge port is an offloaded bonding/team interface.
	 */ 
	p->offload_count++;

The ctx parameter being NULL in this patch does not have any effect on
the offload_count shown above. A proper ctx parameter is provided in the
next patch when we add support for FDBs on bond devices.

Ioana

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v4 10/13] dpaa2-switch: offload FDBs added on an upper bond device
  2026-06-29 11:23 ` [PATCH net-next v4 10/13] dpaa2-switch: offload FDBs added on an upper bond device Ioana Ciornei
@ 2026-06-30 14:30   ` Ioana Ciornei
  2026-06-30 14:41   ` Ioana Ciornei
  1 sibling, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-30 14:30 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

On Mon, Jun 29, 2026 at 02:23:06PM +0300, Ioana Ciornei wrote:
> This patch adds support for offloading FDB entries added on upper bond
> devices.
> 
> First of all, the call to switchdev_bridge_port_offload() is updated so
> that the notifier blocks needed for FDB events replay are available to
> the bridge core.
> 
> Using switchdev_handle_*() helpers is also necessary because each FDB
> event needs to be fanned out to any DPAA2 switch lower device. This
> triggers another change in the return type used by the
> dpaa2_switch_port_fdb_event() - from notifier types to regular errno
> types.
> 
> Handling of the SWITCHDEV_FDB_ADD_TO_DEVICE/SWITCHDEV_FDB_DEL_TO_DEVICE
> events is updated so that the newly dpaa2_switch_lag_fdb_add() /
> dpaa2_switch_lag_fdb_del() functions are called anytime a port is under
> a bond device. This will allow us to manage refcounting on FDB entries
> which are added on the upper bond devices.
> 
> The DPAA2 switch uses shared-VLAN learning which means that the vid
> parameter is not used when adding an FDB entry to HW. The current
> behavior when dealing with FDB entries with the same MAC address but
> different VLANs is to add the entry to HW every time while removal will
> get done on the first 'bridge fdb del' command issued by the user.
> 
> The same behavior is kept also for FDBs added on bond devices by keeping
> the refcount on the {vid, addr} pair while the HW operation disregards
> entirely the vid parameter.
> 
> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
> ---
> Changes in v4:
> - Migrate FDBs in case the primary interface of a LAG changes.
> - Use lag->primary instead of determining each time the primary
> interface of a LAG device
> 
> Changes in v3:
> - Update dpaa2_switch_foreign_dev_check() so that we check if there is
> any port in the same switch as dev which offloads foreign_dev in case
> this is a bridge port.
> - Add mutex_destroy on the per LAG fdb_lock
> - Make sure that all FDB events were processed on the workqueue on the
> .remove() path.
> - Delete the refcounted entry in dpaa2_switch_lag_fdb_del() as soon as
> possible, even if the HW deletion would fail
> - Access the port_priv->lag field only through the proper rcu accessors.
> 
> Changes in v2:
> - Update dpaa2_switch_foreign_dev_check() so that we check if between
> the switch port and the foreign net_device is an offloaded path. Before
> this change we also checked if the foreign_dev was offloaded or not by
> the switch port.
> - Update the switchdev_bridge_port_unoffload() by passing it the proper
> context and the notifier blocks.
> - Add dev_hold() and dev_put() calls for orig_dev
> ---
>  .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 227 ++++++++++++++++--
>  .../ethernet/freescale/dpaa2/dpaa2-switch.h   |  24 ++
>  2 files changed, 225 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> index 949a7241a00f..307b3b7a1bfb 100644
> --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> @@ -25,6 +25,9 @@
>  
>  #define DEFAULT_VLAN_ID			1
>  
> +static struct notifier_block dpaa2_switch_port_switchdev_nb;
> +static struct notifier_block dpaa2_switch_port_switchdev_blocking_nb;
> +
>  static u16 dpaa2_switch_port_get_fdb_id(struct ethsw_port_priv *port_priv)
>  {
>  	return port_priv->fdb->fdb_id;
> @@ -585,6 +588,81 @@ static int dpaa2_switch_port_fdb_del(struct ethsw_port_priv *port_priv,
>  		return dpaa2_switch_port_fdb_del_mc(port_priv, addr);
>  }
>  
> +static struct dpaa2_mac_addr *
> +dpaa2_switch_mac_addr_find(struct list_head *addr_list,
> +			   const unsigned char *addr, u16 vid)
> +{
> +	struct dpaa2_mac_addr *a;
> +
> +	list_for_each_entry(a, addr_list, list)
> +		if (ether_addr_equal(a->addr, addr) && a->vid == vid)
> +			return a;
> +
> +	return NULL;
> +}
> +
> +static int dpaa2_switch_lag_fdb_add(struct dpaa2_switch_lag *lag,
> +				    const unsigned char *addr, u16 vid)
> +{
> +	struct ethsw_port_priv *port_priv = lag->primary;
> +	struct dpaa2_mac_addr *a;
> +	int err = 0;
> +
> +	mutex_lock(&lag->fdb_lock);
> +
> +	a = dpaa2_switch_mac_addr_find(&lag->fdbs, addr, vid);
> +	if (a) {
> +		refcount_inc(&a->refcount);
> +		goto out;
> +	}
> +
> +	a = kzalloc(sizeof(*a), GFP_KERNEL);
> +	if (!a) {
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +
> +	err = dpaa2_switch_port_fdb_add(port_priv, addr);
> +	if (err) {
> +		kfree(a);
> +		goto out;
> +	}
> +
> +	ether_addr_copy(a->addr, addr);
> +	a->vid = vid;
> +	refcount_set(&a->refcount, 1);
> +	list_add_tail(&a->list, &lag->fdbs);
> +
> +out:
> +	mutex_unlock(&lag->fdb_lock);
> +
> +	return err;
> +}
> +
> +static void dpaa2_switch_lag_fdb_del(struct dpaa2_switch_lag *lag,
> +				     const unsigned char *addr, u16 vid)
> +{
> +	struct ethsw_port_priv *port_priv = lag->primary;
> +	struct dpaa2_mac_addr *a;
> +
> +	mutex_lock(&lag->fdb_lock);
> +
> +	a = dpaa2_switch_mac_addr_find(&lag->fdbs, addr, vid);
> +	if (!a)
> +		goto out;
> +
> +	if (!refcount_dec_and_test(&a->refcount))
> +		goto out;
> +
> +	list_del(&a->list);
> +	kfree(a);
> +
> +	dpaa2_switch_port_fdb_del(port_priv, addr);
> +
> +out:
> +	mutex_unlock(&lag->fdb_lock);
> +}
> +
>  static void dpaa2_switch_port_get_stats(struct net_device *netdev,
>  					struct rtnl_link_stats64 *stats)
>  {
> @@ -1533,6 +1611,33 @@ bool dpaa2_switch_port_dev_check(const struct net_device *netdev)
>  	return netdev->netdev_ops == &dpaa2_switch_port_ops;
>  }
>  
> +static bool dpaa2_switch_foreign_dev_check(const struct net_device *dev,
> +					   const struct net_device *foreign_dev)
> +{
> +	struct ethsw_port_priv *port_priv = netdev_priv(dev);
> +	struct ethsw_core *ethsw = port_priv->ethsw_data;
> +	struct ethsw_port_priv *other_port;
> +	int i;
> +
> +	if (netif_is_bridge_master(foreign_dev))
> +		if (port_priv->fdb->bridge_dev == foreign_dev)
> +			return false;
> +
> +	if (netif_is_bridge_port(foreign_dev)) {
> +		for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
> +			other_port = ethsw->ports[i];
> +
> +			if (!other_port)
> +				continue;
> +			if (dpaa2_switch_port_offloads_bridge_port(other_port,
> +								   foreign_dev))
> +				return false;
> +		}
> +	}
> +
> +	return true;
> +}
> +
>  static int dpaa2_switch_port_connect_mac(struct ethsw_port_priv *port_priv)
>  {
>  	struct fsl_mc_device *dpsw_port_dev, *dpmac_dev;
> @@ -2100,8 +2205,10 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
>  		goto err_egress_flood;
>  
>  	brport_dev = dpaa2_switch_port_to_bridge_port(port_priv);
> -	err = switchdev_bridge_port_offload(brport_dev, netdev, NULL,
> -					    NULL, NULL, false, extack);
> +	err = switchdev_bridge_port_offload(brport_dev, netdev, port_priv,
> +					    &dpaa2_switch_port_switchdev_nb,
> +					    &dpaa2_switch_port_switchdev_blocking_nb,
> +					    false, extack);
>  	if (err)
>  		goto err_switchdev_offload;
>  
> @@ -2143,7 +2250,9 @@ static void dpaa2_switch_port_pre_bridge_leave(struct net_device *netdev)
>  	if (!brport_dev)
>  		return;
>  
> -	switchdev_bridge_port_unoffload(brport_dev, NULL, NULL, NULL);
> +	switchdev_bridge_port_unoffload(brport_dev, port_priv,
> +					&dpaa2_switch_port_switchdev_nb,
> +					&dpaa2_switch_port_switchdev_blocking_nb);
>  
>  	/* Make sure that any FDB add/del operations are completed before the
>  	 * bridge layout changes
> @@ -2425,9 +2534,10 @@ static int dpaa2_switch_set_lag_cfg(struct net_device *bond_dev, u8 lag_id,
>  				    struct ethsw_core *ethsw)
>  {
>  	struct dpaa2_switch_lag *lag = &ethsw->lags[lag_id - 1];
> -	struct ethsw_port_priv *primary, *new_primary = NULL;
> -	struct ethsw_port_priv *port_priv = NULL;
> +	struct ethsw_port_priv *primary, *port_priv;
> +	struct ethsw_port_priv *new_primary = NULL;
>  	struct dpsw_lag_cfg cfg = {0};
> +	struct dpaa2_mac_addr *a;
>  	u8 num_ifs = 0;
>  	int err, i;
>  
> @@ -2454,7 +2564,6 @@ static int dpaa2_switch_set_lag_cfg(struct net_device *bond_dev, u8 lag_id,
>  	/* Build the interface list, always placing the primary first */
>  	if (primary)
>  		cfg.if_id[num_ifs++] = primary->idx;
> -
>  	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
>  		port_priv = ethsw->ports[i];
>  		if (port_priv == primary)
> @@ -2477,11 +2586,32 @@ static int dpaa2_switch_set_lag_cfg(struct net_device *bond_dev, u8 lag_id,
>  		lag->in_use = false;
>  	}
>  
> +	/* When the primary changes, migrate the FDB entries from the old
> +	 * primary to the new one: remove them before reconfiguring the LAG in
> +	 * hardware and re-add them on the new primary afterwards. We do not
> +	 * touch any refcounting since the intention is to change the HW entry,
> +	 * not the parallel software tracking.
> +	 */
> +	if (new_primary) {
> +		mutex_lock(&lag->fdb_lock);
> +		list_for_each_entry(a, &lag->fdbs, list)
> +			dpaa2_switch_port_fdb_del(lag->primary, a->addr);
> +		mutex_unlock(&lag->fdb_lock);
> +	}
> +
>  	err = dpsw_lag_set(ethsw->mc_io, 0, ethsw->dpsw_handle, &cfg);
>  	if (err)
>  		return err;
>  

sashiko-nipa notes:

	[High, Medium] When the last port leaves the bond, the block
	above sets

	  if (!num_ifs) {
	  	synchronize_net();

	  	lag->bond_dev = NULL;
	  	lag->primary = NULL;
	  	lag->in_use = false;
	  }

	Can a queued workqueue item still race with this teardown?
	Looking at dpaa2_switch_event_work():

	  rcu_read_lock();
	  lag = rcu_dereference(port_priv->lag);
	  rcu_read_unlock();

	  switch (switchdev_work->event) {
	  case SWITCHDEV_FDB_ADD_TO_DEVICE:
	  	if (lag)
	  		err = dpaa2_switch_lag_fdb_add(lag, fdb_info->addr, ...);

	The RCU read section ends before the lag is used, so the
	synchronize_net() in set_lag_cfg returns immediately without waiting
	for the work. dpaa2_switch_lag_fdb_add() then reads lag->primary while
	holding only fdb_lock, which the writer does not take. If lag->primary
	has been set to NULL by the writer, port_priv = lag->primary; ... in
	dpaa2_switch_lag_fdb_add() will dereference NULL through
	dpaa2_switch_port_fdb_add() -> dpaa2_switch_port_fdb_add_uc() reading
	port_priv->idx and port_priv->ethsw_data.

	The in-file comment claims the lag pointer staying alive is enough, but
	lag->primary is a separately mutable field with no shared lock between
	this reader and the writer. Should lag->primary itself be protected by
	fdb_lock (or by the rtnl/RCU pattern actually waited on), or should
	the bond-leave path flush_workqueue() before clearing primary?

	A related window exists during the primary migration below: between
	unlock of fdb_lock after the add-loop and the lag->primary = new_primary
	store, a concurrent work item can still observe the OLD primary value
	and install entries on it while the HW LAG is being reconfigured. Is
	that intentional?

Not correct. As stated in the commit message for 2/13, any concurrency
between on-going work items and changeupper events is resolved by
flushing the workqueue from the prechangeupper event.

Ioana

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v4 10/13] dpaa2-switch: offload FDBs added on an upper bond device
  2026-06-29 11:23 ` [PATCH net-next v4 10/13] dpaa2-switch: offload FDBs added on an upper bond device Ioana Ciornei
  2026-06-30 14:30   ` Ioana Ciornei
@ 2026-06-30 14:41   ` Ioana Ciornei
  1 sibling, 0 replies; 19+ messages in thread
From: Ioana Ciornei @ 2026-06-30 14:41 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev; +Cc: linux-kernel

On Mon, Jun 29, 2026 at 02:23:06PM +0300, Ioana Ciornei wrote:
> This patch adds support for offloading FDB entries added on upper bond
> devices.
> 
> First of all, the call to switchdev_bridge_port_offload() is updated so
> that the notifier blocks needed for FDB events replay are available to
> the bridge core.
> 
> Using switchdev_handle_*() helpers is also necessary because each FDB
> event needs to be fanned out to any DPAA2 switch lower device. This
> triggers another change in the return type used by the
> dpaa2_switch_port_fdb_event() - from notifier types to regular errno
> types.
> 
> Handling of the SWITCHDEV_FDB_ADD_TO_DEVICE/SWITCHDEV_FDB_DEL_TO_DEVICE
> events is updated so that the newly dpaa2_switch_lag_fdb_add() /
> dpaa2_switch_lag_fdb_del() functions are called anytime a port is under
> a bond device. This will allow us to manage refcounting on FDB entries
> which are added on the upper bond devices.
> 
> The DPAA2 switch uses shared-VLAN learning which means that the vid
> parameter is not used when adding an FDB entry to HW. The current
> behavior when dealing with FDB entries with the same MAC address but
> different VLANs is to add the entry to HW every time while removal will
> get done on the first 'bridge fdb del' command issued by the user.
> 
> The same behavior is kept also for FDBs added on bond devices by keeping
> the refcount on the {vid, addr} pair while the HW operation disregards
> entirely the vid parameter.
> 
> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
> ---
> Changes in v4:
> - Migrate FDBs in case the primary interface of a LAG changes.
> - Use lag->primary instead of determining each time the primary
> interface of a LAG device
> 
> Changes in v3:
> - Update dpaa2_switch_foreign_dev_check() so that we check if there is
> any port in the same switch as dev which offloads foreign_dev in case
> this is a bridge port.
> - Add mutex_destroy on the per LAG fdb_lock
> - Make sure that all FDB events were processed on the workqueue on the
> .remove() path.
> - Delete the refcounted entry in dpaa2_switch_lag_fdb_del() as soon as
> possible, even if the HW deletion would fail
> - Access the port_priv->lag field only through the proper rcu accessors.
> 
> Changes in v2:
> - Update dpaa2_switch_foreign_dev_check() so that we check if between
> the switch port and the foreign net_device is an offloaded path. Before
> this change we also checked if the foreign_dev was offloaded or not by
> the switch port.
> - Update the switchdev_bridge_port_unoffload() by passing it the proper
> context and the notifier blocks.
> - Add dev_hold() and dev_put() calls for orig_dev
> ---

(...)

> @@ -2454,7 +2564,6 @@ static int dpaa2_switch_set_lag_cfg(struct net_device *bond_dev, u8 lag_id,
>  	/* Build the interface list, always placing the primary first */
>  	if (primary)
>  		cfg.if_id[num_ifs++] = primary->idx;
> -
>  	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
>  		port_priv = ethsw->ports[i];
>  		if (port_priv == primary)
> @@ -2477,11 +2586,32 @@ static int dpaa2_switch_set_lag_cfg(struct net_device *bond_dev, u8 lag_id,
>  		lag->in_use = false;
>  	}

sashiko.dev notes:

	Does this leak the dynamically allocated items in the lag->fdbs
	list?  When the last interface leaves a LAG, lag->in_use is set
	to false here, but the elements in lag->fdbs are not freed.

The mechanism initiated by nbp_switchdev_unsync_objs() will remove all
the lag->fdbs automatically without the need to add extra code which
frees manually all entries.

>  
> +	/* When the primary changes, migrate the FDB entries from the old
> +	 * primary to the new one: remove them before reconfiguring the LAG in
> +	 * hardware and re-add them on the new primary afterwards. We do not
> +	 * touch any refcounting since the intention is to change the HW entry,
> +	 * not the parallel software tracking.
> +	 */
> +	if (new_primary) {
> +		mutex_lock(&lag->fdb_lock);
> +		list_for_each_entry(a, &lag->fdbs, list)
> +			dpaa2_switch_port_fdb_del(lag->primary, a->addr);
> +		mutex_unlock(&lag->fdb_lock);
> +	}

sashiko.dev notes:

	Does dropping the fdb_lock here open a race window for leaking
	FDB entries?  While the lock is dropped, a concurrent
	asynchronous work item could execute dpaa2_switch_lag_fdb_add(),
	acquire fdb_lock, add a new FDB entry to the old lag->primary in
	hardware, and append it to the software lag->fdbs list.

	When this thread re-acquires the lock below, it will loop over
	lag->fdbs and add all entries to the new primary. The new entry
	would then be programmed on both the old and new primary ports.
	Later, a delete operation would only remove it from the new
	primary, permanently leaking the hardware entry on the old port.

The LAG primary changes only on bridge leave which will not run
concurrently with any workqueue events since there is a
flush_workqueue() in the pre bridge leave event.

>  static void dpaa2_switch_event_work(struct work_struct *work)
>  {
>  	struct ethsw_switchdev_event_work *switchdev_work =
>  		container_of(work, struct ethsw_switchdev_event_work, work);
> +	struct net_device *orig_dev = switchdev_work->orig_dev;
>  	struct net_device *dev = switchdev_work->dev;
> +	struct ethsw_port_priv *port_priv = netdev_priv(dev);
>  	struct switchdev_notifier_fdb_info *fdb_info;
> +	struct dpaa2_switch_lag *lag;
>  	int err;
>  
>  	fdb_info = &switchdev_work->fdb_info;
>  
> +	/* The lag structures are freed only from dpaa2_switch_remove(), which
> +	 * first flushes this workqueue, so the pointer stays valid for the
> +	 * lifetime of the work item. Only the dereference needs the RCU
> +	 * read-side lock; the FDB helpers below can sleep and must run outside
> +	 * of it.
> +	 */
> +	rcu_read_lock();
> +	lag = rcu_dereference(port_priv->lag);
> +	rcu_read_unlock();

sashiko.dev notes:

	Does this sequence leave lag unprotected against concurrent
	teardown?

	By dropping the RCU read lock here, a concurrent call to
	dpaa2_switch_set_lag_cfg() could set lag->primary to NULL and
	call synchronize_net(). Since the RCU read lock is dropped,
	synchronize_net() would not wait for this work item to finish.

	If the work item then resumes, dpaa2_switch_lag_fdb_add() reads
	lag->primary locklessly:

	dpaa2_switch_lag_fdb_add() {
		...
		struct ethsw_port_priv *port_priv = lag->primary;
		err = dpaa2_switch_port_fdb_add(port_priv, addr);

	This would pass a NULL port_priv to dpaa2_switch_port_fdb_add(),
	potentially causing a NULL pointer dereference.

Again, the flush_workqueue() avoids these kind of concurrencies.

Ioana

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload
  2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
                   ` (12 preceding siblings ...)
  2026-06-29 11:23 ` [PATCH net-next v4 13/13] dpaa2-switch: add support for imprecise source port Ioana Ciornei
@ 2026-07-01 16:10 ` patchwork-bot+netdevbpf
  13 siblings, 0 replies; 19+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-07-01 16:10 UTC (permalink / raw)
  To: Ioana Ciornei
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, netdev,
	linux-kernel

Hello:

This series was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Mon, 29 Jun 2026 14:22:56 +0300 you wrote:
> This patch set adds support in dpaa2-switch for offloading upper bond
> devices.
> 
> The first two patches remove the necessity to hold rtnl_lock during the
> event processing workqueue by ensuring that all event were processed
> before any changes in FDB layout happens.
> 
> [...]

Here is the summary with links:
  - [net-next,v4,01/13] dpaa2-switch: remove unnecessary dev_mc_add/dev_mc_del calls
    https://git.kernel.org/netdev/net-next/c/97cb4ae7511b
  - [net-next,v4,02/13] dpaa2-switch: avoid holding rtnl_lock in dpaa2_switch_event_work()
    https://git.kernel.org/netdev/net-next/c/0cf0b8ac40ae
  - [net-next,v4,03/13] dpaa2-switch: extend the FDB management to cover bond scenarios
    https://git.kernel.org/netdev/net-next/c/900c915030f6
  - [net-next,v4,04/13] dpaa2-switch: create a separate dpaa2_switch_port_fdb_event() function
    https://git.kernel.org/netdev/net-next/c/da7ec6b81b0b
  - [net-next,v4,05/13] dpaa2-switch: check early if an FDB entry should be added
    https://git.kernel.org/netdev/net-next/c/0199ff706da1
  - [net-next,v4,06/13] dpaa2-switch: add dpaa2_switch_port_to_bridge_port() helper
    https://git.kernel.org/netdev/net-next/c/06840a236334
  - [net-next,v4,07/13] dpaa2-switch: consolidate unicast and multicast management
    https://git.kernel.org/netdev/net-next/c/28b79b55852a
  - [net-next,v4,08/13] dpaa2-switch: add LAG configuration API
    https://git.kernel.org/netdev/net-next/c/f27ad9b45b13
  - [net-next,v4,09/13] dpaa2-switch: add support for LAG offload
    https://git.kernel.org/netdev/net-next/c/9ca09640bfc8
  - [net-next,v4,10/13] dpaa2-switch: offload FDBs added on an upper bond device
    https://git.kernel.org/netdev/net-next/c/711c0beea13f
  - [net-next,v4,11/13] dpaa2-switch: offload port objects on an upper bond device
    https://git.kernel.org/netdev/net-next/c/f0a7468fdbeb
  - [net-next,v4,12/13] dpaa2-switch: trap all link local reserved addresses to the CPU
    https://git.kernel.org/netdev/net-next/c/a0a8970b516d
  - [net-next,v4,13/13] dpaa2-switch: add support for imprecise source port
    https://git.kernel.org/netdev/net-next/c/f985358f4ee2

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2026-07-01 16:10 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-29 11:22 [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
2026-06-29 11:22 ` [PATCH net-next v4 01/13] dpaa2-switch: remove unnecessary dev_mc_add/dev_mc_del calls Ioana Ciornei
2026-06-29 11:22 ` [PATCH net-next v4 02/13] dpaa2-switch: avoid holding rtnl_lock in dpaa2_switch_event_work() Ioana Ciornei
2026-06-29 11:22 ` [PATCH net-next v4 03/13] dpaa2-switch: extend the FDB management to cover bond scenarios Ioana Ciornei
2026-06-29 11:23 ` [PATCH net-next v4 04/13] dpaa2-switch: create a separate dpaa2_switch_port_fdb_event() function Ioana Ciornei
2026-06-29 11:23 ` [PATCH net-next v4 05/13] dpaa2-switch: check early if an FDB entry should be added Ioana Ciornei
2026-06-29 11:23 ` [PATCH net-next v4 06/13] dpaa2-switch: add dpaa2_switch_port_to_bridge_port() helper Ioana Ciornei
2026-06-30 13:51   ` Ioana Ciornei
2026-06-29 11:23 ` [PATCH net-next v4 07/13] dpaa2-switch: consolidate unicast and multicast management Ioana Ciornei
2026-06-29 11:23 ` [PATCH net-next v4 08/13] dpaa2-switch: add LAG configuration API Ioana Ciornei
2026-06-29 11:23 ` [PATCH net-next v4 09/13] dpaa2-switch: add support for LAG offload Ioana Ciornei
2026-06-30 14:23   ` Ioana Ciornei
2026-06-29 11:23 ` [PATCH net-next v4 10/13] dpaa2-switch: offload FDBs added on an upper bond device Ioana Ciornei
2026-06-30 14:30   ` Ioana Ciornei
2026-06-30 14:41   ` Ioana Ciornei
2026-06-29 11:23 ` [PATCH net-next v4 11/13] dpaa2-switch: offload port objects " Ioana Ciornei
2026-06-29 11:23 ` [PATCH net-next v4 12/13] dpaa2-switch: trap all link local reserved addresses to the CPU Ioana Ciornei
2026-06-29 11:23 ` [PATCH net-next v4 13/13] dpaa2-switch: add support for imprecise source port Ioana Ciornei
2026-07-01 16:10 ` [PATCH net-next v4 00/13] dpaa2-switch: add support for LAG offload patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox