netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers
@ 2023-07-19 11:01 Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 01/17] net: bridge: br_switchdev: Tolerate -EOPNOTSUPP when replaying MDB Petr Machata
                   ` (17 more replies)
  0 siblings, 18 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

The mlxsw driver currently makes the assumption that the user applies
configuration in a bottom-up manner. Thus netdevices need to be added to
the bridge before IP addresses are configured on that bridge or SVI added
on top of it. Enslaving a netdevice to another netdevice that already has
uppers is in fact forbidden by mlxsw for this reason. Despite this safety,
it is rather easy to get into situations where the offloaded configuration
is just plain wrong.

As an example, take a front panel port, configure an IP address: it gets a
RIF. Now enslave the port to the bridge, and the RIF is gone. Remove the
port from the bridge again, but the RIF never comes back. There is a number
of similar situations, where changing the configuration there and back
utterly breaks the offload.

Similarly, detaching a front panel port from a configured topology means
unoffloading of this whole topology -- VLAN uppers, next hops, etc.
Attaching the port back is then not permitted at all. If it were, it would
not result in a working configuration, because much of mlxsw is written to
react to changes in immediate configuration. There is nothing that would go
visit netdevices in the attached-to topology and offload existing routes
and VLAN memberships, for example.

In this patchset, introduce a number of replays to be invoked so that this
sort of post-hoc offload is supported. Then remove the vetoes that
disallowed enslavement of front panel ports to other netdevices with
uppers.

The patchset progresses as follows:

- In patch #1, fix an issue in the bridge driver. To my knowledge, the
  issue could not have resulted in a buggy behavior previously, and thus is
  packaged with this patchset instead of being sent separately to net.

- In patch #2, add a new helper to the switchdev code.

- In patch #3, drop mlxsw selftests that will not be relevant after this
  patchset anymore.

- Patches #4, #5, #6, #7 and #8 prepare the codebase for smoother
  introduction of the rest of the code.

- Patches #9, #10, #11, #12, #13 and #14 replay various aspects of upper
  configuration when a front panel port is introduced into a topology.
  Individual patches take care of bridge and LAG RIF memberships, switchdev
  replay, nexthop and neighbors replay, and MACVLAN offload.

- Patches #15 and #16 introduce RIFs for newly-relevant netdevices when a
  front panel port is enslaved (in which case all uppers are newly
  relevant), or, respectively, deslaved (in which case the newly-relevant
  netdevice is the one being deslaved).

- Up until this point, the introduced scaffolding was not really used,
  because mlxsw still forbids enslavement of mlxsw netdevices to uppers
  with uppers. In patch #17, this condition is finally relaxed.

A sizable selftest suite is available to test all this new code. That will
be sent in a separate patchset.

Petr Machata (17):
  net: bridge: br_switchdev: Tolerate -EOPNOTSUPP when replaying MDB
  net: switchdev: Add a helper to replay objects on a bridge port
  selftests: mlxsw: rtnetlink: Drop obsolete tests
  mlxsw: spectrum_router: Allow address handlers to run on bridge ports
  mlxsw: spectrum_router: Extract a helper to schedule neighbour work
  mlxsw: spectrum: Split a helper out of mlxsw_sp_netdevice_event()
  mlxsw: spectrum: Allow event handlers to check unowned bridges
  mlxsw: spectrum: Add a replay_deslavement argument to event handlers
  mlxsw: spectrum: On port enslavement to a LAG, join upper's bridges
  mlxsw: spectrum_switchdev: Replay switchdev objects on port join
  mlxsw: spectrum_router: Join RIFs of LAG upper VLANs
  mlxsw: spectrum_router: Offload ethernet nexthops when RIF is made
  mlxsw: spectrum_router: Replay MACVLANs when RIF is made
  mlxsw: spectrum_router: Replay neighbours when RIF is made
  mlxsw: spectrum_router: Replay IP NETDEV_UP on device enslavement
  mlxsw: spectrum_router: Replay IP NETDEV_UP on device deslavement
  mlxsw: spectrum: Permit enslavement to netdevices with uppers

 .../net/ethernet/mellanox/mlxsw/spectrum.c    | 312 ++++++++++---
 .../net/ethernet/mellanox/mlxsw/spectrum.h    |   2 +
 .../ethernet/mellanox/mlxsw/spectrum_router.c | 432 ++++++++++++++++--
 .../ethernet/mellanox/mlxsw/spectrum_router.h |   7 +
 .../mellanox/mlxsw/spectrum_switchdev.c       | 138 +++++-
 include/net/switchdev.h                       |   6 +
 net/bridge/br.c                               |   8 +
 net/bridge/br_private.h                       |  16 +
 net/bridge/br_switchdev.c                     |  15 +-
 net/switchdev/switchdev.c                     |  25 +
 .../selftests/drivers/net/mlxsw/rtnetlink.sh  |  31 --
 11 files changed, 862 insertions(+), 130 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH net-next 01/17] net: bridge: br_switchdev: Tolerate -EOPNOTSUPP when replaying MDB
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 02/17] net: switchdev: Add a helper to replay objects on a bridge port Petr Machata
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov, bridge

There are two kinds of MDB entries to be replayed: port MDB entries, and
host MDB entries. They are both replayed by br_switchdev_mdb_replay(). If
the driver supports one kind, but lacks the other, the first -EOPNOTSUPP
returned terminates the whole replay, including any further still-supported
objects in the list.

For this to cause issues, there must be MDB entries for both the host and
the port being replayed. In that case, if the driver bails out from
handling the host entry, the port entries are never replayed. However, the
replay is currently only done when a switchdev port joins a bridge. There
would be no port memberships at that point. Thus despite being erroneous,
the code does not cause observable bugs.

This is not an issue with other object kinds either, because there, each
function replays one object kind. If a driver does not support that kind,
it makes sense to bail out early. -EOPNOTSUPP is then ignored in
nbp_switchdev_sync_objs().

For MDB, suppress the -EOPNOTSUPP error code in br_switchdev_mdb_replay()
already, so that the whole list gets replayed.

The reason we need this patch is that a future patch will introduce a
replay that should be used when a front-panel port netdevice is enslaved to
a bridge lower, in particular a LAG. The LAG netdevice can already have
both host and port MDB entries. The port entries need to be replayed so
that they are offloaded on the port that joins the LAG.

Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Ivan Vecera <ivecera@redhat.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: bridge@lists.linux-foundation.org
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 net/bridge/br_switchdev.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index ba95c4d74a60..e92e0338afee 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -727,6 +727,8 @@ br_switchdev_mdb_replay(struct net_device *br_dev, struct net_device *dev,
 		err = br_switchdev_mdb_replay_one(nb, dev,
 						  SWITCHDEV_OBJ_PORT_MDB(obj),
 						  action, ctx, extack);
+		if (err == -EOPNOTSUPP)
+			err = 0;
 		if (err)
 			goto out_free_mdb;
 	}
@@ -759,8 +761,10 @@ static int nbp_switchdev_sync_objs(struct net_bridge_port *p, const void *ctx,
 
 	err = br_switchdev_mdb_replay(br_dev, dev, ctx, true, blocking_nb,
 				      extack);
-	if (err && err != -EOPNOTSUPP)
+	if (err) {
+		/* -EOPNOTSUPP not propagated from MDB replay. */
 		return err;
+	}
 
 	err = br_switchdev_fdb_replay(br_dev, ctx, true, atomic_nb);
 	if (err && err != -EOPNOTSUPP)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 02/17] net: switchdev: Add a helper to replay objects on a bridge port
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 01/17] net: bridge: br_switchdev: Tolerate -EOPNOTSUPP when replaying MDB Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2024-01-31 15:47   ` Vladimir Oltean
  2023-07-19 11:01 ` [PATCH net-next 03/17] selftests: mlxsw: rtnetlink: Drop obsolete tests Petr Machata
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov, bridge

When a front panel joins a bridge via another netdevice (typically a LAG),
the driver needs to learn about the objects configured on the bridge port.
When the bridge port is offloaded by the driver for the first time, this
can be achieved by passing a notifier to switchdev_bridge_port_offload().
The notifier is then invoked for the individual objects (such as VLANs)
configured on the bridge, and can look for the interesting ones.

Calling switchdev_bridge_port_offload() when the second port joins the
bridge lower is unnecessary, but the replay is still needed. To that end,
add a new function, switchdev_bridge_port_replay(), which does only the
replay part of the _offload() function in exactly the same way as that
function.

Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Ivan Vecera <ivecera@redhat.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: bridge@lists.linux-foundation.org
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 include/net/switchdev.h   |  6 ++++++
 net/bridge/br.c           |  8 ++++++++
 net/bridge/br_private.h   | 16 ++++++++++++++++
 net/bridge/br_switchdev.c |  9 +++++++++
 net/switchdev/switchdev.c | 25 +++++++++++++++++++++++++
 5 files changed, 64 insertions(+)

diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index ca0312b78294..4d324e2a2eef 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -231,6 +231,7 @@ enum switchdev_notifier_type {
 
 	SWITCHDEV_BRPORT_OFFLOADED,
 	SWITCHDEV_BRPORT_UNOFFLOADED,
+	SWITCHDEV_BRPORT_REPLAY,
 };
 
 struct switchdev_notifier_info {
@@ -299,6 +300,11 @@ void switchdev_bridge_port_unoffload(struct net_device *brport_dev,
 				     const void *ctx,
 				     struct notifier_block *atomic_nb,
 				     struct notifier_block *blocking_nb);
+int switchdev_bridge_port_replay(struct net_device *brport_dev,
+				 struct net_device *dev, const void *ctx,
+				 struct notifier_block *atomic_nb,
+				 struct notifier_block *blocking_nb,
+				 struct netlink_ext_ack *extack);
 
 void switchdev_deferred_process(void);
 int switchdev_port_attr_set(struct net_device *dev,
diff --git a/net/bridge/br.c b/net/bridge/br.c
index 4f5098d33a46..a6e94ceb7c9a 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -234,6 +234,14 @@ static int br_switchdev_blocking_event(struct notifier_block *nb,
 		br_switchdev_port_unoffload(p, b->ctx, b->atomic_nb,
 					    b->blocking_nb);
 		break;
+	case SWITCHDEV_BRPORT_REPLAY:
+		brport_info = ptr;
+		b = &brport_info->brport;
+
+		err = br_switchdev_port_replay(p, b->dev, b->ctx, b->atomic_nb,
+					       b->blocking_nb, extack);
+		err = notifier_from_errno(err);
+		break;
 	}
 
 out:
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index a63b32c1638e..a69774131535 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -2115,6 +2115,12 @@ void br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx,
 				 struct notifier_block *atomic_nb,
 				 struct notifier_block *blocking_nb);
 
+int br_switchdev_port_replay(struct net_bridge_port *p,
+			     struct net_device *dev, const void *ctx,
+			     struct notifier_block *atomic_nb,
+			     struct notifier_block *blocking_nb,
+			     struct netlink_ext_ack *extack);
+
 bool br_switchdev_frame_uses_tx_fwd_offload(struct sk_buff *skb);
 
 void br_switchdev_frame_set_offload_fwd_mark(struct sk_buff *skb);
@@ -2165,6 +2171,16 @@ br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx,
 {
 }
 
+static inline int
+br_switchdev_port_replay(struct net_bridge_port *p,
+			 struct net_device *dev, const void *ctx,
+			 struct notifier_block *atomic_nb,
+			 struct notifier_block *blocking_nb,
+			 struct netlink_ext_ack *extack)
+{
+	return -EOPNOTSUPP;
+}
+
 static inline bool br_switchdev_frame_uses_tx_fwd_offload(struct sk_buff *skb)
 {
 	return false;
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index e92e0338afee..ee84e783e1df 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -829,3 +829,12 @@ void br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx,
 
 	nbp_switchdev_del(p);
 }
+
+int br_switchdev_port_replay(struct net_bridge_port *p,
+			     struct net_device *dev, const void *ctx,
+			     struct notifier_block *atomic_nb,
+			     struct notifier_block *blocking_nb,
+			     struct netlink_ext_ack *extack)
+{
+	return nbp_switchdev_sync_objs(p, ctx, atomic_nb, blocking_nb, extack);
+}
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 8cc42aea19c7..5b045284849e 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -862,3 +862,28 @@ void switchdev_bridge_port_unoffload(struct net_device *brport_dev,
 					  NULL);
 }
 EXPORT_SYMBOL_GPL(switchdev_bridge_port_unoffload);
+
+int switchdev_bridge_port_replay(struct net_device *brport_dev,
+				 struct net_device *dev, const void *ctx,
+				 struct notifier_block *atomic_nb,
+				 struct notifier_block *blocking_nb,
+				 struct netlink_ext_ack *extack)
+{
+	struct switchdev_notifier_brport_info brport_info = {
+		.brport = {
+			.dev = dev,
+			.ctx = ctx,
+			.atomic_nb = atomic_nb,
+			.blocking_nb = blocking_nb,
+		},
+	};
+	int err;
+
+	ASSERT_RTNL();
+
+	err = call_switchdev_blocking_notifiers(SWITCHDEV_BRPORT_REPLAY,
+						brport_dev, &brport_info.info,
+						extack);
+	return notifier_to_errno(err);
+}
+EXPORT_SYMBOL_GPL(switchdev_bridge_port_replay);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 03/17] selftests: mlxsw: rtnetlink: Drop obsolete tests
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 01/17] net: bridge: br_switchdev: Tolerate -EOPNOTSUPP when replaying MDB Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 02/17] net: switchdev: Add a helper to replay objects on a bridge port Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 04/17] mlxsw: spectrum_router: Allow address handlers to run on bridge ports Petr Machata
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

Support for enslaving ports to LAGs with uppers will be added in the
following patches. Selftests to make sure it actually does the right thing
are ready and will be sent as a follow-up.

Similarly, ordering of MACVLAN creation and RIF creation will be relaxed
and it will be permitted to create a MACVLAN first.

Thus these two tests are obsolete. Drop them.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../selftests/drivers/net/mlxsw/rtnetlink.sh  | 31 -------------------
 1 file changed, 31 deletions(-)

diff --git a/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh b/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh
index 5e89657857c7..893a693ad805 100755
--- a/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh
@@ -16,7 +16,6 @@ ALL_TESTS="
 	bridge_deletion_test
 	bridge_vlan_flags_test
 	vlan_1_test
-	lag_bridge_upper_test
 	duplicate_vlans_test
 	vlan_rif_refcount_test
 	subport_rif_refcount_test
@@ -211,33 +210,6 @@ vlan_1_test()
 	ip link del dev $swp1.1
 }
 
-lag_bridge_upper_test()
-{
-	# Test that ports cannot be enslaved to LAG devices that have uppers
-	# and that failure is handled gracefully. See commit b3529af6bb0d
-	# ("spectrum: Reference count VLAN entries") for more details
-	RET=0
-
-	ip link add name bond1 type bond mode 802.3ad
-
-	ip link add name br0 type bridge vlan_filtering 1
-	ip link set dev bond1 master br0
-
-	ip link set dev $swp1 down
-	ip link set dev $swp1 master bond1 &> /dev/null
-	check_fail $? "managed to enslave port to lag when should not"
-
-	# This might generate a trace, if we did not handle the failure
-	# correctly
-	ip -6 address add 2001:db8:1::1/64 dev $swp1
-	ip -6 address del 2001:db8:1::1/64 dev $swp1
-
-	log_test "lag with bridge upper"
-
-	ip link del dev br0
-	ip link del dev bond1
-}
-
 duplicate_vlans_test()
 {
 	# Test that on a given port a VLAN is only used once. Either as VLAN
@@ -510,9 +482,6 @@ vlan_interface_uppers_test()
 	ip link set dev $swp1 master br0
 
 	ip link add link br0 name br0.10 type vlan id 10
-	ip link add link br0.10 name macvlan0 \
-		type macvlan mode private &> /dev/null
-	check_fail $? "managed to create a macvlan when should not"
 
 	ip -6 address add 2001:db8:1::1/64 dev br0.10
 	ip link add link br0.10 name macvlan0 type macvlan mode private
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 04/17] mlxsw: spectrum_router: Allow address handlers to run on bridge ports
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (2 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 03/17] selftests: mlxsw: rtnetlink: Drop obsolete tests Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 05/17] mlxsw: spectrum_router: Extract a helper to schedule neighbour work Petr Machata
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

Currently the IP address event handlers bail out when the event is related
to a netdevice that is a bridge port or a member of a LAG. In order to
create a RIF when a bridged or LAG'd port is unenslaved, these event
handlers will be replayed. However, at the point in time when the
NETDEV_CHANGEUPPER event is delivered, informing of the loss of
enslavement, the port is still formally enslaved.

In order for the operation to have any effect, these handlers need an extra
parameter to indicate that the check for bridge or LAG membership should
not be done. In this patch, add an argument "nomaster" to several event
handlers.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../ethernet/mellanox/mlxsw/spectrum_router.c | 41 +++++++++++--------
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 109ac2db0d65..6b1b142c95db 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -8884,10 +8884,11 @@ static int mlxsw_sp_inetaddr_port_vlan_event(struct net_device *l3_dev,
 }
 
 static int mlxsw_sp_inetaddr_port_event(struct net_device *port_dev,
-					unsigned long event,
+					unsigned long event, bool nomaster,
 					struct netlink_ext_ack *extack)
 {
-	if (netif_is_any_bridge_port(port_dev) || netif_is_lag_port(port_dev))
+	if (!nomaster && (netif_is_any_bridge_port(port_dev) ||
+			  netif_is_lag_port(port_dev)))
 		return 0;
 
 	return mlxsw_sp_inetaddr_port_vlan_event(port_dev, port_dev, event,
@@ -8918,10 +8919,10 @@ static int __mlxsw_sp_inetaddr_lag_event(struct net_device *l3_dev,
 }
 
 static int mlxsw_sp_inetaddr_lag_event(struct net_device *lag_dev,
-				       unsigned long event,
+				       unsigned long event, bool nomaster,
 				       struct netlink_ext_ack *extack)
 {
-	if (netif_is_bridge_port(lag_dev))
+	if (!nomaster && netif_is_bridge_port(lag_dev))
 		return 0;
 
 	return __mlxsw_sp_inetaddr_lag_event(lag_dev, lag_dev, event,
@@ -8980,7 +8981,7 @@ static int mlxsw_sp_inetaddr_bridge_event(struct mlxsw_sp *mlxsw_sp,
 
 static int mlxsw_sp_inetaddr_vlan_event(struct mlxsw_sp *mlxsw_sp,
 					struct net_device *vlan_dev,
-					unsigned long event,
+					unsigned long event, bool nomaster,
 					struct netlink_ext_ack *extack)
 {
 	struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
@@ -8988,7 +8989,7 @@ static int mlxsw_sp_inetaddr_vlan_event(struct mlxsw_sp *mlxsw_sp,
 	u16 lower_pvid;
 	int err;
 
-	if (netif_is_bridge_port(vlan_dev))
+	if (!nomaster && netif_is_bridge_port(vlan_dev))
 		return 0;
 
 	if (mlxsw_sp_port_dev_check(real_dev)) {
@@ -9132,19 +9133,21 @@ static int mlxsw_sp_inetaddr_macvlan_event(struct mlxsw_sp *mlxsw_sp,
 
 static int __mlxsw_sp_inetaddr_event(struct mlxsw_sp *mlxsw_sp,
 				     struct net_device *dev,
-				     unsigned long event,
+				     unsigned long event, bool nomaster,
 				     struct netlink_ext_ack *extack)
 {
 	if (mlxsw_sp_port_dev_check(dev))
-		return mlxsw_sp_inetaddr_port_event(dev, event, extack);
+		return mlxsw_sp_inetaddr_port_event(dev, event, nomaster,
+						    extack);
 	else if (netif_is_lag_master(dev))
-		return mlxsw_sp_inetaddr_lag_event(dev, event, extack);
+		return mlxsw_sp_inetaddr_lag_event(dev, event, nomaster,
+						   extack);
 	else if (netif_is_bridge_master(dev))
 		return mlxsw_sp_inetaddr_bridge_event(mlxsw_sp, dev, -1, event,
 						      extack);
 	else if (is_vlan_dev(dev))
 		return mlxsw_sp_inetaddr_vlan_event(mlxsw_sp, dev, event,
-						    extack);
+						    nomaster, extack);
 	else if (netif_is_macvlan(dev))
 		return mlxsw_sp_inetaddr_macvlan_event(mlxsw_sp, dev, event,
 						       extack);
@@ -9171,7 +9174,8 @@ static int mlxsw_sp_inetaddr_event(struct notifier_block *nb,
 	if (!mlxsw_sp_rif_should_config(rif, dev, event))
 		goto out;
 
-	err = __mlxsw_sp_inetaddr_event(router->mlxsw_sp, dev, event, NULL);
+	err = __mlxsw_sp_inetaddr_event(router->mlxsw_sp, dev, event, false,
+					NULL);
 out:
 	mutex_unlock(&router->lock);
 	return notifier_from_errno(err);
@@ -9195,7 +9199,8 @@ static int mlxsw_sp_inetaddr_valid_event(struct notifier_block *unused,
 	if (!mlxsw_sp_rif_should_config(rif, dev, event))
 		goto out;
 
-	err = __mlxsw_sp_inetaddr_event(mlxsw_sp, dev, event, ivi->extack);
+	err = __mlxsw_sp_inetaddr_event(mlxsw_sp, dev, event, false,
+					ivi->extack);
 out:
 	mutex_unlock(&mlxsw_sp->router->lock);
 	return notifier_from_errno(err);
@@ -9224,7 +9229,7 @@ static void mlxsw_sp_inet6addr_event_work(struct work_struct *work)
 	if (!mlxsw_sp_rif_should_config(rif, dev, event))
 		goto out;
 
-	__mlxsw_sp_inetaddr_event(mlxsw_sp, dev, event, NULL);
+	__mlxsw_sp_inetaddr_event(mlxsw_sp, dev, event, false, NULL);
 out:
 	mutex_unlock(&mlxsw_sp->router->lock);
 	rtnl_unlock();
@@ -9278,7 +9283,8 @@ static int mlxsw_sp_inet6addr_valid_event(struct notifier_block *unused,
 	if (!mlxsw_sp_rif_should_config(rif, dev, event))
 		goto out;
 
-	err = __mlxsw_sp_inetaddr_event(mlxsw_sp, dev, event, i6vi->extack);
+	err = __mlxsw_sp_inetaddr_event(mlxsw_sp, dev, event, false,
+					i6vi->extack);
 out:
 	mutex_unlock(&mlxsw_sp->router->lock);
 	return notifier_from_errno(err);
@@ -9598,10 +9604,11 @@ static int mlxsw_sp_port_vrf_join(struct mlxsw_sp *mlxsw_sp,
 	 */
 	rif = mlxsw_sp_rif_find_by_dev(mlxsw_sp, l3_dev);
 	if (rif)
-		__mlxsw_sp_inetaddr_event(mlxsw_sp, l3_dev, NETDEV_DOWN,
+		__mlxsw_sp_inetaddr_event(mlxsw_sp, l3_dev, NETDEV_DOWN, false,
 					  extack);
 
-	return __mlxsw_sp_inetaddr_event(mlxsw_sp, l3_dev, NETDEV_UP, extack);
+	return __mlxsw_sp_inetaddr_event(mlxsw_sp, l3_dev, NETDEV_UP, false,
+					 extack);
 }
 
 static void mlxsw_sp_port_vrf_leave(struct mlxsw_sp *mlxsw_sp,
@@ -9612,7 +9619,7 @@ static void mlxsw_sp_port_vrf_leave(struct mlxsw_sp *mlxsw_sp,
 	rif = mlxsw_sp_rif_find_by_dev(mlxsw_sp, l3_dev);
 	if (!rif)
 		return;
-	__mlxsw_sp_inetaddr_event(mlxsw_sp, l3_dev, NETDEV_DOWN, NULL);
+	__mlxsw_sp_inetaddr_event(mlxsw_sp, l3_dev, NETDEV_DOWN, false, NULL);
 }
 
 static bool mlxsw_sp_is_vrf_event(unsigned long event, void *ptr)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 05/17] mlxsw: spectrum_router: Extract a helper to schedule neighbour work
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (3 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 04/17] mlxsw: spectrum_router: Allow address handlers to run on bridge ports Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 06/17] mlxsw: spectrum: Split a helper out of mlxsw_sp_netdevice_event() Petr Machata
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

This will come in handy for neighbour replay.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../ethernet/mellanox/mlxsw/spectrum_router.c | 25 ++++++++++++-------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 6b1b142c95db..18e059e94079 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -2872,6 +2872,21 @@ static bool mlxsw_sp_dev_lower_is_port(struct net_device *dev)
 	return !!mlxsw_sp_port;
 }
 
+static int mlxsw_sp_router_schedule_neigh_work(struct mlxsw_sp_router *router,
+					       struct neighbour *n)
+{
+	struct net *net;
+
+	net = neigh_parms_net(n->parms);
+
+	/* Take a reference to ensure the neighbour won't be destructed until we
+	 * drop the reference in delayed work.
+	 */
+	neigh_clone(n);
+	return mlxsw_sp_router_schedule_work(net, router, n,
+					     mlxsw_sp_router_neigh_event_work);
+}
+
 static int mlxsw_sp_router_netevent_event(struct notifier_block *nb,
 					  unsigned long event, void *ptr)
 {
@@ -2879,7 +2894,6 @@ static int mlxsw_sp_router_netevent_event(struct notifier_block *nb,
 	unsigned long interval;
 	struct neigh_parms *p;
 	struct neighbour *n;
-	struct net *net;
 
 	router = container_of(nb, struct mlxsw_sp_router, netevent_nb);
 
@@ -2903,7 +2917,6 @@ static int mlxsw_sp_router_netevent_event(struct notifier_block *nb,
 		break;
 	case NETEVENT_NEIGH_UPDATE:
 		n = ptr;
-		net = neigh_parms_net(n->parms);
 
 		if (n->tbl->family != AF_INET && n->tbl->family != AF_INET6)
 			return NOTIFY_DONE;
@@ -2911,13 +2924,7 @@ static int mlxsw_sp_router_netevent_event(struct notifier_block *nb,
 		if (!mlxsw_sp_dev_lower_is_port(n->dev))
 			return NOTIFY_DONE;
 
-		/* Take a reference to ensure the neighbour won't be
-		 * destructed until we drop the reference in delayed
-		 * work.
-		 */
-		neigh_clone(n);
-		return mlxsw_sp_router_schedule_work(net, router, n,
-				mlxsw_sp_router_neigh_event_work);
+		return mlxsw_sp_router_schedule_neigh_work(router, n);
 
 	case NETEVENT_IPV4_MPATH_HASH_UPDATE:
 	case NETEVENT_IPV6_MPATH_HASH_UPDATE:
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 06/17] mlxsw: spectrum: Split a helper out of mlxsw_sp_netdevice_event()
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (4 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 05/17] mlxsw: spectrum_router: Extract a helper to schedule neighbour work Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 07/17] mlxsw: spectrum: Allow event handlers to check unowned bridges Petr Machata
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

Move the meat of mlxsw_sp_netdevice_event() to a separate function that
does just the validation. This separate helper will be possible to call
later for recursive ascent when validating attachment of a front panel port
to a bridge with uppers.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum.c    | 20 ++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 86e2f0ed64d3..0488fe5695bd 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -5146,21 +5146,18 @@ static int mlxsw_sp_netdevice_vxlan_event(struct mlxsw_sp *mlxsw_sp,
 	return 0;
 }
 
-static int mlxsw_sp_netdevice_event(struct notifier_block *nb,
-				    unsigned long event, void *ptr)
+static int __mlxsw_sp_netdevice_event(struct mlxsw_sp *mlxsw_sp,
+				      unsigned long event, void *ptr)
 {
 	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
 	struct mlxsw_sp_span_entry *span_entry;
-	struct mlxsw_sp *mlxsw_sp;
 	int err = 0;
 
-	mlxsw_sp = container_of(nb, struct mlxsw_sp, netdevice_nb);
 	if (event == NETDEV_UNREGISTER) {
 		span_entry = mlxsw_sp_span_entry_find_by_port(mlxsw_sp, dev);
 		if (span_entry)
 			mlxsw_sp_span_entry_invalidate(mlxsw_sp, span_entry);
 	}
-	mlxsw_sp_span_respin(mlxsw_sp);
 
 	if (netif_is_vxlan(dev))
 		err = mlxsw_sp_netdevice_vxlan_event(mlxsw_sp, dev, event, ptr);
@@ -5175,6 +5172,19 @@ static int mlxsw_sp_netdevice_event(struct notifier_block *nb,
 	else if (netif_is_macvlan(dev))
 		err = mlxsw_sp_netdevice_macvlan_event(dev, event, ptr);
 
+	return err;
+}
+
+static int mlxsw_sp_netdevice_event(struct notifier_block *nb,
+				    unsigned long event, void *ptr)
+{
+	struct mlxsw_sp *mlxsw_sp;
+	int err;
+
+	mlxsw_sp = container_of(nb, struct mlxsw_sp, netdevice_nb);
+	mlxsw_sp_span_respin(mlxsw_sp);
+	err = __mlxsw_sp_netdevice_event(mlxsw_sp, event, ptr);
+
 	return notifier_from_errno(err);
 }
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 07/17] mlxsw: spectrum: Allow event handlers to check unowned bridges
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (5 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 06/17] mlxsw: spectrum: Split a helper out of mlxsw_sp_netdevice_event() Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 08/17] mlxsw: spectrum: Add a replay_deslavement argument to event handlers Petr Machata
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

Currently the bridge-related handlers bail out when the event is related to
a netdevice that is not an upper of one of the front-panel ports. In order
to allow enslavement of front-panel ports to bridges that already have
uppers, it will be necessary to replay CHANGEUPPER events to validate that
the configuration is offloadable. In order for the replay to be effective,
it must be possible to ignore unsupported configuration in the context of
an actual notifier event, but to still "veto" these configurations when the
validation is performed.

To that end, introduce two parameters to a number of handlers: mlxsw_sp,
because it will not be possible to deduce that from the netdevice lowers;
and process_foreign to indicate whether netdevices that are not front panel
uppers should be validated.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum.c    | 40 +++++++++++--------
 1 file changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 0488fe5695bd..3d1a045ff470 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4936,17 +4936,17 @@ static int mlxsw_sp_netdevice_lag_port_vlan_event(struct net_device *vlan_dev,
 	return 0;
 }
 
-static int mlxsw_sp_netdevice_bridge_vlan_event(struct net_device *vlan_dev,
+static int mlxsw_sp_netdevice_bridge_vlan_event(struct mlxsw_sp *mlxsw_sp,
+						struct net_device *vlan_dev,
 						struct net_device *br_dev,
 						unsigned long event, void *ptr,
-						u16 vid)
+						u16 vid, bool process_foreign)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_lower_get(vlan_dev);
 	struct netdev_notifier_changeupper_info *info = ptr;
 	struct netlink_ext_ack *extack;
 	struct net_device *upper_dev;
 
-	if (!mlxsw_sp)
+	if (!process_foreign && !mlxsw_sp_lower_get(vlan_dev))
 		return 0;
 
 	extack = netdev_notifier_info_to_extack(&info->info);
@@ -4979,8 +4979,10 @@ static int mlxsw_sp_netdevice_bridge_vlan_event(struct net_device *vlan_dev,
 	return 0;
 }
 
-static int mlxsw_sp_netdevice_vlan_event(struct net_device *vlan_dev,
-					 unsigned long event, void *ptr)
+static int mlxsw_sp_netdevice_vlan_event(struct mlxsw_sp *mlxsw_sp,
+					 struct net_device *vlan_dev,
+					 unsigned long event, void *ptr,
+					 bool process_foreign)
 {
 	struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
 	u16 vid = vlan_dev_vlan_id(vlan_dev);
@@ -4993,22 +4995,25 @@ static int mlxsw_sp_netdevice_vlan_event(struct net_device *vlan_dev,
 							      real_dev, event,
 							      ptr, vid);
 	else if (netif_is_bridge_master(real_dev))
-		return mlxsw_sp_netdevice_bridge_vlan_event(vlan_dev, real_dev,
-							    event, ptr, vid);
+		return mlxsw_sp_netdevice_bridge_vlan_event(mlxsw_sp, vlan_dev,
+							    real_dev, event,
+							    ptr, vid,
+							    process_foreign);
 
 	return 0;
 }
 
-static int mlxsw_sp_netdevice_bridge_event(struct net_device *br_dev,
-					   unsigned long event, void *ptr)
+static int mlxsw_sp_netdevice_bridge_event(struct mlxsw_sp *mlxsw_sp,
+					   struct net_device *br_dev,
+					   unsigned long event, void *ptr,
+					   bool process_foreign)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_lower_get(br_dev);
 	struct netdev_notifier_changeupper_info *info = ptr;
 	struct netlink_ext_ack *extack;
 	struct net_device *upper_dev;
 	u16 proto;
 
-	if (!mlxsw_sp)
+	if (!process_foreign && !mlxsw_sp_lower_get(br_dev))
 		return 0;
 
 	extack = netdev_notifier_info_to_extack(&info->info);
@@ -5147,7 +5152,8 @@ static int mlxsw_sp_netdevice_vxlan_event(struct mlxsw_sp *mlxsw_sp,
 }
 
 static int __mlxsw_sp_netdevice_event(struct mlxsw_sp *mlxsw_sp,
-				      unsigned long event, void *ptr)
+				      unsigned long event, void *ptr,
+				      bool process_foreign)
 {
 	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
 	struct mlxsw_sp_span_entry *span_entry;
@@ -5166,9 +5172,11 @@ static int __mlxsw_sp_netdevice_event(struct mlxsw_sp *mlxsw_sp,
 	else if (netif_is_lag_master(dev))
 		err = mlxsw_sp_netdevice_lag_event(dev, event, ptr);
 	else if (is_vlan_dev(dev))
-		err = mlxsw_sp_netdevice_vlan_event(dev, event, ptr);
+		err = mlxsw_sp_netdevice_vlan_event(mlxsw_sp, dev, event, ptr,
+						    process_foreign);
 	else if (netif_is_bridge_master(dev))
-		err = mlxsw_sp_netdevice_bridge_event(dev, event, ptr);
+		err = mlxsw_sp_netdevice_bridge_event(mlxsw_sp, dev, event, ptr,
+						      process_foreign);
 	else if (netif_is_macvlan(dev))
 		err = mlxsw_sp_netdevice_macvlan_event(dev, event, ptr);
 
@@ -5183,7 +5191,7 @@ static int mlxsw_sp_netdevice_event(struct notifier_block *nb,
 
 	mlxsw_sp = container_of(nb, struct mlxsw_sp, netdevice_nb);
 	mlxsw_sp_span_respin(mlxsw_sp);
-	err = __mlxsw_sp_netdevice_event(mlxsw_sp, event, ptr);
+	err = __mlxsw_sp_netdevice_event(mlxsw_sp, event, ptr, false);
 
 	return notifier_from_errno(err);
 }
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 08/17] mlxsw: spectrum: Add a replay_deslavement argument to event handlers
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (6 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 07/17] mlxsw: spectrum: Allow event handlers to check unowned bridges Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 09/17] mlxsw: spectrum: On port enslavement to a LAG, join upper's bridges Petr Machata
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

When handling deslavement of LAG or its upper from a bridge device, when
the deslaved netdevice has an IP address, it should join the router. This
should be done after all the lowers of the LAG have left the bridge. The
replay intended to cause the device to join the router therefore cannot be
invoked unconditionally in the event handlers themselves. It can be done
right away if the handler is invoked for a sole device, but when it is
invoked repeated for each LAG lower, the replay needs to be postponed
until after this processing is done.

To that end, add a boolean parameter, replay_deslavement, to
mlxsw_sp_netdevice_port_upper_event(), mlxsw_sp_netdevice_port_vlan_event()
and one helper on the call path. Have the invocations that are done for
sole netdevices pass true, and those done for LAG lowers pass false.

Nothing depends on this flag at this point, but it removes some noise from
the patch that introduces the replay itself.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum.c    | 20 +++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 3d1a045ff470..b9ceffe258bf 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4641,7 +4641,8 @@ static bool mlxsw_sp_bridge_vxlan_is_valid(struct net_device *br_dev,
 
 static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev,
 					       struct net_device *dev,
-					       unsigned long event, void *ptr)
+					       unsigned long event, void *ptr,
+					       bool replay_deslavement)
 {
 	struct netdev_notifier_changeupper_info *info;
 	struct mlxsw_sp_port *mlxsw_sp_port;
@@ -4815,13 +4816,15 @@ static int mlxsw_sp_netdevice_port_lower_event(struct net_device *dev,
 
 static int mlxsw_sp_netdevice_port_event(struct net_device *lower_dev,
 					 struct net_device *port_dev,
-					 unsigned long event, void *ptr)
+					 unsigned long event, void *ptr,
+					 bool replay_deslavement)
 {
 	switch (event) {
 	case NETDEV_PRECHANGEUPPER:
 	case NETDEV_CHANGEUPPER:
 		return mlxsw_sp_netdevice_port_upper_event(lower_dev, port_dev,
-							   event, ptr);
+							   event, ptr,
+							   replay_deslavement);
 	case NETDEV_CHANGELOWERSTATE:
 		return mlxsw_sp_netdevice_port_lower_event(port_dev, event,
 							   ptr);
@@ -4840,7 +4843,7 @@ static int mlxsw_sp_netdevice_lag_event(struct net_device *lag_dev,
 	netdev_for_each_lower_dev(lag_dev, dev, iter) {
 		if (mlxsw_sp_port_dev_check(dev)) {
 			ret = mlxsw_sp_netdevice_port_event(lag_dev, dev, event,
-							    ptr);
+							    ptr, false);
 			if (ret)
 				return ret;
 		}
@@ -4852,7 +4855,7 @@ static int mlxsw_sp_netdevice_lag_event(struct net_device *lag_dev,
 static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev,
 					      struct net_device *dev,
 					      unsigned long event, void *ptr,
-					      u16 vid)
+					      u16 vid, bool replay_deslavement)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
 	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
@@ -4927,7 +4930,7 @@ static int mlxsw_sp_netdevice_lag_port_vlan_event(struct net_device *vlan_dev,
 		if (mlxsw_sp_port_dev_check(dev)) {
 			ret = mlxsw_sp_netdevice_port_vlan_event(vlan_dev, dev,
 								 event, ptr,
-								 vid);
+								 vid, false);
 			if (ret)
 				return ret;
 		}
@@ -4989,7 +4992,8 @@ static int mlxsw_sp_netdevice_vlan_event(struct mlxsw_sp *mlxsw_sp,
 
 	if (mlxsw_sp_port_dev_check(real_dev))
 		return mlxsw_sp_netdevice_port_vlan_event(vlan_dev, real_dev,
-							  event, ptr, vid);
+							  event, ptr, vid,
+							  true);
 	else if (netif_is_lag_master(real_dev))
 		return mlxsw_sp_netdevice_lag_port_vlan_event(vlan_dev,
 							      real_dev, event,
@@ -5168,7 +5172,7 @@ static int __mlxsw_sp_netdevice_event(struct mlxsw_sp *mlxsw_sp,
 	if (netif_is_vxlan(dev))
 		err = mlxsw_sp_netdevice_vxlan_event(mlxsw_sp, dev, event, ptr);
 	else if (mlxsw_sp_port_dev_check(dev))
-		err = mlxsw_sp_netdevice_port_event(dev, dev, event, ptr);
+		err = mlxsw_sp_netdevice_port_event(dev, dev, event, ptr, true);
 	else if (netif_is_lag_master(dev))
 		err = mlxsw_sp_netdevice_lag_event(dev, event, ptr);
 	else if (is_vlan_dev(dev))
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 09/17] mlxsw: spectrum: On port enslavement to a LAG, join upper's bridges
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (7 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 08/17] mlxsw: spectrum: Add a replay_deslavement argument to event handlers Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 10/17] mlxsw: spectrum_switchdev: Replay switchdev objects on port join Petr Machata
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

Currently it never happens that a netdevice that is already a bridge slave
would suddenly become mlxsw upper. The only case where this might be
possible as far as mlxsw is concerned, is with LAG netdevices. But if a LAG
already has an upper, enslaving mlxsw port to that LAG is forbidden. Thus
the only way to install a LAG between a bridge and a mlxsw port is by first
enslaving the port to the LAG, and then enslaving that LAG to a bridge.

However in the following patches, the requirement that ports be only
enslaved to masters without uppers, is going to be relaxed. It will
therefore be necessary to join bridges of LAG uppers. Without this replay,
the mlxsw bridge_port objects are not instantiated, which causes issues
later, as a lot of code relies on their presence.

Therefore in this patch, when the first mlxsw physical netdevice is
enslaved to a LAG, consider bridges upper to the LAG (both the direct
master, if any, and any bridge masters of VLAN uppers), and have the
relevant netdevices join their bridges.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum.c    | 90 +++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index b9ceffe258bf..9a6e1ce4e786 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4337,6 +4337,88 @@ static int mlxsw_sp_port_lag_index_get(struct mlxsw_sp *mlxsw_sp,
 	return -EBUSY;
 }
 
+static int mlxsw_sp_lag_uppers_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port,
+					   struct net_device *lag_dev,
+					   struct netlink_ext_ack *extack)
+{
+	struct net_device *upper_dev;
+	struct net_device *master;
+	struct list_head *iter;
+	int done = 0;
+	int err;
+
+	master = netdev_master_upper_dev_get(lag_dev);
+	if (master && netif_is_bridge_master(master)) {
+		err = mlxsw_sp_port_bridge_join(mlxsw_sp_port, lag_dev, master,
+						extack);
+		if (err)
+			return err;
+	}
+
+	netdev_for_each_upper_dev_rcu(lag_dev, upper_dev, iter) {
+		if (!is_vlan_dev(upper_dev))
+			continue;
+
+		master = netdev_master_upper_dev_get(upper_dev);
+		if (master && netif_is_bridge_master(master)) {
+			err = mlxsw_sp_port_bridge_join(mlxsw_sp_port,
+							upper_dev, master,
+							extack);
+			if (err)
+				goto err_port_bridge_join;
+		}
+
+		++done;
+	}
+
+	return 0;
+
+err_port_bridge_join:
+	netdev_for_each_upper_dev_rcu(lag_dev, upper_dev, iter) {
+		if (!is_vlan_dev(upper_dev))
+			continue;
+
+		master = netdev_master_upper_dev_get(upper_dev);
+		if (!master || !netif_is_bridge_master(master))
+			continue;
+
+		if (!done--)
+			break;
+
+		mlxsw_sp_port_bridge_leave(mlxsw_sp_port, upper_dev, master);
+	}
+
+	master = netdev_master_upper_dev_get(lag_dev);
+	if (master && netif_is_bridge_master(master))
+		mlxsw_sp_port_bridge_leave(mlxsw_sp_port, lag_dev, master);
+
+	return err;
+}
+
+static void
+mlxsw_sp_lag_uppers_bridge_leave(struct mlxsw_sp_port *mlxsw_sp_port,
+				 struct net_device *lag_dev)
+{
+	struct net_device *upper_dev;
+	struct net_device *master;
+	struct list_head *iter;
+
+	netdev_for_each_upper_dev_rcu(lag_dev, upper_dev, iter) {
+		if (!is_vlan_dev(upper_dev))
+			continue;
+
+		master = netdev_master_upper_dev_get(upper_dev);
+		if (!master)
+			continue;
+
+		mlxsw_sp_port_bridge_leave(mlxsw_sp_port, upper_dev, master);
+	}
+
+	master = netdev_master_upper_dev_get(lag_dev);
+	if (master)
+		mlxsw_sp_port_bridge_leave(mlxsw_sp_port, lag_dev, master);
+}
+
 static int mlxsw_sp_port_lag_join(struct mlxsw_sp_port *mlxsw_sp_port,
 				  struct net_device *lag_dev,
 				  struct netlink_ext_ack *extack)
@@ -4361,6 +4443,12 @@ static int mlxsw_sp_port_lag_join(struct mlxsw_sp_port *mlxsw_sp_port,
 	err = mlxsw_sp_port_lag_index_get(mlxsw_sp, lag_id, &port_index);
 	if (err)
 		return err;
+
+	err = mlxsw_sp_lag_uppers_bridge_join(mlxsw_sp_port, lag_dev,
+					      extack);
+	if (err)
+		goto err_lag_uppers_bridge_join;
+
 	err = mlxsw_sp_lag_col_port_add(mlxsw_sp_port, lag_id, port_index);
 	if (err)
 		goto err_col_port_add;
@@ -4390,6 +4478,8 @@ static int mlxsw_sp_port_lag_join(struct mlxsw_sp_port *mlxsw_sp_port,
 				     mlxsw_sp_port->local_port);
 	mlxsw_sp_lag_col_port_remove(mlxsw_sp_port, lag_id);
 err_col_port_add:
+	mlxsw_sp_lag_uppers_bridge_leave(mlxsw_sp_port, lag_dev);
+err_lag_uppers_bridge_join:
 	if (!lag->ref_count)
 		mlxsw_sp_lag_destroy(mlxsw_sp, lag_id);
 	return err;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 10/17] mlxsw: spectrum_switchdev: Replay switchdev objects on port join
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (8 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 09/17] mlxsw: spectrum: On port enslavement to a LAG, join upper's bridges Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 11/17] mlxsw: spectrum_router: Join RIFs of LAG upper VLANs Petr Machata
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

Currently it never happens that a netdevice that is already a bridge slave
would suddenly become mlxsw upper. The only case where this might be
possible as far as mlxsw is concerned, is with LAG netdevices. But if a LAG
has any upper (e.g. is enslaved), enlaving mlxsw port to that LAG is
forbidden. Thus the only way to install a LAG between a bridge and a mlxsw
port is by first enslaving the port to the LAG, and then enslaving that LAG
to a bridge. At that point there are no bridge objects (such as port VLANs)
to replay. Those are added afterwards, and notified as they are created.
This holds even for the PVID.

However in the following patches, the requirement that ports be only
enslaved to masters without uppers, is going to be relaxed. It will
therefore be necessary to replay the existing bridge objects. Without this
replay, e.g. the mlxsw bridge_port_vlan objects are not instantiated, which
causes issues later, as a lot of code relies on their presence.

To that end, add a new notifier block whose sole role is to filter out
events related to the one relevant upper, and forward those to the existing
switchdev notifier block. Pass the new notifier block to
switchdev_bridge_port_offload() when the bridge port is created.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum.c    |   4 +-
 .../net/ethernet/mellanox/mlxsw/spectrum.h    |   2 +
 .../mellanox/mlxsw/spectrum_switchdev.c       | 131 +++++++++++++++++-
 3 files changed, 132 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 9a6e1ce4e786..8087da00f38f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1132,8 +1132,8 @@ static int mlxsw_sp_port_add_vid(struct net_device *dev,
 	return PTR_ERR_OR_ZERO(mlxsw_sp_port_vlan_create(mlxsw_sp_port, vid));
 }
 
-static int mlxsw_sp_port_kill_vid(struct net_device *dev,
-				  __be16 __always_unused proto, u16 vid)
+int mlxsw_sp_port_kill_vid(struct net_device *dev,
+			   __be16 __always_unused proto, u16 vid)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
 	struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index c6231e62a371..65eaa181e0aa 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -700,6 +700,8 @@ int mlxsw_sp_port_pvid_set(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid,
 struct mlxsw_sp_port_vlan *
 mlxsw_sp_port_vlan_create(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid);
 void mlxsw_sp_port_vlan_destroy(struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan);
+int mlxsw_sp_port_kill_vid(struct net_device *dev,
+			   __be16 __always_unused proto, u16 vid);
 int mlxsw_sp_port_vlan_set(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid_begin,
 			   u16 vid_end, bool is_member, bool untagged);
 int mlxsw_sp_flow_counter_get(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 79d45c6c6edf..982eae6bd63e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -384,6 +384,91 @@ mlxsw_sp_bridge_port_find(struct mlxsw_sp_bridge *bridge,
 	return __mlxsw_sp_bridge_port_find(bridge_device, brport_dev);
 }
 
+static int mlxsw_sp_port_obj_add(struct net_device *dev, const void *ctx,
+				 const struct switchdev_obj *obj,
+				 struct netlink_ext_ack *extack);
+static int mlxsw_sp_port_obj_del(struct net_device *dev, const void *ctx,
+				 const struct switchdev_obj *obj);
+
+struct mlxsw_sp_bridge_port_replay_switchdev_objs {
+	struct net_device *brport_dev;
+	struct mlxsw_sp_port *mlxsw_sp_port;
+	int done;
+};
+
+static int
+mlxsw_sp_bridge_port_replay_switchdev_objs(struct notifier_block *nb,
+					   unsigned long event, void *ptr)
+{
+	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+	struct switchdev_notifier_port_obj_info *port_obj_info = ptr;
+	struct netlink_ext_ack *extack = port_obj_info->info.extack;
+	struct mlxsw_sp_bridge_port_replay_switchdev_objs *rso;
+	int err = 0;
+
+	rso = (void *)port_obj_info->info.ctx;
+
+	if (event != SWITCHDEV_PORT_OBJ_ADD ||
+	    dev != rso->brport_dev)
+		goto out;
+
+	/* When a port is joining the bridge through a LAG, there likely are
+	 * VLANs configured on that LAG already. The replay will thus attempt to
+	 * have the given port-vlans join the corresponding FIDs. But the LAG
+	 * netdevice has already called the ndo_vlan_rx_add_vid NDO for its VLAN
+	 * memberships, back before CHANGEUPPER was distributed and netdevice
+	 * master set. So now before propagating the VLAN events further, we
+	 * first need to kill the corresponding VID at the mlxsw_sp_port.
+	 *
+	 * Note that this doesn't need to be rolled back on failure -- if the
+	 * replay fails, the enslavement is off, and the VIDs would be killed by
+	 * LAG anyway as part of its rollback.
+	 */
+	if (port_obj_info->obj->id == SWITCHDEV_OBJ_ID_PORT_VLAN) {
+		u16 vid = SWITCHDEV_OBJ_PORT_VLAN(port_obj_info->obj)->vid;
+
+		err = mlxsw_sp_port_kill_vid(rso->mlxsw_sp_port->dev, 0, vid);
+		if (err)
+			goto out;
+	}
+
+	++rso->done;
+	err = mlxsw_sp_port_obj_add(rso->mlxsw_sp_port->dev, NULL,
+				    port_obj_info->obj, extack);
+
+out:
+	return notifier_from_errno(err);
+}
+
+static struct notifier_block mlxsw_sp_bridge_port_replay_switchdev_objs_nb = {
+	.notifier_call = mlxsw_sp_bridge_port_replay_switchdev_objs,
+};
+
+static int
+mlxsw_sp_bridge_port_unreplay_switchdev_objs(struct notifier_block *nb,
+					     unsigned long event, void *ptr)
+{
+	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+	struct switchdev_notifier_port_obj_info *port_obj_info = ptr;
+	struct mlxsw_sp_bridge_port_replay_switchdev_objs *rso;
+
+	rso = (void *)port_obj_info->info.ctx;
+
+	if (event != SWITCHDEV_PORT_OBJ_ADD ||
+	    dev != rso->brport_dev)
+		return NOTIFY_DONE;
+	if (!rso->done--)
+		return NOTIFY_STOP;
+
+	mlxsw_sp_port_obj_del(rso->mlxsw_sp_port->dev, NULL,
+			      port_obj_info->obj);
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block mlxsw_sp_bridge_port_unreplay_switchdev_objs_nb = {
+	.notifier_call = mlxsw_sp_bridge_port_unreplay_switchdev_objs,
+};
+
 static struct mlxsw_sp_bridge_port *
 mlxsw_sp_bridge_port_create(struct mlxsw_sp_bridge_device *bridge_device,
 			    struct net_device *brport_dev,
@@ -2350,6 +2435,33 @@ static struct mlxsw_sp_port *mlxsw_sp_lag_rep_port(struct mlxsw_sp *mlxsw_sp,
 	return NULL;
 }
 
+static int
+mlxsw_sp_bridge_port_replay(struct mlxsw_sp_bridge_port *bridge_port,
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    struct netlink_ext_ack *extack)
+{
+	struct mlxsw_sp_bridge_port_replay_switchdev_objs rso = {
+		.brport_dev = bridge_port->dev,
+		.mlxsw_sp_port = mlxsw_sp_port,
+	};
+	struct notifier_block *nb;
+	int err;
+
+	nb = &mlxsw_sp_bridge_port_replay_switchdev_objs_nb;
+	err = switchdev_bridge_port_replay(bridge_port->dev, mlxsw_sp_port->dev,
+					   &rso, NULL, nb, extack);
+	if (err)
+		goto err_replay;
+
+	return 0;
+
+err_replay:
+	nb = &mlxsw_sp_bridge_port_unreplay_switchdev_objs_nb;
+	switchdev_bridge_port_replay(bridge_port->dev, mlxsw_sp_port->dev,
+				     &rso, NULL, nb, extack);
+	return err;
+}
+
 static int
 mlxsw_sp_bridge_vlan_aware_port_join(struct mlxsw_sp_bridge_port *bridge_port,
 				     struct mlxsw_sp_port *mlxsw_sp_port,
@@ -2364,7 +2476,7 @@ mlxsw_sp_bridge_vlan_aware_port_join(struct mlxsw_sp_bridge_port *bridge_port,
 	if (mlxsw_sp_port->default_vlan->fid)
 		mlxsw_sp_port_vlan_router_leave(mlxsw_sp_port->default_vlan);
 
-	return 0;
+	return mlxsw_sp_bridge_port_replay(bridge_port, mlxsw_sp_port, extack);
 }
 
 static int
@@ -2536,6 +2648,7 @@ mlxsw_sp_bridge_8021d_port_join(struct mlxsw_sp_bridge_device *bridge_device,
 	struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan;
 	struct net_device *dev = bridge_port->dev;
 	u16 vid;
+	int err;
 
 	vid = is_vlan_dev(dev) ? vlan_dev_vlan_id(dev) : MLXSW_SP_DEFAULT_VID;
 	mlxsw_sp_port_vlan = mlxsw_sp_port_vlan_find_by_vid(mlxsw_sp_port, vid);
@@ -2551,8 +2664,20 @@ mlxsw_sp_bridge_8021d_port_join(struct mlxsw_sp_bridge_device *bridge_device,
 	if (mlxsw_sp_port_vlan->fid)
 		mlxsw_sp_port_vlan_router_leave(mlxsw_sp_port_vlan);
 
-	return mlxsw_sp_port_vlan_bridge_join(mlxsw_sp_port_vlan, bridge_port,
-					      extack);
+	err = mlxsw_sp_port_vlan_bridge_join(mlxsw_sp_port_vlan, bridge_port,
+					     extack);
+	if (err)
+		return err;
+
+	err = mlxsw_sp_bridge_port_replay(bridge_port, mlxsw_sp_port, extack);
+	if (err)
+		goto err_replay;
+
+	return 0;
+
+err_replay:
+	mlxsw_sp_port_vlan_bridge_leave(mlxsw_sp_port_vlan);
+	return err;
 }
 
 static void
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 11/17] mlxsw: spectrum_router: Join RIFs of LAG upper VLANs
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (9 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 10/17] mlxsw: spectrum_switchdev: Replay switchdev objects on port join Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 12/17] mlxsw: spectrum_router: Offload ethernet nexthops when RIF is made Petr Machata
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

In the following patches, the requirement that ports be only enslaved to
masters without uppers, is going to be relaxed. It will therefore be
necessary to join not only RIF for the immediate LAG, as is currently the
case, but also RIFs for VLAN netdevices upper to the LAG.

In this patch, extend mlxsw_sp_netdevice_router_join_lag() to walk the
uppers of a LAG being joined, and also join any VLAN ones.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../ethernet/mellanox/mlxsw/spectrum_router.c | 55 ++++++++++++++++++-
 1 file changed, 52 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 18e059e94079..ae2d5e760f1b 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -9685,15 +9685,64 @@ mlxsw_sp_port_vid_router_join_existing(struct mlxsw_sp_port *mlxsw_sp_port,
 						       dev, extack);
 }
 
+static void
+mlxsw_sp_port_vid_router_leave(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid,
+			       struct net_device *dev)
+{
+	struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan;
+
+	mlxsw_sp_port_vlan = mlxsw_sp_port_vlan_find_by_vid(mlxsw_sp_port,
+							    vid);
+	if (WARN_ON(!mlxsw_sp_port_vlan))
+		return;
+
+	__mlxsw_sp_port_vlan_router_leave(mlxsw_sp_port_vlan);
+}
+
 static int __mlxsw_sp_router_port_join_lag(struct mlxsw_sp_port *mlxsw_sp_port,
 					   struct net_device *lag_dev,
 					   struct netlink_ext_ack *extack)
 {
 	u16 default_vid = MLXSW_SP_DEFAULT_VID;
+	struct net_device *upper_dev;
+	struct list_head *iter;
+	int done = 0;
+	u16 vid;
+	int err;
 
-	return mlxsw_sp_port_vid_router_join_existing(mlxsw_sp_port,
-						      default_vid, lag_dev,
-						      extack);
+	err = mlxsw_sp_port_vid_router_join_existing(mlxsw_sp_port, default_vid,
+						     lag_dev, extack);
+	if (err)
+		return err;
+
+	netdev_for_each_upper_dev_rcu(lag_dev, upper_dev, iter) {
+		if (!is_vlan_dev(upper_dev))
+			continue;
+
+		vid = vlan_dev_vlan_id(upper_dev);
+		err = mlxsw_sp_port_vid_router_join_existing(mlxsw_sp_port, vid,
+							     upper_dev, extack);
+		if (err)
+			goto err_router_join_dev;
+
+		++done;
+	}
+
+	return 0;
+
+err_router_join_dev:
+	netdev_for_each_upper_dev_rcu(lag_dev, upper_dev, iter) {
+		if (!is_vlan_dev(upper_dev))
+			continue;
+		if (!done--)
+			break;
+
+		vid = vlan_dev_vlan_id(upper_dev);
+		mlxsw_sp_port_vid_router_leave(mlxsw_sp_port, vid, upper_dev);
+	}
+
+	mlxsw_sp_port_vid_router_leave(mlxsw_sp_port, default_vid, lag_dev);
+	return err;
 }
 
 int mlxsw_sp_router_port_join_lag(struct mlxsw_sp_port *mlxsw_sp_port,
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 12/17] mlxsw: spectrum_router: Offload ethernet nexthops when RIF is made
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (10 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 11/17] mlxsw: spectrum_router: Join RIFs of LAG upper VLANs Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 13/17] mlxsw: spectrum_router: Replay MACVLANs " Petr Machata
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

As RIF is created, refresh each netxhop group tracked at the CRIF for which
the RIF was created.

Note that nothing needs to be done for IPIP nexthops. The RIF for these is
either available from the get-go, or will never be available, so no after
the fact offloading needs to be done.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../ethernet/mellanox/mlxsw/spectrum_router.c | 54 +++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index ae2d5e760f1b..fe1855cc2c76 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -4404,6 +4404,19 @@ static int mlxsw_sp_nexthop_type_init(struct mlxsw_sp *mlxsw_sp,
 	return err;
 }
 
+static int mlxsw_sp_nexthop_type_rif_made(struct mlxsw_sp *mlxsw_sp,
+					  struct mlxsw_sp_nexthop *nh)
+{
+	switch (nh->type) {
+	case MLXSW_SP_NEXTHOP_TYPE_ETH:
+		return mlxsw_sp_nexthop_neigh_init(mlxsw_sp, nh);
+	case MLXSW_SP_NEXTHOP_TYPE_IPIP:
+		break;
+	}
+
+	return 0;
+}
+
 static void mlxsw_sp_nexthop_type_rif_gone(struct mlxsw_sp *mlxsw_sp,
 					   struct mlxsw_sp_nexthop *nh)
 {
@@ -4532,6 +4545,35 @@ static void mlxsw_sp_nexthop_rif_update(struct mlxsw_sp *mlxsw_sp,
 	}
 }
 
+static int mlxsw_sp_nexthop_rif_made_sync(struct mlxsw_sp *mlxsw_sp,
+					  struct mlxsw_sp_rif *rif)
+{
+	struct mlxsw_sp_nexthop *nh, *tmp;
+	unsigned int n = 0;
+	int err;
+
+	list_for_each_entry_safe(nh, tmp, &rif->crif->nexthop_list,
+				 crif_list_node) {
+		err = mlxsw_sp_nexthop_type_rif_made(mlxsw_sp, nh);
+		if (err)
+			goto err_nexthop_type_rif;
+		mlxsw_sp_nexthop_group_refresh(mlxsw_sp, nh->nhgi->nh_grp);
+		n++;
+	}
+
+	return 0;
+
+err_nexthop_type_rif:
+	list_for_each_entry_safe(nh, tmp, &rif->crif->nexthop_list,
+				 crif_list_node) {
+		if (!n--)
+			break;
+		mlxsw_sp_nexthop_type_rif_gone(mlxsw_sp, nh);
+		mlxsw_sp_nexthop_group_refresh(mlxsw_sp, nh->nhgi->nh_grp);
+	}
+	return err;
+}
+
 static void mlxsw_sp_nexthop_rif_gone_sync(struct mlxsw_sp *mlxsw_sp,
 					   struct mlxsw_sp_rif *rif)
 {
@@ -7892,6 +7934,12 @@ static int mlxsw_sp_router_rif_disable(struct mlxsw_sp *mlxsw_sp, u16 rif)
 	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ritr), ritr_pl);
 }
 
+static int mlxsw_sp_router_rif_made_sync(struct mlxsw_sp *mlxsw_sp,
+					 struct mlxsw_sp_rif *rif)
+{
+	return mlxsw_sp_nexthop_rif_made_sync(mlxsw_sp, rif);
+}
+
 static void mlxsw_sp_router_rif_gone_sync(struct mlxsw_sp *mlxsw_sp,
 					  struct mlxsw_sp_rif *rif)
 {
@@ -8329,6 +8377,10 @@ mlxsw_sp_rif_create(struct mlxsw_sp *mlxsw_sp,
 			goto err_mr_rif_add;
 	}
 
+	err = mlxsw_sp_router_rif_made_sync(mlxsw_sp, rif);
+	if (err)
+		goto err_rif_made_sync;
+
 	if (netdev_offload_xstats_enabled(params->dev,
 					  NETDEV_OFFLOAD_XSTATS_TYPE_L3)) {
 		err = mlxsw_sp_router_port_l3_stats_enable(rif);
@@ -8343,6 +8395,8 @@ mlxsw_sp_rif_create(struct mlxsw_sp *mlxsw_sp,
 	return rif;
 
 err_stats_enable:
+	mlxsw_sp_router_rif_gone_sync(mlxsw_sp, rif);
+err_rif_made_sync:
 err_mr_rif_add:
 	for (i--; i >= 0; i--)
 		mlxsw_sp_mr_rif_del(vr->mr_table[i], rif);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 13/17] mlxsw: spectrum_router: Replay MACVLANs when RIF is made
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (11 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 12/17] mlxsw: spectrum_router: Offload ethernet nexthops when RIF is made Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 14/17] mlxsw: spectrum_router: Replay neighbours " Petr Machata
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

If IP address is added to a MACVLAN netdevice, the effect is of configuring
VRRP on the RIF for the netdevice linked to the MACVLAN. Because the
MACVLAN offload is tied to existence of a RIF at the linked netdevice,
adding a MACVLAN is currently not allowed until a RIF is present.

If this requirement stays, it will never be possible to attach a first port
into a topology that involves a MACVLAN. Thus topologies would need to be
built in a certain order, which is impractical.

Additionally, IP address removal, which leads to disappearance of the RIF
that the MACVLAN depends on, cannot be vetoed. Thus even as things stand
now it is possible to get to a state where a MACVLAN netdevice exists
without a RIF, despite having mlxsw lowers. And once the MACVLAN is
un-offloaded due to RIF getting destroyed, recreating the RIF does not
bring it back.

In this patch, accept that MACVLAN can be created out of order and support
that use case.

One option would seem to be to simply recognize MACVLAN netdevices as
"interesting", and let the existing replay mechanisms take care of the
offload. However, that does not address the necessity to reoffload MACVLAN
once a RIF is created.

Thus add a new replay hook, symmetrical to mlxsw_sp_rif_macvlan_flush(),
called mlxsw_sp_rif_macvlan_replay(), which instead of unwinding the
existing offloads, applies the configuration as if the netdevice were
created just now.

Additionally, remove all vetoes and warning messages that checked for
presence of a RIF at the linked device.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum.c    | 22 -------
 .../ethernet/mellanox/mlxsw/spectrum_router.c | 59 +++++++++++++++++--
 2 files changed, 54 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 8087da00f38f..4346cc736579 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4786,11 +4786,6 @@ static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev,
 			NL_SET_ERR_MSG_MOD(extack, "Can not put a VLAN on a LAG port");
 			return -EINVAL;
 		}
-		if (netif_is_macvlan(upper_dev) &&
-		    !mlxsw_sp_rif_exists(mlxsw_sp, lower_dev)) {
-			NL_SET_ERR_MSG_MOD(extack, "macvlan is only supported on top of router interfaces");
-			return -EOPNOTSUPP;
-		}
 		if (netif_is_ovs_master(upper_dev) && vlan_uses_dev(dev)) {
 			NL_SET_ERR_MSG_MOD(extack, "Master device is an OVS master and this device has a VLAN");
 			return -EINVAL;
@@ -4979,11 +4974,6 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev,
 			NL_SET_ERR_MSG_MOD(extack, "Enslaving a port to a device that already has an upper device is not supported");
 			return -EINVAL;
 		}
-		if (netif_is_macvlan(upper_dev) &&
-		    !mlxsw_sp_rif_exists(mlxsw_sp, vlan_dev)) {
-			NL_SET_ERR_MSG_MOD(extack, "macvlan is only supported on top of router interfaces");
-			return -EOPNOTSUPP;
-		}
 		break;
 	case NETDEV_CHANGEUPPER:
 		upper_dev = info->upper_dev;
@@ -5052,13 +5042,6 @@ static int mlxsw_sp_netdevice_bridge_vlan_event(struct mlxsw_sp *mlxsw_sp,
 			NL_SET_ERR_MSG_MOD(extack, "Unknown upper device type");
 			return -EOPNOTSUPP;
 		}
-		if (!info->linking)
-			break;
-		if (netif_is_macvlan(upper_dev) &&
-		    !mlxsw_sp_rif_exists(mlxsw_sp, vlan_dev)) {
-			NL_SET_ERR_MSG_MOD(extack, "macvlan is only supported on top of router interfaces");
-			return -EOPNOTSUPP;
-		}
 		break;
 	case NETDEV_CHANGEUPPER:
 		upper_dev = info->upper_dev;
@@ -5135,11 +5118,6 @@ static int mlxsw_sp_netdevice_bridge_event(struct mlxsw_sp *mlxsw_sp,
 			NL_SET_ERR_MSG_MOD(extack, "VLAN uppers are only supported with 802.1q VLAN protocol");
 			return -EOPNOTSUPP;
 		}
-		if (netif_is_macvlan(upper_dev) &&
-		    !mlxsw_sp_rif_exists(mlxsw_sp, br_dev)) {
-			NL_SET_ERR_MSG_MOD(extack, "macvlan is only supported on top of router interfaces");
-			return -EOPNOTSUPP;
-		}
 		break;
 	case NETDEV_CHANGEUPPER:
 		upper_dev = info->upper_dev;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index fe1855cc2c76..2322429cdb72 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -9121,10 +9121,8 @@ static int mlxsw_sp_rif_macvlan_add(struct mlxsw_sp *mlxsw_sp,
 	int err;
 
 	rif = mlxsw_sp_rif_find_by_dev(mlxsw_sp, vlan->lowerdev);
-	if (!rif) {
-		NL_SET_ERR_MSG_MOD(extack, "macvlan is only supported on top of router interfaces");
-		return -EOPNOTSUPP;
-	}
+	if (!rif)
+		return 0;
 
 	err = mlxsw_sp_rif_fdb_op(mlxsw_sp, macvlan_dev->dev_addr,
 				  mlxsw_sp_fid_index(rif->fid), true);
@@ -9857,6 +9855,40 @@ static int mlxsw_sp_router_netdevice_event(struct notifier_block *nb,
 	return notifier_from_errno(err);
 }
 
+struct mlxsw_sp_macvlan_replay {
+	struct mlxsw_sp *mlxsw_sp;
+	struct netlink_ext_ack *extack;
+};
+
+static int mlxsw_sp_macvlan_replay_upper(struct net_device *dev,
+					 struct netdev_nested_priv *priv)
+{
+	const struct mlxsw_sp_macvlan_replay *rms = priv->data;
+	struct netlink_ext_ack *extack = rms->extack;
+	struct mlxsw_sp *mlxsw_sp = rms->mlxsw_sp;
+
+	if (!netif_is_macvlan(dev))
+		return 0;
+
+	return mlxsw_sp_rif_macvlan_add(mlxsw_sp, dev, extack);
+}
+
+static int mlxsw_sp_macvlan_replay(struct mlxsw_sp_rif *rif,
+				   struct netlink_ext_ack *extack)
+{
+	struct mlxsw_sp_macvlan_replay rms = {
+		.mlxsw_sp = rif->mlxsw_sp,
+		.extack = extack,
+	};
+	struct netdev_nested_priv priv = {
+		.data = &rms,
+	};
+
+	return netdev_walk_all_upper_dev_rcu(mlxsw_sp_rif_dev(rif),
+					     mlxsw_sp_macvlan_replay_upper,
+					     &priv);
+}
+
 static int __mlxsw_sp_rif_macvlan_flush(struct net_device *dev,
 					struct netdev_nested_priv *priv)
 {
@@ -9879,7 +9911,6 @@ static int mlxsw_sp_rif_macvlan_flush(struct mlxsw_sp_rif *rif)
 	if (!netif_is_macvlan_port(dev))
 		return 0;
 
-	netdev_warn(dev, "Router interface is deleted. Upper macvlans will not work\n");
 	return netdev_walk_all_upper_dev_rcu(dev,
 					     __mlxsw_sp_rif_macvlan_flush, &priv);
 }
@@ -9937,6 +9968,10 @@ static int mlxsw_sp_rif_subport_configure(struct mlxsw_sp_rif *rif,
 	if (err)
 		goto err_rif_subport_op;
 
+	err = mlxsw_sp_macvlan_replay(rif, extack);
+	if (err)
+		goto err_macvlan_replay;
+
 	err = mlxsw_sp_rif_fdb_op(rif->mlxsw_sp, dev->dev_addr,
 				  mlxsw_sp_fid_index(rif->fid), true);
 	if (err)
@@ -9952,6 +9987,8 @@ static int mlxsw_sp_rif_subport_configure(struct mlxsw_sp_rif *rif,
 	mlxsw_sp_rif_fdb_op(rif->mlxsw_sp, dev->dev_addr,
 			    mlxsw_sp_fid_index(rif->fid), false);
 err_rif_fdb_op:
+	mlxsw_sp_rif_macvlan_flush(rif);
+err_macvlan_replay:
 	mlxsw_sp_rif_subport_op(rif, false);
 err_rif_subport_op:
 	mlxsw_sp_rif_mac_profile_put(rif->mlxsw_sp, mac_profile);
@@ -10038,6 +10075,10 @@ static int mlxsw_sp_rif_fid_configure(struct mlxsw_sp_rif *rif,
 	if (err)
 		goto err_fid_bc_flood_set;
 
+	err = mlxsw_sp_macvlan_replay(rif, extack);
+	if (err)
+		goto err_macvlan_replay;
+
 	err = mlxsw_sp_rif_fdb_op(rif->mlxsw_sp, dev->dev_addr,
 				  mlxsw_sp_fid_index(rif->fid), true);
 	if (err)
@@ -10053,6 +10094,8 @@ static int mlxsw_sp_rif_fid_configure(struct mlxsw_sp_rif *rif,
 	mlxsw_sp_rif_fdb_op(rif->mlxsw_sp, dev->dev_addr,
 			    mlxsw_sp_fid_index(rif->fid), false);
 err_rif_fdb_op:
+	mlxsw_sp_rif_macvlan_flush(rif);
+err_macvlan_replay:
 	mlxsw_sp_fid_flood_set(rif->fid, MLXSW_SP_FLOOD_TYPE_BC,
 			       mlxsw_sp_router_port(mlxsw_sp), false);
 err_fid_bc_flood_set:
@@ -10200,6 +10243,10 @@ static int mlxsw_sp_rif_vlan_configure(struct mlxsw_sp_rif *rif, u16 efid,
 	if (err)
 		goto err_fid_bc_flood_set;
 
+	err = mlxsw_sp_macvlan_replay(rif, extack);
+	if (err)
+		goto err_macvlan_replay;
+
 	err = mlxsw_sp_rif_fdb_op(rif->mlxsw_sp, dev->dev_addr,
 				  mlxsw_sp_fid_index(rif->fid), true);
 	if (err)
@@ -10215,6 +10262,8 @@ static int mlxsw_sp_rif_vlan_configure(struct mlxsw_sp_rif *rif, u16 efid,
 	mlxsw_sp_rif_fdb_op(rif->mlxsw_sp, dev->dev_addr,
 			    mlxsw_sp_fid_index(rif->fid), false);
 err_rif_fdb_op:
+	mlxsw_sp_rif_macvlan_flush(rif);
+err_macvlan_replay:
 	mlxsw_sp_fid_flood_set(rif->fid, MLXSW_SP_FLOOD_TYPE_BC,
 			       mlxsw_sp_router_port(mlxsw_sp), false);
 err_fid_bc_flood_set:
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 14/17] mlxsw: spectrum_router: Replay neighbours when RIF is made
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (12 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 13/17] mlxsw: spectrum_router: Replay MACVLANs " Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 15/17] mlxsw: spectrum_router: Replay IP NETDEV_UP on device enslavement Petr Machata
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

As neighbours are created, mlxsw is involved through the netevent
notifications. When at the time there is no RIF for a given neighbour, the
notification is not acted upon. When the RIF is later created, these
outstanding neighbours are left unoffloaded and cause traffic to go through
the SW datapath.

In order to fix this issue, as a RIF is created, walk the ARP and ND tables
and find neighbours for the netdevice that represents the RIF. Then
schedule neighbour work for them, allowing them to be offloaded.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../ethernet/mellanox/mlxsw/spectrum_router.c | 62 ++++++++++++++++++-
 1 file changed, 61 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 2322429cdb72..9263a914bcc7 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -2983,6 +2983,52 @@ static void mlxsw_sp_neigh_rif_gone_sync(struct mlxsw_sp *mlxsw_sp,
 	}
 }
 
+struct mlxsw_sp_neigh_rif_made_sync {
+	struct mlxsw_sp *mlxsw_sp;
+	struct mlxsw_sp_rif *rif;
+	int err;
+};
+
+static void mlxsw_sp_neigh_rif_made_sync_each(struct neighbour *n, void *data)
+{
+	struct mlxsw_sp_neigh_rif_made_sync *rms = data;
+	int rc;
+
+	if (rms->err)
+		return;
+	if (n->dev != mlxsw_sp_rif_dev(rms->rif))
+		return;
+	rc = mlxsw_sp_router_schedule_neigh_work(rms->mlxsw_sp->router, n);
+	if (rc != NOTIFY_DONE)
+		rms->err = -ENOMEM;
+}
+
+static int mlxsw_sp_neigh_rif_made_sync(struct mlxsw_sp *mlxsw_sp,
+					struct mlxsw_sp_rif *rif)
+{
+	struct mlxsw_sp_neigh_rif_made_sync rms = {
+		.mlxsw_sp = mlxsw_sp,
+		.rif = rif,
+	};
+
+	neigh_for_each(&arp_tbl, mlxsw_sp_neigh_rif_made_sync_each, &rms);
+	if (rms.err)
+		goto err_arp;
+
+#if IS_ENABLED(CONFIG_IPV6)
+	neigh_for_each(&nd_tbl, mlxsw_sp_neigh_rif_made_sync_each, &rms);
+#endif
+	if (rms.err)
+		goto err_nd;
+
+	return 0;
+
+err_nd:
+err_arp:
+	mlxsw_sp_neigh_rif_gone_sync(mlxsw_sp, rif);
+	return rms.err;
+}
+
 enum mlxsw_sp_nexthop_type {
 	MLXSW_SP_NEXTHOP_TYPE_ETH,
 	MLXSW_SP_NEXTHOP_TYPE_IPIP,
@@ -7937,7 +7983,21 @@ static int mlxsw_sp_router_rif_disable(struct mlxsw_sp *mlxsw_sp, u16 rif)
 static int mlxsw_sp_router_rif_made_sync(struct mlxsw_sp *mlxsw_sp,
 					 struct mlxsw_sp_rif *rif)
 {
-	return mlxsw_sp_nexthop_rif_made_sync(mlxsw_sp, rif);
+	int err;
+
+	err = mlxsw_sp_neigh_rif_made_sync(mlxsw_sp, rif);
+	if (err)
+		return err;
+
+	err = mlxsw_sp_nexthop_rif_made_sync(mlxsw_sp, rif);
+	if (err)
+		goto err_nexthop;
+
+	return 0;
+
+err_nexthop:
+	mlxsw_sp_neigh_rif_gone_sync(mlxsw_sp, rif);
+	return err;
 }
 
 static void mlxsw_sp_router_rif_gone_sync(struct mlxsw_sp *mlxsw_sp,
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 15/17] mlxsw: spectrum_router: Replay IP NETDEV_UP on device enslavement
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (13 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 14/17] mlxsw: spectrum_router: Replay neighbours " Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 16/17] mlxsw: spectrum_router: Replay IP NETDEV_UP on device deslavement Petr Machata
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

Enslaving of front panel ports (and their uppers) to netdevices that
already have uppers is currently forbidden. When this is permitted, any
uppers with IP addresses need to have the NETDEV_UP inetaddr event
replayed, so that any RIFs are created.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum.c    |   6 +
 .../ethernet/mellanox/mlxsw/spectrum_router.c | 119 ++++++++++++++++++
 .../ethernet/mellanox/mlxsw/spectrum_router.h |   5 +
 .../mellanox/mlxsw/spectrum_switchdev.c       |   7 ++
 4 files changed, 137 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 4346cc736579..0dcd988ffce1 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4469,8 +4469,14 @@ static int mlxsw_sp_port_lag_join(struct mlxsw_sp_port *mlxsw_sp_port,
 	if (err)
 		goto err_router_join;
 
+	err = mlxsw_sp_netdevice_enslavement_replay(mlxsw_sp, lag_dev, extack);
+	if (err)
+		goto err_replay;
+
 	return 0;
 
+err_replay:
+	mlxsw_sp_router_port_leave_lag(mlxsw_sp_port, lag_dev);
 err_router_join:
 	lag->ref_count--;
 	mlxsw_sp_port->lagged = 0;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 9263a914bcc7..18d1bcbb3fdf 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -9781,6 +9781,97 @@ mlxsw_sp_netdevice_vrf_event(struct net_device *l3_dev, unsigned long event,
 	return err;
 }
 
+struct mlxsw_sp_router_replay_inetaddr_up {
+	struct mlxsw_sp *mlxsw_sp;
+	struct netlink_ext_ack *extack;
+	unsigned int done;
+};
+
+static int mlxsw_sp_router_replay_inetaddr_up(struct net_device *dev,
+					      struct netdev_nested_priv *priv)
+{
+	struct mlxsw_sp_router_replay_inetaddr_up *ctx = priv->data;
+	struct mlxsw_sp_crif *crif;
+	int err;
+
+	if (mlxsw_sp_dev_addr_list_empty(dev))
+		return 0;
+
+	crif = mlxsw_sp_crif_lookup(ctx->mlxsw_sp->router, dev);
+	if (!crif || crif->rif)
+		return 0;
+
+	if (!mlxsw_sp_rif_should_config(crif->rif, dev, NETDEV_UP))
+		return 0;
+
+	err = __mlxsw_sp_inetaddr_event(ctx->mlxsw_sp, dev, NETDEV_UP,
+					false, ctx->extack);
+	if (err)
+		return err;
+
+	ctx->done++;
+	return 0;
+}
+
+static int mlxsw_sp_router_unreplay_inetaddr_up(struct net_device *dev,
+						struct netdev_nested_priv *priv)
+{
+	struct mlxsw_sp_router_replay_inetaddr_up *ctx = priv->data;
+	struct mlxsw_sp_crif *crif;
+
+	if (!ctx->done)
+		return 0;
+
+	if (mlxsw_sp_dev_addr_list_empty(dev))
+		return 0;
+
+	crif = mlxsw_sp_crif_lookup(ctx->mlxsw_sp->router, dev);
+	if (!crif || !crif->rif)
+		return 0;
+
+	/* We are rolling back NETDEV_UP, so ask for that. */
+	if (!mlxsw_sp_rif_should_config(crif->rif, dev, NETDEV_UP))
+		return 0;
+
+	__mlxsw_sp_inetaddr_event(ctx->mlxsw_sp, dev, NETDEV_DOWN, false, NULL);
+
+	ctx->done--;
+	return 0;
+}
+
+int mlxsw_sp_netdevice_enslavement_replay(struct mlxsw_sp *mlxsw_sp,
+					  struct net_device *upper_dev,
+					  struct netlink_ext_ack *extack)
+{
+	struct mlxsw_sp_router_replay_inetaddr_up ctx = {
+		.mlxsw_sp = mlxsw_sp,
+		.extack = extack,
+	};
+	struct netdev_nested_priv priv = {
+		.data = &ctx,
+	};
+	int err;
+
+	err = mlxsw_sp_router_replay_inetaddr_up(upper_dev, &priv);
+	if (err)
+		return err;
+
+	err = netdev_walk_all_upper_dev_rcu(upper_dev,
+					    mlxsw_sp_router_replay_inetaddr_up,
+					    &priv);
+	if (err)
+		goto err_replay_up;
+
+	return 0;
+
+err_replay_up:
+	netdev_walk_all_upper_dev_rcu(upper_dev,
+				      mlxsw_sp_router_unreplay_inetaddr_up,
+				      &priv);
+	mlxsw_sp_router_unreplay_inetaddr_up(upper_dev, &priv);
+	return err;
+}
+
 static int
 mlxsw_sp_port_vid_router_join_existing(struct mlxsw_sp_port *mlxsw_sp_port,
 				       u16 vid, struct net_device *dev,
@@ -9857,6 +9948,26 @@ static int __mlxsw_sp_router_port_join_lag(struct mlxsw_sp_port *mlxsw_sp_port,
 	return err;
 }
 
+static void
+__mlxsw_sp_router_port_leave_lag(struct mlxsw_sp_port *mlxsw_sp_port,
+				 struct net_device *lag_dev)
+{
+	u16 default_vid = MLXSW_SP_DEFAULT_VID;
+	struct net_device *upper_dev;
+	struct list_head *iter;
+	u16 vid;
+
+	netdev_for_each_upper_dev_rcu(lag_dev, upper_dev, iter) {
+		if (!is_vlan_dev(upper_dev))
+			continue;
+
+		vid = vlan_dev_vlan_id(upper_dev);
+		mlxsw_sp_port_vid_router_leave(mlxsw_sp_port, vid, upper_dev);
+	}
+
+	mlxsw_sp_port_vid_router_leave(mlxsw_sp_port, default_vid, lag_dev);
+}
+
 int mlxsw_sp_router_port_join_lag(struct mlxsw_sp_port *mlxsw_sp_port,
 				  struct net_device *lag_dev,
 				  struct netlink_ext_ack *extack)
@@ -9870,6 +9981,14 @@ int mlxsw_sp_router_port_join_lag(struct mlxsw_sp_port *mlxsw_sp_port,
 	return err;
 }
 
+void mlxsw_sp_router_port_leave_lag(struct mlxsw_sp_port *mlxsw_sp_port,
+				    struct net_device *lag_dev)
+{
+	mutex_lock(&mlxsw_sp_port->mlxsw_sp->router->lock);
+	__mlxsw_sp_router_port_leave_lag(mlxsw_sp_port, lag_dev);
+	mutex_unlock(&mlxsw_sp_port->mlxsw_sp->router->lock);
+}
+
 static int mlxsw_sp_router_netdevice_event(struct notifier_block *nb,
 					   unsigned long event, void *ptr)
 {
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.h
index 74242220a0cf..eed04fbf719f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.h
@@ -178,5 +178,10 @@ int mlxsw_sp_router_bridge_vlan_add(struct mlxsw_sp *mlxsw_sp,
 int mlxsw_sp_router_port_join_lag(struct mlxsw_sp_port *mlxsw_sp_port,
 				  struct net_device *lag_dev,
 				  struct netlink_ext_ack *extack);
+void mlxsw_sp_router_port_leave_lag(struct mlxsw_sp_port *mlxsw_sp_port,
+				    struct net_device *lag_dev);
+int mlxsw_sp_netdevice_enslavement_replay(struct mlxsw_sp *mlxsw_sp,
+					  struct net_device *upper_dev,
+					  struct netlink_ext_ack *extack);
 
 #endif /* _MLXSW_ROUTER_H_*/
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 982eae6bd63e..dffb67c1038e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -2894,8 +2894,15 @@ int mlxsw_sp_port_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port,
 	if (err)
 		goto err_port_join;
 
+	err = mlxsw_sp_netdevice_enslavement_replay(mlxsw_sp, br_dev, extack);
+	if (err)
+		goto err_replay;
+
 	return 0;
 
+err_replay:
+	bridge_device->ops->port_leave(bridge_device, bridge_port,
+				       mlxsw_sp_port);
 err_port_join:
 	mlxsw_sp_bridge_port_put(mlxsw_sp->bridge, bridge_port);
 	return err;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 16/17] mlxsw: spectrum_router: Replay IP NETDEV_UP on device deslavement
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (14 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 15/17] mlxsw: spectrum_router: Replay IP NETDEV_UP on device enslavement Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-19 11:01 ` [PATCH net-next 17/17] mlxsw: spectrum: Permit enslavement to netdevices with uppers Petr Machata
  2023-07-21  8:10 ` [PATCH net-next 00/17] mlxsw: " patchwork-bot+netdevbpf
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

When a netdevice is removed from a bridge or a LAG, and it has an IP
address, it should join the router and gain a RIF. Do that by replaying
address addition event on the netdevice.

When handling deslavement of LAG or its upper from a bridge device, the
replay should be done after all the lowers of the LAG have left the bridge.
Thus these scenarios are handled by passing replay_deslavement of false,
and by invoking, after the lowers have been processed, a new helper,
mlxsw_sp_netdevice_post_lag_event(), which does the per-LAG / -upper
handling, and in particular invokes the replay.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum.c    | 48 ++++++++++++++++---
 .../ethernet/mellanox/mlxsw/spectrum_router.c | 23 ++++++++-
 .../ethernet/mellanox/mlxsw/spectrum_router.h |  2 +
 3 files changed, 65 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 0dcd988ffce1..b955511fe5a2 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4838,15 +4838,20 @@ static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev,
 	case NETDEV_CHANGEUPPER:
 		upper_dev = info->upper_dev;
 		if (netif_is_bridge_master(upper_dev)) {
-			if (info->linking)
+			if (info->linking) {
 				err = mlxsw_sp_port_bridge_join(mlxsw_sp_port,
 								lower_dev,
 								upper_dev,
 								extack);
-			else
+			} else {
 				mlxsw_sp_port_bridge_leave(mlxsw_sp_port,
 							   lower_dev,
 							   upper_dev);
+				if (!replay_deslavement)
+					break;
+				mlxsw_sp_netdevice_deslavement_replay(mlxsw_sp,
+								      lower_dev);
+			}
 		} else if (netif_is_lag_master(upper_dev)) {
 			if (info->linking) {
 				err = mlxsw_sp_port_lag_join(mlxsw_sp_port,
@@ -4855,6 +4860,8 @@ static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev,
 				mlxsw_sp_port_lag_col_dist_disable(mlxsw_sp_port);
 				mlxsw_sp_port_lag_leave(mlxsw_sp_port,
 							upper_dev);
+				mlxsw_sp_netdevice_deslavement_replay(mlxsw_sp,
+								      dev);
 			}
 		} else if (netif_is_ovs_master(upper_dev)) {
 			if (info->linking)
@@ -4924,6 +4931,30 @@ static int mlxsw_sp_netdevice_port_event(struct net_device *lower_dev,
 	return 0;
 }
 
+/* Called for LAG or its upper VLAN after the per-LAG-lower processing was done,
+ * to do any per-LAG / per-LAG-upper processing.
+ */
+static int mlxsw_sp_netdevice_post_lag_event(struct net_device *dev,
+					     unsigned long event,
+					     void *ptr)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_lower_get(dev);
+	struct netdev_notifier_changeupper_info *info = ptr;
+
+	if (!mlxsw_sp)
+		return 0;
+
+	switch (event) {
+	case NETDEV_CHANGEUPPER:
+		if (info->linking)
+			break;
+		if (netif_is_bridge_master(info->upper_dev))
+			mlxsw_sp_netdevice_deslavement_replay(mlxsw_sp, dev);
+		break;
+	}
+	return 0;
+}
+
 static int mlxsw_sp_netdevice_lag_event(struct net_device *lag_dev,
 					unsigned long event, void *ptr)
 {
@@ -4940,7 +4971,7 @@ static int mlxsw_sp_netdevice_lag_event(struct net_device *lag_dev,
 		}
 	}
 
-	return 0;
+	return mlxsw_sp_netdevice_post_lag_event(lag_dev, event, ptr);
 }
 
 static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev,
@@ -4984,15 +5015,20 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev,
 	case NETDEV_CHANGEUPPER:
 		upper_dev = info->upper_dev;
 		if (netif_is_bridge_master(upper_dev)) {
-			if (info->linking)
+			if (info->linking) {
 				err = mlxsw_sp_port_bridge_join(mlxsw_sp_port,
 								vlan_dev,
 								upper_dev,
 								extack);
-			else
+			} else {
 				mlxsw_sp_port_bridge_leave(mlxsw_sp_port,
 							   vlan_dev,
 							   upper_dev);
+				if (!replay_deslavement)
+					break;
+				mlxsw_sp_netdevice_deslavement_replay(mlxsw_sp,
+								      vlan_dev);
+			}
 		} else if (netif_is_macvlan(upper_dev)) {
 			if (!info->linking)
 				mlxsw_sp_rif_macvlan_del(mlxsw_sp, upper_dev);
@@ -5022,7 +5058,7 @@ static int mlxsw_sp_netdevice_lag_port_vlan_event(struct net_device *vlan_dev,
 		}
 	}
 
-	return 0;
+	return mlxsw_sp_netdevice_post_lag_event(vlan_dev, event, ptr);
 }
 
 static int mlxsw_sp_netdevice_bridge_vlan_event(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 18d1bcbb3fdf..57f0faac836c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -9785,12 +9785,14 @@ struct mlxsw_sp_router_replay_inetaddr_up {
 	struct mlxsw_sp *mlxsw_sp;
 	struct netlink_ext_ack *extack;
 	unsigned int done;
+	bool deslavement;
 };
 
 static int mlxsw_sp_router_replay_inetaddr_up(struct net_device *dev,
 					      struct netdev_nested_priv *priv)
 {
 	struct mlxsw_sp_router_replay_inetaddr_up *ctx = priv->data;
+	bool nomaster = ctx->deslavement;
 	struct mlxsw_sp_crif *crif;
 	int err;
 
@@ -9805,7 +9807,7 @@ static int mlxsw_sp_router_replay_inetaddr_up(struct net_device *dev,
 		return 0;
 
 	err = __mlxsw_sp_inetaddr_event(ctx->mlxsw_sp, dev, NETDEV_UP,
-					false, ctx->extack);
+					nomaster, ctx->extack);
 	if (err)
 		return err;
 
@@ -9817,6 +9819,7 @@ static int mlxsw_sp_router_unreplay_inetaddr_up(struct net_device *dev,
 						struct netdev_nested_priv *priv)
 {
 	struct mlxsw_sp_router_replay_inetaddr_up *ctx = priv->data;
+	bool nomaster = ctx->deslavement;
 	struct mlxsw_sp_crif *crif;
 
 	if (!ctx->done)
@@ -9833,7 +9836,8 @@ static int mlxsw_sp_router_unreplay_inetaddr_up(struct net_device *dev,
 	if (!mlxsw_sp_rif_should_config(crif->rif, dev, NETDEV_UP))
 		return 0;
 
-	__mlxsw_sp_inetaddr_event(ctx->mlxsw_sp, dev, NETDEV_DOWN, false, NULL);
+	__mlxsw_sp_inetaddr_event(ctx->mlxsw_sp, dev, NETDEV_DOWN, nomaster,
+				  NULL);
 
 	ctx->done--;
 	return 0;
@@ -9846,6 +9850,7 @@ int mlxsw_sp_netdevice_enslavement_replay(struct mlxsw_sp *mlxsw_sp,
 	struct mlxsw_sp_router_replay_inetaddr_up ctx = {
 		.mlxsw_sp = mlxsw_sp,
 		.extack = extack,
+		.deslavement = false,
 	};
 	struct netdev_nested_priv priv = {
 		.data = &ctx,
@@ -9872,6 +9877,20 @@ int mlxsw_sp_netdevice_enslavement_replay(struct mlxsw_sp *mlxsw_sp,
 	return err;
 }
 
+void mlxsw_sp_netdevice_deslavement_replay(struct mlxsw_sp *mlxsw_sp,
+					   struct net_device *dev)
+{
+	struct mlxsw_sp_router_replay_inetaddr_up ctx = {
+		.mlxsw_sp = mlxsw_sp,
+		.deslavement = true,
+	};
+	struct netdev_nested_priv priv = {
+		.data = &ctx,
+	};
+
+	mlxsw_sp_router_replay_inetaddr_up(dev, &priv);
+}
+
 static int
 mlxsw_sp_port_vid_router_join_existing(struct mlxsw_sp_port *mlxsw_sp_port,
 				       u16 vid, struct net_device *dev,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.h
index eed04fbf719f..ed3b628caafe 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.h
@@ -183,5 +183,7 @@ void mlxsw_sp_router_port_leave_lag(struct mlxsw_sp_port *mlxsw_sp_port,
 int mlxsw_sp_netdevice_enslavement_replay(struct mlxsw_sp *mlxsw_sp,
 					  struct net_device *upper_dev,
 					  struct netlink_ext_ack *extack);
+void mlxsw_sp_netdevice_deslavement_replay(struct mlxsw_sp *mlxsw_sp,
+					   struct net_device *dev);
 
 #endif /* _MLXSW_ROUTER_H_*/
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 17/17] mlxsw: spectrum: Permit enslavement to netdevices with uppers
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (15 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 16/17] mlxsw: spectrum_router: Replay IP NETDEV_UP on device deslavement Petr Machata
@ 2023-07-19 11:01 ` Petr Machata
  2023-07-21  8:10 ` [PATCH net-next 00/17] mlxsw: " patchwork-bot+netdevbpf
  17 siblings, 0 replies; 20+ messages in thread
From: Petr Machata @ 2023-07-19 11:01 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev
  Cc: Ido Schimmel, Petr Machata, Danielle Ratson, mlxsw

Enslaving of front panel ports (and their uppers) to netdevices that
already have uppers is currently forbidden. In the previous patches, a
number of replays have been added. Those ensure that various bits of state,
such as next hops or switchdev objects, are offloaded when they become
relevant due to a mlxsw lower being introduced into the topology.

However the act of actually, for example, enslaving a front-panel port to
a bridge with uppers, has been vetoed so far. In this patch, remove the
vetoes and permit the operation.

mlxsw currently validates creation of "interesting" uppers. Thus creating
VLAN netdevices on top of 802.1ad bridges is forbidden if the bridge has an
mlxsw lower, but permitted in general. This validation code never gets run
when a port is introduced as a lower of an existing netdevice structure.

Thus when enslaving an mlxsw netdevice to netdevices with uppers, invoke
the PRECHANGEUPPER event handler for each netdevice above the one that the
front panel port is being enslaved to. This way the tower of netdevices
above the attachment point is validated.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum.c    | 66 +++++++++++++++++--
 1 file changed, 62 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index b955511fe5a2..f0f6af3ec7c5 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4735,6 +4735,58 @@ static bool mlxsw_sp_bridge_vxlan_is_valid(struct net_device *br_dev,
 	return true;
 }
 
+static bool mlxsw_sp_netdev_is_master(struct net_device *upper_dev,
+				      struct net_device *dev)
+{
+	return upper_dev == netdev_master_upper_dev_get(dev);
+}
+
+static int __mlxsw_sp_netdevice_event(struct mlxsw_sp *mlxsw_sp,
+				      unsigned long event, void *ptr,
+				      bool process_foreign);
+
+static int mlxsw_sp_netdevice_validate_uppers(struct mlxsw_sp *mlxsw_sp,
+					      struct net_device *dev,
+					      struct netlink_ext_ack *extack)
+{
+	struct net_device *upper_dev;
+	struct list_head *iter;
+	int err;
+
+	netdev_for_each_upper_dev_rcu(dev, upper_dev, iter) {
+		struct netdev_notifier_changeupper_info info = {
+			.info = {
+				.dev = dev,
+				.extack = extack,
+			},
+			.master = mlxsw_sp_netdev_is_master(upper_dev, dev),
+			.upper_dev = upper_dev,
+			.linking = true,
+
+			/* upper_info is relevant for LAG devices. But we would
+			 * only need this if LAG were a valid upper above
+			 * another upper (e.g. a bridge that is a member of a
+			 * LAG), and that is never a valid configuration. So we
+			 * can keep this as NULL.
+			 */
+			.upper_info = NULL,
+		};
+
+		err = __mlxsw_sp_netdevice_event(mlxsw_sp,
+						 NETDEV_PRECHANGEUPPER,
+						 &info, true);
+		if (err)
+			return err;
+
+		err = mlxsw_sp_netdevice_validate_uppers(mlxsw_sp, upper_dev,
+							 extack);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev,
 					       struct net_device *dev,
 					       unsigned long event, void *ptr,
@@ -4776,8 +4828,11 @@ static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev,
 		    (!netif_is_bridge_master(upper_dev) ||
 		     !mlxsw_sp_bridge_device_is_offloaded(mlxsw_sp,
 							  upper_dev))) {
-			NL_SET_ERR_MSG_MOD(extack, "Enslaving a port to a device that already has an upper device is not supported");
-			return -EINVAL;
+			err = mlxsw_sp_netdevice_validate_uppers(mlxsw_sp,
+								 upper_dev,
+								 extack);
+			if (err)
+				return err;
 		}
 		if (netif_is_lag_master(upper_dev) &&
 		    !mlxsw_sp_master_lag_check(mlxsw_sp, upper_dev,
@@ -5008,8 +5063,11 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev,
 		    (!netif_is_bridge_master(upper_dev) ||
 		     !mlxsw_sp_bridge_device_is_offloaded(mlxsw_sp,
 							  upper_dev))) {
-			NL_SET_ERR_MSG_MOD(extack, "Enslaving a port to a device that already has an upper device is not supported");
-			return -EINVAL;
+			err = mlxsw_sp_netdevice_validate_uppers(mlxsw_sp,
+								 upper_dev,
+								 extack);
+			if (err)
+				return err;
 		}
 		break;
 	case NETDEV_CHANGEUPPER:
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers
  2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
                   ` (16 preceding siblings ...)
  2023-07-19 11:01 ` [PATCH net-next 17/17] mlxsw: spectrum: Permit enslavement to netdevices with uppers Petr Machata
@ 2023-07-21  8:10 ` patchwork-bot+netdevbpf
  17 siblings, 0 replies; 20+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-07-21  8:10 UTC (permalink / raw)
  To: Petr Machata
  Cc: davem, edumazet, kuba, pabeni, netdev, idosch, danieller, mlxsw

Hello:

This series was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:

On Wed, 19 Jul 2023 13:01:15 +0200 you wrote:
> The mlxsw driver currently makes the assumption that the user applies
> configuration in a bottom-up manner. Thus netdevices need to be added to
> the bridge before IP addresses are configured on that bridge or SVI added
> on top of it. Enslaving a netdevice to another netdevice that already has
> uppers is in fact forbidden by mlxsw for this reason. Despite this safety,
> it is rather easy to get into situations where the offloaded configuration
> is just plain wrong.
> 
> [...]

Here is the summary with links:
  - [net-next,01/17] net: bridge: br_switchdev: Tolerate -EOPNOTSUPP when replaying MDB
    https://git.kernel.org/netdev/net-next/c/989280d6ea70
  - [net-next,02/17] net: switchdev: Add a helper to replay objects on a bridge port
    https://git.kernel.org/netdev/net-next/c/f2e2857b3522
  - [net-next,03/17] selftests: mlxsw: rtnetlink: Drop obsolete tests
    https://git.kernel.org/netdev/net-next/c/d7eb1f175153
  - [net-next,04/17] mlxsw: spectrum_router: Allow address handlers to run on bridge ports
    https://git.kernel.org/netdev/net-next/c/6bbc9ca6a3a7
  - [net-next,05/17] mlxsw: spectrum_router: Extract a helper to schedule neighbour work
    https://git.kernel.org/netdev/net-next/c/96c3e45c0130
  - [net-next,06/17] mlxsw: spectrum: Split a helper out of mlxsw_sp_netdevice_event()
    https://git.kernel.org/netdev/net-next/c/721717fafdc4
  - [net-next,07/17] mlxsw: spectrum: Allow event handlers to check unowned bridges
    https://git.kernel.org/netdev/net-next/c/40b7b4236c1f
  - [net-next,08/17] mlxsw: spectrum: Add a replay_deslavement argument to event handlers
    https://git.kernel.org/netdev/net-next/c/1c47e65b8c0b
  - [net-next,09/17] mlxsw: spectrum: On port enslavement to a LAG, join upper's bridges
    https://git.kernel.org/netdev/net-next/c/987c7782f062
  - [net-next,10/17] mlxsw: spectrum_switchdev: Replay switchdev objects on port join
    https://git.kernel.org/netdev/net-next/c/ec4643ca3d98
  - [net-next,11/17] mlxsw: spectrum_router: Join RIFs of LAG upper VLANs
    https://git.kernel.org/netdev/net-next/c/ef59713c26b1
  - [net-next,12/17] mlxsw: spectrum_router: Offload ethernet nexthops when RIF is made
    https://git.kernel.org/netdev/net-next/c/cfc01a92eaff
  - [net-next,13/17] mlxsw: spectrum_router: Replay MACVLANs when RIF is made
    https://git.kernel.org/netdev/net-next/c/49c3a615d382
  - [net-next,14/17] mlxsw: spectrum_router: Replay neighbours when RIF is made
    https://git.kernel.org/netdev/net-next/c/8fdb09a7674c
  - [net-next,15/17] mlxsw: spectrum_router: Replay IP NETDEV_UP on device enslavement
    https://git.kernel.org/netdev/net-next/c/31618b22f2c4
  - [net-next,16/17] mlxsw: spectrum_router: Replay IP NETDEV_UP on device deslavement
    https://git.kernel.org/netdev/net-next/c/4560cf408eca
  - [net-next,17/17] mlxsw: spectrum: Permit enslavement to netdevices with uppers
    https://git.kernel.org/netdev/net-next/c/2c5ffe8d7226

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 02/17] net: switchdev: Add a helper to replay objects on a bridge port
  2023-07-19 11:01 ` [PATCH net-next 02/17] net: switchdev: Add a helper to replay objects on a bridge port Petr Machata
@ 2024-01-31 15:47   ` Vladimir Oltean
  0 siblings, 0 replies; 20+ messages in thread
From: Vladimir Oltean @ 2024-01-31 15:47 UTC (permalink / raw)
  To: Petr Machata
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, Ido Schimmel, Danielle Ratson, mlxsw, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov, bridge

Hi Petr,

On Wed, Jul 19, 2023 at 01:01:17PM +0200, Petr Machata wrote:
> When a front panel joins a bridge via another netdevice (typically a LAG),
> the driver needs to learn about the objects configured on the bridge port.
> When the bridge port is offloaded by the driver for the first time, this
> can be achieved by passing a notifier to switchdev_bridge_port_offload().
> The notifier is then invoked for the individual objects (such as VLANs)
> configured on the bridge, and can look for the interesting ones.
> 
> Calling switchdev_bridge_port_offload() when the second port joins the
> bridge lower is unnecessary, but the replay is still needed. To that end,
> add a new function, switchdev_bridge_port_replay(), which does only the
> replay part of the _offload() function in exactly the same way as that
> function.
> 
> Cc: Jiri Pirko <jiri@resnulli.us>
> Cc: Ivan Vecera <ivecera@redhat.com>
> Cc: Roopa Prabhu <roopa@nvidia.com>
> Cc: Nikolay Aleksandrov <razor@blackwall.org>
> Cc: bridge@lists.linux-foundation.org
> Signed-off-by: Petr Machata <petrm@nvidia.com>
> Reviewed-by: Danielle Ratson <danieller@nvidia.com>
> ---

I just noticed this commit in the kernel, and the commit message did not
really convince me.

As unnecessary as the second switchdev_bridge_port_offload() call may be,
it does not hurt either, because it just bumps p->offload_count in the bridge.
And with just this justification, adding a new function to the switchdev API
is equally unnecessary.

Even worse, switchdev_bridge_port_replay() is a partial misnomer, and
will become a complete misnomer soon. Out of 3 object classes (FDB, MDB
and VLAN), FDB and VLAN are not actually replayed just for the given
bridge port given as argument, but for all ports of the bridge. And Tobias
wants to change things so that the same will happen for MDB too.
https://lore.kernel.org/netdev/87bk927bxl.fsf@waldekranz.com/

Can mlxsw not use the same replay code path as DSA, through the _offload()
and _unoffload() entry points?

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-01-31 15:47 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-19 11:01 [PATCH net-next 00/17] mlxsw: Permit enslavement to netdevices with uppers Petr Machata
2023-07-19 11:01 ` [PATCH net-next 01/17] net: bridge: br_switchdev: Tolerate -EOPNOTSUPP when replaying MDB Petr Machata
2023-07-19 11:01 ` [PATCH net-next 02/17] net: switchdev: Add a helper to replay objects on a bridge port Petr Machata
2024-01-31 15:47   ` Vladimir Oltean
2023-07-19 11:01 ` [PATCH net-next 03/17] selftests: mlxsw: rtnetlink: Drop obsolete tests Petr Machata
2023-07-19 11:01 ` [PATCH net-next 04/17] mlxsw: spectrum_router: Allow address handlers to run on bridge ports Petr Machata
2023-07-19 11:01 ` [PATCH net-next 05/17] mlxsw: spectrum_router: Extract a helper to schedule neighbour work Petr Machata
2023-07-19 11:01 ` [PATCH net-next 06/17] mlxsw: spectrum: Split a helper out of mlxsw_sp_netdevice_event() Petr Machata
2023-07-19 11:01 ` [PATCH net-next 07/17] mlxsw: spectrum: Allow event handlers to check unowned bridges Petr Machata
2023-07-19 11:01 ` [PATCH net-next 08/17] mlxsw: spectrum: Add a replay_deslavement argument to event handlers Petr Machata
2023-07-19 11:01 ` [PATCH net-next 09/17] mlxsw: spectrum: On port enslavement to a LAG, join upper's bridges Petr Machata
2023-07-19 11:01 ` [PATCH net-next 10/17] mlxsw: spectrum_switchdev: Replay switchdev objects on port join Petr Machata
2023-07-19 11:01 ` [PATCH net-next 11/17] mlxsw: spectrum_router: Join RIFs of LAG upper VLANs Petr Machata
2023-07-19 11:01 ` [PATCH net-next 12/17] mlxsw: spectrum_router: Offload ethernet nexthops when RIF is made Petr Machata
2023-07-19 11:01 ` [PATCH net-next 13/17] mlxsw: spectrum_router: Replay MACVLANs " Petr Machata
2023-07-19 11:01 ` [PATCH net-next 14/17] mlxsw: spectrum_router: Replay neighbours " Petr Machata
2023-07-19 11:01 ` [PATCH net-next 15/17] mlxsw: spectrum_router: Replay IP NETDEV_UP on device enslavement Petr Machata
2023-07-19 11:01 ` [PATCH net-next 16/17] mlxsw: spectrum_router: Replay IP NETDEV_UP on device deslavement Petr Machata
2023-07-19 11:01 ` [PATCH net-next 17/17] mlxsw: spectrum: Permit enslavement to netdevices with uppers Petr Machata
2023-07-21  8:10 ` [PATCH net-next 00/17] mlxsw: " patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).