netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions
@ 2023-09-08 13:33 Vladimir Oltean
  2023-09-08 13:33 ` [PATCH net 1/5] net: dsa: sja1105: hide all multicast addresses from "bridge fdb show" Vladimir Oltean
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Vladimir Oltean @ 2023-09-08 13:33 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, David S. Miller,
	Xiaoliang Yang, Andrew Lunn, Florian Fainelli, Yanan Yang,
	linux-kernel

A report by Yanan Yang has prompted an investigation into the sja1105
driver's behavior w.r.t. multicast. The report states that when adding
multicast L2 addresses with "bridge mdb add", only the most recently
added address works - the others seem to be overwritten. This is solved
by patch 3/5 (with patch 2/5 as a dependency for it).

Patches 4/5 and 5/5 fix a series of race conditions introduced during
the same patch set as the bug above, namely this one:
https://patchwork.kernel.org/project/netdevbpf/cover/20211024171757.3753288-1-vladimir.oltean@nxp.com/

Finally, patch 1/5 fixes an issue found ever since the introduction of
multicast forwarding offload in sja1105, which is that the multicast
addresses are visible (with the "self" flag) in "bridge fdb show".

Vladimir Oltean (5):
  net: dsa: sja1105: hide all multicast addresses from "bridge fdb show"
  net: dsa: sja1105: propagate exact error code from
    sja1105_dynamic_config_poll_valid()
  net: dsa: sja1105: fix multicast forwarding working only for last
    added mdb entry
  net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB
    accesses
  net: dsa: sja1105: block FDB accesses that are concurrent with a
    switch reset

 drivers/net/dsa/sja1105/sja1105.h             |  2 +
 .../net/dsa/sja1105/sja1105_dynamic_config.c  | 93 +++++++++----------
 drivers/net/dsa/sja1105/sja1105_main.c        | 69 ++++++++++----
 3 files changed, 97 insertions(+), 67 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net 1/5] net: dsa: sja1105: hide all multicast addresses from "bridge fdb show"
  2023-09-08 13:33 [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions Vladimir Oltean
@ 2023-09-08 13:33 ` Vladimir Oltean
  2023-09-08 13:33 ` [PATCH net 2/5] net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid() Vladimir Oltean
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Vladimir Oltean @ 2023-09-08 13:33 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, David S. Miller,
	Xiaoliang Yang, Andrew Lunn, Florian Fainelli, Yanan Yang,
	linux-kernel

Commit 4d9423549501 ("net: dsa: sja1105: offload bridge port flags to
device") has partially hidden some multicast entries from showing up in
the "bridge fdb show" output, but it wasn't enough. Addresses which are
added through "bridge mdb add" still show up. Hide them all.

Fixes: 291d1e72b756 ("net: dsa: sja1105: Add support for FDB and MDB management")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/dsa/sja1105/sja1105_main.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index a23d980d28f5..11c917d5ce43 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -1868,13 +1868,14 @@ static int sja1105_fdb_dump(struct dsa_switch *ds, int port,
 		if (!(l2_lookup.destports & BIT(port)))
 			continue;
 
-		/* We need to hide the FDB entry for unknown multicast */
-		if (l2_lookup.macaddr == SJA1105_UNKNOWN_MULTICAST &&
-		    l2_lookup.mask_macaddr == SJA1105_UNKNOWN_MULTICAST)
-			continue;
-
 		u64_to_ether_addr(l2_lookup.macaddr, macaddr);
 
+		/* Hardware FDB is shared for fdb and mdb, "bridge fdb show"
+		 * only wants to see unicast
+		 */
+		if (is_multicast_ether_addr(macaddr))
+			continue;
+
 		/* We need to hide the dsa_8021q VLANs from the user. */
 		if (vid_is_dsa_8021q(l2_lookup.vlanid))
 			l2_lookup.vlanid = 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 2/5] net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid()
  2023-09-08 13:33 [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions Vladimir Oltean
  2023-09-08 13:33 ` [PATCH net 1/5] net: dsa: sja1105: hide all multicast addresses from "bridge fdb show" Vladimir Oltean
@ 2023-09-08 13:33 ` Vladimir Oltean
  2023-09-08 13:33 ` [PATCH net 3/5] net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry Vladimir Oltean
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Vladimir Oltean @ 2023-09-08 13:33 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, David S. Miller,
	Xiaoliang Yang, Andrew Lunn, Florian Fainelli, Yanan Yang,
	linux-kernel

Currently, sja1105_dynamic_config_wait_complete() returns either 0 or
-ETIMEDOUT, because it just looks at the read_poll_timeout() return code.

There will be future changes which move some more checks to
sja1105_dynamic_config_poll_valid(). It is important that we propagate
their exact return code (-ENOENT, -EINVAL), because callers of
sja1105_dynamic_config_read() depend on them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/dsa/sja1105/sja1105_dynamic_config.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
index 7729d3f8b7f5..93d47dab8d3e 100644
--- a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
+++ b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
@@ -1211,13 +1211,14 @@ sja1105_dynamic_config_wait_complete(struct sja1105_private *priv,
 				     struct sja1105_dyn_cmd *cmd,
 				     const struct sja1105_dynamic_table_ops *ops)
 {
-	int rc;
-
-	return read_poll_timeout(sja1105_dynamic_config_poll_valid,
-				 rc, rc != -EAGAIN,
-				 SJA1105_DYNAMIC_CONFIG_SLEEP_US,
-				 SJA1105_DYNAMIC_CONFIG_TIMEOUT_US,
-				 false, priv, cmd, ops);
+	int err, rc;
+
+	err = read_poll_timeout(sja1105_dynamic_config_poll_valid,
+				rc, rc != -EAGAIN,
+				SJA1105_DYNAMIC_CONFIG_SLEEP_US,
+				SJA1105_DYNAMIC_CONFIG_TIMEOUT_US,
+				false, priv, cmd, ops);
+	return err < 0 ? err : rc;
 }
 
 /* Provides read access to the settings through the dynamic interface
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 3/5] net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry
  2023-09-08 13:33 [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions Vladimir Oltean
  2023-09-08 13:33 ` [PATCH net 1/5] net: dsa: sja1105: hide all multicast addresses from "bridge fdb show" Vladimir Oltean
  2023-09-08 13:33 ` [PATCH net 2/5] net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid() Vladimir Oltean
@ 2023-09-08 13:33 ` Vladimir Oltean
  2023-09-08 13:33 ` [PATCH net 4/5] net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses Vladimir Oltean
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Vladimir Oltean @ 2023-09-08 13:33 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, David S. Miller,
	Xiaoliang Yang, Andrew Lunn, Florian Fainelli, Yanan Yang,
	linux-kernel

The commit cited in Fixes: did 2 things: it refactored the read-back
polling from sja1105_dynamic_config_read() into a new function,
sja1105_dynamic_config_wait_complete(), and it called that from
sja1105_dynamic_config_write() too.

What is problematic is the refactoring.

The refactored code from sja1105_dynamic_config_poll_valid() works like
the previous one, but the problem is that it uses another packed_buf[]
SPI buffer, and there was code at the end of sja1105_dynamic_config_read()
which was relying on the read-back packed_buf[]:

	/* Don't dereference possibly NULL pointer - maybe caller
	 * only wanted to see whether the entry existed or not.
	 */
	if (entry)
		ops->entry_packing(packed_buf, entry, UNPACK);

After the change, the packed_buf[] that this code sees is no longer the
entry read back from hardware, but the original entry that the caller
passed to the sja1105_dynamic_config_read(), packed into this buffer.

This difference is the most notable with the SJA1105_SEARCH uses from
sja1105pqrs_fdb_add() - used for both fdb and mdb. There, we have logic
added by commit 728db843df88 ("net: dsa: sja1105: ignore the FDB entry
for unknown multicast when adding a new address") to figure out whether
the address we're trying to add matches on any existing hardware entry,
with the exception of the catch-all multicast address.

That logic was broken, because with sja1105_dynamic_config_read() not
working properly, it doesn't return us the entry read back from
hardware, but the entry that we passed to it. And, since for multicast,
a match will always exist, it will tell us that any mdb entry already
exists at index=0 L2 Address Lookup table. It is index=0 because the
caller doesn't know the index - it wants to find it out, and
sja1105_dynamic_config_read() does:

	if (index < 0) { // SJA1105_SEARCH
		/* Avoid copying a signed negative number to an u64 */
		cmd.index = 0; // <- this
		cmd.search = true;
	} else {
		cmd.index = index;
		cmd.search = false;
	}

So, to the caller of sja1105_dynamic_config_read(), the returned info
looks entirely legit, and it will add all mdb entries to FDB index 0.
There, they will always overwrite each other (not to mention,
potentially they can also overwrite a pre-existing bridge fdb entry),
and the user-visible impact will be that only the last mdb entry will be
forwarded as it should. The others won't (will be flooded or dropped,
depending on the egress flood settings).

Fixing is a bit more complicated, and involves either passing the same
packed_buf[] to sja1105_dynamic_config_wait_complete(), or moving all
the extra processing on the packed_buf[] to
sja1105_dynamic_config_wait_complete(). I've opted for the latter,
because it makes sja1105_dynamic_config_wait_complete() a bit more
self-contained.

Fixes: df405910ab9f ("net: dsa: sja1105: wait for dynamic config command completion on writes too")
Reported-by: Yanan Yang <yanan.yang@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 .../net/dsa/sja1105/sja1105_dynamic_config.c  | 80 +++++++++----------
 1 file changed, 37 insertions(+), 43 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
index 93d47dab8d3e..984c0e604e8d 100644
--- a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
+++ b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
@@ -1175,18 +1175,15 @@ const struct sja1105_dynamic_table_ops sja1110_dyn_ops[BLK_IDX_MAX_DYN] = {
 
 static int
 sja1105_dynamic_config_poll_valid(struct sja1105_private *priv,
-				  struct sja1105_dyn_cmd *cmd,
-				  const struct sja1105_dynamic_table_ops *ops)
+				  const struct sja1105_dynamic_table_ops *ops,
+				  void *entry, bool check_valident,
+				  bool check_errors)
 {
 	u8 packed_buf[SJA1105_MAX_DYN_CMD_SIZE] = {};
+	struct sja1105_dyn_cmd cmd = {};
 	int rc;
 
-	/* We don't _need_ to read the full entry, just the command area which
-	 * is a fixed SJA1105_SIZE_DYN_CMD. But our cmd_packing() API expects a
-	 * buffer that contains the full entry too. Additionally, our API
-	 * doesn't really know how many bytes into the buffer does the command
-	 * area really begin. So just read back the whole entry.
-	 */
+	/* Read back the whole entry + command structure. */
 	rc = sja1105_xfer_buf(priv, SPI_READ, ops->addr, packed_buf,
 			      ops->packed_size);
 	if (rc)
@@ -1195,11 +1192,25 @@ sja1105_dynamic_config_poll_valid(struct sja1105_private *priv,
 	/* Unpack the command structure, and return it to the caller in case it
 	 * needs to perform further checks on it (VALIDENT).
 	 */
-	memset(cmd, 0, sizeof(*cmd));
-	ops->cmd_packing(packed_buf, cmd, UNPACK);
+	ops->cmd_packing(packed_buf, &cmd, UNPACK);
 
 	/* Hardware hasn't cleared VALID => still working on it */
-	return cmd->valid ? -EAGAIN : 0;
+	if (cmd.valid)
+		return -EAGAIN;
+
+	if (check_valident && !cmd.valident && !(ops->access & OP_VALID_ANYWAY))
+		return -ENOENT;
+
+	if (check_errors && cmd.errors)
+		return -EINVAL;
+
+	/* Don't dereference possibly NULL pointer - maybe caller
+	 * only wanted to see whether the entry existed or not.
+	 */
+	if (entry)
+		ops->entry_packing(packed_buf, entry, UNPACK);
+
+	return 0;
 }
 
 /* Poll the dynamic config entry's control area until the hardware has
@@ -1208,8 +1219,9 @@ sja1105_dynamic_config_poll_valid(struct sja1105_private *priv,
  */
 static int
 sja1105_dynamic_config_wait_complete(struct sja1105_private *priv,
-				     struct sja1105_dyn_cmd *cmd,
-				     const struct sja1105_dynamic_table_ops *ops)
+				     const struct sja1105_dynamic_table_ops *ops,
+				     void *entry, bool check_valident,
+				     bool check_errors)
 {
 	int err, rc;
 
@@ -1217,7 +1229,8 @@ sja1105_dynamic_config_wait_complete(struct sja1105_private *priv,
 				rc, rc != -EAGAIN,
 				SJA1105_DYNAMIC_CONFIG_SLEEP_US,
 				SJA1105_DYNAMIC_CONFIG_TIMEOUT_US,
-				false, priv, cmd, ops);
+				false, priv, ops, entry, check_valident,
+				check_errors);
 	return err < 0 ? err : rc;
 }
 
@@ -1287,25 +1300,14 @@ int sja1105_dynamic_config_read(struct sja1105_private *priv,
 	mutex_lock(&priv->dynamic_config_lock);
 	rc = sja1105_xfer_buf(priv, SPI_WRITE, ops->addr, packed_buf,
 			      ops->packed_size);
-	if (rc < 0) {
-		mutex_unlock(&priv->dynamic_config_lock);
-		return rc;
-	}
-
-	rc = sja1105_dynamic_config_wait_complete(priv, &cmd, ops);
-	mutex_unlock(&priv->dynamic_config_lock);
 	if (rc < 0)
-		return rc;
+		goto out;
 
-	if (!cmd.valident && !(ops->access & OP_VALID_ANYWAY))
-		return -ENOENT;
+	rc = sja1105_dynamic_config_wait_complete(priv, ops, entry, true, false);
+out:
+	mutex_unlock(&priv->dynamic_config_lock);
 
-	/* Don't dereference possibly NULL pointer - maybe caller
-	 * only wanted to see whether the entry existed or not.
-	 */
-	if (entry)
-		ops->entry_packing(packed_buf, entry, UNPACK);
-	return 0;
+	return rc;
 }
 
 int sja1105_dynamic_config_write(struct sja1105_private *priv,
@@ -1357,22 +1359,14 @@ int sja1105_dynamic_config_write(struct sja1105_private *priv,
 	mutex_lock(&priv->dynamic_config_lock);
 	rc = sja1105_xfer_buf(priv, SPI_WRITE, ops->addr, packed_buf,
 			      ops->packed_size);
-	if (rc < 0) {
-		mutex_unlock(&priv->dynamic_config_lock);
-		return rc;
-	}
-
-	rc = sja1105_dynamic_config_wait_complete(priv, &cmd, ops);
-	mutex_unlock(&priv->dynamic_config_lock);
 	if (rc < 0)
-		return rc;
+		goto out;
 
-	cmd = (struct sja1105_dyn_cmd) {0};
-	ops->cmd_packing(packed_buf, &cmd, UNPACK);
-	if (cmd.errors)
-		return -EINVAL;
+	rc = sja1105_dynamic_config_wait_complete(priv, ops, NULL, false, true);
+out:
+	mutex_unlock(&priv->dynamic_config_lock);
 
-	return 0;
+	return rc;
 }
 
 static u8 sja1105_crc8_add(u8 crc, u8 byte, u8 poly)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 4/5] net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses
  2023-09-08 13:33 [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions Vladimir Oltean
                   ` (2 preceding siblings ...)
  2023-09-08 13:33 ` [PATCH net 3/5] net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry Vladimir Oltean
@ 2023-09-08 13:33 ` Vladimir Oltean
  2023-09-08 13:33 ` [PATCH net 5/5] net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset Vladimir Oltean
  2023-09-10 15:18 ` [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions Simon Horman
  5 siblings, 0 replies; 7+ messages in thread
From: Vladimir Oltean @ 2023-09-08 13:33 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, David S. Miller,
	Xiaoliang Yang, Andrew Lunn, Florian Fainelli, Yanan Yang,
	linux-kernel

sja1105_fdb_add() runs from the dsa_owq, and sja1105_port_mcast_flood()
runs from switchdev_deferred_process_work(). Prior to the blamed commit,
they used to be indirectly serialized through the rtnl_lock(), which
no longer holds true because dsa_owq dropped that.

So, it is now possible that we traverse the static config BLK_IDX_L2_LOOKUP
elements concurrently compared to when we change them, in
sja1105_static_fdb_change(). That is not ideal, since it might result in
data corruption.

Introduce a mutex which serializes accesses to the hardware FDB and to
the static config elements for the L2 Address Lookup table.

I can't find a good reason to add locking around sja1105_fdb_dump().
I'll add it later if needed.

Fixes: 0faf890fc519 ("net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/dsa/sja1105/sja1105.h      |  2 +
 drivers/net/dsa/sja1105/sja1105_main.c | 56 ++++++++++++++++++++------
 2 files changed, 45 insertions(+), 13 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index 0617d5ccd3ff..8c66d3bf61f0 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -266,6 +266,8 @@ struct sja1105_private {
 	 * the switch doesn't confuse them with one another.
 	 */
 	struct mutex mgmt_lock;
+	/* Serializes accesses to the FDB */
+	struct mutex fdb_lock;
 	/* PTP two-step TX timestamp ID, and its serialization lock */
 	spinlock_t ts_id_lock;
 	u8 ts_id;
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 11c917d5ce43..cefd72617af4 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -1798,6 +1798,7 @@ static int sja1105_fdb_add(struct dsa_switch *ds, int port,
 			   struct dsa_db db)
 {
 	struct sja1105_private *priv = ds->priv;
+	int rc;
 
 	if (!vid) {
 		switch (db.type) {
@@ -1812,12 +1813,16 @@ static int sja1105_fdb_add(struct dsa_switch *ds, int port,
 		}
 	}
 
-	return priv->info->fdb_add_cmd(ds, port, addr, vid);
+	mutex_lock(&priv->fdb_lock);
+	rc = priv->info->fdb_add_cmd(ds, port, addr, vid);
+	mutex_unlock(&priv->fdb_lock);
+
+	return rc;
 }
 
-static int sja1105_fdb_del(struct dsa_switch *ds, int port,
-			   const unsigned char *addr, u16 vid,
-			   struct dsa_db db)
+static int __sja1105_fdb_del(struct dsa_switch *ds, int port,
+			     const unsigned char *addr, u16 vid,
+			     struct dsa_db db)
 {
 	struct sja1105_private *priv = ds->priv;
 
@@ -1837,6 +1842,20 @@ static int sja1105_fdb_del(struct dsa_switch *ds, int port,
 	return priv->info->fdb_del_cmd(ds, port, addr, vid);
 }
 
+static int sja1105_fdb_del(struct dsa_switch *ds, int port,
+			   const unsigned char *addr, u16 vid,
+			   struct dsa_db db)
+{
+	struct sja1105_private *priv = ds->priv;
+	int rc;
+
+	mutex_lock(&priv->fdb_lock);
+	rc = __sja1105_fdb_del(ds, port, addr, vid, db);
+	mutex_unlock(&priv->fdb_lock);
+
+	return rc;
+}
+
 static int sja1105_fdb_dump(struct dsa_switch *ds, int port,
 			    dsa_fdb_dump_cb_t *cb, void *data)
 {
@@ -1899,6 +1918,8 @@ static void sja1105_fast_age(struct dsa_switch *ds, int port)
 	};
 	int i;
 
+	mutex_lock(&priv->fdb_lock);
+
 	for (i = 0; i < SJA1105_MAX_L2_LOOKUP_COUNT; i++) {
 		struct sja1105_l2_lookup_entry l2_lookup = {0};
 		u8 macaddr[ETH_ALEN];
@@ -1912,7 +1933,7 @@ static void sja1105_fast_age(struct dsa_switch *ds, int port)
 		if (rc) {
 			dev_err(ds->dev, "Failed to read FDB: %pe\n",
 				ERR_PTR(rc));
-			return;
+			break;
 		}
 
 		if (!(l2_lookup.destports & BIT(port)))
@@ -1924,14 +1945,16 @@ static void sja1105_fast_age(struct dsa_switch *ds, int port)
 
 		u64_to_ether_addr(l2_lookup.macaddr, macaddr);
 
-		rc = sja1105_fdb_del(ds, port, macaddr, l2_lookup.vlanid, db);
+		rc = __sja1105_fdb_del(ds, port, macaddr, l2_lookup.vlanid, db);
 		if (rc) {
 			dev_err(ds->dev,
 				"Failed to delete FDB entry %pM vid %lld: %pe\n",
 				macaddr, l2_lookup.vlanid, ERR_PTR(rc));
-			return;
+			break;
 		}
 	}
+
+	mutex_unlock(&priv->fdb_lock);
 }
 
 static int sja1105_mdb_add(struct dsa_switch *ds, int port,
@@ -2955,7 +2978,9 @@ static int sja1105_port_mcast_flood(struct sja1105_private *priv, int to,
 {
 	struct sja1105_l2_lookup_entry *l2_lookup;
 	struct sja1105_table *table;
-	int match;
+	int match, rc;
+
+	mutex_lock(&priv->fdb_lock);
 
 	table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP];
 	l2_lookup = table->entries;
@@ -2968,7 +2993,8 @@ static int sja1105_port_mcast_flood(struct sja1105_private *priv, int to,
 	if (match == table->entry_count) {
 		NL_SET_ERR_MSG_MOD(extack,
 				   "Could not find FDB entry for unknown multicast");
-		return -ENOSPC;
+		rc = -ENOSPC;
+		goto out;
 	}
 
 	if (flags.val & BR_MCAST_FLOOD)
@@ -2976,10 +3002,13 @@ static int sja1105_port_mcast_flood(struct sja1105_private *priv, int to,
 	else
 		l2_lookup[match].destports &= ~BIT(to);
 
-	return sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
-					    l2_lookup[match].index,
-					    &l2_lookup[match],
-					    true);
+	rc = sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
+					  l2_lookup[match].index,
+					  &l2_lookup[match], true);
+out:
+	mutex_unlock(&priv->fdb_lock);
+
+	return rc;
 }
 
 static int sja1105_port_pre_bridge_flags(struct dsa_switch *ds, int port,
@@ -3349,6 +3378,7 @@ static int sja1105_probe(struct spi_device *spi)
 	mutex_init(&priv->ptp_data.lock);
 	mutex_init(&priv->dynamic_config_lock);
 	mutex_init(&priv->mgmt_lock);
+	mutex_init(&priv->fdb_lock);
 	spin_lock_init(&priv->ts_id_lock);
 
 	rc = sja1105_parse_dt(priv);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 5/5] net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset
  2023-09-08 13:33 [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions Vladimir Oltean
                   ` (3 preceding siblings ...)
  2023-09-08 13:33 ` [PATCH net 4/5] net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses Vladimir Oltean
@ 2023-09-08 13:33 ` Vladimir Oltean
  2023-09-10 15:18 ` [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions Simon Horman
  5 siblings, 0 replies; 7+ messages in thread
From: Vladimir Oltean @ 2023-09-08 13:33 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, David S. Miller,
	Xiaoliang Yang, Andrew Lunn, Florian Fainelli, Yanan Yang,
	linux-kernel

Currently, when we add the first sja1105 port to a bridge with
vlan_filtering 1, then we sometimes see this output:

sja1105 spi2.2: port 4 failed to read back entry for be:79:b4:9e:9e:96 vid 3088: -ENOENT
sja1105 spi2.2: Reset switch and programmed static config. Reason: VLAN filtering
sja1105 spi2.2: port 0 failed to add be:79:b4:9e:9e:96 vid 0 to fdb: -2

It is because sja1105_fdb_add() runs from the dsa_owq which is no longer
serialized with switch resets since it dropped the rtnl_lock() in the
blamed commit.

Either performing the FDB accesses before the reset, or after the reset,
is equally fine, because sja1105_static_fdb_change() backs up those
changes in the static config, but FDB access during reset isn't ok.

Make sja1105_static_config_reload() take the fdb_lock to fix that.

Fixes: 0faf890fc519 ("net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/dsa/sja1105/sja1105_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index cefd72617af4..1a367e64bc3b 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2297,6 +2297,7 @@ int sja1105_static_config_reload(struct sja1105_private *priv,
 	int rc, i;
 	s64 now;
 
+	mutex_lock(&priv->fdb_lock);
 	mutex_lock(&priv->mgmt_lock);
 
 	mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
@@ -2409,6 +2410,7 @@ int sja1105_static_config_reload(struct sja1105_private *priv,
 		goto out;
 out:
 	mutex_unlock(&priv->mgmt_lock);
+	mutex_unlock(&priv->fdb_lock);
 
 	return rc;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions
  2023-09-08 13:33 [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions Vladimir Oltean
                   ` (4 preceding siblings ...)
  2023-09-08 13:33 ` [PATCH net 5/5] net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset Vladimir Oltean
@ 2023-09-10 15:18 ` Simon Horman
  5 siblings, 0 replies; 7+ messages in thread
From: Simon Horman @ 2023-09-10 15:18 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	David S. Miller, Xiaoliang Yang, Andrew Lunn, Florian Fainelli,
	Yanan Yang, linux-kernel

On Fri, Sep 08, 2023 at 04:33:47PM +0300, Vladimir Oltean wrote:
> A report by Yanan Yang has prompted an investigation into the sja1105
> driver's behavior w.r.t. multicast. The report states that when adding
> multicast L2 addresses with "bridge mdb add", only the most recently
> added address works - the others seem to be overwritten. This is solved
> by patch 3/5 (with patch 2/5 as a dependency for it).
> 
> Patches 4/5 and 5/5 fix a series of race conditions introduced during
> the same patch set as the bug above, namely this one:
> https://patchwork.kernel.org/project/netdevbpf/cover/20211024171757.3753288-1-vladimir.oltean@nxp.com/
> 
> Finally, patch 1/5 fixes an issue found ever since the introduction of
> multicast forwarding offload in sja1105, which is that the multicast
> addresses are visible (with the "self" flag) in "bridge fdb show".

For series,

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-09-10 15:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-08 13:33 [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions Vladimir Oltean
2023-09-08 13:33 ` [PATCH net 1/5] net: dsa: sja1105: hide all multicast addresses from "bridge fdb show" Vladimir Oltean
2023-09-08 13:33 ` [PATCH net 2/5] net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid() Vladimir Oltean
2023-09-08 13:33 ` [PATCH net 3/5] net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry Vladimir Oltean
2023-09-08 13:33 ` [PATCH net 4/5] net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses Vladimir Oltean
2023-09-08 13:33 ` [PATCH net 5/5] net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset Vladimir Oltean
2023-09-10 15:18 ` [PATCH net 0/5] Fixes for SJA1105 DSA FDB regressions Simon Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).