[PATCH v3 net] net: mvpp2: Prevent parser TCAM memory corruption

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 net] net: mvpp2: Prevent parser TCAM memory corruption
@ 2025-03-26 10:37 Tobias Waldekranz
  2025-03-26 11:07 ` Maxime Chevallier
  0 siblings, 1 reply; 3+ messages in thread
From: Tobias Waldekranz @ 2025-03-26 10:37 UTC (permalink / raw)
  To: davem, kuba
  Cc: maxime.chevallier, marcin.s.wojtas, linux, andrew, edumazet,
	pabeni, netdev

Protect the parser TCAM/SRAM memory, and the cached (shadow) SRAM
information, from concurrent modifications.

Both the TCAM and SRAM tables are indirectly accessed by configuring
an index register that selects the row to read or write to. This means
that operations must be atomic in order to, e.g., avoid spreading
writes across multiple rows. Since the shadow SRAM array is used to
find free rows in the hardware table, it must also be protected in
order to avoid TOCTOU errors where multiple cores allocate the same
row.

This issue was detected in a situation where `mvpp2_set_rx_mode()` ran
concurrently on two CPUs. In this particular case the
MVPP2_PE_MAC_UC_PROMISCUOUS entry was corrupted, causing the
classifier unit to drop all incoming unicast - indicated by the
`rx_classifier_drops` counter.

Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
---

@Andrew: I did finally manage to trigger sparse warnings that could be
silenced with __must_hold() annotations, but I still do not understand
how they work. I went back to the change that pulled this in:

https://lore.kernel.org/all/C5833F40-2EA6-43DA-B69C-AFF59E76E0C9@coraid.com/T/

The referenced function (tx()), still exists in aoenet.c. Using that
as a template, I could construct an unlock+lock sequence that
triggered a warning without __must_hold(). For example...

spin_unlock_bh(&priv->prs_spinlock);
if (net_ratelimit())
	schedule();
spin_lock_bh(&priv->prs_spinlock);

...would generate a warning. But this...

spin_unlock_bh(&priv->prs_spinlock);
net_ratelimit();
schedule();
spin_lock_bh(&priv->prs_spinlock);

...would not.

Reading through the sparse validation suite, it does not seem to have
any tests that covers this either:

https://web.git.kernel.org/pub/scm/devel/sparse/sparse.git/tree/validation/context.c

Therefore, I decided to take Jakub's advise and add lockdep assertions
instead. That necessitated some more changes, since tables are updated
in the init phase (where I originally omitted locking).

@Maxime: There was enough of a diff between v2->v3 that I did not feel
comfortable including your signoff/testing tags. Would it be possible
for you to run your tests again on this version?

v2 -> v3:
 - Note the fact that prs_spinlock also protects the SRAM shadow table
   (Andrew)
 - Add lockdep assertions for HW table accesses (Andrew/Jakub)
 - Add locking to the init path, since it uses the table access
   functions that are now annotated with lockdep assertions

v1 -> v2:
 - Parser memory is never modified from hard IRQ context, so settle
   for disabling bottom halves in critical sections. (Maxime)


 drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |   3 +
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   |   3 +-
 .../net/ethernet/marvell/mvpp2/mvpp2_prs.c    | 184 ++++++++++++------
 3 files changed, 133 insertions(+), 57 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
index 44fe9b68d1c2..061fcd444d50 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
@@ -1113,6 +1113,9 @@ struct mvpp2 {
 
 	/* Spinlocks for CM3 shared memory configuration */
 	spinlock_t mss_spinlock;
+
+	/* Spinlock for shared PRS parser memory and shadow table */
+	spinlock_t prs_spinlock;
 };
 
 struct mvpp2_pcpu_stats {
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index dd76c1b7ed3a..c63e5f1b168a 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -7722,8 +7722,9 @@ static int mvpp2_probe(struct platform_device *pdev)
 	if (mvpp2_read(priv, MVPP2_VER_ID_REG) == MVPP2_VER_PP23)
 		priv->hw_version = MVPP23;
 
-	/* Init mss lock */
+	/* Init locks for shared packet processor resources */
 	spin_lock_init(&priv->mss_spinlock);
+	spin_lock_init(&priv->prs_spinlock);
 
 	/* Initialize network controller */
 	err = mvpp2_init(pdev, priv);
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c
index 9af22f497a40..cdb3f3dba83f 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c
@@ -23,6 +23,8 @@ static int mvpp2_prs_hw_write(struct mvpp2 *priv, struct mvpp2_prs_entry *pe)
 {
 	int i;
 
+	lockdep_assert_held(&priv->prs_spinlock);
+
 	if (pe->index > MVPP2_PRS_TCAM_SRAM_SIZE - 1)
 		return -EINVAL;
 
@@ -43,11 +45,13 @@ static int mvpp2_prs_hw_write(struct mvpp2 *priv, struct mvpp2_prs_entry *pe)
 }
 
 /* Initialize tcam entry from hw */
-int mvpp2_prs_init_from_hw(struct mvpp2 *priv, struct mvpp2_prs_entry *pe,
-			   int tid)
+static int mvpp2_prs_init_from_hw_unlocked(struct mvpp2 *priv,
+					   struct mvpp2_prs_entry *pe, int tid)
 {
 	int i;
 
+	lockdep_assert_held(&priv->prs_spinlock);
+
 	if (tid > MVPP2_PRS_TCAM_SRAM_SIZE - 1)
 		return -EINVAL;
 
@@ -73,6 +77,18 @@ int mvpp2_prs_init_from_hw(struct mvpp2 *priv, struct mvpp2_prs_entry *pe,
 	return 0;
 }
 
+int mvpp2_prs_init_from_hw(struct mvpp2 *priv, struct mvpp2_prs_entry *pe,
+			   int tid)
+{
+	int err;
+
+	spin_lock_bh(&priv->prs_spinlock);
+	err = mvpp2_prs_init_from_hw_unlocked(priv, pe, tid);
+	spin_unlock_bh(&priv->prs_spinlock);
+
+	return err;
+}
+
 /* Invalidate tcam hw entry */
 static void mvpp2_prs_hw_inv(struct mvpp2 *priv, int index)
 {
@@ -374,7 +390,7 @@ static int mvpp2_prs_flow_find(struct mvpp2 *priv, int flow)
 		    priv->prs_shadow[tid].lu != MVPP2_PRS_LU_FLOWS)
 			continue;
 
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 		bits = mvpp2_prs_sram_ai_get(&pe);
 
 		/* Sram store classification lookup ID in AI bits [5:0] */
@@ -441,7 +457,7 @@ static void mvpp2_prs_mac_drop_all_set(struct mvpp2 *priv, int port, bool add)
 
 	if (priv->prs_shadow[MVPP2_PE_DROP_ALL].valid) {
 		/* Entry exist - update port only */
-		mvpp2_prs_init_from_hw(priv, &pe, MVPP2_PE_DROP_ALL);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, MVPP2_PE_DROP_ALL);
 	} else {
 		/* Entry doesn't exist - create new */
 		memset(&pe, 0, sizeof(pe));
@@ -469,14 +485,17 @@ static void mvpp2_prs_mac_drop_all_set(struct mvpp2 *priv, int port, bool add)
 }
 
 /* Set port to unicast or multicast promiscuous mode */
-void mvpp2_prs_mac_promisc_set(struct mvpp2 *priv, int port,
-			       enum mvpp2_prs_l2_cast l2_cast, bool add)
+static void mvpp2_prs_mac_promisc_set_unlocked(struct mvpp2 *priv, int port,
+					       enum mvpp2_prs_l2_cast l2_cast,
+					       bool add)
 {
 	struct mvpp2_prs_entry pe;
 	unsigned char cast_match;
 	unsigned int ri;
 	int tid;
 
+	lockdep_assert_held(&priv->prs_spinlock);
+
 	if (l2_cast == MVPP2_PRS_L2_UNI_CAST) {
 		cast_match = MVPP2_PRS_UCAST_VAL;
 		tid = MVPP2_PE_MAC_UC_PROMISCUOUS;
@@ -489,7 +508,7 @@ void mvpp2_prs_mac_promisc_set(struct mvpp2 *priv, int port,
 
 	/* promiscuous mode - Accept unknown unicast or multicast packets */
 	if (priv->prs_shadow[tid].valid) {
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 	} else {
 		memset(&pe, 0, sizeof(pe));
 		mvpp2_prs_tcam_lu_set(&pe, MVPP2_PRS_LU_MAC);
@@ -522,6 +541,14 @@ void mvpp2_prs_mac_promisc_set(struct mvpp2 *priv, int port,
 	mvpp2_prs_hw_write(priv, &pe);
 }
 
+void mvpp2_prs_mac_promisc_set(struct mvpp2 *priv, int port,
+			       enum mvpp2_prs_l2_cast l2_cast, bool add)
+{
+		spin_lock_bh(&priv->prs_spinlock);
+		mvpp2_prs_mac_promisc_set_unlocked(priv, port, l2_cast, add);
+		spin_unlock_bh(&priv->prs_spinlock);
+}
+
 /* Set entry for dsa packets */
 static void mvpp2_prs_dsa_tag_set(struct mvpp2 *priv, int port, bool add,
 				  bool tagged, bool extend)
@@ -539,7 +566,7 @@ static void mvpp2_prs_dsa_tag_set(struct mvpp2 *priv, int port, bool add,
 
 	if (priv->prs_shadow[tid].valid) {
 		/* Entry exist - update port only */
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 	} else {
 		/* Entry doesn't exist - create new */
 		memset(&pe, 0, sizeof(pe));
@@ -610,7 +637,7 @@ static void mvpp2_prs_dsa_tag_ethertype_set(struct mvpp2 *priv, int port,
 
 	if (priv->prs_shadow[tid].valid) {
 		/* Entry exist - update port only */
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 	} else {
 		/* Entry doesn't exist - create new */
 		memset(&pe, 0, sizeof(pe));
@@ -673,7 +700,7 @@ static int mvpp2_prs_vlan_find(struct mvpp2 *priv, unsigned short tpid, int ai)
 		    priv->prs_shadow[tid].lu != MVPP2_PRS_LU_VLAN)
 			continue;
 
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 		match = mvpp2_prs_tcam_data_cmp(&pe, 0, tpid);
 		if (!match)
 			continue;
@@ -726,7 +753,7 @@ static int mvpp2_prs_vlan_add(struct mvpp2 *priv, unsigned short tpid, int ai,
 			    priv->prs_shadow[tid_aux].lu != MVPP2_PRS_LU_VLAN)
 				continue;
 
-			mvpp2_prs_init_from_hw(priv, &pe, tid_aux);
+			mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid_aux);
 			ri_bits = mvpp2_prs_sram_ri_get(&pe);
 			if ((ri_bits & MVPP2_PRS_RI_VLAN_MASK) ==
 			    MVPP2_PRS_RI_VLAN_DOUBLE)
@@ -760,7 +787,7 @@ static int mvpp2_prs_vlan_add(struct mvpp2 *priv, unsigned short tpid, int ai,
 
 		mvpp2_prs_shadow_set(priv, pe.index, MVPP2_PRS_LU_VLAN);
 	} else {
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 	}
 	/* Update ports' mask */
 	mvpp2_prs_tcam_port_map_set(&pe, port_map);
@@ -800,7 +827,7 @@ static int mvpp2_prs_double_vlan_find(struct mvpp2 *priv, unsigned short tpid1,
 		    priv->prs_shadow[tid].lu != MVPP2_PRS_LU_VLAN)
 			continue;
 
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 
 		match = mvpp2_prs_tcam_data_cmp(&pe, 0, tpid1) &&
 			mvpp2_prs_tcam_data_cmp(&pe, 4, tpid2);
@@ -849,7 +876,7 @@ static int mvpp2_prs_double_vlan_add(struct mvpp2 *priv, unsigned short tpid1,
 			    priv->prs_shadow[tid_aux].lu != MVPP2_PRS_LU_VLAN)
 				continue;
 
-			mvpp2_prs_init_from_hw(priv, &pe, tid_aux);
+			mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid_aux);
 			ri_bits = mvpp2_prs_sram_ri_get(&pe);
 			ri_bits &= MVPP2_PRS_RI_VLAN_MASK;
 			if (ri_bits == MVPP2_PRS_RI_VLAN_SINGLE ||
@@ -880,7 +907,7 @@ static int mvpp2_prs_double_vlan_add(struct mvpp2 *priv, unsigned short tpid1,
 
 		mvpp2_prs_shadow_set(priv, pe.index, MVPP2_PRS_LU_VLAN);
 	} else {
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 	}
 
 	/* Update ports' mask */
@@ -1213,8 +1240,8 @@ static void mvpp2_prs_mac_init(struct mvpp2 *priv)
 	/* Create dummy entries for drop all and promiscuous modes */
 	mvpp2_prs_drop_fc(priv);
 	mvpp2_prs_mac_drop_all_set(priv, 0, false);
-	mvpp2_prs_mac_promisc_set(priv, 0, MVPP2_PRS_L2_UNI_CAST, false);
-	mvpp2_prs_mac_promisc_set(priv, 0, MVPP2_PRS_L2_MULTI_CAST, false);
+	mvpp2_prs_mac_promisc_set_unlocked(priv, 0, MVPP2_PRS_L2_UNI_CAST, false);
+	mvpp2_prs_mac_promisc_set_unlocked(priv, 0, MVPP2_PRS_L2_MULTI_CAST, false);
 }
 
 /* Set default entries for various types of dsa packets */
@@ -1941,7 +1968,7 @@ static int mvpp2_prs_vid_range_find(struct mvpp2_port *port, u16 vid, u16 mask)
 		    port->priv->prs_shadow[tid].lu != MVPP2_PRS_LU_VID)
 			continue;
 
-		mvpp2_prs_init_from_hw(port->priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(port->priv, &pe, tid);
 
 		mvpp2_prs_tcam_data_byte_get(&pe, 2, &byte[0], &enable[0]);
 		mvpp2_prs_tcam_data_byte_get(&pe, 3, &byte[1], &enable[1]);
@@ -1970,6 +1997,8 @@ int mvpp2_prs_vid_entry_add(struct mvpp2_port *port, u16 vid)
 
 	memset(&pe, 0, sizeof(pe));
 
+	spin_lock_bh(&priv->prs_spinlock);
+
 	/* Scan TCAM and see if entry with this <vid,port> already exist */
 	tid = mvpp2_prs_vid_range_find(port, vid, mask);
 
@@ -1988,8 +2017,10 @@ int mvpp2_prs_vid_entry_add(struct mvpp2_port *port, u16 vid)
 						MVPP2_PRS_VLAN_FILT_MAX_ENTRY);
 
 		/* There isn't room for a new VID filter */
-		if (tid < 0)
+		if (tid < 0) {
+			spin_unlock_bh(&priv->prs_spinlock);
 			return tid;
+		}
 
 		mvpp2_prs_tcam_lu_set(&pe, MVPP2_PRS_LU_VID);
 		pe.index = tid;
@@ -1997,7 +2028,7 @@ int mvpp2_prs_vid_entry_add(struct mvpp2_port *port, u16 vid)
 		/* Mask all ports */
 		mvpp2_prs_tcam_port_map_set(&pe, 0);
 	} else {
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 	}
 
 	/* Enable the current port */
@@ -2019,6 +2050,7 @@ int mvpp2_prs_vid_entry_add(struct mvpp2_port *port, u16 vid)
 	mvpp2_prs_shadow_set(priv, pe.index, MVPP2_PRS_LU_VID);
 	mvpp2_prs_hw_write(priv, &pe);
 
+	spin_unlock_bh(&priv->prs_spinlock);
 	return 0;
 }
 
@@ -2028,15 +2060,16 @@ void mvpp2_prs_vid_entry_remove(struct mvpp2_port *port, u16 vid)
 	struct mvpp2 *priv = port->priv;
 	int tid;
 
-	/* Scan TCAM and see if entry with this <vid,port> already exist */
-	tid = mvpp2_prs_vid_range_find(port, vid, 0xfff);
+	spin_lock_bh(&priv->prs_spinlock);
 
-	/* No such entry */
-	if (tid < 0)
-		return;
+	/* Invalidate TCAM entry with this <vid,port>, if it exists */
+	tid = mvpp2_prs_vid_range_find(port, vid, 0xfff);
+	if (tid >= 0) {
+		mvpp2_prs_hw_inv(priv, tid);
+		priv->prs_shadow[tid].valid = false;
+	}
 
-	mvpp2_prs_hw_inv(priv, tid);
-	priv->prs_shadow[tid].valid = false;
+	spin_unlock_bh(&priv->prs_spinlock);
 }
 
 /* Remove all existing VID filters on this port */
@@ -2045,6 +2078,8 @@ void mvpp2_prs_vid_remove_all(struct mvpp2_port *port)
 	struct mvpp2 *priv = port->priv;
 	int tid;
 
+	spin_lock_bh(&priv->prs_spinlock);
+
 	for (tid = MVPP2_PRS_VID_PORT_FIRST(port->id);
 	     tid <= MVPP2_PRS_VID_PORT_LAST(port->id); tid++) {
 		if (priv->prs_shadow[tid].valid) {
@@ -2052,6 +2087,8 @@ void mvpp2_prs_vid_remove_all(struct mvpp2_port *port)
 			priv->prs_shadow[tid].valid = false;
 		}
 	}
+
+	spin_unlock_bh(&priv->prs_spinlock);
 }
 
 /* Remove VID filering entry for this port */
@@ -2060,10 +2097,14 @@ void mvpp2_prs_vid_disable_filtering(struct mvpp2_port *port)
 	unsigned int tid = MVPP2_PRS_VID_PORT_DFLT(port->id);
 	struct mvpp2 *priv = port->priv;
 
+	spin_lock_bh(&priv->prs_spinlock);
+
 	/* Invalidate the guard entry */
 	mvpp2_prs_hw_inv(priv, tid);
 
 	priv->prs_shadow[tid].valid = false;
+
+	spin_unlock_bh(&priv->prs_spinlock);
 }
 
 /* Add guard entry that drops packets when no VID is matched on this port */
@@ -2079,6 +2120,8 @@ void mvpp2_prs_vid_enable_filtering(struct mvpp2_port *port)
 
 	memset(&pe, 0, sizeof(pe));
 
+	spin_lock_bh(&priv->prs_spinlock);
+
 	pe.index = tid;
 
 	reg_val = mvpp2_read(priv, MVPP2_MH_REG(port->id));
@@ -2111,6 +2154,8 @@ void mvpp2_prs_vid_enable_filtering(struct mvpp2_port *port)
 	/* Update shadow table */
 	mvpp2_prs_shadow_set(priv, pe.index, MVPP2_PRS_LU_VID);
 	mvpp2_prs_hw_write(priv, &pe);
+
+	spin_unlock_bh(&priv->prs_spinlock);
 }
 
 /* Parser default initialization */
@@ -2118,6 +2163,8 @@ int mvpp2_prs_default_init(struct platform_device *pdev, struct mvpp2 *priv)
 {
 	int err, index, i;
 
+	spin_lock_bh(&priv->prs_spinlock);
+
 	/* Enable tcam table */
 	mvpp2_write(priv, MVPP2_PRS_TCAM_CTRL_REG, MVPP2_PRS_TCAM_EN_MASK);
 
@@ -2139,8 +2186,10 @@ int mvpp2_prs_default_init(struct platform_device *pdev, struct mvpp2 *priv)
 	priv->prs_shadow = devm_kcalloc(&pdev->dev, MVPP2_PRS_TCAM_SRAM_SIZE,
 					sizeof(*priv->prs_shadow),
 					GFP_KERNEL);
-	if (!priv->prs_shadow)
-		return -ENOMEM;
+	if (!priv->prs_shadow) {
+		err = -ENOMEM;
+		goto out_unlock;
+	}
 
 	/* Always start from lookup = 0 */
 	for (index = 0; index < MVPP2_MAX_PORTS; index++)
@@ -2158,26 +2207,14 @@ int mvpp2_prs_default_init(struct platform_device *pdev, struct mvpp2 *priv)
 	mvpp2_prs_vid_init(priv);
 
 	err = mvpp2_prs_etype_init(priv);
-	if (err)
-		return err;
+	err = err ? : mvpp2_prs_vlan_init(pdev, priv);
+	err = err ? : mvpp2_prs_pppoe_init(priv);
+	err = err ? : mvpp2_prs_ip6_init(priv);
+	err = err ? : mvpp2_prs_ip4_init(priv);
 
-	err = mvpp2_prs_vlan_init(pdev, priv);
-	if (err)
-		return err;
-
-	err = mvpp2_prs_pppoe_init(priv);
-	if (err)
-		return err;
-
-	err = mvpp2_prs_ip6_init(priv);
-	if (err)
-		return err;
-
-	err = mvpp2_prs_ip4_init(priv);
-	if (err)
-		return err;
-
-	return 0;
+out_unlock:
+	spin_unlock_bh(&priv->prs_spinlock);
+	return err;
 }
 
 /* Compare MAC DA with tcam entry data */
@@ -2217,7 +2254,7 @@ mvpp2_prs_mac_da_range_find(struct mvpp2 *priv, int pmap, const u8 *da,
 		    (priv->prs_shadow[tid].udf != udf_type))
 			continue;
 
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 		entry_pmap = mvpp2_prs_tcam_port_map_get(&pe);
 
 		if (mvpp2_prs_mac_range_equals(&pe, da, mask) &&
@@ -2229,7 +2266,8 @@ mvpp2_prs_mac_da_range_find(struct mvpp2 *priv, int pmap, const u8 *da,
 }
 
 /* Update parser's mac da entry */
-int mvpp2_prs_mac_da_accept(struct mvpp2_port *port, const u8 *da, bool add)
+static int mvpp2_prs_mac_da_accept_unlocked(struct mvpp2_port *port,
+					    const u8 *da, bool add)
 {
 	unsigned char mask[ETH_ALEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
 	struct mvpp2 *priv = port->priv;
@@ -2261,7 +2299,7 @@ int mvpp2_prs_mac_da_accept(struct mvpp2_port *port, const u8 *da, bool add)
 		/* Mask all ports */
 		mvpp2_prs_tcam_port_map_set(&pe, 0);
 	} else {
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 	}
 
 	mvpp2_prs_tcam_lu_set(&pe, MVPP2_PRS_LU_MAC);
@@ -2317,6 +2355,17 @@ int mvpp2_prs_mac_da_accept(struct mvpp2_port *port, const u8 *da, bool add)
 	return 0;
 }
 
+int mvpp2_prs_mac_da_accept(struct mvpp2_port *port, const u8 *da, bool add)
+{
+	int err;
+
+	spin_lock_bh(&port->priv->prs_spinlock);
+	err = mvpp2_prs_mac_da_accept_unlocked(port, da, add);
+	spin_unlock_bh(&port->priv->prs_spinlock);
+
+	return err;
+}
+
 int mvpp2_prs_update_mac_da(struct net_device *dev, const u8 *da)
 {
 	struct mvpp2_port *port = netdev_priv(dev);
@@ -2345,6 +2394,8 @@ void mvpp2_prs_mac_del_all(struct mvpp2_port *port)
 	unsigned long pmap;
 	int index, tid;
 
+	spin_lock_bh(&priv->prs_spinlock);
+
 	for (tid = MVPP2_PE_MAC_RANGE_START;
 	     tid <= MVPP2_PE_MAC_RANGE_END; tid++) {
 		unsigned char da[ETH_ALEN], da_mask[ETH_ALEN];
@@ -2354,7 +2405,7 @@ void mvpp2_prs_mac_del_all(struct mvpp2_port *port)
 		    (priv->prs_shadow[tid].udf != MVPP2_PRS_UDF_MAC_DEF))
 			continue;
 
-		mvpp2_prs_init_from_hw(priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(priv, &pe, tid);
 
 		pmap = mvpp2_prs_tcam_port_map_get(&pe);
 
@@ -2375,14 +2426,17 @@ void mvpp2_prs_mac_del_all(struct mvpp2_port *port)
 			continue;
 
 		/* Remove entry from TCAM */
-		mvpp2_prs_mac_da_accept(port, da, false);
+		mvpp2_prs_mac_da_accept_unlocked(port, da, false);
 	}
+
+	spin_unlock_bh(&priv->prs_spinlock);
 }
 
 int mvpp2_prs_tag_mode_set(struct mvpp2 *priv, int port, int type)
 {
 	switch (type) {
 	case MVPP2_TAG_TYPE_EDSA:
+		spin_lock_bh(&priv->prs_spinlock);
 		/* Add port to EDSA entries */
 		mvpp2_prs_dsa_tag_set(priv, port, true,
 				      MVPP2_PRS_TAGGED, MVPP2_PRS_EDSA);
@@ -2393,9 +2447,11 @@ int mvpp2_prs_tag_mode_set(struct mvpp2 *priv, int port, int type)
 				      MVPP2_PRS_TAGGED, MVPP2_PRS_DSA);
 		mvpp2_prs_dsa_tag_set(priv, port, false,
 				      MVPP2_PRS_UNTAGGED, MVPP2_PRS_DSA);
+		spin_unlock_bh(&priv->prs_spinlock);
 		break;
 
 	case MVPP2_TAG_TYPE_DSA:
+		spin_lock_bh(&priv->prs_spinlock);
 		/* Add port to DSA entries */
 		mvpp2_prs_dsa_tag_set(priv, port, true,
 				      MVPP2_PRS_TAGGED, MVPP2_PRS_DSA);
@@ -2406,10 +2462,12 @@ int mvpp2_prs_tag_mode_set(struct mvpp2 *priv, int port, int type)
 				      MVPP2_PRS_TAGGED, MVPP2_PRS_EDSA);
 		mvpp2_prs_dsa_tag_set(priv, port, false,
 				      MVPP2_PRS_UNTAGGED, MVPP2_PRS_EDSA);
+		spin_unlock_bh(&priv->prs_spinlock);
 		break;
 
 	case MVPP2_TAG_TYPE_MH:
 	case MVPP2_TAG_TYPE_NONE:
+		spin_lock_bh(&priv->prs_spinlock);
 		/* Remove port form EDSA and DSA entries */
 		mvpp2_prs_dsa_tag_set(priv, port, false,
 				      MVPP2_PRS_TAGGED, MVPP2_PRS_DSA);
@@ -2419,6 +2477,7 @@ int mvpp2_prs_tag_mode_set(struct mvpp2 *priv, int port, int type)
 				      MVPP2_PRS_TAGGED, MVPP2_PRS_EDSA);
 		mvpp2_prs_dsa_tag_set(priv, port, false,
 				      MVPP2_PRS_UNTAGGED, MVPP2_PRS_EDSA);
+		spin_unlock_bh(&priv->prs_spinlock);
 		break;
 
 	default:
@@ -2437,11 +2496,15 @@ int mvpp2_prs_add_flow(struct mvpp2 *priv, int flow, u32 ri, u32 ri_mask)
 
 	memset(&pe, 0, sizeof(pe));
 
+	spin_lock_bh(&priv->prs_spinlock);
+
 	tid = mvpp2_prs_tcam_first_free(priv,
 					MVPP2_PE_LAST_FREE_TID,
 					MVPP2_PE_FIRST_FREE_TID);
-	if (tid < 0)
+	if (tid < 0) {
+		spin_unlock_bh(&priv->prs_spinlock);
 		return tid;
+	}
 
 	pe.index = tid;
 
@@ -2461,6 +2524,7 @@ int mvpp2_prs_add_flow(struct mvpp2 *priv, int flow, u32 ri, u32 ri_mask)
 	mvpp2_prs_tcam_port_map_set(&pe, MVPP2_PRS_PORT_MASK);
 	mvpp2_prs_hw_write(priv, &pe);
 
+	spin_unlock_bh(&priv->prs_spinlock);
 	return 0;
 }
 
@@ -2472,6 +2536,8 @@ int mvpp2_prs_def_flow(struct mvpp2_port *port)
 
 	memset(&pe, 0, sizeof(pe));
 
+	spin_lock_bh(&port->priv->prs_spinlock);
+
 	tid = mvpp2_prs_flow_find(port->priv, port->id);
 
 	/* Such entry not exist */
@@ -2480,8 +2546,10 @@ int mvpp2_prs_def_flow(struct mvpp2_port *port)
 		tid = mvpp2_prs_tcam_first_free(port->priv,
 						MVPP2_PE_LAST_FREE_TID,
 					       MVPP2_PE_FIRST_FREE_TID);
-		if (tid < 0)
+		if (tid < 0) {
+			spin_unlock_bh(&port->priv->prs_spinlock);
 			return tid;
+		}
 
 		pe.index = tid;
 
@@ -2492,13 +2560,14 @@ int mvpp2_prs_def_flow(struct mvpp2_port *port)
 		/* Update shadow table */
 		mvpp2_prs_shadow_set(port->priv, pe.index, MVPP2_PRS_LU_FLOWS);
 	} else {
-		mvpp2_prs_init_from_hw(port->priv, &pe, tid);
+		mvpp2_prs_init_from_hw_unlocked(port->priv, &pe, tid);
 	}
 
 	mvpp2_prs_tcam_lu_set(&pe, MVPP2_PRS_LU_FLOWS);
 	mvpp2_prs_tcam_port_map_set(&pe, (1 << port->id));
 	mvpp2_prs_hw_write(port->priv, &pe);
 
+	spin_unlock_bh(&port->priv->prs_spinlock);
 	return 0;
 }
 
@@ -2509,11 +2578,14 @@ int mvpp2_prs_hits(struct mvpp2 *priv, int index)
 	if (index > MVPP2_PRS_TCAM_SRAM_SIZE)
 		return -EINVAL;
 
+	spin_lock_bh(&priv->prs_spinlock);
+
 	mvpp2_write(priv, MVPP2_PRS_TCAM_HIT_IDX_REG, index);
 
 	val = mvpp2_read(priv, MVPP2_PRS_TCAM_HIT_CNT_REG);
 
 	val &= MVPP2_PRS_TCAM_HIT_CNT_MASK;
 
+	spin_unlock_bh(&priv->prs_spinlock);
 	return val;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3 net] net: mvpp2: Prevent parser TCAM memory corruption
  2025-03-26 10:37 [PATCH v3 net] net: mvpp2: Prevent parser TCAM memory corruption Tobias Waldekranz
@ 2025-03-26 11:07 ` Maxime Chevallier
  2025-03-26 12:27   ` Tobias Waldekranz
  0 siblings, 1 reply; 3+ messages in thread
From: Maxime Chevallier @ 2025-03-26 11:07 UTC (permalink / raw)
  To: Tobias Waldekranz
  Cc: davem, kuba, marcin.s.wojtas, linux, andrew, edumazet, pabeni,
	netdev

Hi Tobias,

On Wed, 26 Mar 2025 11:37:33 +0100
Tobias Waldekranz <tobias@waldekranz.com> wrote:

> Protect the parser TCAM/SRAM memory, and the cached (shadow) SRAM
> information, from concurrent modifications.
> 
> Both the TCAM and SRAM tables are indirectly accessed by configuring
> an index register that selects the row to read or write to. This means
> that operations must be atomic in order to, e.g., avoid spreading
> writes across multiple rows. Since the shadow SRAM array is used to
> find free rows in the hardware table, it must also be protected in
> order to avoid TOCTOU errors where multiple cores allocate the same
> row.
> 
> This issue was detected in a situation where `mvpp2_set_rx_mode()` ran
> concurrently on two CPUs. In this particular case the
> MVPP2_PE_MAC_UC_PROMISCUOUS entry was corrupted, causing the
> classifier unit to drop all incoming unicast - indicated by the
> `rx_classifier_drops` counter.
> 
> Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
> ---
> 
> @Andrew: I did finally manage to trigger sparse warnings that could be
> silenced with __must_hold() annotations, but I still do not understand
> how they work. I went back to the change that pulled this in:
> 
> https://lore.kernel.org/all/C5833F40-2EA6-43DA-B69C-AFF59E76E0C9@coraid.com/T/
> 
> The referenced function (tx()), still exists in aoenet.c. Using that
> as a template, I could construct an unlock+lock sequence that
> triggered a warning without __must_hold(). For example...
> 
> spin_unlock_bh(&priv->prs_spinlock);
> if (net_ratelimit())
> 	schedule();
> spin_lock_bh(&priv->prs_spinlock);
> 
> ...would generate a warning. But this...
> 
> spin_unlock_bh(&priv->prs_spinlock);
> net_ratelimit();
> schedule();
> spin_lock_bh(&priv->prs_spinlock);
> 
> ...would not.
> 
> Reading through the sparse validation suite, it does not seem to have
> any tests that covers this either:
> 
> https://web.git.kernel.org/pub/scm/devel/sparse/sparse.git/tree/validation/context.c
> 
> Therefore, I decided to take Jakub's advise and add lockdep assertions
> instead. That necessitated some more changes, since tables are updated
> in the init phase (where I originally omitted locking).
> 
> @Maxime: There was enough of a diff between v2->v3 that I did not feel
> comfortable including your signoff/testing tags. Would it be possible
> for you to run your tests again on this version?

Sure thing, although I do have some comments :)

[...]

>  /* Parser default initialization */
> @@ -2118,6 +2163,8 @@ int mvpp2_prs_default_init(struct platform_device *pdev, struct mvpp2 *priv)
>  {
>  	int err, index, i;
>  
> +	spin_lock_bh(&priv->prs_spinlock);
> +
>  	/* Enable tcam table */
>  	mvpp2_write(priv, MVPP2_PRS_TCAM_CTRL_REG, MVPP2_PRS_TCAM_EN_MASK);
>  
> @@ -2139,8 +2186,10 @@ int mvpp2_prs_default_init(struct platform_device *pdev, struct mvpp2 *priv)
>  	priv->prs_shadow = devm_kcalloc(&pdev->dev, MVPP2_PRS_TCAM_SRAM_SIZE,
>  					sizeof(*priv->prs_shadow),
>  					GFP_KERNEL);

GFP_KERNEL alloc while holding a spinlock isn't correct and triggers a
splat when building when CONFIG_DEBUG_ATOMIC_SLEEP :

[    4.380325] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:321
[    4.389217] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
[    4.397120] preempt_count: 201, expected: 0
[    4.401358] RCU nest depth: 0, expected: 0
[    4.405507] 2 locks held by swapper/0/1:
[    4.409488]  #0: ffff000100e168f8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x8c/0x1ac
[    4.417971]  #1: ffff00010ae15368 (&priv->prs_spinlock){+...}-{3:3}, at: mvpp2_prs_default_init+0x50/0x1570
[    4.427843] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0-rc7-01963-g02bf787e4750 #68
[    4.427851] Hardware name: Marvell 8040 MACCHIATOBin Double-shot (DT)
[    4.427855] Call trace:
[    4.427858]  show_stack+0x18/0x24 (C)
[    4.427867]  dump_stack_lvl+0xd8/0xf0
[    4.427875]  dump_stack+0x18/0x24
[    4.427880]  __might_resched+0x148/0x24c
[    4.427890]  __might_sleep+0x48/0x7c
[    4.427897]  __kmalloc_node_track_caller_noprof+0x200/0x480
[    4.427903]  devm_kmalloc+0x54/0x118
[    4.427910]  mvpp2_prs_default_init+0x138/0x1570
[    4.427919]  mvpp2_probe+0x904/0xfa4
[    4.427926]  platform_probe+0x68/0xc8
[...]

I suggest you move that alloc and associated error handling outside of
the spinlock.

Thanks,

Maxime

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3 net] net: mvpp2: Prevent parser TCAM memory corruption
  2025-03-26 11:07 ` Maxime Chevallier
@ 2025-03-26 12:27   ` Tobias Waldekranz
  0 siblings, 0 replies; 3+ messages in thread
From: Tobias Waldekranz @ 2025-03-26 12:27 UTC (permalink / raw)
  To: Maxime Chevallier
  Cc: davem, kuba, marcin.s.wojtas, linux, andrew, edumazet, pabeni,
	netdev

On ons, mar 26, 2025 at 12:07, Maxime Chevallier <maxime.chevallier@bootlin.com> wrote:
> Hi Tobias,
>
> On Wed, 26 Mar 2025 11:37:33 +0100
> Tobias Waldekranz <tobias@waldekranz.com> wrote:
>
>> Protect the parser TCAM/SRAM memory, and the cached (shadow) SRAM
>> information, from concurrent modifications.
>> 
>> Both the TCAM and SRAM tables are indirectly accessed by configuring
>> an index register that selects the row to read or write to. This means
>> that operations must be atomic in order to, e.g., avoid spreading
>> writes across multiple rows. Since the shadow SRAM array is used to
>> find free rows in the hardware table, it must also be protected in
>> order to avoid TOCTOU errors where multiple cores allocate the same
>> row.
>> 
>> This issue was detected in a situation where `mvpp2_set_rx_mode()` ran
>> concurrently on two CPUs. In this particular case the
>> MVPP2_PE_MAC_UC_PROMISCUOUS entry was corrupted, causing the
>> classifier unit to drop all incoming unicast - indicated by the
>> `rx_classifier_drops` counter.
>> 
>> Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
>> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
>> ---
>> 
>> @Andrew: I did finally manage to trigger sparse warnings that could be
>> silenced with __must_hold() annotations, but I still do not understand
>> how they work. I went back to the change that pulled this in:
>> 
>> https://lore.kernel.org/all/C5833F40-2EA6-43DA-B69C-AFF59E76E0C9@coraid.com/T/
>> 
>> The referenced function (tx()), still exists in aoenet.c. Using that
>> as a template, I could construct an unlock+lock sequence that
>> triggered a warning without __must_hold(). For example...
>> 
>> spin_unlock_bh(&priv->prs_spinlock);
>> if (net_ratelimit())
>> 	schedule();
>> spin_lock_bh(&priv->prs_spinlock);
>> 
>> ...would generate a warning. But this...
>> 
>> spin_unlock_bh(&priv->prs_spinlock);
>> net_ratelimit();
>> schedule();
>> spin_lock_bh(&priv->prs_spinlock);
>> 
>> ...would not.
>> 
>> Reading through the sparse validation suite, it does not seem to have
>> any tests that covers this either:
>> 
>> https://web.git.kernel.org/pub/scm/devel/sparse/sparse.git/tree/validation/context.c
>> 
>> Therefore, I decided to take Jakub's advise and add lockdep assertions
>> instead. That necessitated some more changes, since tables are updated
>> in the init phase (where I originally omitted locking).
>> 
>> @Maxime: There was enough of a diff between v2->v3 that I did not feel
>> comfortable including your signoff/testing tags. Would it be possible
>> for you to run your tests again on this version?
>
> Sure thing, although I do have some comments :)
>
> [...]
>
>>  /* Parser default initialization */
>> @@ -2118,6 +2163,8 @@ int mvpp2_prs_default_init(struct platform_device *pdev, struct mvpp2 *priv)
>>  {
>>  	int err, index, i;
>>  
>> +	spin_lock_bh(&priv->prs_spinlock);
>> +
>>  	/* Enable tcam table */
>>  	mvpp2_write(priv, MVPP2_PRS_TCAM_CTRL_REG, MVPP2_PRS_TCAM_EN_MASK);
>>  
>> @@ -2139,8 +2186,10 @@ int mvpp2_prs_default_init(struct platform_device *pdev, struct mvpp2 *priv)
>>  	priv->prs_shadow = devm_kcalloc(&pdev->dev, MVPP2_PRS_TCAM_SRAM_SIZE,
>>  					sizeof(*priv->prs_shadow),
>>  					GFP_KERNEL);
>
> GFP_KERNEL alloc while holding a spinlock isn't correct and triggers a
> splat when building when CONFIG_DEBUG_ATOMIC_SLEEP :

I think I had pretty much every other debug flag enabled in my config :)

Thanks for catching this!

> [    4.380325] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:321
> [    4.389217] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
> [    4.397120] preempt_count: 201, expected: 0
> [    4.401358] RCU nest depth: 0, expected: 0
> [    4.405507] 2 locks held by swapper/0/1:
> [    4.409488]  #0: ffff000100e168f8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x8c/0x1ac
> [    4.417971]  #1: ffff00010ae15368 (&priv->prs_spinlock){+...}-{3:3}, at: mvpp2_prs_default_init+0x50/0x1570
> [    4.427843] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0-rc7-01963-g02bf787e4750 #68
> [    4.427851] Hardware name: Marvell 8040 MACCHIATOBin Double-shot (DT)
> [    4.427855] Call trace:
> [    4.427858]  show_stack+0x18/0x24 (C)
> [    4.427867]  dump_stack_lvl+0xd8/0xf0
> [    4.427875]  dump_stack+0x18/0x24
> [    4.427880]  __might_resched+0x148/0x24c
> [    4.427890]  __might_sleep+0x48/0x7c
> [    4.427897]  __kmalloc_node_track_caller_noprof+0x200/0x480
> [    4.427903]  devm_kmalloc+0x54/0x118
> [    4.427910]  mvpp2_prs_default_init+0x138/0x1570
> [    4.427919]  mvpp2_probe+0x904/0xfa4
> [    4.427926]  platform_probe+0x68/0xc8
> [...]
>
> I suggest you move that alloc and associated error handling outside of
> the spinlock.

Will do. Sorry for the noise. I have fixed this locally - will send v4
as soon as the rules permit.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-03-26 12:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-26 10:37 [PATCH v3 net] net: mvpp2: Prevent parser TCAM memory corruption Tobias Waldekranz
2025-03-26 11:07 ` Maxime Chevallier
2025-03-26 12:27   ` Tobias Waldekranz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).