* [PATCH net v3 0/3] net: bcmgenet: fix queue lock up
@ 2026-04-06 17:57 Justin Chen
2026-04-06 17:57 ` [PATCH net v3 1/3] net: bcmgenet: fix off-by-one in bcmgenet_put_txcb Justin Chen
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Justin Chen @ 2026-04-06 17:57 UTC (permalink / raw)
To: netdev
Cc: pabeni, kuba, edumazet, davem, andrew+netdev,
bcm-kernel-feedback-list, florian.fainelli, opendmb, nb,
Justin Chen
We have been seeing reports of logs like this.
# [ 39.713199] bcmgenet 1001300000.ethernet eth0: NETDEV WATCHDOG: CPU: 0: transmit queue 2 timed out 7991 ms
[ 41.761198] bcmgenet 1001300000.ethernet eth0: NETDEV WATCHDOG: CPU: 0: transmit queue 2 timed out 10039 ms
[ 43.745198] bcmgenet 1001300000.ethernet eth0: NETDEV WATCHDOG: CPU: 0: transmit queue 2 timed out 12023 ms
[ 45.729198] bcmgenet 1001300000.ethernet eth0: NETDEV WATCHDOG: CPU: 0: transmit queue 2 timed out 14007 ms
We have two issues. The persistent queue timeouts and the eventual lock up of the entire transmit.
We address the lock up issue first. The queue timeouts are due to a fundamental design issue
not a bug perse. Timeouts still persist, but we should no longer lock up.
v3
- Dropped this patch. Not relevant yet.
net: bcmgenet: relax the xmit ring full case
v2
- These patches stayed the same.
net: bcmgenet: fix off-by-one in bcmgenet_put_txcb
net: bcmgenet: fix leaking free_bds
- New patches that fix a few other slightly related issues.
net: bcmgenet: fix off-by-one in bcmgenet_put_txcb
net: bcmgenet: relax the xmit ring full case
Justin Chen (3):
net: bcmgenet: fix off-by-one in bcmgenet_put_txcb
net: bcmgenet: fix leaking free_bds
net: bcmgenet: fix racing timeout handler
.../net/ethernet/broadcom/genet/bcmgenet.c | 30 +++++++++----------
1 file changed, 14 insertions(+), 16 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net v3 1/3] net: bcmgenet: fix off-by-one in bcmgenet_put_txcb
2026-04-06 17:57 [PATCH net v3 0/3] net: bcmgenet: fix queue lock up Justin Chen
@ 2026-04-06 17:57 ` Justin Chen
2026-04-06 17:57 ` [PATCH net v3 2/3] net: bcmgenet: fix leaking free_bds Justin Chen
2026-04-06 17:57 ` [PATCH net v3 3/3] net: bcmgenet: fix racing timeout handler Justin Chen
2 siblings, 0 replies; 4+ messages in thread
From: Justin Chen @ 2026-04-06 17:57 UTC (permalink / raw)
To: netdev
Cc: pabeni, kuba, edumazet, davem, andrew+netdev,
bcm-kernel-feedback-list, florian.fainelli, opendmb, nb,
Justin Chen
The write_ptr points to the next open tx_cb. We want to return the
tx_cb that gets rewinded, so we must rewind the pointer first then
return the tx_cb that it points to. That way the txcb can be correctly
cleaned up.
Fixes: 876dbadd53a7 ("net: bcmgenet: Fix unmapping of fragments in bcmgenet_xmit()")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
---
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 482a31e7b72b..0f6e4baba25b 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1819,15 +1819,15 @@ static struct enet_cb *bcmgenet_put_txcb(struct bcmgenet_priv *priv,
{
struct enet_cb *tx_cb_ptr;
- tx_cb_ptr = ring->cbs;
- tx_cb_ptr += ring->write_ptr - ring->cb_ptr;
-
/* Rewinding local write pointer */
if (ring->write_ptr == ring->cb_ptr)
ring->write_ptr = ring->end_ptr;
else
ring->write_ptr--;
+ tx_cb_ptr = ring->cbs;
+ tx_cb_ptr += ring->write_ptr - ring->cb_ptr;
+
return tx_cb_ptr;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net v3 2/3] net: bcmgenet: fix leaking free_bds
2026-04-06 17:57 [PATCH net v3 0/3] net: bcmgenet: fix queue lock up Justin Chen
2026-04-06 17:57 ` [PATCH net v3 1/3] net: bcmgenet: fix off-by-one in bcmgenet_put_txcb Justin Chen
@ 2026-04-06 17:57 ` Justin Chen
2026-04-06 17:57 ` [PATCH net v3 3/3] net: bcmgenet: fix racing timeout handler Justin Chen
2 siblings, 0 replies; 4+ messages in thread
From: Justin Chen @ 2026-04-06 17:57 UTC (permalink / raw)
To: netdev
Cc: pabeni, kuba, edumazet, davem, andrew+netdev,
bcm-kernel-feedback-list, florian.fainelli, opendmb, nb,
Justin Chen
While reclaiming the tx queue we fast forward the write pointer to
drop any data in flight. These dropped frames are not added back
to the pool of free bds. We also need to tell the netdev that we
are dropping said data.
Fixes: f1bacae8b655 ("net: bcmgenet: support reclaiming unsent Tx packets")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Tested-by: Nicolai Buchwitz <nb@tipi-net.de>
---
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 0f6e4baba25b..e89126a0c20e 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1985,6 +1985,7 @@ static unsigned int bcmgenet_tx_reclaim(struct net_device *dev,
drop = (ring->prod_index - ring->c_index) & DMA_C_INDEX_MASK;
released += drop;
ring->prod_index = ring->c_index & DMA_C_INDEX_MASK;
+ ring->free_bds += drop;
while (drop--) {
cb_ptr = bcmgenet_put_txcb(priv, ring);
skb = cb_ptr->skb;
@@ -1996,6 +1997,7 @@ static unsigned int bcmgenet_tx_reclaim(struct net_device *dev,
}
if (skb)
dev_consume_skb_any(skb);
+ netdev_tx_reset_queue(netdev_get_tx_queue(dev, ring->index));
bcmgenet_tdma_ring_writel(priv, ring->index,
ring->prod_index, TDMA_PROD_INDEX);
wr_ptr = ring->write_ptr * WORDS_PER_BD(priv);
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net v3 3/3] net: bcmgenet: fix racing timeout handler
2026-04-06 17:57 [PATCH net v3 0/3] net: bcmgenet: fix queue lock up Justin Chen
2026-04-06 17:57 ` [PATCH net v3 1/3] net: bcmgenet: fix off-by-one in bcmgenet_put_txcb Justin Chen
2026-04-06 17:57 ` [PATCH net v3 2/3] net: bcmgenet: fix leaking free_bds Justin Chen
@ 2026-04-06 17:57 ` Justin Chen
2 siblings, 0 replies; 4+ messages in thread
From: Justin Chen @ 2026-04-06 17:57 UTC (permalink / raw)
To: netdev
Cc: pabeni, kuba, edumazet, davem, andrew+netdev,
bcm-kernel-feedback-list, florian.fainelli, opendmb, nb,
Justin Chen
The bcmgenet_timeout handler tries to take down all tx queues when
a single queue times out. This is over zealous and causes many race
conditions with queues that are still chugging along. Instead lets
only restart the timed out queue.
Fixes: 13ea657806cf ("net: bcmgenet: improve TX timeout")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Tested-by: Nicolai Buchwitz <nb@tipi-net.de>
---
.../net/ethernet/broadcom/genet/bcmgenet.c | 22 ++++++++-----------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index e89126a0c20e..54f71b1e85fc 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -3477,27 +3477,23 @@ static void bcmgenet_dump_tx_queue(struct bcmgenet_tx_ring *ring)
static void bcmgenet_timeout(struct net_device *dev, unsigned int txqueue)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
- u32 int1_enable = 0;
- unsigned int q;
+ struct bcmgenet_tx_ring *ring = &priv->tx_rings[txqueue];
+ struct netdev_queue *txq = netdev_get_tx_queue(dev, txqueue);
netif_dbg(priv, tx_err, dev, "bcmgenet_timeout\n");
- for (q = 0; q <= priv->hw_params->tx_queues; q++)
- bcmgenet_dump_tx_queue(&priv->tx_rings[q]);
-
- bcmgenet_tx_reclaim_all(dev);
+ bcmgenet_dump_tx_queue(ring);
- for (q = 0; q <= priv->hw_params->tx_queues; q++)
- int1_enable |= (1 << q);
+ bcmgenet_tx_reclaim(dev, ring, true);
- /* Re-enable TX interrupts if disabled */
- bcmgenet_intrl2_1_writel(priv, int1_enable, INTRL2_CPU_MASK_CLEAR);
+ /* Re-enable the TX interrupt for this ring */
+ bcmgenet_intrl2_1_writel(priv, 1 << txqueue, INTRL2_CPU_MASK_CLEAR);
- netif_trans_update(dev);
+ txq_trans_cond_update(txq);
- BCMGENET_STATS64_INC((&priv->tx_rings[txqueue].stats64), errors);
+ BCMGENET_STATS64_INC((&ring->stats64), errors);
- netif_tx_wake_all_queues(dev);
+ netif_tx_wake_queue(txq);
}
#define MAX_MDF_FILTER 17
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-06 17:58 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-06 17:57 [PATCH net v3 0/3] net: bcmgenet: fix queue lock up Justin Chen
2026-04-06 17:57 ` [PATCH net v3 1/3] net: bcmgenet: fix off-by-one in bcmgenet_put_txcb Justin Chen
2026-04-06 17:57 ` [PATCH net v3 2/3] net: bcmgenet: fix leaking free_bds Justin Chen
2026-04-06 17:57 ` [PATCH net v3 3/3] net: bcmgenet: fix racing timeout handler Justin Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox