From: Nicolai Buchwitz <nb@tipi-net.de>
To: justin.chen@broadcom.com
Cc: netdev@vger.kernel.org, pabeni@redhat.com, kuba@kernel.org,
edumazet@google.com, davem@davemloft.net, andrew+netdev@lunn.ch,
bcm-kernel-feedback-list@broadcom.com,
florian.fainelli@broadcom.com
Subject: Re: [PATCH RFC net] net: bcmgenet: fix leaking and racing timeout handler
Date: Thu, 26 Mar 2026 09:05:04 +0100 [thread overview]
Message-ID: <df5781320ba8e27f55743c0613fea23a@tipi-net.de> (raw)
In-Reply-To: <20260325173602.3676778-1-justin.chen@broadcom.com>
On 25.3.2026 18:36, justin.chen@broadcom.com wrote:
> From: Justin Chen <justin.chen@broadcom.com>
>
> The bcmgenet_timeout handler tries to take down all tx queues when
> a single queue times out. This is over zealous and causes many race
> conditions with queues that are still chugging along. Instead lets
> only restart the timed out queue.
>
> While reclaiming the tx queue we fast forward the write pointer to
> drop any data in flight. These dropped frames are not added back
> to the pool of free bds. We also need to tell the netdev that we
> are dropping said data.
>
> Signed-off-by: Justin Chen <justin.chen@broadcom.com>
> ---
> .../net/ethernet/broadcom/genet/bcmgenet.c | 24 +++++++++----------
> 1 file changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> index 482a31e7b72b..f69cf490445c 100644
> --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> @@ -1985,6 +1985,7 @@ static unsigned int bcmgenet_tx_reclaim(struct
> net_device *dev,
> drop = (ring->prod_index - ring->c_index) & DMA_C_INDEX_MASK;
> released += drop;
> ring->prod_index = ring->c_index & DMA_C_INDEX_MASK;
> + ring->free_bds += drop;
> while (drop--) {
> cb_ptr = bcmgenet_put_txcb(priv, ring);
> skb = cb_ptr->skb;
> @@ -1996,6 +1997,7 @@ static unsigned int bcmgenet_tx_reclaim(struct
> net_device *dev,
> }
> if (skb)
> dev_consume_skb_any(skb);
> + netdev_tx_reset_queue(netdev_get_tx_queue(dev, ring->index));
> bcmgenet_tdma_ring_writel(priv, ring->index,
> ring->prod_index, TDMA_PROD_INDEX);
> wr_ptr = ring->write_ptr * WORDS_PER_BD(priv);
> @@ -3475,27 +3477,23 @@ static void bcmgenet_dump_tx_queue(struct
> bcmgenet_tx_ring *ring)
> static void bcmgenet_timeout(struct net_device *dev, unsigned int
> txqueue)
> {
> struct bcmgenet_priv *priv = netdev_priv(dev);
> - u32 int1_enable = 0;
> - unsigned int q;
> + struct bcmgenet_tx_ring *ring = &priv->tx_rings[txqueue];
> + struct netdev_queue *txq = netdev_get_tx_queue(dev, txqueue);
>
> netif_dbg(priv, tx_err, dev, "bcmgenet_timeout\n");
>
> - for (q = 0; q <= priv->hw_params->tx_queues; q++)
> - bcmgenet_dump_tx_queue(&priv->tx_rings[q]);
> -
> - bcmgenet_tx_reclaim_all(dev);
> + bcmgenet_dump_tx_queue(ring);
>
> - for (q = 0; q <= priv->hw_params->tx_queues; q++)
> - int1_enable |= (1 << q);
> + bcmgenet_tx_reclaim(dev, ring, true);
>
> - /* Re-enable TX interrupts if disabled */
> - bcmgenet_intrl2_1_writel(priv, int1_enable, INTRL2_CPU_MASK_CLEAR);
> + /* Re-enable the TX interrupt for this ring */
> + bcmgenet_intrl2_1_writel(priv, 1 << txqueue, INTRL2_CPU_MASK_CLEAR);
>
> - netif_trans_update(dev);
> + txq_trans_cond_update(txq);
>
> - BCMGENET_STATS64_INC((&priv->tx_rings[txqueue].stats64), errors);
> + BCMGENET_STATS64_INC((&ring->stats64), errors);
>
> - netif_tx_wake_all_queues(dev);
> + netif_tx_wake_queue(txq);
> }
>
> #define MAX_MDF_FILTER 17
Can confirm that the lockup is gone with this patch. The watchdog still
reports a timeout (as already mentioned by you):
[Thu Mar 26 08:00:30 2026] bcmgenet fd580000.ethernet eth0: NETDEV
WATCHDOG: CPU: 0: transmit queue 0 timed out 2377 ms
[Thu Mar 26 08:00:32 2026] bcmgenet fd580000.ethernet eth0: NETDEV
WATCHDOG: CPU: 0: transmit queue 4 timed out 4361 ms
[Thu Mar 26 08:00:34 2026] bcmgenet fd580000.ethernet eth0: NETDEV
WATCHDOG: CPU: 0: transmit queue 4 timed out 2001 ms
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de
Tested-by: Nicolai Buchwitz <nb@tipi-net.de
Thanks
Nicolai
prev parent reply other threads:[~2026-03-26 8:05 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-25 17:36 [PATCH RFC net] net: bcmgenet: fix leaking and racing timeout handler justin.chen
2026-03-25 22:00 ` Florian Fainelli
2026-03-25 22:11 ` Justin Chen
2026-03-26 8:05 ` Nicolai Buchwitz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=df5781320ba8e27f55743c0613fea23a@tipi-net.de \
--to=nb@tipi-net.de \
--cc=andrew+netdev@lunn.ch \
--cc=bcm-kernel-feedback-list@broadcom.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=florian.fainelli@broadcom.com \
--cc=justin.chen@broadcom.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox