* [PATCH net] gve: Fix stuck TX queue for DQ queue format
@ 2025-06-04 20:19 Harshitha Ramamurthy
2025-06-05 15:13 ` Jakub Kicinski
0 siblings, 1 reply; 3+ messages in thread
From: Harshitha Ramamurthy @ 2025-06-04 20:19 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, jeroendb, hramamurthy,
andrew+netdev, willemb, pkaligineedi, joshwash, thostet, jfraker,
awogbemila, linux-kernel, stable
From: Praveen Kaligineedi <pkaligineedi@google.com>
gve_tx_timeout was calculating missed completions in a way that is only
relevant in the GQ queue format. Additionally, it was attempting to
disable device interrupts, which is not needed in either GQ or DQ queue
formats.
As a result, TX timeouts with the DQ queue format likely would have
triggered early resets without kicking the queue at all.
This patch drops the check for pending work altogether and always kicks
the queue after validating the queue has not seen a TX timeout too
recently.
Fixes: 87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ")
Co-developed-by: Tim Hostetler <thostet@google.com>
Signed-off-by: Tim Hostetler <thostet@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
---
drivers/net/ethernet/google/gve/gve_main.c | 16 ++++------------
1 file changed, 4 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index c3791cf..0c6328b 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -1921,7 +1921,6 @@ static void gve_tx_timeout(struct net_device *dev, unsigned int txqueue)
struct gve_notify_block *block;
struct gve_tx_ring *tx = NULL;
struct gve_priv *priv;
- u32 last_nic_done;
u32 current_time;
u32 ntfy_idx;
@@ -1941,17 +1940,10 @@ static void gve_tx_timeout(struct net_device *dev, unsigned int txqueue)
if (tx->last_kick_msec + MIN_TX_TIMEOUT_GAP > current_time)
goto reset;
- /* Check to see if there are missed completions, which will allow us to
- * kick the queue.
- */
- last_nic_done = gve_tx_load_event_counter(priv, tx);
- if (last_nic_done - tx->done) {
- netdev_info(dev, "Kicking queue %d", txqueue);
- iowrite32be(GVE_IRQ_MASK, gve_irq_doorbell(priv, block));
- napi_schedule(&block->napi);
- tx->last_kick_msec = current_time;
- goto out;
- } // Else reset.
+ netdev_info(dev, "Kicking queue %d", txqueue);
+ napi_schedule(&block->napi);
+ tx->last_kick_msec = current_time;
+ goto out;
reset:
gve_schedule_reset(priv);
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net] gve: Fix stuck TX queue for DQ queue format
2025-06-04 20:19 [PATCH net] gve: Fix stuck TX queue for DQ queue format Harshitha Ramamurthy
@ 2025-06-05 15:13 ` Jakub Kicinski
2025-06-05 17:15 ` Tim Hostetler
0 siblings, 1 reply; 3+ messages in thread
From: Jakub Kicinski @ 2025-06-05 15:13 UTC (permalink / raw)
To: Harshitha Ramamurthy
Cc: netdev, davem, edumazet, pabeni, jeroendb, andrew+netdev, willemb,
pkaligineedi, joshwash, thostet, jfraker, awogbemila,
linux-kernel, stable
On Wed, 4 Jun 2025 20:19:38 +0000 Harshitha Ramamurthy wrote:
> + netdev_info(dev, "Kicking queue %d", txqueue);
> + napi_schedule(&block->napi);
> + tx->last_kick_msec = current_time;
> + goto out;
>
> reset:
> gve_schedule_reset(priv);
gotos at the base level of the function are too ugly to exit.
Please refactor this first to move the logic that decides whether
reset should happen to a separate helper, then you can avoid both
gotos/labels.
goto reset should turn into return true
goto out should turn into return false
--
pw-bot: cr
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net] gve: Fix stuck TX queue for DQ queue format
2025-06-05 15:13 ` Jakub Kicinski
@ 2025-06-05 17:15 ` Tim Hostetler
0 siblings, 0 replies; 3+ messages in thread
From: Tim Hostetler @ 2025-06-05 17:15 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Harshitha Ramamurthy, netdev, davem, edumazet, pabeni, jeroendb,
andrew+netdev, willemb, pkaligineedi, joshwash, jfraker,
awogbemila, linux-kernel, stable
On Thu, Jun 5, 2025 at 8:13 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 4 Jun 2025 20:19:38 +0000 Harshitha Ramamurthy wrote:
> > + netdev_info(dev, "Kicking queue %d", txqueue);
> > + napi_schedule(&block->napi);
> > + tx->last_kick_msec = current_time;
> > + goto out;
> >
> > reset:
> > gve_schedule_reset(priv);
>
> gotos at the base level of the function are too ugly to exit.
>
> Please refactor this first to move the logic that decides whether
> reset should happen to a separate helper, then you can avoid both
> gotos/labels.
>
> goto reset should turn into return true
> goto out should turn into return false
That makes sense to me, I'll refactor this in v2.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-06-05 17:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-04 20:19 [PATCH net] gve: Fix stuck TX queue for DQ queue format Harshitha Ramamurthy
2025-06-05 15:13 ` Jakub Kicinski
2025-06-05 17:15 ` Tim Hostetler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).