From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19BA73624B8 for ; Wed, 18 Mar 2026 13:48:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773841724; cv=none; b=LtoKurwi/YmQedChd7uKvjCXXBuIl5Fs/1wo7cLDJM7ysn8490Rq2ZMDiwQN9Ccn7jzkCrEPTfVHFszlo86wxZR9GOuIqnhmIx5EbfEdw61IbZwvDobhOpmMj2HoXeZxDzwH3d0j2jXzbv2Q0/OgPoj824983O3HMF7vnBBD1fY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773841724; c=relaxed/simple; bh=kmAlTVSYkBvlhmijDIelF9x8a46Vp2oGoa0FlSXWQys=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=nGvmKJtPOobErVtzstsF7Ca3EW1O2fHHMJuNryIY8lXwdjr7WL4K9mDXwUI7rkNEbmJfHLeShlCzEtkKW9Ml6H/S2cOV0mM0zMcRs9DY1amuEWS0xKbEpjNCzuXXHWeBHS3AD7g1NNkg1NtU3Sn4R5gjT2mr/WpPz+BCEvnHOBo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VD5Cqnat; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VD5Cqnat" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 76144C2BC87; Wed, 18 Mar 2026 13:48:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773841723; bh=kmAlTVSYkBvlhmijDIelF9x8a46Vp2oGoa0FlSXWQys=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VD5CqnatiIG5518jAOu6zhpUx7CrurQl/5Gd9UdzMqj6qUpLjaYz2UmU7F7KMUq9p /2fGrGhoRciFSqpLxsB1Mi1Kkd+hPSPuRDS11LHz852SmM4LqKTlF9H3c3icbRfPJF bUjM1D55ssRTz54WLKPRlzenP+QVt6BVMBsOsbYyBzwHUDEz6KataMGWHPxyYoHkjA ViFFSZZBAkr01EbDBVpwDfIMwMw/Cfz6k5oQVd7HQcWzfVqxY6C1KBDjGTxeZk5JPJ b7WLQIueXlPMhAcZuQhTW8AqdT/dRo/X0+Z4nAdgMyY6omqvjF9K0LjBdptBlKXmfn v2Dt2T3HtBVlA== From: hawk@kernel.org To: netdev@vger.kernel.org Cc: edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, davem@davemloft.net, andrew+netdev@lunn.ch, horms@kernel.org, jhs@mojatatu.com, jiri@resnulli.us, toke@toke.dk, sdf@fomichev.me, j.koeppeler@tu-berlin.de, mfreemon@cloudflare.com, carges@cloudflare.com Subject: [RFC PATCH net-next 3/6] veth: add tx_timeout watchdog as BQL safety net Date: Wed, 18 Mar 2026 14:48:23 +0100 Message-ID: <20260318134826.1281205-4-hawk@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260318134826.1281205-1-hawk@kernel.org> References: <20260318134826.1281205-1-hawk@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Jesper Dangaard Brouer With the introduction of BQL (Byte Queue Limits) for veth, there are now two independent mechanisms that can stop a transmit queue: - DRV_XOFF: set by netif_tx_stop_queue() when the ptr_ring is full - STACK_XOFF: set by BQL when the byte-in-flight limit is reached If either mechanism stalls without a corresponding wake/completion, the queue stops permanently. Enable the net device watchdog timer and implement ndo_tx_timeout as a failsafe recovery. The timeout handler resets BQL state (clearing STACK_XOFF) and wakes the queue (clearing DRV_XOFF), covering both stop mechanisms. The watchdog fires after 16 seconds, which accommodates worst-case NAPI processing (budget=64 packets x 250ms per-packet consumer delay) without false positives under normal backpressure. Signed-off-by: Jesper Dangaard Brouer Tested-by: Jonas Köppeler --- drivers/net/veth.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index bad49e0d0bfc..2717f65fb352 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1423,6 +1423,22 @@ static int veth_set_channels(struct net_device *dev, goto out; } +static void veth_tx_timeout(struct net_device *dev, unsigned int txqueue) +{ + struct netdev_queue *txq = netdev_get_tx_queue(dev, txqueue); + + netdev_err(dev, + "veth backpressure(0x%lX) stalled(n:%ld) TXQ(%u) re-enable\n", + txq->state, atomic_long_read(&txq->trans_timeout), txqueue); + + /* Clear both stop mechanisms: + * - DRV_XOFF: set by netif_tx_stop_queue (ptr_ring backpressure) + * - STACK_XOFF: set by BQL when byte limit is reached + */ + netdev_tx_reset_queue(txq); + netif_tx_wake_queue(txq); +} + static int veth_open(struct net_device *dev) { struct veth_priv *priv = netdev_priv(dev); @@ -1761,6 +1777,7 @@ static const struct net_device_ops veth_netdev_ops = { .ndo_bpf = veth_xdp, .ndo_xdp_xmit = veth_ndo_xdp_xmit, .ndo_get_peer_dev = veth_peer_dev, + .ndo_tx_timeout = veth_tx_timeout, }; static const struct xdp_metadata_ops veth_xdp_metadata_ops = { @@ -1800,6 +1817,7 @@ static void veth_setup(struct net_device *dev) dev->priv_destructor = veth_dev_free; dev->pcpu_stat_type = NETDEV_PCPU_STAT_TSTATS; dev->max_mtu = ETH_MAX_MTU; + dev->watchdog_timeo = msecs_to_jiffies(16000); dev->hw_features = VETH_FEATURES; dev->hw_enc_features = VETH_FEATURES; -- 2.43.0