Re: [PATCH net-next v4 2/4] veth: implement Byte Queue Limits (BQL) for latency reduction

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: hawk@kernel.org
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH net-next v4 2/4] veth: implement Byte Queue Limits (BQL) for latency reduction
Date: Sat, 02 May 2026 07:18:27 +0000	[thread overview]
Message-ID: <20260502071828.616C3C19425@smtp.kernel.org> (raw)
In-Reply-To: <20260501071633.644353-3-hawk@kernel.org>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Out-of-bounds array access on peer_dev->_tx in veth_poll() when local RX queues exceed peer TX queues.
- [High] Disabling XDP permanently stops peer TX queues if they were backpressured (DRV_XOFF leak).
--

commit 9bc3ed022846dda6dd5cc131e798f949f63f491f
Author: Jesper Dangaard Brouer <hawk@kernel.org>

veth: implement Byte Queue Limits (BQL) for latency reduction

This commit adds byte queue limits support to the veth driver to provide
qdiscs with feedback and reduce latency. It charges a fixed unit per
packet rather than byte-based charging.

> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index e35df717e65e2..3de25ba34a906 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c

[ ... ]

> @@ -975,7 +1016,7 @@ static int veth_poll(struct napi_struct *napi, int budget)
>  	peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : NULL;
>  
>  	xdp_set_return_frame_no_direct();
> -	done = veth_xdp_rcv(rq, budget, &bq, &stats);
> +	done = veth_xdp_rcv(rq, budget, &bq, &stats, peer_txq);

Could this lead to an out-of-bounds array access on the peer device's
transmit queues?

If a veth pair is configured asymmetrically, the local device might have
more receive queues than the peer has transmit queues.

If packets are redirected from another interface into a higher-indexed
receive queue, the poll function will run with a queue_idx greater than or
equal to peer_dev->real_num_tx_queues.

Since netdev_get_tx_queue() performs a direct array access without bounds
checking, peer_txq would point out-of-bounds. This out-of-bounds pointer is
then passed into the new veth_xdp_rcv() code paths and accessed by
netif_tx_queue_stopped().

Should queue_idx be checked against peer_dev->real_num_tx_queues before
fetching the transmit queue?

[ ... ]

> @@ -1091,6 +1133,17 @@ static void veth_napi_del_range(struct net_device *dev, int start, int end)
>  		ptr_ring_cleanup(&rq->xdp_ring, veth_ptr_free);
>  	}
>  
> +	/* Reset BQL on peer's txqs: remaining ring items were freed above
> +	 * without BQL completion, so DQL state must be reset.
> +	 */
> +	peer = rtnl_dereference(priv->peer);
> +	if (peer) {
> +		int peer_end = min(end, (int)peer->real_num_tx_queues);
> +
> +		for (i = start; i < peer_end; i++)
> +			netdev_tx_reset_queue(netdev_get_tx_queue(peer, i));
> +	}

Does this cleanup leave peer transmit queues permanently stopped if they
were backpressured when xdp was disabled?

When the ptr_ring is full, veth_xmit() stops the transmit queue using
netif_tx_stop_queue(), which sets the __QUEUE_STATE_DRV_XOFF flag.

While netdev_tx_reset_queue() successfully clears the newly added
__QUEUE_STATE_STACK_XOFF flag, it does not clear __QUEUE_STATE_DRV_XOFF.
If a queue was backpressured exactly when xdp is disabled, it appears it
would remain locked forever.

Would it be appropriate to wake the queues using netif_tx_wake_queue() or
explicitly clear the __QUEUE_STATE_DRV_XOFF flag during this cleanup?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260501071633.644353-1-hawk@kernel.org?part=2

next prev parent reply	other threads:[~2026-05-02  7:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-01  7:16 [PATCH net-next v4 0/4] veth: add Byte Queue Limits (BQL) support hawk
2026-05-01  7:16 ` [PATCH net-next v4 1/4] net: add dev->bql flag to allow BQL sysfs for IFF_NO_QUEUE devices hawk
2026-05-01  7:16 ` [PATCH net-next v4 2/4] veth: implement Byte Queue Limits (BQL) for latency reduction hawk
2026-05-02  7:18   ` sashiko-bot [this message]
2026-05-05 12:40     ` Jesper Dangaard Brouer
2026-05-01  7:16 ` [PATCH net-next v4 3/4] veth: add tx_timeout watchdog as BQL safety net hawk
2026-05-01  7:16 ` [PATCH net-next v4 4/4] net: sched: add timeout count to NETDEV WATCHDOG message hawk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260502071828.616C3C19425@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=hawk@kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.