All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: hawk@kernel.org
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH net-next v7 5/5] veth: time-based BQL completion coalescing via ethtool tx-usecs
Date: Sat, 13 Jun 2026 08:36:05 +0000	[thread overview]
Message-ID: <20260613083606.A6A491F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260612083530.1650245-6-hawk@kernel.org>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] BQL completions are stranded when NAPI goes to sleep if the timer has not expired, artificially destroying DQL performance estimates.
- [Low] Direct access to `peer_txq->dql.limit` breaks compilation when CONFIG_BQL is disabled.
--

commit 7aba7f3c0951cf87cb2ef78e981e1f66a316a71d
Author: Simon Schippers <simon.schippers@tu-dortmund.de>

veth: time-based BQL completion coalescing via ethtool tx-usecs

This patch accumulates BQL completions and flushes them when a configurable
time threshold (tx-usecs) is exceeded. This lets DQL discover a limit that
bounds actual queuing delay to the configured interval.

> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 2473f730734b7..c62d87a8402cb 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c

[ ... ]

> @@ -969,13 +1029,54 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
>  	return NULL;
>  }
>  
> +static void veth_bql_maybe_complete(struct veth_bql_state *state,
> +				    struct netdev_queue *peer_txq,
> +				    u64 bql_flush_ns)
> +{
> +	u64 current_time;
> +
> +	/* There is no reason to complete with 0 and
> +	 * peer_txq could go away.
> +	 */
> +	if (!state->n_bql || !peer_txq)
> +		return;
> +
> +	current_time = sched_clock();
> +
> +	/* We complete if:
> +	 * 1. We reach bql_flush_ns.
> +	 * 2. We potentially have BQL starvation.
> +	 */
> +	if (state->time + bql_flush_ns <= current_time ||
> +	    state->n_bql > peer_txq->dql.limit) {

[Severity: Low]
Does this direct access to peer_txq->dql.limit cause compilation errors when
CONFIG_BQL is disabled? The dql member of struct netdev_queue is
conditionally compiled, so accessing it directly might break the build.

> +		netdev_tx_completed_queue(peer_txq, state->n_bql,
> +					  state->n_bql * VETH_BQL_UNIT);
> +		state->time = current_time;
> +		state->n_bql = 0;
> +	}
> +}
> +
>  static int veth_xdp_rcv(struct veth_rq *rq, int budget,
>  			struct veth_xdp_tx_bq *bq,
>  			struct veth_stats *stats,
>  			struct netdev_queue *peer_txq)

[ ... ]

> @@ -1000,12 +1101,11 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
>  			}
>  		} else {
>  			/* ndo_start_xmit */
> -			bool bql_charged = veth_ptr_is_bql(ptr);
>  			struct sk_buff *skb = veth_ptr_to_skb(ptr);
>  
> +			if (veth_ptr_is_bql(ptr))
> +				state->n_bql++;
>  			stats->xdp_bytes += skb->len;
> -			if (peer_txq && bql_charged)
> -				netdev_tx_completed_queue(peer_txq, 1, VETH_BQL_UNIT);
>  
>  			skb = veth_xdp_rcv_skb(rq, skb, bq, stats);
>  			if (skb) {
> @@ -1015,6 +1115,7 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
>  					napi_gro_receive(&rq->xdp_napi, skb);
>  			}
>  		}
> +		veth_bql_maybe_complete(state, peer_txq, bql_flush_ns);
>  		done++;
>  	}

[Severity: High]
If veth_xdp_rcv() finishes and returns a done count less than the budget,
NAPI will go to sleep in veth_poll(). Do we need to unconditionally flush
any stranded BQL completions in veth_poll() before sleeping?

If completions are left in rq->bql_state indefinitely across NAPI idle
periods, it might present an artificially massive delay to DQL. This could
cause DQL to mistakenly conclude the hardware is extremely slow and
aggressively shrink dql.limit to its minimum, crippling throughput on
subsequent bursts.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260612083530.1650245-1-hawk@kernel.org?part=5

  reply	other threads:[~2026-06-13  8:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12  8:35 [PATCH net-next v7 0/5] veth: add Byte Queue Limits (BQL) support hawk
2026-06-12  8:35 ` [PATCH net-next v7 1/5] net: add dev->bql flag to allow BQL sysfs for IFF_NO_QUEUE devices hawk
2026-06-12  8:35 ` [PATCH net-next v7 2/5] veth: implement Byte Queue Limits (BQL) for latency reduction hawk
2026-06-13  8:36   ` sashiko-bot
2026-06-12  8:35 ` [PATCH net-next v7 3/5] veth: add tx_timeout watchdog as BQL safety net hawk
2026-06-12  8:35 ` [PATCH net-next v7 4/5] net: sched: add timeout count to NETDEV WATCHDOG message hawk
2026-06-12  8:35 ` [PATCH net-next v7 5/5] veth: time-based BQL completion coalescing via ethtool tx-usecs hawk
2026-06-13  8:36   ` sashiko-bot [this message]
2026-06-13 14:14   ` Simon Schippers
2026-06-12 14:10 ` [PATCH net-next v7 0/5] veth: add Byte Queue Limits (BQL) support Simon Schippers
2026-06-12 17:21   ` Jonas Köppeler
2026-06-13 13:57     ` Simon Schippers
2026-06-16  1:53 ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260613083606.A6A491F00A3A@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=hawk@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.