All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: hawk@kernel.org, netdev@vger.kernel.org
Cc: edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	davem@davemloft.net, andrew+netdev@lunn.ch, horms@kernel.org,
	jhs@mojatatu.com, jiri@resnulli.us, sdf@fomichev.me,
	j.koeppeler@tu-berlin.de, mfreemon@cloudflare.com,
	carges@cloudflare.com
Subject: Re: [RFC PATCH net-next 2/6] veth: implement Byte Queue Limits (BQL) for latency reduction
Date: Wed, 18 Mar 2026 15:28:43 +0100	[thread overview]
Message-ID: <87ms05nrdw.fsf@toke.dk> (raw)
In-Reply-To: <20260318134826.1281205-3-hawk@kernel.org>

hawk@kernel.org writes:

> From: Jesper Dangaard Brouer <hawk@kernel.org>
>
> Add BQL support to the veth driver to dynamically limit the number of
> bytes queued in the ptr_ring, giving the qdisc earlier feedback to shape
> traffic and reduce latency.
>
> The BQL charge (netdev_tx_sent_queue) is placed in veth_xmit() BEFORE
> veth_forward_skb() produces the SKB into the ptr_ring. This ordering is
> critical: with threaded NAPI the consumer runs on a separate CPU and can
> complete the SKB (calling dql_completed) before veth_xmit() returns. If
> the charge happened after the produce, the completion could race ahead
> of the charge, violating dql_completed()'s invariant that completed
> bytes never exceed queued bytes (BUG_ON).
>
> Whether an SKB was BQL-charged is tracked per-SKB using a VETH_BQL_FLAG
> bit in the ptr_ring pointer (BIT(1), alongside the existing VETH_XDP_FLAG
> BIT(0)). The do_bql flag from veth_xmit() propagates through
> veth_forward_skb() and veth_xdp_rx() into the ptr_ring entry. On the
> completion side in veth_xdp_rcv(), veth_ptr_is_bql() reads the flag to
> decide whether to call netdev_tx_completed_queue(). Per-SKB tracking is
> necessary because the qdisc can be replaced live (e.g. noqueue->sfq or
> vice versa via 'tc qdisc replace') while SKBs are already in-flight in
> the ptr_ring. SKBs charged under the old qdisc must complete correctly
> regardless of what qdisc is attached when the consumer runs, so each
> SKB carries its own BQL-charged state rather than re-checking the peer's
> qdisc at completion time.

It's not completely obvious to me why BQL can't be active regardless of
whether there's a qdisc installed or not? If there's no qdisc, shouldn't
BQL auto-tune to a higher value because the queue runs empty more?

-Toke


  reply	other threads:[~2026-03-18 14:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-18 13:48 [RFC PATCH net-next 0/6] veth: add Byte Queue Limits (BQL) support hawk
2026-03-18 13:48 ` [RFC PATCH net-next 1/6] net: add dev->bql flag to allow BQL sysfs for IFF_NO_QUEUE devices hawk
2026-03-18 13:48 ` [RFC PATCH net-next 2/6] veth: implement Byte Queue Limits (BQL) for latency reduction hawk
2026-03-18 14:28   ` Toke Høiland-Jørgensen [this message]
2026-03-18 16:24     ` Jesper Dangaard Brouer
2026-03-19 10:04       ` Toke Høiland-Jørgensen
2026-03-20 14:50         ` Jesper Dangaard Brouer
2026-03-23 10:10           ` Toke Høiland-Jørgensen
2026-03-18 13:48 ` [RFC PATCH net-next 3/6] veth: add tx_timeout watchdog as BQL safety net hawk
2026-03-18 13:48 ` [RFC PATCH net-next 4/6] net: sched: add timeout count to NETDEV WATCHDOG message hawk
2026-03-18 13:48 ` [RFC PATCH net-next 5/6] selftests: net: add veth BQL stress test hawk
2026-03-18 13:48 ` [RFC PATCH net-next 6/6] net_sched: codel: fix stale state for empty flows in fq_codel hawk
2026-03-18 14:10   ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ms05nrdw.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=carges@cloudflare.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=j.koeppeler@tu-berlin.de \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=mfreemon@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.