All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <hawk@kernel.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Simon Schippers <simon@schippers-hamm.de>,
	Paolo Abeni <pabeni@redhat.com>,
	netdev@vger.kernel.org, kernel-team@cloudflare.com,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	John Fastabend <john.fastabend@gmail.com>,
	Stanislav Fomichev <sdf@fomichev.me>,
	linux-kernel@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH net-next v5 3/5] veth: implement Byte Queue Limits (BQL) for latency reduction
Date: Mon, 11 May 2026 10:11:13 +0200	[thread overview]
Message-ID: <daa05c21-dcfc-4cc4-aa22-9e25c7f6c743@kernel.org> (raw)
In-Reply-To: <20260510085602.57c7a081@kernel.org>



On 10/05/2026 17.56, Jakub Kicinski wrote:
> On Sat, 9 May 2026 11:09:51 +0200 Jesper Dangaard Brouer wrote:
>> On 09/05/2026 04.06, Jakub Kicinski wrote:
>>> On Thu, 7 May 2026 21:09:09 +0200 Jesper Dangaard Brouer wrote:
>>>> Not against being able to modify VETH_RING_SIZE, but I don't think it is
>>>> the solution here.
>>>
>>> Was it evaluated, tho?
>>>
>>> It's obviously super easy these days have AI spew no end of complex
>>> code. So it'd be great to have some solid, ideally production-like
>>> data to back this all up.
>>>
>>> VETH_RING_SIZE seems trivial, ethtool set ringparam
>>
>> No, unfortunately we cannot just decrease the VETH_RING_SIZE.
> 
> To be clear - I said may it configurable with ethtool -G
> not change the default.
> 

Sure, I understand the desire to make VETH_RING_SIZE configurable.
If doing so we are making Linux network stack harder to tune and setup
correctly. E.g. adding a qdisc to veth would also require changing the
ring size, but if system also uses XDP then tuning below 64 (likely 128)
will lead to hard-to-find packet drops.

I prefer adding something (like BQL) that auto-tune how much of the ring
queue we are using.  Good queues function as shock absorbers when
concurrent processes in the OS have scheduling noise.

I acknowledge that Simon Schippers found that the BQL implementation was
actually not auto-tuning.  We need to work on this, my prototype
implementation [1] [2] works surprisingly well.


- [1] 
https://lore.kernel.org/all/3e43117f-356d-4086-a176-abd7fe2e6f0a@kernel.org/2-09-veth-time-based-bql-coalescing.patch
- [2] 
https://lore.kernel.org/all/3e43117f-356d-4086-a176-abd7fe2e6f0a@kernel.org/


>> The reason is that XDP-redirect into veth don't have any
>> back-pressure and would simply drop packets if queue size becomes
>> less than the NAPI budget (64). (Yes, we use both normal path and
>> XDP-redirect in production).
> 
> Doesn't this mean you have a queue which is not under BQL control?
> 

It is a matter of perspective. BQL needs between 17-55 elements in the
256 queue.  At the same time we handle if the ring runs full, e.g. due
to a sudden burst of XDP redirected packets, which pushes packets into
the qdisc layer.


>> My benchmarking shows that an optimal BQL limit is dynamically
>> adjusted between 17-55 depending on veth consumer namespace
>> overhead/speed, when balancing throughput and latency.
> 
> Testing with prod-approximating traffic pattern and load would be great.

That is what I'm doing.  I'm testing with prod-approximating traffic
pattern and changing the number of iptables rules to simulate the
overhead I measured from production.  I think I explained this in the
cover letter. We are going to use this in a production environment (to
be clear).

Simon found an issue testing the overload scenario.

--Jesper


  reply	other threads:[~2026-05-11  8:11 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-05 13:21 [PATCH net-next v5 0/5] veth: add Byte Queue Limits (BQL) support hawk
2026-05-05 13:21 ` [PATCH net-next v5 1/5] veth: fix OOB txq access in veth_poll() with asymmetric queue counts hawk
2026-05-07 14:25   ` Paolo Abeni
2026-05-05 13:21 ` [PATCH net-next v5 2/5] net: add dev->bql flag to allow BQL sysfs for IFF_NO_QUEUE devices hawk
2026-05-05 13:21 ` [PATCH net-next v5 3/5] veth: implement Byte Queue Limits (BQL) for latency reduction hawk
2026-05-06 18:50   ` sashiko-bot
2026-05-07  6:54   ` Simon Schippers
2026-05-07 13:21     ` Paolo Abeni
2026-05-07 14:34     ` Paolo Abeni
2026-05-07 14:46       ` Simon Schippers
2026-05-07 19:09         ` Jesper Dangaard Brouer
2026-05-07 20:12           ` Simon Schippers
2026-05-07 20:45             ` Jesper Dangaard Brouer
2026-05-08  8:01               ` Simon Schippers
2026-05-08  9:20                 ` Simon Schippers
2026-05-09  2:06           ` Jakub Kicinski
2026-05-09  9:09             ` Jesper Dangaard Brouer
2026-05-10 15:56               ` Jakub Kicinski
2026-05-11  8:11                 ` Jesper Dangaard Brouer [this message]
2026-05-11  9:55                   ` Simon Schippers
2026-05-11 18:08                     ` Jesper Dangaard Brouer
2026-05-11 20:37                       ` Simon Schippers
2026-05-12 13:54                         ` Jesper Dangaard Brouer
2026-05-12 21:55                           ` Simon Schippers
2026-05-05 13:21 ` [PATCH net-next v5 4/5] veth: add tx_timeout watchdog as BQL safety net hawk
2026-05-05 13:21 ` [PATCH net-next v5 5/5] net: sched: add timeout count to NETDEV WATCHDOG message hawk
2026-05-07 14:30 ` [PATCH net-next v5 0/5] veth: add Byte Queue Limits (BQL) support patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=daa05c21-dcfc-4cc4-aa22-9e25c7f6c743@kernel.org \
    --to=hawk@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=kernel-team@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=simon@schippers-hamm.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.