From: Pavel Begunkov <asml.silence@gmail.com>
To: netdev@vger.kernel.org
Cc: "David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Michael Chan <michael.chan@broadcom.com>,
Pavan Chebbi <pavan.chebbi@broadcom.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Jesper Dangaard Brouer <hawk@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
Shuah Khan <shuah@kernel.org>,
Mina Almasry <almasrymina@google.com>,
Stanislav Fomichev <sdf@fomichev.me>,
Pavel Begunkov <asml.silence@gmail.com>,
Yue Haibing <yuehaibing@huawei.com>, David Wei <dw@davidwei.uk>,
Haiyue Wang <haiyuewa@163.com>, Jens Axboe <axboe@kernel.dk>,
Joe Damato <jdamato@fastly.com>, Simon Horman <horms@kernel.org>,
Vishwanath Seshagiri <vishs@fb.com>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
io-uring@vger.kernel.org, dtatulea@nvidia.com
Subject: [PATCH net-next v7 0/9] Add support for providers with large rx buffer
Date: Sun, 30 Nov 2025 23:35:15 +0000 [thread overview]
Message-ID: <cover.1764542851.git.asml.silence@gmail.com> (raw)
Note: it's net/ only bits and doesn't include changes, which shoulf be
merged separately and are posted separately. The full branch for
convenience is at [1], and the patch is here:
https://lore.kernel.org/io-uring/7486ab32e99be1f614b3ef8d0e9bc77015b173f7.1764265323.git.asml.silence@gmail.com
Many modern NICs support configurable receive buffer lengths, and zcrx and
memory providers can use buffers larger than 4K/PAGE_SIZE on x86 to improve
performance. When paired with hw-gro larger rx buffer sizes can drastically
reduce the number of buffers traversing the stack and save a lot of processing
time. It also allows to give to users larger contiguous chunks of data. The
idea was first floated around by Saeed during netdev conf 2024 and was
asked about by a few folks.
Single stream benchmarks showed up to ~30% CPU util improvement.
E.g. comparison for 4K vs 32K buffers using a 200Gbit NIC:
packets=23987040 (MB=2745098), rps=199559 (MB/s=22837)
CPU %usr %nice %sys %iowait %irq %soft %idle
0 1.53 0.00 27.78 2.72 1.31 66.45 0.22
packets=24078368 (MB=2755550), rps=200319 (MB/s=22924)
CPU %usr %nice %sys %iowait %irq %soft %idle
0 0.69 0.00 8.26 31.65 1.83 57.00 0.57
This series adds net infrastructure for memory providers configuring
the size and implements it for bnxt. It's an opt-in feature for drivers,
they should advertise support for the parameter in the qops and must check
if the hardware supports the given size. It's limited to memory providers
as it drastically simplifies implementation. It doesn't affect the fast
path zcrx uAPI, and the sizes is defined in zcrx terms, which allows it
to be flexible and adjusted in the future, see Patch 8 for details.
A liburing example can be found at [2]
full branch:
[1] https://github.com/isilence/linux.git zcrx/large-buffers-v7
Liburing example:
[2] https://github.com/isilence/liburing.git zcrx/rx-buf-len
v7: - Add xa_destroy
- Rebase
v6: - Update docs and add a selftest
v5: https://lore.kernel.org/netdev/cover.1760440268.git.asml.silence@gmail.com/
- Remove all unnecessary bits like configuration via netlink, and
multi-stage queue configuration.
v4: https://lore.kernel.org/all/cover.1760364551.git.asml.silence@gmail.com/
- Update fbnic qops
- Propagate max buf len for hns3
- Use configured buf size in __bnxt_alloc_rx_netmem
- Minor stylistic changes
v3: https://lore.kernel.org/all/cover.1755499375.git.asml.silence@gmail.com/
- Rebased, excluded zcrx specific patches
- Set agg_size_fac to 1 on warning
v2: https://lore.kernel.org/all/cover.1754657711.git.asml.silence@gmail.com/
- Add MAX_PAGE_ORDER check on pp init
- Applied comments rewording
- Adjust pp.max_len based on order
- Patch up mlx5 queue callbacks after rebase
- Minor ->queue_mgmt_ops refactoring
- Rebased to account for both fill level and agg_size_fac
- Pass providers buf length in struct pp_memory_provider_params and
apply it in __netdev_queue_confi().
- Use ->supported_ring_params to validate drivers support of set
qcfg parameters.
Jakub Kicinski (1):
eth: bnxt: adjust the fill level of agg queues with larger buffers
Pavel Begunkov (8):
net: page pool: xa init with destroy on pp init
net: page_pool: sanitise allocation order
net: memzero mp params when closing a queue
net: let pp memory provider to specify rx buf len
eth: bnxt: store rx buffer size per queue
eth: bnxt: allow providers to set rx buf size
io_uring/zcrx: document area chunking parameter
selftests: iou-zcrx: test large chunk sizes
Documentation/networking/iou-zcrx.rst | 20 +++
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 118 ++++++++++++++----
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 +
drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 6 +-
drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h | 2 +-
include/net/netdev_queues.h | 9 ++
include/net/page_pool/types.h | 1 +
net/core/netdev_rx_queue.c | 14 ++-
net/core/page_pool.c | 4 +
.../selftests/drivers/net/hw/iou-zcrx.c | 72 +++++++++--
.../selftests/drivers/net/hw/iou-zcrx.py | 37 ++++++
11 files changed, 236 insertions(+), 49 deletions(-)
--
2.52.0
next reply other threads:[~2025-11-30 23:35 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-30 23:35 Pavel Begunkov [this message]
2025-11-30 23:35 ` [PATCH net-next v7 1/9] net: page pool: xa init with destroy on pp init Pavel Begunkov
2025-11-30 23:35 ` [PATCH net-next v7 2/9] net: page_pool: sanitise allocation order Pavel Begunkov
2025-11-30 23:35 ` [PATCH net-next v7 3/9] net: memzero mp params when closing a queue Pavel Begunkov
2025-11-30 23:35 ` [PATCH net-next v7 4/9] net: let pp memory provider to specify rx buf len Pavel Begunkov
2025-12-02 19:04 ` Jakub Kicinski
2025-12-11 1:31 ` Pavel Begunkov
2025-12-12 23:57 ` Jakub Kicinski
2025-11-30 23:35 ` [PATCH net-next v7 5/9] eth: bnxt: store rx buffer size per queue Pavel Begunkov
2025-11-30 23:35 ` [PATCH net-next v7 6/9] eth: bnxt: adjust the fill level of agg queues with larger buffers Pavel Begunkov
2025-11-30 23:35 ` [PATCH net-next v7 7/9] eth: bnxt: allow providers to set rx buf size Pavel Begunkov
2025-12-02 18:58 ` Jakub Kicinski
2025-12-11 1:39 ` Pavel Begunkov
2025-12-13 0:04 ` Jakub Kicinski
2025-11-30 23:35 ` [PATCH net-next v7 8/9] io_uring/zcrx: document area chunking parameter Pavel Begunkov
2025-11-30 23:35 ` [PATCH net-next v7 9/9] selftests: iou-zcrx: test large chunk sizes Pavel Begunkov
2025-12-02 14:44 ` [PATCH net-next v7 0/9] Add support for providers with large rx buffer Paolo Abeni
2025-12-02 15:36 ` Pavel Begunkov
2025-12-02 19:05 ` Jakub Kicinski
2025-12-02 19:20 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1764542851.git.asml.silence@gmail.com \
--to=asml.silence@gmail.com \
--cc=almasrymina@google.com \
--cc=andrew+netdev@lunn.ch \
--cc=ast@kernel.org \
--cc=axboe@kernel.dk \
--cc=bpf@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dtatulea@nvidia.com \
--cc=dw@davidwei.uk \
--cc=edumazet@google.com \
--cc=haiyuewa@163.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=io-uring@vger.kernel.org \
--cc=jdamato@fastly.com \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=michael.chan@broadcom.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pavan.chebbi@broadcom.com \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=vishs@fb.com \
--cc=yuehaibing@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).