From: "Björn Töpel" <bjorn@kernel.org>
To: Alexander Duyck <alexanderduyck@fb.com>,
Jakub Kicinski <kuba@kernel.org>,
kernel-team@meta.com, Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>, Shuah Khan <shuah@kernel.org>,
netdev@vger.kernel.org
Cc: "Björn Töpel" <bjorn@kernel.org>,
"Jacob Keller" <jacob.e.keller@intel.com>,
"Mohsin Bashir" <mohsin.bashr@gmail.com>,
"Mike Marciniszyn (Meta)" <mike.marciniszyn@gmail.com>,
"Pavel Begunkov" <asml.silence@gmail.com>,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: [PATCH net-next 0/3] fbnic: Support larger io_uring zcrx buffers
Date: Fri, 22 May 2026 13:32:19 +0200 [thread overview]
Message-ID: <20260522113225.241337-1-bjorn@kernel.org> (raw)
Hi!
Fbnic programs receive buffers through BDQs. The hardware consumes
BDQs as 4 KiB fragments, and receive completions report the consumed
buffer by returning the BDQ buffer ID in the RCD.
The driver currently derives the BDQ fragment layout from PAGE_SIZE.
That works while HPQ and PPQ use the same allocation size, but
io_uring zcrx can provide larger receive buffers through rx_buf_len.
For zcrx, the PPQ page pool allocation size and the PPQ BDQ fragment
geometry need to match the requested buffer size, without changing
HPQ.
Make the BDQ fragment geometry per ring, then use the rendered RX
queue rx_page_size when creating the PPQ page pool. The NIC still
consumes the PPQ as 4 KiB fragments; a larger zcrx buffer is
represented as multiple BDQ fragments belonging to one net_iov.
Fbnic also validates rx_page_size against its own queue geometry. The
core validates the zcrx request and checks that the imported memory
can be represented as rx_buf_len-sized DMA chunks, but fbnic still
needs to make sure the PPQ retains usable depth after expanding one
software buffer into multiple 4 KiB hardware fragments.
The normal open path uses the rendered per-queue rx_page_size as well.
This preserves a memory-provider binding made while the netdev is
down, instead of falling back to the default PPQ geometry on open.
The selftest change adds an optional iou-zcrx helper check for manual
driver testing. It is not wired into the generic large-chunk test
because different drivers may legitimately return different CQE
boundaries.
Manual testing
==============
The fbnic QEMU model and firmware setup are described here:
https://lore.kernel.org/netdev/20260309113852.2c654de5@kernel.org/
I use something like:
KERNEL=/path/to/linux
DISK=/path/to/fedora-qemu.raw
OVMF_CODE=/path/to/OVMF_CODE.fd
OVMF_VARS=/path/to/OVMF_VARS.fd
MODS=/tmp/fbnic-modules
QEMU=/path/to/fbnic-qemu/build/qemu-system-x86_64
$QEMU \
-machine type=q35,accel=kvm \
-drive if=pflash,format=raw,unit=0,file=$OVMF_CODE,readonly=on \
-drive if=pflash,format=raw,unit=1,file=$OVMF_VARS \
-smp 16 -m 16G \
-object memory-backend-memfd,id=mem,size=16G,share=on \
-numa node,memdev=mem \
-kernel $KERNEL/arch/x86/boot/bzImage \
-append "root=/dev/vda2 rw console=ttyS0 earlycon" \
-drive file=$DISK,format=raw,if=none,id=drive0 \
-device virtio-blk-pci,drive=drive0 \
-no-user-config -nodefaults -nographic \
-virtfs local,path=$MODS/lib/modules,mount_tag=modules,security_model=none,readonly=on \
-virtfs local,path=$KERNEL,mount_tag=hostshare,security_model=none,readonly=on \
-netdev user,id=hostnet0,hostfwd=tcp::9999-:9999 \
-netdev hubport,id=hub_uplink,hubid=0,netdev=hostnet0 \
-device virtio-net-pci,netdev=n1 \
-netdev hubport,id=n1,hubid=0 \
-device pcie-root-port,id=pcie.1,bus=pcie.0,chassis=1 \
-device fbnic,bus=pcie.1,id=fbnic.1,mac=00:de:ad:be:ef:01,netdev=n2,rbt=skt.0,bar4=ctrl.1 \
-netdev hubport,id=n2,hubid=0 \
-chardev socket,id=ctrl.1,path=/tmp/fbnic-ctrl-skt \
-netdev socket,id=skt.0,connect=localhost:9000 \
-serial mon:stdio
Here you'll get a fbnic device, host port forwarding for TCP port
9999, and a 9p mount for the kernel tree and modules.
Inside the guest:
mount -t 9p -o trans=virtio,version=9p2000.L hostshare /host
cd /host/tools/testing/selftests/drivers/net/hw
ethtool -L enp1s0 combined 2
ethtool -G enp1s0 tcp-data-split on hds-thresh 0 rx 64
ethtool -X enp1s0 equal 1
ethtool -N enp1s0 flow-type tcp4 dst-ip 10.0.2.15 dst-port 9999 action 1
echo 64 > /proc/sys/vm/nr_hugepages
./iou-zcrx -s -i enp1s0 -p 9999 -q 1 -x 2
On the host:
cd /path/to/linux/tools/testing/selftests/drivers/net/hw
./iou-zcrx -c -h 127.0.0.1 -p 9999 -l 12840
For fbnic-specific manual checking that traffic reaches the second 4
KiB fragment of an 8 KiB zcrx buffer, run the receiver with:
./iou-zcrx -s -i enp1s0 -p 9999 -q 1 -x 2 -F 4096
Björn Töpel (3):
fbnic: Track BDQ fragment geometry per ring
fbnic: Support larger zcrx receive buffers
selftests: drv-net: Add zcrx payload offset check
drivers/net/ethernet/meta/fbnic/fbnic_csr.h | 29 +--
.../net/ethernet/meta/fbnic/fbnic_debugfs.c | 5 +-
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 168 ++++++++++++++----
drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 6 +
.../selftests/drivers/net/hw/iou-zcrx.c | 28 ++-
5 files changed, 176 insertions(+), 60 deletions(-)
base-commit: 1a1f055318d82e64485a6ff8420e5f70b4267998
--
2.53.0
next reply other threads:[~2026-05-22 11:32 UTC|newest] Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top 2026-05-22 11:32 Björn Töpel [this message] 2026-05-22 11:32 ` [PATCH net-next 1/3] fbnic: Track BDQ fragment geometry per ring Björn Töpel 2026-05-22 13:57 ` Jakub Kicinski 2026-05-22 11:32 ` [PATCH net-next 2/3] fbnic: Support larger zcrx receive buffers Björn Töpel 2026-05-22 14:03 ` Jakub Kicinski 2026-05-22 11:32 ` [PATCH net-next 3/3] selftests: drv-net: Add zcrx payload offset check Björn Töpel 2026-05-22 14:05 ` [PATCH net-next 0/3] fbnic: Support larger io_uring zcrx buffers Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260522113225.241337-1-bjorn@kernel.org \
--to=bjorn@kernel.org \
--cc=alexanderduyck@fb.com \
--cc=andrew+netdev@lunn.ch \
--cc=asml.silence@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jacob.e.keller@intel.com \
--cc=kernel-team@meta.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mike.marciniszyn@gmail.com \
--cc=mohsin.bashr@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox