Linux Kernel Selftest development
 help / color / mirror / Atom feed
From: Bobby Eshleman <bobbyeshleman@gmail.com>
To: Zhu Yanjun <yanjun.zhu@linux.dev>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Alex Shi <alexs@kernel.org>, Yanteng Si <si.yanteng@linux.dev>,
	Dongliang Mu <dzm91@hust.edu.cn>,
	Michael Chan <michael.chan@broadcom.com>,
	Pavan Chebbi <pavan.chebbi@broadcom.com>,
	Joshua Washington <joshwash@google.com>,
	Harshitha Ramamurthy <hramamurthy@google.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Tariq Toukan <tariqt@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
	Leon Romanovsky <leon@kernel.org>,
	Alexander Duyck <alexanderduyck@fb.com>,
	kernel-team@meta.com, Daniel Borkmann <daniel@iogearbox.net>,
	Nikolay Aleksandrov <razor@blackwall.org>,
	Shuah Khan <shuah@kernel.org>,
	dw@davidwei.uk, sdf.kernel@gmail.com, mohsin.bashr@gmail.com,
	willemb@google.com, jiang.kun2@zte.com.cn, xu.xin16@zte.com.cn,
	wang.yaxin@zte.com.cn, netdev@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-rdma@vger.kernel.org, bpf@vger.kernel.org,
	linux-kselftest@vger.kernel.org,
	Stanislav Fomichev <sdf@fomichev.me>,
	Mina Almasry <almasrymina@google.com>,
	Bobby Eshleman <bobbyeshleman@meta.com>
Subject: Re: [PATCH net-next v3 0/8] net: devmem: support devmem with netkit devices
Date: Mon, 11 May 2026 10:01:25 -0700	[thread overview]
Message-ID: <agILZaYTrRBlL+Dt@devvm29614.prn0.facebook.com> (raw)
In-Reply-To: <d7de2f17-af2d-4b6a-be65-f009d78e3d20@linux.dev>

On Sun, May 10, 2026 at 01:33:18PM -0700, Zhu Yanjun wrote:
> 在 2026/5/7 19:27, Bobby Eshleman 写道:
> > This series enables TCP devmem TX through netkit devices.
> > 
> > Netkit now supports queue leasing. A physical NIC's RX queue can be
> > leased to a netkit guest interface inside a container namespace. This
> > gives the container a devmem-capable data path on the RX side (bind-rx,
> > etc...). On the TX side, the container process binds to its netkit guest
> > interface and sends traffic that netkit redirects (via BPF or ip
> > forwarding) to the physical NIC for DMA.
> > 
> > Two things in the existing devmem TX path prevent this from working:
> > 
> > 1. validate_xmit_unreadable_skb() requires dev->netmem_tx before it will
> >     forward a dmabuf-backed (unreadable) skb. This protects skbs from
> >     landing on devices that don't have the IOMMU mappings for the backing
> >     dmabuf or that don't speak netmem. Netkit, however, does not support
> >     DMA, doesn't attempt to read unreadable skb pages and so doesn't
> >     break netmem (it is pure skb routing and redirection). It is
> >     functionally capable of routing unreadable skbs, but there is no way
> >     for the TX validation pathway to distinguish between a device that
> >     will actually attempt DMA-ing the skb and another device
> >     (like netkit) that does not DMA but also does not break
> >     netmem.
> > 
> > 2. bind_tx_doit uses the bound device as the DMA device.  When the user
> >     binds devmem TX to the netkit guest, the bind handler attempts to
> >     create DMA mappings against netkit, which has no DMA capability and
> >     no IOMMU mappings.
> > 
> > This series solves these problems as follows:
> > 
> > 1. Extend netmem_tx to two bits, assigned to one of three values:
> > 
> >     NETMEM_TX_NONE   - netmem not supported
> >     NETMEM_TX_DMA    - netmem supported and performs DMA
> >     NETMEM_TX_NO_DMA - netmem supported, but does not DMA
> > 
> >     With these bits, phys devices can set NETMEM_TX_DMA and devices like
> >     netkit set NETMEM_TX_NO_DMA. The validation TX path ensures that any
> >     DMA-capable netdev exactly matches the bound device, guaranteeing the
> >     correct mapping of the bound dmabuf. The validation TX path also
> >     allows devices with NETMEM_TX_NO_DMA to pass, knowing these devices
> >     will not misuse netmem or run into IOMMU faults. After redirection or
> >     routing and the skb finally makes its way through the stack to a
> >     physical device's TX path, the above NETMEM_TX_DMA check is performed
> >     again to guarantee the device has the appropriate binding/mappings.
> > 
> > 2. On TX bind, the bind handler recognizes NETMEM_TX_NO_DMA devices and
> >     finds the phys TX device and binds to that instead. For the netkit
> >     case, if it has been leased a queue from a DMA-capable device
> >     already, then the bind action is performed on the DMA-capable device
> >     instead and the dmabuf is mapped correctly.
> > 
> > ---
> > Changes in v3:
> > - Fix validate_xmit_unreadable_skb() logic for non-devmem
> >    unreadable niovs (should not be dropped) (Sashiko)
> > - Simplify lock handling in bind_tx, no premature release (Jakub)
> > - split NO_DMA changes into separate patch (Jakub)
> > - fixed some pylint issues, one required an additional patch ("selftests:
> >    drv-net: make attr _nk_guest_ifname public") to rename a variable from
> >    private to public
> > - see per-patch changelist for more detailed changes
> > - Link to v2: https://lore.kernel.org/r/20260504-tcp-dm-netkit-v2-0-56d52ac72fd4@meta.com
> > 
> > Changes in v2:
> > - Squash driver conversion patches (2-5) into patch 1 (Jakub)
> > - In validate_xmit_unreadable_skb() to check netmem_tx mode before inspecting
> >    frags (Jakub)
> > - Lock bind_dev around netdev_queue_get_dma_dev() when bind_dev != netdev to
> >    fix lockdep (Sashiko)
> > - Move require_devmem() into individual test functions so KsftSkipEx goes up to
> >    ksft_run() (Sashiko)
> > - Add nk_devmem.py to TEST_PROGS in Makefile (Sashiko)
> > - Link to v1:
> >    https://lore.kernel.org/all/20260428-tcp-dm-netkit-v1-0-719280eba4d2@meta.com/
> > 
> > Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
> > 
> > ---
> > Bobby Eshleman (8):
> >        net: convert netmem_tx flag to enum
> >        net: netkit: declare NETMEM_TX_NO_DMA mode
> >        net: devmem: support TX over NETMEM_TX_NO_DMA devices
> 
> I applied this patchset in my local kernel tree and built a new kernel
> image. I loaded this new kernel image in my test environment. It seems that
> all the testcases can pass.
> 
> I think that this patchset would not cause any regression problem in my test
> environment.
> 
> Zhu Yanjun

Thanks for testing!

Best,
Bobby

      reply	other threads:[~2026-05-11 17:01 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08  2:27 [PATCH net-next v3 0/8] net: devmem: support devmem with netkit devices Bobby Eshleman
2026-05-08  2:27 ` [PATCH net-next v3 1/8] net: convert netmem_tx flag to enum Bobby Eshleman
2026-05-08 14:56   ` Stanislav Fomichev
2026-05-08 16:11     ` Bobby Eshleman
2026-05-08  2:27 ` [PATCH net-next v3 2/8] net: netkit: declare NETMEM_TX_NO_DMA mode Bobby Eshleman
2026-05-08 14:57   ` Stanislav Fomichev
2026-05-08  2:27 ` [PATCH net-next v3 3/8] net: devmem: support TX over NETMEM_TX_NO_DMA devices Bobby Eshleman
2026-05-08 15:01   ` Stanislav Fomichev
2026-05-08 16:19     ` Bobby Eshleman
2026-05-08 20:44     ` Jakub Kicinski
2026-05-08 20:47   ` Jakub Kicinski
2026-05-08 21:28     ` Bobby Eshleman
2026-05-08 22:27       ` Jakub Kicinski
2026-05-08 23:03         ` Bobby Eshleman
2026-05-08  2:27 ` [PATCH net-next v3 4/8] selftests: drv-net: ncdevmem: add -n flag to skip NIC configuration Bobby Eshleman
2026-05-08 15:01   ` Stanislav Fomichev
2026-05-08  2:27 ` [PATCH net-next v3 5/8] selftests: drv-net: make attr _nk_guest_ifname public Bobby Eshleman
2026-05-08 15:01   ` Stanislav Fomichev
2026-05-08  2:27 ` [PATCH net-next v3 6/8] selftests: drv-net: refactor devmem command builders into lib module Bobby Eshleman
2026-05-08 15:03   ` Stanislav Fomichev
2026-05-08 16:19     ` Bobby Eshleman
2026-05-08  2:27 ` [PATCH net-next v3 7/8] selftests: drv-net: add primary_rx_redirect support to NetDrvContEnv Bobby Eshleman
2026-05-08 15:03   ` Stanislav Fomichev
2026-05-08  2:27 ` [PATCH net-next v3 8/8] selftests: drv-net: add netkit devmem tests Bobby Eshleman
2026-05-08 15:03   ` Stanislav Fomichev
2026-05-10 20:33 ` [PATCH net-next v3 0/8] net: devmem: support devmem with netkit devices Zhu Yanjun
2026-05-11 17:01   ` Bobby Eshleman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agILZaYTrRBlL+Dt@devvm29614.prn0.facebook.com \
    --to=bobbyeshleman@gmail.com \
    --cc=alexanderduyck@fb.com \
    --cc=alexs@kernel.org \
    --cc=almasrymina@google.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=bobbyeshleman@meta.com \
    --cc=bpf@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dw@davidwei.uk \
    --cc=dzm91@hust.edu.cn \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=hramamurthy@google.com \
    --cc=jiang.kun2@zte.com.cn \
    --cc=joshwash@google.com \
    --cc=kernel-team@meta.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=michael.chan@broadcom.com \
    --cc=mohsin.bashr@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pavan.chebbi@broadcom.com \
    --cc=razor@blackwall.org \
    --cc=saeedm@nvidia.com \
    --cc=sdf.kernel@gmail.com \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=si.yanteng@linux.dev \
    --cc=skhan@linuxfoundation.org \
    --cc=tariqt@nvidia.com \
    --cc=wang.yaxin@zte.com.cn \
    --cc=willemb@google.com \
    --cc=xu.xin16@zte.com.cn \
    --cc=yanjun.zhu@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox