From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-yx1-f42.google.com (mail-yx1-f42.google.com [74.125.224.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4BE0D43E4BF for ; Mon, 11 May 2026 17:01:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.42 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778518897; cv=none; b=BaSHBFuG/ljtwwvgLJWPhcs+qLT0c3biQRhp40PhIYf2fN3ATZCthegni78qlMm+R9VPGdiFgG3JQhQ8SWY97bVsAE4duYgyUYBaxxhpK1Q4RKVqqs246io8JnAvF/UdsNPiiMI6bK6e3XMoBu5qetn3tLdzOwCcAH7IMwsrJ7g= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778518897; c=relaxed/simple; bh=PE7CfPDlkT2wusrx9oi+uBa5B29abPzYTNGdV9PEKc8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=l5dKnteaIyqiabdouHQ/lLrVE8BEIaG8NELF/Vx/zj/RK/oJr1QqhtMcpK6nF5kMl8YNM4oqJZzJ+czaXTBSvSUInsK00FWQS9r/uCEvkMN/FbAe/sPKPlFz1MR9q7NsBKm1AZUXvMJVoSKNYFdCSdsfC8F2+Hg8zhZjaUHQypA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Bob49sCb; arc=none smtp.client-ip=74.125.224.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Bob49sCb" Received: by mail-yx1-f42.google.com with SMTP id 956f58d0204a3-651bc83e74aso4706874d50.2 for ; Mon, 11 May 2026 10:01:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778518894; x=1779123694; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=xawfXJfKqmeoHeOoXubh3qn/e0+i/yH0+NUj7NVoCTE=; b=Bob49sCbsvWdsGg5Bo0V+IiPmAZiZXom4ahxqCPEjCxWuTBugBnA/d7dnO89SvIAZE XE63XQYfKmnSaTB2jQLDvD9F2QkP1qBT4NIvddpVZ3vqoMIfuCkhyqt5rzRuAcUMQRqB 5Eg4aBn3G2Vt4/StQffospDOP3HzyKAaBAYhvlDKRvDcZJMiuhhMDv2XnleLIYPqB2yD 10b4bwArDx73qG5NmxFYv0SQdPrQFRlZPmucCRjlD7LNZplTqWjhlDsBfKSGspd5Auvl d1fjhxQUffsK1y/D0WXwSMvgUIItrRlqV9IdR8Q5jncDIFxekDh7lqchNXhK44HYuJkN ebWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778518894; x=1779123694; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xawfXJfKqmeoHeOoXubh3qn/e0+i/yH0+NUj7NVoCTE=; b=hqZVrrHkmuBUyCOPNgW3GgfTLQzS2dMU2iJNAb6dZaPt8USZmeKw7cAb7I4a5FOjyX VpqvvMWpvQ4T5Y1Qf/H95zwE2v/fGpnGQRrx5Yo9UZi5wFdDEApV+3DrOo8T2wOPqfLx dhFoqCDc9V0iFnBhrwQbHC6wBNEeVlYUUuz+Y4SNAxM5j1Y1Zt7b5jayk4LmolL2WM5j 5zYwAKCHeqvBPy0rN52bg532evI99118XUEOf6akiTKrCn/L24ag+NyqfCd4N5oeQz4P 84QFaqhisJAS16P2rLB6ByxATZTdcBhYUpN//akYHFri+4hpBJ7ZpPoGLE2T8wZkw8N1 7dKw== X-Forwarded-Encrypted: i=1; AFNElJ9hNIJvSBudW5cSFv+6C+PYJc8/1MHn/MG/pusuQ5DBV3GecUXnAa+PqwyonB8OaHYk1C9CglP9H2b79Pp7TLU=@vger.kernel.org X-Gm-Message-State: AOJu0YxsL/ClIckA9xN6fBpOAKiA4jOUxFEvh7MyY2vAddX1RL5tl/+J jGQDp+xpr9rTZVxgrc3nHB0iAG/XhWYAu766Skvxe3uDADdZZnFRLaSn X-Gm-Gg: Acq92OHxPoPFR3zkiSxl7wbxaqC3wmRHhWfz0LxqsxQwdPwDFOjbFe4HzAbnVjsIuhI PTY3z9qz59TcgKXyAqzjnM2JswPPJyZhWvjRkiF9PMvSjj/JJ060gP86yFgWX/Eipwi2honVMFK cWHywf5PxFJ6pRoSM9HtuUSZVcQCSZbBdGoJDknlXCdupDp86Ox7dYBcItpcIzv8ttbm508trGO n/5Up2Ll1+AMtYl41hX6V2DSankb1bFTRHpeBZpXfMdjHuG3lEQJcII91EkCha7gP3j/kjrVP5K pJVUkKKONG5vCsnyd8wllacNDp4ksWb0jncf7LQgBk4xToBPtm40XgeaK1ouZ87KsCkh4nNeRv0 hYACdqNHeMTPMQ5RPqBK7kt+5XlUifQNiY+kSOgcMC6VILipGqQvdogShdsieZS6LXORRaKSOjL ixf8QlRDmjZPvVtJgjsv/b/K8ShkZFUVFZRgJaiMKMUU5y5vg= X-Received: by 2002:a05:690c:e3c8:b0:7ba:eefe:9fa1 with SMTP id 00721157ae682-7bdf5db6952mr267368157b3.6.1778518893760; Mon, 11 May 2026 10:01:33 -0700 (PDT) Received: from devvm29614.prn0.facebook.com ([2a03:2880:f806:11::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7bd66019105sm151829677b3.0.2026.05.11.10.01.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 May 2026 10:01:33 -0700 (PDT) Date: Mon, 11 May 2026 10:01:25 -0700 From: Bobby Eshleman To: Zhu Yanjun Cc: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Jonathan Corbet , Shuah Khan , Alex Shi , Yanteng Si , Dongliang Mu , Michael Chan , Pavan Chebbi , Joshua Washington , Harshitha Ramamurthy , Saeed Mahameed , Tariq Toukan , Mark Bloch , Leon Romanovsky , Alexander Duyck , kernel-team@meta.com, Daniel Borkmann , Nikolay Aleksandrov , Shuah Khan , dw@davidwei.uk, sdf.kernel@gmail.com, mohsin.bashr@gmail.com, willemb@google.com, jiang.kun2@zte.com.cn, xu.xin16@zte.com.cn, wang.yaxin@zte.com.cn, netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, Stanislav Fomichev , Mina Almasry , Bobby Eshleman Subject: Re: [PATCH net-next v3 0/8] net: devmem: support devmem with netkit devices Message-ID: References: <20260507-tcp-dm-netkit-v3-0-52821445867c@meta.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Sun, May 10, 2026 at 01:33:18PM -0700, Zhu Yanjun wrote: > 在 2026/5/7 19:27, Bobby Eshleman 写道: > > This series enables TCP devmem TX through netkit devices. > > > > Netkit now supports queue leasing. A physical NIC's RX queue can be > > leased to a netkit guest interface inside a container namespace. This > > gives the container a devmem-capable data path on the RX side (bind-rx, > > etc...). On the TX side, the container process binds to its netkit guest > > interface and sends traffic that netkit redirects (via BPF or ip > > forwarding) to the physical NIC for DMA. > > > > Two things in the existing devmem TX path prevent this from working: > > > > 1. validate_xmit_unreadable_skb() requires dev->netmem_tx before it will > > forward a dmabuf-backed (unreadable) skb. This protects skbs from > > landing on devices that don't have the IOMMU mappings for the backing > > dmabuf or that don't speak netmem. Netkit, however, does not support > > DMA, doesn't attempt to read unreadable skb pages and so doesn't > > break netmem (it is pure skb routing and redirection). It is > > functionally capable of routing unreadable skbs, but there is no way > > for the TX validation pathway to distinguish between a device that > > will actually attempt DMA-ing the skb and another device > > (like netkit) that does not DMA but also does not break > > netmem. > > > > 2. bind_tx_doit uses the bound device as the DMA device. When the user > > binds devmem TX to the netkit guest, the bind handler attempts to > > create DMA mappings against netkit, which has no DMA capability and > > no IOMMU mappings. > > > > This series solves these problems as follows: > > > > 1. Extend netmem_tx to two bits, assigned to one of three values: > > > > NETMEM_TX_NONE - netmem not supported > > NETMEM_TX_DMA - netmem supported and performs DMA > > NETMEM_TX_NO_DMA - netmem supported, but does not DMA > > > > With these bits, phys devices can set NETMEM_TX_DMA and devices like > > netkit set NETMEM_TX_NO_DMA. The validation TX path ensures that any > > DMA-capable netdev exactly matches the bound device, guaranteeing the > > correct mapping of the bound dmabuf. The validation TX path also > > allows devices with NETMEM_TX_NO_DMA to pass, knowing these devices > > will not misuse netmem or run into IOMMU faults. After redirection or > > routing and the skb finally makes its way through the stack to a > > physical device's TX path, the above NETMEM_TX_DMA check is performed > > again to guarantee the device has the appropriate binding/mappings. > > > > 2. On TX bind, the bind handler recognizes NETMEM_TX_NO_DMA devices and > > finds the phys TX device and binds to that instead. For the netkit > > case, if it has been leased a queue from a DMA-capable device > > already, then the bind action is performed on the DMA-capable device > > instead and the dmabuf is mapped correctly. > > > > --- > > Changes in v3: > > - Fix validate_xmit_unreadable_skb() logic for non-devmem > > unreadable niovs (should not be dropped) (Sashiko) > > - Simplify lock handling in bind_tx, no premature release (Jakub) > > - split NO_DMA changes into separate patch (Jakub) > > - fixed some pylint issues, one required an additional patch ("selftests: > > drv-net: make attr _nk_guest_ifname public") to rename a variable from > > private to public > > - see per-patch changelist for more detailed changes > > - Link to v2: https://lore.kernel.org/r/20260504-tcp-dm-netkit-v2-0-56d52ac72fd4@meta.com > > > > Changes in v2: > > - Squash driver conversion patches (2-5) into patch 1 (Jakub) > > - In validate_xmit_unreadable_skb() to check netmem_tx mode before inspecting > > frags (Jakub) > > - Lock bind_dev around netdev_queue_get_dma_dev() when bind_dev != netdev to > > fix lockdep (Sashiko) > > - Move require_devmem() into individual test functions so KsftSkipEx goes up to > > ksft_run() (Sashiko) > > - Add nk_devmem.py to TEST_PROGS in Makefile (Sashiko) > > - Link to v1: > > https://lore.kernel.org/all/20260428-tcp-dm-netkit-v1-0-719280eba4d2@meta.com/ > > > > Signed-off-by: Bobby Eshleman > > > > --- > > Bobby Eshleman (8): > > net: convert netmem_tx flag to enum > > net: netkit: declare NETMEM_TX_NO_DMA mode > > net: devmem: support TX over NETMEM_TX_NO_DMA devices > > I applied this patchset in my local kernel tree and built a new kernel > image. I loaded this new kernel image in my test environment. It seems that > all the testcases can pass. > > I think that this patchset would not cause any regression problem in my test > environment. > > Zhu Yanjun Thanks for testing! Best, Bobby