All of lore.kernel.org
 help / color / mirror / Atom feed
From: zf <zf15750701@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>, netdev@vger.kernel.org
Cc: bpf@vger.kernel.org, kuba@kernel.org, davem@davemloft.net,
	razor@blackwall.org, pabeni@redhat.com, willemb@google.com,
	sdf@fomichev.me, john.fastabend@gmail.com, martin.lau@kernel.org,
	jordan@jrife.io, maciej.fijalkowski@intel.com,
	magnus.karlsson@intel.com, David Wei <dw@davidwei.uk>,
	yangzhenze@bytedance.com,
	Dongdong Wang <wangdongdong.6@bytedance.com>
Subject: Re: [PATCH net-next 18/20] netkit: Add io_uring zero-copy support for TCP
Date: Mon, 22 Sep 2025 11:17:58 +0800	[thread overview]
Message-ID: <e9c6903c-e440-46b3-860e-8782bfe4efb2@gmail.com> (raw)
In-Reply-To: <20250919213153.103606-19-daniel@iogearbox.net>

在 2025/9/20 05:31, Daniel Borkmann 写道:
> From: David Wei <dw@davidwei.uk>
> 
> This adds the last missing bit to netkit for supporting io_uring with
> zero-copy mode [0]. Up until this point it was not possible to consume
> the latter out of containers or Kubernetes Pods where applications are
> in their own network namespace.
> 
> Thus, as a last missing bit, implement ndo_queue_get_dma_dev() in netkit
> to return the physical device of the real rxq for DMA. This allows memory
> providers like io_uring zero-copy or devmem to bind to the physically
> mapped rxq in netkit.
> 
> io_uring example with eth0 being a physical device with 16 queues where
> netkit is bound to the last queue, iou-zcrx.c is binary from selftests.
> Flow steering to that queue is based on the service VIP:port of the
> server utilizing io_uring:
> 
>    # ethtool -X eth0 start 0 equal 15
>    # ethtool -X eth0 start 15 equal 1 context new
>    # ethtool --config-ntuple eth0 flow-type tcp4 dst-ip 1.2.3.4 dst-port 5000 action 15
>    # ip netns add foo
>    # ip link add numrxqueues 2 type netkit
>    # ynl-bind eth0 15 nk0
>    # ip link set nk0 netns foo
>    # ip link set nk1 up
>    # ip netns exec foo ip link set lo up
>    # ip netns exec foo ip link set nk0 up
>    # ip netns exec foo ip addr add 1.2.3.4/32 dev nk0
>    [ ... setup routing etc to get external traffic into the netns ... ]
>    # ip netns exec foo ./iou-zcrx -s -p 5000 -i nk0 -q 1
> 
> Remote io_uring client:
> 
>    # ./iou-zcrx -c -h 1.2.3.4 -p 5000 -l 12840 -z 65536
> 
> We have tested the above against a dual-port Nvidia ConnectX-6 (mlx5)
> 100G NIC as well as Broadcom BCM957504 (bnxt_en) 100G NIC, both
> supporting TCP header/data split. For Cilium, the plan is to open
> up support for io_uring in zero-copy mode for regular Kubernetes Pods
> when Cilium is configured with netkit datapath mode.
> 

 From what we have learned, mlx supports TCP header/data split starting 
from CX7, relying on the hw rx gro. I would like to ask, can CX6 use TCP 
header/data split? Can you share your CX6's mlx driver information and 
FW information? I will test it. If CX6 can support, this one is even 
better for me. Thanks.


> Signed-off-by: David Wei <dw@davidwei.uk>
> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Link: https://kernel-recipes.org/en/2024/schedule/efficient-zero-copy-networking-using-io_uring [0]
> ---
>   drivers/net/netkit.c | 18 +++++++++++++++++-
>   1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c
> index 27ff84833f28..5129b27a7c3c 100644
> --- a/drivers/net/netkit.c
> +++ b/drivers/net/netkit.c
> @@ -274,6 +274,21 @@ static const struct ethtool_ops netkit_ethtool_ops = {
>   	.get_channels		= netkit_get_channels,
>   };
>   
> +static struct device *netkit_queue_get_dma_dev(struct net_device *dev, int idx)
> +{
> +	struct netdev_rx_queue *rxq, *peer_rxq;
> +	unsigned int peer_idx;
> +
> +	rxq = __netif_get_rx_queue(dev, idx);
> +	if (!rxq->peer)
> +		return NULL;
> +
> +	peer_rxq = rxq->peer;
> +	peer_idx = get_netdev_rx_queue_index(peer_rxq);
> +
> +	return netdev_queue_get_dma_dev(peer_rxq->dev, peer_idx);
> +}
> +
>   static int netkit_queue_create(struct net_device *dev)
>   {
>   	struct netkit *nk = netkit_priv(dev);
> @@ -299,7 +314,8 @@ static int netkit_queue_create(struct net_device *dev)
>   }
>   
>   static const struct netdev_queue_mgmt_ops netkit_queue_mgmt_ops = {
> -	.ndo_queue_create = netkit_queue_create,
> +	.ndo_queue_get_dma_dev		= netkit_queue_get_dma_dev,
> +	.ndo_queue_create		= netkit_queue_create,
>   };
>   
>   static struct net_device *netkit_alloc(struct nlattr *tb[],


  reply	other threads:[~2025-09-22  3:18 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-19 21:31 [PATCH net-next 00/20] netkit: Support for io_uring zero-copy and AF_XDP Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 01/20] net, ynl: Add bind-queue operation Daniel Borkmann
2025-09-22 16:04   ` Stanislav Fomichev
2025-09-22 16:13     ` Daniel Borkmann
2025-09-23  1:17   ` Jakub Kicinski
2025-09-23 16:13     ` David Wei
2025-09-19 21:31 ` [PATCH net-next 02/20] net: Add peer to netdev_rx_queue Daniel Borkmann
2025-09-23  1:22   ` Jakub Kicinski
2025-09-23 15:56     ` David Wei
2025-09-19 21:31 ` [PATCH net-next 03/20] net: Add ndo_queue_create callback Daniel Borkmann
2025-09-22 16:04   ` Stanislav Fomichev
2025-09-22 16:14     ` Daniel Borkmann
2025-09-23 15:58     ` David Wei
2025-09-23  1:22   ` Jakub Kicinski
2025-09-23 15:58     ` David Wei
2025-09-19 21:31 ` [PATCH net-next 04/20] net: Add ndo_{peer,unpeer}_queues callback Daniel Borkmann
2025-09-23  1:23   ` Jakub Kicinski
2025-09-23 16:06     ` David Wei
2025-09-23 16:26       ` Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 05/20] net, ynl: Implement netdev_nl_bind_queue_doit Daniel Borkmann
2025-09-22 16:06   ` Stanislav Fomichev
2025-09-23  1:26     ` Jakub Kicinski
2025-09-23 16:06       ` David Wei
2025-09-19 21:31 ` [PATCH net-next 06/20] net, ynl: Add peer info to queue-get response Daniel Borkmann
2025-09-23  1:32   ` Jakub Kicinski
2025-09-23 16:08     ` David Wei
2025-09-19 21:31 ` [PATCH net-next 07/20] net, ethtool: Disallow mapped real rxqs to be resized Daniel Borkmann
2025-09-23  1:34   ` Jakub Kicinski
2025-09-23  1:38     ` Jakub Kicinski
2025-09-23 16:08       ` David Wei
2025-09-19 21:31 ` [PATCH net-next 08/20] net: Proxy net_mp_{open,close}_rxq for mapped queues Daniel Borkmann
2025-09-22 16:35   ` Stanislav Fomichev
2025-09-19 21:31 ` [PATCH net-next 09/20] xsk: Move NETDEV_XDP_ACT_ZC into generic header Daniel Borkmann
2025-09-22 15:59   ` Maciej Fijalkowski
2025-09-19 21:31 ` [PATCH net-next 10/20] xsk: Move pool registration into single function Daniel Borkmann
2025-09-22 16:01   ` Maciej Fijalkowski
2025-09-22 16:15     ` Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 11/20] xsk: Add small helper xp_pool_bindable Daniel Borkmann
2025-09-22 16:03   ` Maciej Fijalkowski
2025-09-22 16:17     ` Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 12/20] xsk: Change xsk_rcv_check to check netdev/queue_id from pool Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 13/20] xsk: Proxy pool management for mapped queues Daniel Borkmann
2025-09-22 16:48   ` Stanislav Fomichev
2025-09-22 17:01     ` Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 14/20] netkit: Add single device mode for netkit Daniel Borkmann
2025-09-27  1:10   ` Jordan Rife
2025-09-29  7:55     ` Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 15/20] netkit: Document fast vs slowpath members via macros Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 16/20] netkit: Implement rtnl_link_ops->alloc Daniel Borkmann
2025-09-27  1:17   ` Jordan Rife
2025-09-29  7:50     ` Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 17/20] netkit: Implement ndo_queue_create Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 18/20] netkit: Add io_uring zero-copy support for TCP Daniel Borkmann
2025-09-22  3:17   ` zf [this message]
2025-09-22 16:23     ` Daniel Borkmann
2025-09-19 21:31 ` [PATCH net-next 19/20] netkit: Add xsk support for af_xdp applications Daniel Borkmann
2025-09-23 11:42   ` Toke Høiland-Jørgensen
2025-09-24 10:41     ` Daniel Borkmann
2025-09-26  8:55       ` Toke Høiland-Jørgensen
2025-09-19 21:31 ` [PATCH net-next 20/20] tools, ynl: Add queue binding ynl sample application Daniel Borkmann
2025-09-22 17:09   ` Stanislav Fomichev
2025-09-23 16:12     ` David Wei
2025-09-22 12:05 ` [PATCH net-next 00/20] netkit: Support for io_uring zero-copy and AF_XDP Nikolay Aleksandrov
2025-09-23  1:59 ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9c6903c-e440-46b3-860e-8782bfe4efb2@gmail.com \
    --to=zf15750701@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dw@davidwei.uk \
    --cc=john.fastabend@gmail.com \
    --cc=jordan@jrife.io \
    --cc=kuba@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=martin.lau@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=sdf@fomichev.me \
    --cc=wangdongdong.6@bytedance.com \
    --cc=willemb@google.com \
    --cc=yangzhenze@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.