netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: "Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
	netdev@vger.kernel.org, "Björn Töpel" <bjorn@kernel.org>,
	"Magnus Karlsson" <magnus.karlsson@intel.com>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	"Jonathan Lemon" <jonathan.lemon@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"John Fastabend" <john.fastabend@gmail.com>,
	bpf@vger.kernel.org, virtualization@lists.linux-foundation.org,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Guenter Roeck" <linux@roeck-us.net>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Jens Axboe" <axboe@kernel.dk>,
	"Linus Torvalds" <torvalds@linux-foundation.org>
Subject: Re: [PATCH net-next] xsk: introduce xsk_dma_ops
Date: Wed, 19 Apr 2023 09:45:06 -0700	[thread overview]
Message-ID: <20230419094506.2658b73f@kernel.org> (raw)
In-Reply-To: <ZD95RY9PjVRi7qz3@infradead.org>

On Tue, 18 Apr 2023 22:16:53 -0700 Christoph Hellwig wrote:
> On Mon, Apr 17, 2023 at 11:19:47PM -0700, Jakub Kicinski wrote:
> > Damn, that's unfortunate. Thinking aloud -- that means that if we want 
> > to continue to pull memory management out of networking drivers to
> > improve it for all, cross-optimize with the rest of the stack and
> > allow various upcoming forms of zero copy -- then we need to add an
> > equivalent of dma_ops and DMA API locally in networking?  
> 
> Can you explain what the actual use case is?
> 
> From the original patchset I suspect it is dma mapping something very
> long term and then maybe doing syncs on it as needed?

In this case yes, pinned user memory, it gets sliced up into MTU sized
chunks, fed into an Rx queue of a device, and user can see packets
without any copies.

Quite similar use case #2 is upcoming io_uring / "direct placement"
patches (former from Meta, latter for Google) which will try to receive
just the TCP data into pinned user memory.

And, as I think Olek mentioned, #3 is page_pool - which allocates 4k
pages, manages the DMA mappings, gives them to the device and tries 
to recycle back to the device once TCP is done with them (avoiding the
unmapping and even atomic ops on the refcount, as in the good case page
refcount is always 1). See page_pool_return_skb_page() for the
recycling flow.

In all those cases it's more flexible (and faster) to hide the DMA
mapping from the driver. All the cases are also opt-in so we don't need
to worry about complete oddball devices. And to answer your question in
all cases we hope mapping/unmapping will be relatively rare while
syncing will be frequent.

AFAIU the patch we're discussing implements custom dma_ops for case #1,
but the same thing will be needed for #2, and #3. Question to me is
whether we need netdev-wide net_dma_ops or device model can provide us
with a DMA API that'd work for SoC/PCIe/virt devices.

  parent reply	other threads:[~2023-04-19 16:45 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-17  3:27 [PATCH net-next] xsk: introduce xsk_dma_ops Xuan Zhuo
2023-04-17  4:24 ` Christoph Hellwig
2023-04-17  5:58   ` Xuan Zhuo
2023-04-17 18:56     ` Jakub Kicinski
2023-04-17 18:57       ` Jakub Kicinski
2023-04-18  1:07         ` Jason Wang
2023-04-18  1:19           ` Jakub Kicinski
2023-04-18  2:19             ` Xuan Zhuo
2023-04-18  2:54               ` Jakub Kicinski
2023-04-18  5:01                 ` Christoph Hellwig
2023-04-18  6:19                   ` Jakub Kicinski
2023-04-19  5:16                     ` Christoph Hellwig
2023-04-19 13:14                       ` Alexander Lobakin
2023-04-19 13:40                         ` Xuan Zhuo
2023-04-20  6:16                         ` Christoph Hellwig
2023-04-20 13:59                           ` Alexander Lobakin
2023-04-20 16:15                             ` Christoph Hellwig
2023-04-20 16:42                               ` Alexander Lobakin
2023-05-01  4:28                                 ` Christoph Hellwig
2023-04-19 16:45                       ` Jakub Kicinski [this message]
2023-04-20  6:19                         ` Christoph Hellwig
2023-04-20  9:11                           ` Xuan Zhuo
2023-04-20 16:18                             ` Christoph Hellwig
2023-04-25  8:12                               ` Michael S. Tsirkin
2023-05-01  4:16                                 ` Christoph Hellwig
2023-04-20 14:13                           ` Jakub Kicinski
2023-04-21  7:31                             ` Xuan Zhuo
2023-04-21 13:50                               ` Jakub Kicinski
2023-04-23  1:54                                 ` Xuan Zhuo
2023-04-24 15:28                                   ` Jakub Kicinski
2023-04-24 15:28                                 ` Alexander Lobakin
2023-04-25  2:11                                   ` Xuan Zhuo
2023-04-18  2:15         ` Xuan Zhuo
2023-04-17  6:38 ` kernel test robot
2023-04-17  6:43 ` Michael S. Tsirkin
2023-04-17  6:48   ` Xuan Zhuo
2023-04-17  6:48 ` kernel test robot
2023-04-19 13:22 ` Alexander Lobakin
2023-04-19 13:42   ` Xuan Zhuo
2023-04-20  6:12   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230419094506.2658b73f@kernel.org \
    --to=kuba@kernel.org \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hawk@kernel.org \
    --cc=hch@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kraxel@redhat.com \
    --cc=linux@roeck-us.net \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).