From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: "Jakub Kicinski" <kuba@kernel.org>,
"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
netdev@vger.kernel.org, "Björn Töpel" <bjorn@kernel.org>,
"Magnus Karlsson" <magnus.karlsson@intel.com>,
"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
"Jonathan Lemon" <jonathan.lemon@gmail.com>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Paolo Abeni" <pabeni@redhat.com>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"John Fastabend" <john.fastabend@gmail.com>,
bpf@vger.kernel.org, virtualization@lists.linux-foundation.org,
"Michael S. Tsirkin" <mst@redhat.com>,
"Guenter Roeck" <linux@roeck-us.net>,
"Gerd Hoffmann" <kraxel@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Jens Axboe" <axboe@kernel.dk>,
"Linus Torvalds" <torvalds@linux-foundation.org>
Subject: Re: [PATCH net-next] xsk: introduce xsk_dma_ops
Date: Thu, 20 Apr 2023 15:59:39 +0200 [thread overview]
Message-ID: <ff3d588e-10ac-36dd-06af-d55a79424ede@intel.com> (raw)
In-Reply-To: <ZEDYt/EQJk39dTuK@infradead.org>
From: Christoph Hellwig <hch@infradead.org>
Date: Wed, 19 Apr 2023 23:16:23 -0700
> On Wed, Apr 19, 2023 at 03:14:48PM +0200, Alexander Lobakin wrote:
>>>>> dma addresses and thus dma mappings are completely driver specific.
>>>>> Upper layers have no business looking at them.
>>
>> Here it's not an "upper layer". XSk core doesn't look at them or pass
>> them between several drivers.
>
> Same for upper layers :) The just do abstract operations that can sit
> on top of a variety of drivers.
>
>> It maps DMA solely via the struct device
>> passed from the driver and then just gets-sets addresses for this driver
>> only. Just like Page Pool does for regular Rx buffers. This got moved to
>> the XSk core to not repeat the same code pattern in each driver.
>
> Which assumes that:
>
> a) a DMA mapping needs to be done at all
> b) it can be done using a struct device exposed to it
> c) that DMA mapping is actually at the same granularity that it
> operates on
>
> all of which might not be true.
>
>>> >From the original patchset I suspect it is dma mapping something very
>>> long term and then maybe doing syncs on it as needed?
>>
>> As I mentioned, XSk provides some handy wrappers to map DMA for drivers.
>> Previously, XSk was supported by real hardware drivers only, but here
>> the developer tries to add support to virtio-net. I suspect he needs to
>> use DMA mapping functions different from which the regular driver use.
>
> Yes, For actual hardware virtio and some more complex virtualized
> setups it works just like real hardware. For legacy virtio there is
> no DMA maping involved at all. Because of that all DMA mapping needs
> to be done inside of virtio.
>
>> So this is far from dma_map_ops, the author picked wrong name :D
>> And correct, for XSk we map one big piece of memory only once and then
>> reuse it for buffers, no inflight map/unmap on hotpath (only syncs when
>> needed). So this mapping is longterm and is stored in XSk core structure
>> assigned to the driver which this mapping was done for.
>> I think Jakub thinks of something similar, but for the "regular" Rx/Tx,
>> not only XDP sockets :)
>
> FYI, dma_map_* is not intended for long term mappings, can lead
> to starvation issues. You need to use dma_alloc_* instead. And
> "you" in that case is as I said the driver, not an upper layer.
> If it's just helper called by drivers and never from core code,
> that's of course fine.
Hmm, currently almost all Ethernet drivers map Rx pages once and then
just recycle them, keeping the original DMA mapping. Which means pages
can have the same first mapping for very long time, often even for the
lifetime of the struct device. Same for XDP sockets, the lifetime of DMA
mappings equals the lifetime of sockets.
Does it mean we'd better review that approach and try switching to
dma_alloc_*() family (non-coherent/caching in our case)?
Also, I remember I tried to do that for one my driver, but the thing
that all those functions zero the whole page(s) before returning them to
the driver ruins the performance -- we don't need to zero buffers for
receiving packets and spend a ton of cycles on it (esp. in cases when 4k
gets zeroed each time, but your main body of traffic is 64-byte frames).
Thanks,
Olek
next prev parent reply other threads:[~2023-04-20 14:01 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-17 3:27 [PATCH net-next] xsk: introduce xsk_dma_ops Xuan Zhuo
2023-04-17 4:24 ` Christoph Hellwig
2023-04-17 5:58 ` Xuan Zhuo
2023-04-17 18:56 ` Jakub Kicinski
2023-04-17 18:57 ` Jakub Kicinski
2023-04-18 1:07 ` Jason Wang
2023-04-18 1:19 ` Jakub Kicinski
2023-04-18 2:19 ` Xuan Zhuo
2023-04-18 2:54 ` Jakub Kicinski
2023-04-18 5:01 ` Christoph Hellwig
2023-04-18 6:19 ` Jakub Kicinski
2023-04-19 5:16 ` Christoph Hellwig
2023-04-19 13:14 ` Alexander Lobakin
2023-04-19 13:40 ` Xuan Zhuo
2023-04-20 6:16 ` Christoph Hellwig
2023-04-20 13:59 ` Alexander Lobakin [this message]
2023-04-20 16:15 ` Christoph Hellwig
2023-04-20 16:42 ` Alexander Lobakin
2023-05-01 4:28 ` Christoph Hellwig
2023-04-19 16:45 ` Jakub Kicinski
2023-04-20 6:19 ` Christoph Hellwig
2023-04-20 9:11 ` Xuan Zhuo
2023-04-20 16:18 ` Christoph Hellwig
2023-04-25 8:12 ` Michael S. Tsirkin
2023-05-01 4:16 ` Christoph Hellwig
2023-04-20 14:13 ` Jakub Kicinski
2023-04-21 7:31 ` Xuan Zhuo
2023-04-21 13:50 ` Jakub Kicinski
2023-04-23 1:54 ` Xuan Zhuo
2023-04-24 15:28 ` Jakub Kicinski
2023-04-24 15:28 ` Alexander Lobakin
2023-04-25 2:11 ` Xuan Zhuo
2023-04-18 2:15 ` Xuan Zhuo
2023-04-17 6:38 ` kernel test robot
2023-04-17 6:43 ` Michael S. Tsirkin
2023-04-17 6:48 ` Xuan Zhuo
2023-04-17 6:48 ` kernel test robot
2023-04-19 13:22 ` Alexander Lobakin
2023-04-19 13:42 ` Xuan Zhuo
2023-04-20 6:12 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ff3d588e-10ac-36dd-06af-d55a79424ede@intel.com \
--to=aleksander.lobakin@intel.com \
--cc=ast@kernel.org \
--cc=axboe@kernel.dk \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=hawk@kernel.org \
--cc=hch@infradead.org \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=jonathan.lemon@gmail.com \
--cc=kraxel@redhat.com \
--cc=kuba@kernel.org \
--cc=linux@roeck-us.net \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).