From: Edward Cree <ecree.xilinx@gmail.com>
To: Mina Almasry <almasrymina@google.com>, David Ahern <dsahern@kernel.org>
Cc: "Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
"Stanislav Fomichev" <sdf@google.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org,
linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
linaro-mm-sig@lists.linaro.org,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
"Arnd Bergmann" <arnd@arndb.de>, "Shuah Khan" <shuah@kernel.org>,
"Sumit Semwal" <sumit.semwal@linaro.org>,
"Christian König" <christian.koenig@amd.com>,
"Shakeel Butt" <shakeelb@google.com>,
"Jeroen de Borst" <jeroendb@google.com>,
"Praveen Kaligineedi" <pkaligineedi@google.com>,
"Willem de Bruijn" <willemb@google.com>,
"Kaiyuan Zhang" <kaiyuanz@google.com>
Subject: Re: [RFC PATCH v3 10/12] tcp: RX path for devmem TCP
Date: Thu, 9 Nov 2023 16:07:15 +0000 [thread overview]
Message-ID: <6f853286-e463-b684-cc1e-405119528697@gmail.com> (raw)
In-Reply-To: <CAHS8izM_qrEs37F=kPzT_kmqCBV_wSiTf72PtHfJYxks9R9--Q@mail.gmail.com>
On 09/11/2023 02:39, Mina Almasry wrote:
> On Wed, Nov 8, 2023 at 7:36 AM Edward Cree <ecree.xilinx@gmail.com> wrote:
>> If not then surely the way to return a memory area
>> in an io_uring idiom is just to post a new read sqe ('RX descriptor')
>> pointing into it, rather than explicitly returning it with setsockopt.
>
> We're interested in using this with regular TCP sockets, not
> necessarily io_uring.
Fair. I just wanted to push against the suggestion upthread that "oh,
since io_uring supports setsockopt() we can just ignore it and it'll
all magically work later" (paraphrased).
If you can keep the "allocate buffers out of a devmem region" and "post
RX descriptors built on those buffers" APIs separate (inside the
kernel; obviously both triggered by a single call to the setsockopt()
uAPI) that'll likely make things simpler for the io_uring interface I
describe, which will only want the latter.
-ed
PS: Here's a crazy idea that I haven't thought through at all: what if
you allow device memory to be mmap()ed into process address space
(obviously with none of r/w/x because it's unreachable), so that your
various uAPIs can just operate on pointers (e.g. the setsockopt
becomes the madvise it's named after; recvmsg just uses or populates
the iovec rather than needing a cmsg). Then if future devices have
their memory CXL accessible that can potentially be enabled with no
change to the uAPI (userland just starts being able to access the
region without faulting).
And you can maybe add a semantic flag to recvmsg saying "if you don't
use all the buffers in my iovec, keep hold of the rest of them for
future incoming traffic, and if I post new buffers with my next
recvmsg, add those to the tail of the RXQ rather than replacing the
ones you've got". That way you can still have the "userland
directly fills the RX ring" behaviour even with TCP sockets.
next prev parent reply other threads:[~2023-11-09 16:07 UTC|newest]
Thread overview: 126+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-06 2:43 [RFC PATCH v3 00/12] Device Memory TCP Mina Almasry
2023-11-06 2:44 ` [RFC PATCH v3 01/12] net: page_pool: factor out releasing DMA from releasing the page Mina Almasry
2023-11-06 2:44 ` [RFC PATCH v3 02/12] net: page_pool: create hooks for custom page providers Mina Almasry
2023-11-07 7:44 ` Yunsheng Lin
2023-11-09 11:09 ` Paolo Abeni
2023-11-10 23:19 ` Jakub Kicinski
2023-11-13 3:28 ` Mina Almasry
2023-11-13 22:10 ` Jakub Kicinski
2023-11-06 2:44 ` [RFC PATCH v3 03/12] net: netdev netlink api to bind dma-buf to a net device Mina Almasry
2023-11-10 23:16 ` Jakub Kicinski
2023-11-06 2:44 ` [RFC PATCH v3 04/12] netdev: support binding dma-buf to netdevice Mina Almasry
2023-11-07 7:46 ` Yunsheng Lin
2023-11-07 21:59 ` Mina Almasry
2023-11-08 3:40 ` Yunsheng Lin
2023-11-09 2:22 ` Mina Almasry
2023-11-09 9:29 ` Yunsheng Lin
2023-11-08 23:47 ` David Wei
2023-11-09 2:25 ` Mina Almasry
2023-11-09 8:29 ` Paolo Abeni
2023-11-10 2:59 ` Mina Almasry
2023-11-10 7:38 ` Yunsheng Lin
2023-11-10 9:45 ` Mina Almasry
2023-11-10 23:19 ` Jakub Kicinski
2023-11-11 2:19 ` Mina Almasry
2023-11-06 2:44 ` [RFC PATCH v3 05/12] netdev: netdevice devmem allocator Mina Almasry
2023-11-06 23:44 ` David Ahern
2023-11-07 22:10 ` Mina Almasry
2023-11-07 22:55 ` David Ahern
2023-11-07 23:03 ` Mina Almasry
2023-11-09 1:15 ` David Wei
2023-11-10 14:26 ` Pavel Begunkov
2023-11-11 17:19 ` David Ahern
2023-11-14 16:09 ` Pavel Begunkov
2023-11-09 1:00 ` David Wei
2023-11-08 3:48 ` Yunsheng Lin
2023-11-09 1:41 ` Mina Almasry
2023-11-07 7:45 ` Yunsheng Lin
2023-11-09 8:44 ` Paolo Abeni
2023-11-06 2:44 ` [RFC PATCH v3 06/12] memory-provider: dmabuf devmem memory provider Mina Almasry
2023-11-06 21:02 ` Stanislav Fomichev
2023-11-06 23:49 ` David Ahern
2023-11-08 0:02 ` Mina Almasry
2023-11-08 0:10 ` David Ahern
2023-11-10 23:16 ` Jakub Kicinski
2023-11-13 4:54 ` Mina Almasry
2023-11-06 2:44 ` [RFC PATCH v3 07/12] page-pool: device memory support Mina Almasry
2023-11-07 8:00 ` Yunsheng Lin
2023-11-07 21:56 ` Mina Almasry
2023-11-08 10:56 ` Yunsheng Lin
2023-11-09 3:20 ` Mina Almasry
2023-11-09 9:30 ` Yunsheng Lin
2023-11-09 12:20 ` Mina Almasry
2023-11-09 13:23 ` Yunsheng Lin
2023-11-09 9:01 ` Paolo Abeni
2023-11-06 2:44 ` [RFC PATCH v3 08/12] net: support non paged skb frags Mina Almasry
2023-11-07 9:00 ` Yunsheng Lin
2023-11-07 21:19 ` Mina Almasry
2023-11-08 11:25 ` Yunsheng Lin
2023-11-09 9:14 ` Paolo Abeni
2023-11-10 4:06 ` Mina Almasry
2023-11-10 23:19 ` Jakub Kicinski
2023-11-13 6:05 ` Mina Almasry
2023-11-13 22:17 ` Jakub Kicinski
2023-11-06 2:44 ` [RFC PATCH v3 09/12] net: add support for skbs with unreadable frags Mina Almasry
2023-11-06 18:47 ` Stanislav Fomichev
2023-11-06 19:34 ` David Ahern
2023-11-06 20:31 ` Mina Almasry
2023-11-06 21:59 ` Stanislav Fomichev
2023-11-06 22:18 ` Mina Almasry
2023-11-06 22:59 ` Stanislav Fomichev
2023-11-06 23:27 ` Mina Almasry
2023-11-06 23:55 ` Stanislav Fomichev
2023-11-07 0:07 ` Willem de Bruijn
2023-11-07 0:14 ` Stanislav Fomichev
2023-11-07 0:59 ` Stanislav Fomichev
2023-11-07 2:23 ` Willem de Bruijn
2023-11-07 17:44 ` Stanislav Fomichev
2023-11-07 17:57 ` Willem de Bruijn
2023-11-07 18:14 ` Stanislav Fomichev
2023-11-07 0:20 ` Mina Almasry
2023-11-07 1:06 ` Stanislav Fomichev
2023-11-07 19:53 ` Mina Almasry
2023-11-07 21:05 ` Stanislav Fomichev
2023-11-07 21:17 ` Eric Dumazet
2023-11-07 22:23 ` Stanislav Fomichev
2023-11-10 23:17 ` Jakub Kicinski
2023-11-10 23:19 ` Jakub Kicinski
2023-11-07 1:09 ` David Ahern
2023-11-06 23:37 ` David Ahern
2023-11-07 0:03 ` Mina Almasry
2023-11-06 20:56 ` Stanislav Fomichev
2023-11-07 0:16 ` David Ahern
2023-11-07 0:23 ` Mina Almasry
2023-11-08 14:43 ` David Laight
2023-11-06 2:44 ` [RFC PATCH v3 10/12] tcp: RX path for devmem TCP Mina Almasry
2023-11-06 18:44 ` Stanislav Fomichev
2023-11-06 19:29 ` Mina Almasry
2023-11-06 21:14 ` Willem de Bruijn
2023-11-06 22:34 ` Stanislav Fomichev
2023-11-06 22:55 ` Willem de Bruijn
2023-11-06 23:32 ` Stanislav Fomichev
2023-11-06 23:55 ` David Ahern
2023-11-07 0:02 ` Willem de Bruijn
2023-11-07 23:55 ` Mina Almasry
2023-11-08 0:01 ` David Ahern
2023-11-09 2:39 ` Mina Almasry
2023-11-09 16:07 ` Edward Cree [this message]
2023-12-08 20:12 ` Pavel Begunkov
2023-11-09 11:05 ` Paolo Abeni
2023-11-10 23:16 ` Jakub Kicinski
2023-12-08 20:28 ` Pavel Begunkov
2023-12-08 20:09 ` Pavel Begunkov
2023-11-06 21:17 ` Stanislav Fomichev
2023-11-08 15:36 ` Edward Cree
2023-11-09 10:52 ` Paolo Abeni
2023-11-10 23:19 ` Jakub Kicinski
2023-11-06 2:44 ` [RFC PATCH v3 11/12] net: add SO_DEVMEM_DONTNEED setsockopt to release RX pages Mina Almasry
2023-11-06 2:44 ` [RFC PATCH v3 12/12] selftests: add ncdevmem, netcat for devmem TCP Mina Almasry
2023-11-09 11:03 ` Paolo Abeni
2023-11-10 23:13 ` Jakub Kicinski
2023-11-11 2:27 ` Mina Almasry
2023-11-11 2:35 ` Jakub Kicinski
2023-11-13 4:08 ` Mina Almasry
2023-11-13 22:20 ` Jakub Kicinski
2023-11-10 23:17 ` Jakub Kicinski
2023-11-07 15:18 ` [RFC PATCH v3 00/12] Device Memory TCP David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6f853286-e463-b684-cc1e-405119528697@gmail.com \
--to=ecree.xilinx@gmail.com \
--cc=almasrymina@google.com \
--cc=arnd@arndb.de \
--cc=christian.koenig@amd.com \
--cc=davem@davemloft.net \
--cc=dri-devel@lists.freedesktop.org \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=jeroendb@google.com \
--cc=kaiyuanz@google.com \
--cc=kuba@kernel.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pkaligineedi@google.com \
--cc=sdf@google.com \
--cc=shakeelb@google.com \
--cc=shuah@kernel.org \
--cc=sumit.semwal@linaro.org \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox