netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Mina Almasry <almasrymina@google.com>
Cc: David Wei <dw@davidwei.uk>,
	Stanislav Fomichev <stfomichev@gmail.com>,
	io-uring@vger.kernel.org, netdev@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>, Jakub Kicinski <kuba@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	David Ahern <dsahern@kernel.org>
Subject: Re: [PATCH v1 03/15] net: generalise net_iov chunk owners
Date: Sat, 12 Oct 2024 00:12:25 +0100	[thread overview]
Message-ID: <f762abbd-5732-456f-97f2-df91b006016e@gmail.com> (raw)
In-Reply-To: <CAHS8izM4AVsB5+H4p05D_m-cwO5TqHfn28XfNUM-rDAO5=BTew@mail.gmail.com>

On 10/11/24 23:25, Mina Almasry wrote:
> On Fri, Oct 11, 2024 at 3:02 PM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>> On 10/11/24 19:44, David Wei wrote:
>>> On 2024-10-09 09:28, Stanislav Fomichev wrote:
>>>> On 10/08, Pavel Begunkov wrote:
>>>>> On 10/8/24 16:46, Stanislav Fomichev wrote:
>>>>>> On 10/07, David Wei wrote:
>>>>>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>>>>>
>>>>>>> Currently net_iov stores a pointer to struct dmabuf_genpool_chunk_owner,
>>>>>>> which serves as a useful abstraction to share data and provide a
>>>>>>> context. However, it's too devmem specific, and we want to reuse it for
>>>>>>> other memory providers, and for that we need to decouple net_iov from
>>>>>>> devmem. Make net_iov to point to a new base structure called
>>>>>>> net_iov_area, which dmabuf_genpool_chunk_owner extends.
>>>>>>>
>>>>>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>>>>>>> Signed-off-by: David Wei <dw@davidwei.uk>
>>>>>>> ---
>>>>>>>     include/net/netmem.h | 21 ++++++++++++++++++++-
>>>>>>>     net/core/devmem.c    | 25 +++++++++++++------------
>>>>>>>     net/core/devmem.h    | 25 +++++++++----------------
>>>>>>>     3 files changed, 42 insertions(+), 29 deletions(-)
>>>>>>>
>>>>>>> diff --git a/include/net/netmem.h b/include/net/netmem.h
>>>>>>> index 8a6e20be4b9d..3795ded30d2c 100644
>>>>>>> --- a/include/net/netmem.h
>>>>>>> +++ b/include/net/netmem.h
>>>>>>> @@ -24,11 +24,20 @@ struct net_iov {
>>>>>>>            unsigned long __unused_padding;
>>>>>>>            unsigned long pp_magic;
>>>>>>>            struct page_pool *pp;
>>>>>>> - struct dmabuf_genpool_chunk_owner *owner;
>>>>>>> + struct net_iov_area *owner;
>>>>>>
>>>>>> Any reason not to use dmabuf_genpool_chunk_owner as is (or rename it
>>>>>> to net_iov_area to generalize) with the fields that you don't need
>>>>>> set to 0/NULL? container_of makes everything harder to follow :-(
>>>>>
>>>>> It can be that, but then io_uring would have a (null) pointer to
>>>>> struct net_devmem_dmabuf_binding it knows nothing about and other
>>>>> fields devmem might add in the future. Also, it reduces the
>>>>> temptation for the common code to make assumptions about the origin
>>>>> of the area / pp memory provider. IOW, I think it's cleaner
>>>>> when separated like in this patch.
>>>>
>>>> Ack, let's see whether other people find any issues with this approach.
>>>> For me, it makes the devmem parts harder to read, so my preference
>>>> is on dropping this patch and keeping owner=null on your side.
>>>
>>> I don't mind at this point which approach to take right now. I would
>>> prefer keeping dmabuf_genpool_chunk_owner today even if it results in a
>>> nullptr in io_uring's case. Once there are more memory providers in the
>>> future, I think it'll be clearer what sort of abstraction we might need
>>> here.
>>
>> That's the thing about abstractions, if we say that devmem is the
>> only first class citizen for net_iov and everything else by definition
>> is 2nd class that should strictly follow devmem TCP patterns, and/or
>> that struct dmabuf_genpool_chunk_owner is an integral part of net_iov
>> and should be reused by everyone, then preserving the current state
>> of the chunk owner is likely the right long term approach. If not, and
>> net_iov is actually a generic piece of infrastructure, then IMHO there
>> is no place for devmem sticking out of every bit single bit of it, with
>> structures that are devmem specific and can even be not defined without
>> devmem TCP enabled (fwiw, which is not an actual problem for
>> compilation, juts oddness).
>>
> 
> There is no intention of devmem TCP being a first class citizen or
> anything.

Let me note to avoid being misread, that kind of prioritisation can
have place and that's fine, but that usually happens when you build
on top of older code or user base sizes are much different. And
again, theoretically dmabuf_genpool_chunk_owner could be common code,
i.e. if you want to use dmabuf you need to use the structure
regardless of the provider of choice, and it'll do all dmabuf
handling. But the current chunk owner goes beyond that, and
would need some splitting if someone tries to have that kind of
an abstraction.

> Abstractly speaking, we're going to draw a line in the sand
> and say everything past this line is devmem specific and should be
> replaced by other users. In this patch you drew the line between
> dmabuf_genpool_chunk_owner and net_iov_area, which is fine by me on
> first look. What Stan and I were thinking at first glance is
> preserving dmabuf_* (and renaming) and drawing the line somewhere
> else, which would have also been fine.

True enough, I drew the line when it was convenient, io_uring
needs an extendible abstraction that binds net_iovs, and we'll
also have several different sets of net_iovs, so it fell onto
the object holding the net_iov array as the most natural option.
In that sense, we could've had that binding pointing to an
allocated io_zcrx_area, which would then point further into
io_uring, but that's one extra indirection.

As an viable alternative I don't like that much, instead of
trying to share struct net_iov_area, we can just make struct
struct net_iov::owner completely provider dependent and make
it void *, providers will be allowed to store there whatever
they wish.

> My real issue is whether its safe to do all this container_of while
> not always checking explicitly for the type of net_iov. I'm not 100%
> sure checking in tcp.c alone is enough, yet. I need to take a deeper
> look, no changes requested from me yet.

That's done the typical way everything in the kernel and just
inheritance works. When you get into devmem.c the page pool
callbacks, you for sure know that net_iov's passed are devmem's,
nobody should ever take one net_iov's ops and call it with a
second net_iov without care, the page pool follows it.

When someone wants to operate with devmem net_iov's but doesn't
have a callback it has to validate the origin as tcp.c now does
in 5/15. The nice part is that this patch changes types, so all
such places either explicitly listed in this patch, or it has to
pass through one of the devmem.h helpers, which is yet trivially
checkable.

> FWIW I'm out for the next couple of weeks. I'll have time to take a
> look during that but not as much as now.
> 

-- 
Pavel Begunkov

  reply	other threads:[~2024-10-11 23:11 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-07 22:15 [PATCH v1 00/15] io_uring zero copy rx David Wei
2024-10-07 22:15 ` [PATCH v1 01/15] net: devmem: pull struct definitions out of ifdef David Wei
2024-10-09 20:17   ` Mina Almasry
2024-10-09 23:16     ` Pavel Begunkov
2024-10-10 18:01       ` Mina Almasry
2024-10-10 18:57         ` Pavel Begunkov
2024-10-13 22:38           ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 02/15] net: prefix devmem specific helpers David Wei
2024-10-09 20:19   ` Mina Almasry
2024-10-07 22:15 ` [PATCH v1 03/15] net: generalise net_iov chunk owners David Wei
2024-10-08 15:46   ` Stanislav Fomichev
2024-10-08 16:34     ` Pavel Begunkov
2024-10-09 16:28       ` Stanislav Fomichev
2024-10-11 18:44         ` David Wei
2024-10-11 22:02           ` Pavel Begunkov
2024-10-11 22:25             ` Mina Almasry
2024-10-11 23:12               ` Pavel Begunkov [this message]
2024-10-09 20:44   ` Mina Almasry
2024-10-09 22:13     ` Pavel Begunkov
2024-10-09 22:19     ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 04/15] net: page_pool: create hooks for custom page providers David Wei
2024-10-09 20:49   ` Mina Almasry
2024-10-09 22:02     ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 05/15] net: prepare for non devmem TCP memory providers David Wei
2024-10-09 20:56   ` Mina Almasry
2024-10-09 21:45     ` Pavel Begunkov
2024-10-13 22:33       ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 06/15] net: page_pool: add ->scrub mem provider callback David Wei
2024-10-09 21:00   ` Mina Almasry
2024-10-09 21:59     ` Pavel Begunkov
2024-10-10 17:54       ` Mina Almasry
2024-10-13 17:25         ` David Wei
2024-10-14 13:37           ` Pavel Begunkov
2024-10-14 22:58           ` Mina Almasry
2024-10-16 17:42             ` Pavel Begunkov
2024-11-01 17:18               ` Mina Almasry
2024-11-01 18:35                 ` Pavel Begunkov
2024-11-01 19:24                   ` Mina Almasry
2024-11-01 21:38                     ` Pavel Begunkov
2024-11-04 20:42                       ` Mina Almasry
2024-11-04 21:27                         ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 07/15] net: page pool: add helper creating area from pages David Wei
2024-10-09 21:11   ` Mina Almasry
2024-10-09 21:34     ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 08/15] net: add helper executing custom callback from napi David Wei
2024-10-08 22:25   ` Joe Damato
2024-10-09 15:09     ` Pavel Begunkov
2024-10-09 16:13       ` Joe Damato
2024-10-09 19:12         ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 09/15] io_uring/zcrx: add interface queue and refill queue David Wei
2024-10-09 17:50   ` Jens Axboe
2024-10-09 18:09     ` Jens Axboe
2024-10-09 19:08     ` Pavel Begunkov
2024-10-11 22:11     ` Pavel Begunkov
2024-10-13 17:32     ` David Wei
2024-10-07 22:15 ` [PATCH v1 10/15] io_uring/zcrx: add io_zcrx_area David Wei
2024-10-09 18:02   ` Jens Axboe
2024-10-09 19:05     ` Pavel Begunkov
2024-10-09 19:06       ` Jens Axboe
2024-10-09 21:29   ` Mina Almasry
2024-10-07 22:15 ` [PATCH v1 11/15] io_uring/zcrx: implement zerocopy receive pp memory provider David Wei
2024-10-09 18:10   ` Jens Axboe
2024-10-09 22:01   ` Mina Almasry
2024-10-09 22:58     ` Pavel Begunkov
2024-10-10 18:19       ` Mina Almasry
2024-10-10 20:26         ` Pavel Begunkov
2024-10-10 20:53           ` Mina Almasry
2024-10-10 20:58             ` Mina Almasry
2024-10-10 21:22             ` Pavel Begunkov
2024-10-11  0:32               ` Mina Almasry
2024-10-11  1:49                 ` Pavel Begunkov
2024-10-07 22:16 ` [PATCH v1 12/15] io_uring/zcrx: add io_recvzc request David Wei
2024-10-09 18:28   ` Jens Axboe
2024-10-09 18:51     ` Pavel Begunkov
2024-10-09 19:01       ` Jens Axboe
2024-10-09 19:27         ` Pavel Begunkov
2024-10-09 19:42           ` Jens Axboe
2024-10-09 19:47             ` Pavel Begunkov
2024-10-09 19:50               ` Jens Axboe
2024-10-07 22:16 ` [PATCH v1 13/15] io_uring/zcrx: add copy fallback David Wei
2024-10-08 15:58   ` Stanislav Fomichev
2024-10-08 16:39     ` Pavel Begunkov
2024-10-08 16:40     ` David Wei
2024-10-09 16:30       ` Stanislav Fomichev
2024-10-09 23:05         ` Pavel Begunkov
2024-10-11  6:22           ` David Wei
2024-10-11 14:43             ` Stanislav Fomichev
2024-10-09 18:38   ` Jens Axboe
2024-10-07 22:16 ` [PATCH v1 14/15] io_uring/zcrx: set pp memory provider for an rx queue David Wei
2024-10-09 18:42   ` Jens Axboe
2024-10-10 13:09     ` Pavel Begunkov
2024-10-10 13:19       ` Jens Axboe
2024-10-07 22:16 ` [PATCH v1 15/15] io_uring/zcrx: throttle receive requests David Wei
2024-10-09 18:43   ` Jens Axboe
2024-10-07 22:20 ` [PATCH v1 00/15] io_uring zero copy rx David Wei
2024-10-08 23:10 ` Joe Damato
2024-10-09 15:07   ` Pavel Begunkov
2024-10-09 16:10     ` Joe Damato
2024-10-09 16:12     ` Jens Axboe
2024-10-11  6:15       ` David Wei
2024-10-09 15:27 ` Jens Axboe
2024-10-09 15:38   ` David Ahern
2024-10-09 15:43     ` Jens Axboe
2024-10-09 15:49       ` Pavel Begunkov
2024-10-09 15:50         ` Jens Axboe
2024-10-09 16:35       ` David Ahern
2024-10-09 16:50         ` Jens Axboe
2024-10-09 16:53           ` Jens Axboe
2024-10-09 17:12             ` Jens Axboe
2024-10-10 14:21               ` Jens Axboe
2024-10-10 15:03                 ` David Ahern
2024-10-10 15:15                   ` Jens Axboe
2024-10-10 18:11                 ` Jens Axboe
2024-10-14  8:42                   ` David Laight
2024-10-09 16:55 ` Mina Almasry
2024-10-09 16:57   ` Jens Axboe
2024-10-09 19:32     ` Mina Almasry
2024-10-09 19:43       ` Pavel Begunkov
2024-10-09 19:47       ` Jens Axboe
2024-10-09 17:19   ` David Ahern
2024-10-09 18:21   ` Pedro Tammela
2024-10-10 13:19     ` Pavel Begunkov
2024-10-11  0:35     ` David Wei
2024-10-11 14:28       ` Pedro Tammela
2024-10-11  0:29   ` David Wei
2024-10-11 19:43     ` Mina Almasry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f762abbd-5732-456f-97f2-df91b006016e@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=almasrymina@google.com \
    --cc=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=dw@davidwei.uk \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=io-uring@vger.kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=stfomichev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).