All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: David Wei <dw@davidwei.uk>, Mina Almasry <almasrymina@google.com>
Cc: io-uring@vger.kernel.org, netdev@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>, Jakub Kicinski <kuba@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	David Ahern <dsahern@kernel.org>
Subject: Re: [PATCH v1 06/15] net: page_pool: add ->scrub mem provider callback
Date: Mon, 14 Oct 2024 14:37:26 +0100	[thread overview]
Message-ID: <4a33bf2c-49ef-4cfd-97af-8341cdb977d3@gmail.com> (raw)
In-Reply-To: <5d7925ed-91bf-4c78-8b70-598ae9ab3885@davidwei.uk>

On 10/13/24 18:25, David Wei wrote:
> On 2024-10-10 10:54, Mina Almasry wrote:
>> On Wed, Oct 9, 2024 at 2:58 PM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>>
>>> On 10/9/24 22:00, Mina Almasry wrote:
>>>> On Mon, Oct 7, 2024 at 3:16 PM David Wei <dw@davidwei.uk> wrote:
>>>>>
>>>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>>>
>>>>> page pool is now waiting for all ppiovs to return before destroying
>>>>> itself, and for that to happen the memory provider might need to push
>>>>> some buffers, flush caches and so on.
>>>>>
>>>>> todo: we'll try to get by without it before the final release
>>>>>
>>>>
>>>> Is the intention to drop this todo and stick with this patch, or to
>>>> move ahead with this patch?
>>>
>>> Heh, I overlooked this todo. The plan is to actually leave it
>>> as is, it's by far the simplest way and doesn't really gets
>>> into anyone's way as it's a slow path.
>>>
>>>> To be honest, I think I read in a follow up patch that you want to
>>>> unref all the memory on page_pool_destory, which is not how the
>>>> page_pool is used today. Tdoay page_pool_destroy does not reclaim
>>>> memory. Changing that may be OK.
>>>
>>> It doesn't because it can't (not breaking anything), which is a
>>> problem as the page pool might never get destroyed. io_uring
>>> doesn't change that, a buffer can't be reclaimed while anything
>>> in the kernel stack holds it. It's only when it's given to the
>>> user we can force it back out of there.
> 
> The page pool will definitely be destroyed, the call to
> netdev_rx_queue_restart() with mp_ops/mp_priv set to null and netdev
> core will ensure that.
> 
>>>
>>> And it has to happen one way or another, we can't trust the
>>> user to put buffers back, it's just devmem does that by temporarily
>>> attaching the lifetime of such buffers to a socket.
>>>
>>
>> (noob question) does io_uring not have a socket equivalent that you
>> can tie the lifetime of the buffers to? I'm thinking there must be
>> one, because in your patches IIRC you have the fill queues and the
>> memory you bind from the userspace, there should be something that
>> tells you that the userspace has exited/crashed and it's time to now
>> destroy the fill queue and unbind the memory, right?
>>
>> I'm thinking you may want to bind the lifetime of the buffers to that,
>> instead of the lifetime of the pool. The pool will not be destroyed
>> until the next driver/reset reconfiguration happens, right? That could
>> be long long after the userspace has stopped using the memory.
>>
> 
> Yes, there are io_uring objects e.g. interface queue that hold
> everything together. IIRC page pool destroy doesn't unref but it waits
> for all pages that are handed out to skbs to be returned. So for us,
> below might work:
> 
> 1. Call netdev_rx_queue_restart() which allocates a new pp for the rx
>     queue and tries to free the old pp
> 2. At this point we're guaranteed that any packets hitting this rx queue
>     will not go to user pages from our memory provider
> 3. Assume userspace is gone (either crash or gracefully terminating),
>     unref the uref for all pages, same as what scrub() is doing today
> 4. Any pages that are still in skb frags will get freed when the sockets
>     etc are closed
> 5. Rely on the pp delay release to eventually terminate and clean up
> 
> Let me know what you think Pavel.

I'll get to this comment a bit later when I get some time to
remember what races we have to deal with without the callback.

-- 
Pavel Begunkov

  reply	other threads:[~2024-10-14 13:36 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-07 22:15 [PATCH v1 00/15] io_uring zero copy rx David Wei
2024-10-07 22:15 ` [PATCH v1 01/15] net: devmem: pull struct definitions out of ifdef David Wei
2024-10-09 20:17   ` Mina Almasry
2024-10-09 23:16     ` Pavel Begunkov
2024-10-10 18:01       ` Mina Almasry
2024-10-10 18:57         ` Pavel Begunkov
2024-10-13 22:38           ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 02/15] net: prefix devmem specific helpers David Wei
2024-10-09 20:19   ` Mina Almasry
2024-10-07 22:15 ` [PATCH v1 03/15] net: generalise net_iov chunk owners David Wei
2024-10-08 15:46   ` Stanislav Fomichev
2024-10-08 16:34     ` Pavel Begunkov
2024-10-09 16:28       ` Stanislav Fomichev
2024-10-11 18:44         ` David Wei
2024-10-11 22:02           ` Pavel Begunkov
2024-10-11 22:25             ` Mina Almasry
2024-10-11 23:12               ` Pavel Begunkov
2024-10-09 20:44   ` Mina Almasry
2024-10-09 22:13     ` Pavel Begunkov
2024-10-09 22:19     ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 04/15] net: page_pool: create hooks for custom page providers David Wei
2024-10-09 20:49   ` Mina Almasry
2024-10-09 22:02     ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 05/15] net: prepare for non devmem TCP memory providers David Wei
2024-10-09 20:56   ` Mina Almasry
2024-10-09 21:45     ` Pavel Begunkov
2024-10-13 22:33       ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 06/15] net: page_pool: add ->scrub mem provider callback David Wei
2024-10-09 21:00   ` Mina Almasry
2024-10-09 21:59     ` Pavel Begunkov
2024-10-10 17:54       ` Mina Almasry
2024-10-13 17:25         ` David Wei
2024-10-14 13:37           ` Pavel Begunkov [this message]
2024-10-14 22:58           ` Mina Almasry
2024-10-16 17:42             ` Pavel Begunkov
2024-11-01 17:18               ` Mina Almasry
2024-11-01 18:35                 ` Pavel Begunkov
2024-11-01 19:24                   ` Mina Almasry
2024-11-01 21:38                     ` Pavel Begunkov
2024-11-04 20:42                       ` Mina Almasry
2024-11-04 21:27                         ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 07/15] net: page pool: add helper creating area from pages David Wei
2024-10-09 21:11   ` Mina Almasry
2024-10-09 21:34     ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 08/15] net: add helper executing custom callback from napi David Wei
2024-10-08 22:25   ` Joe Damato
2024-10-09 15:09     ` Pavel Begunkov
2024-10-09 16:13       ` Joe Damato
2024-10-09 19:12         ` Pavel Begunkov
2024-10-07 22:15 ` [PATCH v1 09/15] io_uring/zcrx: add interface queue and refill queue David Wei
2024-10-09 17:50   ` Jens Axboe
2024-10-09 18:09     ` Jens Axboe
2024-10-09 19:08     ` Pavel Begunkov
2024-10-11 22:11     ` Pavel Begunkov
2024-10-13 17:32     ` David Wei
2024-10-07 22:15 ` [PATCH v1 10/15] io_uring/zcrx: add io_zcrx_area David Wei
2024-10-09 18:02   ` Jens Axboe
2024-10-09 19:05     ` Pavel Begunkov
2024-10-09 19:06       ` Jens Axboe
2024-10-09 21:29   ` Mina Almasry
2024-10-07 22:15 ` [PATCH v1 11/15] io_uring/zcrx: implement zerocopy receive pp memory provider David Wei
2024-10-09 18:10   ` Jens Axboe
2024-10-09 22:01   ` Mina Almasry
2024-10-09 22:58     ` Pavel Begunkov
2024-10-10 18:19       ` Mina Almasry
2024-10-10 20:26         ` Pavel Begunkov
2024-10-10 20:53           ` Mina Almasry
2024-10-10 20:58             ` Mina Almasry
2024-10-10 21:22             ` Pavel Begunkov
2024-10-11  0:32               ` Mina Almasry
2024-10-11  1:49                 ` Pavel Begunkov
2024-10-07 22:16 ` [PATCH v1 12/15] io_uring/zcrx: add io_recvzc request David Wei
2024-10-09 18:28   ` Jens Axboe
2024-10-09 18:51     ` Pavel Begunkov
2024-10-09 19:01       ` Jens Axboe
2024-10-09 19:27         ` Pavel Begunkov
2024-10-09 19:42           ` Jens Axboe
2024-10-09 19:47             ` Pavel Begunkov
2024-10-09 19:50               ` Jens Axboe
2024-10-07 22:16 ` [PATCH v1 13/15] io_uring/zcrx: add copy fallback David Wei
2024-10-08 15:58   ` Stanislav Fomichev
2024-10-08 16:39     ` Pavel Begunkov
2024-10-08 16:40     ` David Wei
2024-10-09 16:30       ` Stanislav Fomichev
2024-10-09 23:05         ` Pavel Begunkov
2024-10-11  6:22           ` David Wei
2024-10-11 14:43             ` Stanislav Fomichev
2024-10-09 18:38   ` Jens Axboe
2024-10-07 22:16 ` [PATCH v1 14/15] io_uring/zcrx: set pp memory provider for an rx queue David Wei
2024-10-09 18:42   ` Jens Axboe
2024-10-10 13:09     ` Pavel Begunkov
2024-10-10 13:19       ` Jens Axboe
2024-10-07 22:16 ` [PATCH v1 15/15] io_uring/zcrx: throttle receive requests David Wei
2024-10-09 18:43   ` Jens Axboe
2024-10-07 22:20 ` [PATCH v1 00/15] io_uring zero copy rx David Wei
2024-10-08 23:10 ` Joe Damato
2024-10-09 15:07   ` Pavel Begunkov
2024-10-09 16:10     ` Joe Damato
2024-10-09 16:12     ` Jens Axboe
2024-10-11  6:15       ` David Wei
2024-10-09 15:27 ` Jens Axboe
2024-10-09 15:38   ` David Ahern
2024-10-09 15:43     ` Jens Axboe
2024-10-09 15:49       ` Pavel Begunkov
2024-10-09 15:50         ` Jens Axboe
2024-10-09 16:35       ` David Ahern
2024-10-09 16:50         ` Jens Axboe
2024-10-09 16:53           ` Jens Axboe
2024-10-09 17:12             ` Jens Axboe
2024-10-10 14:21               ` Jens Axboe
2024-10-10 15:03                 ` David Ahern
2024-10-10 15:15                   ` Jens Axboe
2024-10-10 18:11                 ` Jens Axboe
2024-10-14  8:42                   ` David Laight
2024-10-09 16:55 ` Mina Almasry
2024-10-09 16:57   ` Jens Axboe
2024-10-09 19:32     ` Mina Almasry
2024-10-09 19:43       ` Pavel Begunkov
2024-10-09 19:47       ` Jens Axboe
2024-10-09 17:19   ` David Ahern
2024-10-09 18:21   ` Pedro Tammela
2024-10-10 13:19     ` Pavel Begunkov
2024-10-11  0:35     ` David Wei
2024-10-11 14:28       ` Pedro Tammela
2024-10-11  0:29   ` David Wei
2024-10-11 19:43     ` Mina Almasry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4a33bf2c-49ef-4cfd-97af-8341cdb977d3@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=almasrymina@google.com \
    --cc=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=dw@davidwei.uk \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=io-uring@vger.kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.