From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Mina Almasry <almasrymina@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
davem@davemloft.net, netdev@vger.kernel.org, edumazet@google.com,
pabeni@redhat.com, andrew+netdev@lunn.ch, horms@kernel.org,
hawk@kernel.org, ilias.apalodimas@linaro.org,
asml.silence@gmail.com, kaiyuanz@google.com, willemb@google.com,
mkarsten@uwaterloo.ca, jdamato@fastly.com
Subject: Re: [PATCH net] net: page_pool: don't try to stash the napi id
Date: Mon, 27 Jan 2025 14:31:10 +0100 [thread overview]
Message-ID: <877c6gpen5.fsf@toke.dk> (raw)
In-Reply-To: <CAHS8izOv=tUiuzha6NFq1-ZurLGz9Jdi78jb3ey4ExVJirMprA@mail.gmail.com>
Mina Almasry <almasrymina@google.com> writes:
> On Fri, Jan 24, 2025 at 2:18 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Mina Almasry <almasrymina@google.com> writes:
>>
>> > On Thu, Jan 23, 2025 at 3:16 PM Jakub Kicinski <kuba@kernel.org> wrote:
>> >>
>> >> Page ppol tried to cache the NAPI ID in page pool info to avoid
>> >
>> > Page pool
>> >
>> >> having a dependency on the life cycle of the NAPI instance.
>> >> Since commit under Fixes the NAPI ID is not populated until
>> >> napi_enable() and there's a good chance that page pool is
>> >> created before NAPI gets enabled.
>> >>
>> >> Protect the NAPI pointer with the existing page pool mutex,
>> >> the reading path already holds it. napi_id itself we need
>> >
>> > The reading paths in page_pool.c don't hold the lock, no? Only the
>> > reading paths in page_pool_user.c seem to do.
>> >
>> > I could not immediately wrap my head around why pool->p.napi can be
>> > accessed in page_pool_napi_local with no lock, but needs to be
>> > protected in the code in page_pool_user.c. It seems
>> > READ_ONCE/WRITE_ONCE protection is good enough to make sure
>> > page_pool_napi_local doesn't race with
>> > page_pool_disable_direct_recycling in a way that can crash (the
>> > reading code either sees a valid pointer or NULL). Why is that not
>> > good enough to also synchronize the accesses between
>> > page_pool_disable_direct_recycling and page_pool_nl_fill? I.e., drop
>> > the locking?
>>
>> It actually seems that this is *not* currently the case. See the
>> discussion here:
>>
>> https://lore.kernel.org/all/8734h8qgmz.fsf@toke.dk/
>>
>> IMO (as indicated in the message linked above), we should require users
>> to destroy the page pool before freeing the NAPI memory, rather than add
>> additional synchronisation.
>>
>
> Ah, I see. I wonder if we should make this part of the API via comment
> and/or add DEBUG_NET_WARN_ON to catch misuse, something like:
>
> diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
> index ed4cd114180a..3919ca302e95 100644
> --- a/include/net/page_pool/types.h
> +++ b/include/net/page_pool/types.h
> @@ -257,6 +257,10 @@ struct xdp_mem_info;
>
> #ifdef CONFIG_PAGE_POOL
> void page_pool_disable_direct_recycling(struct page_pool *pool);
> +
> +/* page_pool_destroy or page_pool_disable_direct_recycling must be
> called before
> + * netif_napi_del if pool->p.napi is set.
> + */
> void page_pool_destroy(struct page_pool *pool);
> void page_pool_use_xdp_mem(struct page_pool *pool, void (*disconnect)(void *),
> const struct xdp_mem_info *mem);
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 5c4b788b811b..dc82767b2516 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -1161,6 +1161,8 @@ void page_pool_destroy(struct page_pool *pool)
> if (!page_pool_put(pool))
> return;
>
> + DEBUG_NET_WARN_ON(pool->p.napi && !napi_is_valid(pool->p.napi));
> +
> page_pool_disable_direct_recycling(pool);
> page_pool_free_frag(pool);
Yeah, good idea; care to send a proper patch? :)
> I also took a quick spot check - which could be wrong - but it seems
> to me both gve and bnxt free the napi before destroying the pool :(
Right, that fits with what Yunsheng found over in that other thread, at
least (for bnxt).
> But I think this entire discussion is unrelated to this patch, so and
> the mutex sync in this patch seems necessary for the page_pool_user.c
> code which runs outside of softirq context:
>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Yeah, didn't really mean this comment to have anything to do with this
patch, just mentioned it since you were talking about the data path :)
For this patch, I agree, the mutex seems fine:
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
next prev parent reply other threads:[~2025-01-27 13:31 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-23 23:16 [PATCH net] net: page_pool: don't try to stash the napi id Jakub Kicinski
2025-01-24 21:00 ` Mina Almasry
2025-01-24 22:18 ` Toke Høiland-Jørgensen
2025-01-24 23:49 ` Mina Almasry
2025-01-27 13:31 ` Toke Høiland-Jørgensen [this message]
2025-01-27 19:37 ` Jakub Kicinski
2025-01-27 19:41 ` Mina Almasry
2025-01-25 1:31 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877c6gpen5.fsf@toke.dk \
--to=toke@redhat.com \
--cc=almasrymina@google.com \
--cc=andrew+netdev@lunn.ch \
--cc=asml.silence@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=jdamato@fastly.com \
--cc=kaiyuanz@google.com \
--cc=kuba@kernel.org \
--cc=mkarsten@uwaterloo.ca \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.