From: Paolo Abeni <pabeni@redhat.com>
To: Mina Almasry <almasrymina@google.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org,
linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
linaro-mm-sig@lists.linaro.org
Cc: "Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
"Jeroen de Borst" <jeroendb@google.com>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"Arnd Bergmann" <arnd@arndb.de>,
"Christian König" <christian.koenig@amd.com>,
"David Ahern" <dsahern@kernel.org>,
"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
"Sumit Semwal" <sumit.semwal@linaro.org>,
"Eric Dumazet" <edumazet@google.com>,
"Shakeel Butt" <shakeelb@google.com>,
"Praveen Kaligineedi" <pkaligineedi@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Shuah Khan" <shuah@kernel.org>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [RFC PATCH v3 07/12] page-pool: device memory support
Date: Thu, 09 Nov 2023 10:01:45 +0100 [thread overview]
Message-ID: <429f6303c9c61d50ba6c5fdddec30c22dc0b2c09.camel@redhat.com> (raw)
In-Reply-To: <20231106024413.2801438-8-almasrymina@google.com>
On Sun, 2023-11-05 at 18:44 -0800, Mina Almasry wrote:
> Overload the LSB of struct page* to indicate that it's a page_pool_iov.
>
> Refactor mm calls on struct page* into helpers, and add page_pool_iov
> handling on those helpers. Modify callers of these mm APIs with calls to
> these helpers instead.
>
> In areas where struct page* is dereferenced, add a check for special
> handling of page_pool_iov.
>
> Signed-off-by: Mina Almasry <almasrymina@google.com>
>
> ---
> include/net/page_pool/helpers.h | 74 ++++++++++++++++++++++++++++++++-
> net/core/page_pool.c | 63 ++++++++++++++++++++--------
> 2 files changed, 118 insertions(+), 19 deletions(-)
>
> diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
> index b93243c2a640..08f1a2cc70d2 100644
> --- a/include/net/page_pool/helpers.h
> +++ b/include/net/page_pool/helpers.h
> @@ -151,6 +151,64 @@ static inline struct page_pool_iov *page_to_page_pool_iov(struct page *page)
> return NULL;
> }
>
> +static inline int page_pool_page_ref_count(struct page *page)
> +{
> + if (page_is_page_pool_iov(page))
> + return page_pool_iov_refcount(page_to_page_pool_iov(page));
> +
> + return page_ref_count(page);
> +}
> +
> +static inline void page_pool_page_get_many(struct page *page,
> + unsigned int count)
> +{
> + if (page_is_page_pool_iov(page))
> + return page_pool_iov_get_many(page_to_page_pool_iov(page),
> + count);
> +
> + return page_ref_add(page, count);
> +}
> +
> +static inline void page_pool_page_put_many(struct page *page,
> + unsigned int count)
> +{
> + if (page_is_page_pool_iov(page))
> + return page_pool_iov_put_many(page_to_page_pool_iov(page),
> + count);
> +
> + if (count > 1)
> + page_ref_sub(page, count - 1);
> +
> + put_page(page);
> +}
> +
> +static inline bool page_pool_page_is_pfmemalloc(struct page *page)
> +{
> + if (page_is_page_pool_iov(page))
> + return false;
> +
> + return page_is_pfmemalloc(page);
> +}
> +
> +static inline bool page_pool_page_is_pref_nid(struct page *page, int pref_nid)
> +{
> + /* Assume page_pool_iov are on the preferred node without actually
> + * checking...
> + *
> + * This check is only used to check for recycling memory in the page
> + * pool's fast paths. Currently the only implementation of page_pool_iov
> + * is dmabuf device memory. It's a deliberate decision by the user to
> + * bind a certain dmabuf to a certain netdev, and the netdev rx queue
> + * would not be able to reallocate memory from another dmabuf that
> + * exists on the preferred node, so, this check doesn't make much sense
> + * in this case. Assume all page_pool_iovs can be recycled for now.
> + */
> + if (page_is_page_pool_iov(page))
> + return true;
> +
> + return page_to_nid(page) == pref_nid;
> +}
> +
> /**
> * page_pool_dev_alloc_pages() - allocate a page.
> * @pool: pool from which to allocate
> @@ -301,6 +359,9 @@ static inline long page_pool_defrag_page(struct page *page, long nr)
> {
> long ret;
>
> + if (page_is_page_pool_iov(page))
> + return -EINVAL;
> +
> /* If nr == pp_frag_count then we have cleared all remaining
> * references to the page:
> * 1. 'n == 1': no need to actually overwrite it.
> @@ -431,7 +492,12 @@ static inline void page_pool_free_va(struct page_pool *pool, void *va,
> */
> static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
> {
> - dma_addr_t ret = page->dma_addr;
> + dma_addr_t ret;
> +
> + if (page_is_page_pool_iov(page))
> + return page_pool_iov_dma_addr(page_to_page_pool_iov(page));
Should the above conditional be guarded by the page_pool_mem_providers
static key? this looks like fast-path. Same question for the refcount
helper above.
Minor nit: possibly cache 'page_is_page_pool_iov(page)' to make the
code more readable.
> +
> + ret = page->dma_addr;
>
> if (PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA)
> ret <<= PAGE_SHIFT;
> @@ -441,6 +507,12 @@ static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
>
> static inline bool page_pool_set_dma_addr(struct page *page, dma_addr_t addr)
> {
> + /* page_pool_iovs are mapped and their dma-addr can't be modified. */
> + if (page_is_page_pool_iov(page)) {
> + DEBUG_NET_WARN_ON_ONCE(true);
> + return false;
> + }
Quickly skimming over the page_pool_code it looks like
page_pool_set_dma_addr() usage is guarded by the PP_FLAG_DMA_MAP page
pool flag. Could the device mem provider enforce such flag being
cleared on the page pool?
> +
> if (PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA) {
> page->dma_addr = addr >> PAGE_SHIFT;
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 138ddea0b28f..d211996d423b 100644
> --- a/net/core/page_pool.cnn
> +++ b/net/core/page_pool.c
> @@ -317,7 +317,7 @@ static struct page *page_pool_refill_alloc_cache(struct page_pool *pool)
> if (unlikely(!page))
> break;
>
> - if (likely(page_to_nid(page) == pref_nid)) {
> + if (likely(page_pool_page_is_pref_nid(page, pref_nid))) {
> pool->alloc.cache[pool->alloc.count++] = page;
> } else {
> /* NUMA mismatch;
> @@ -362,7 +362,15 @@ static void page_pool_dma_sync_for_device(struct page_pool *pool,
> struct page *page,
> unsigned int dma_sync_size)
> {
> - dma_addr_t dma_addr = page_pool_get_dma_addr(page);
> + dma_addr_t dma_addr;
> +
> + /* page_pool_iov memory provider do not support PP_FLAG_DMA_SYNC_DEV */
> + if (page_is_page_pool_iov(page)) {
> + DEBUG_NET_WARN_ON_ONCE(true);
> + return;
> + }
Similar to the above point, mutatis mutandis.
> +
> + dma_addr = page_pool_get_dma_addr(page);
>
> dma_sync_size = min(dma_sync_size, pool->p.max_len);
> dma_sync_single_range_for_device(pool->p.dev, dma_addr,
> @@ -374,6 +382,12 @@ static bool page_pool_dma_map(struct page_pool *pool, struct page *page)
> {
> dma_addr_t dma;
>
> + if (page_is_page_pool_iov(page)) {
> + /* page_pool_iovs are already mapped */
> + DEBUG_NET_WARN_ON_ONCE(true);
> + return true;
> + }
Ditto.
Cheers,
Paolo
next prev parent reply other threads:[~2023-11-09 9:02 UTC|newest]
Thread overview: 128+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-06 2:43 [RFC PATCH v3 00/12] Device Memory TCP Mina Almasry
2023-11-06 2:44 ` [RFC PATCH v3 01/12] net: page_pool: factor out releasing DMA from releasing the page Mina Almasry
2023-11-06 2:44 ` [RFC PATCH v3 02/12] net: page_pool: create hooks for custom page providers Mina Almasry
2023-11-07 7:44 ` Yunsheng Lin
2023-11-09 11:09 ` Paolo Abeni
2023-11-10 23:19 ` Jakub Kicinski
2023-11-13 3:28 ` Mina Almasry
2023-11-13 22:10 ` Jakub Kicinski
2023-11-06 2:44 ` [RFC PATCH v3 03/12] net: netdev netlink api to bind dma-buf to a net device Mina Almasry
2023-11-10 23:16 ` Jakub Kicinski
2023-11-06 2:44 ` [RFC PATCH v3 04/12] netdev: support binding dma-buf to netdevice Mina Almasry
2023-11-07 7:46 ` Yunsheng Lin
2023-11-07 21:59 ` Mina Almasry
2023-11-08 3:40 ` Yunsheng Lin
2023-11-09 2:22 ` Mina Almasry
2023-11-09 9:29 ` Yunsheng Lin
2023-11-08 23:47 ` David Wei
2023-11-09 2:25 ` Mina Almasry
2023-11-09 8:29 ` Paolo Abeni
2023-11-10 2:59 ` Mina Almasry
2023-11-10 7:38 ` Yunsheng Lin
2023-11-10 9:45 ` Mina Almasry
2023-11-10 23:19 ` Jakub Kicinski
2023-11-11 2:19 ` Mina Almasry
2023-11-06 2:44 ` [RFC PATCH v3 05/12] netdev: netdevice devmem allocator Mina Almasry
2023-11-06 23:44 ` David Ahern
2023-11-07 22:10 ` Mina Almasry
2023-11-07 22:55 ` David Ahern
2023-11-07 23:03 ` Mina Almasry
2023-11-09 1:15 ` David Wei
2023-11-10 14:26 ` Pavel Begunkov
2023-11-11 17:19 ` David Ahern
2023-11-14 16:09 ` Pavel Begunkov
2023-11-09 1:00 ` David Wei
2023-11-08 3:48 ` Yunsheng Lin
2023-11-09 1:41 ` Mina Almasry
2023-11-07 7:45 ` Yunsheng Lin
2023-11-09 8:44 ` Paolo Abeni
2023-11-06 2:44 ` [RFC PATCH v3 06/12] memory-provider: dmabuf devmem memory provider Mina Almasry
2023-11-06 21:02 ` Stanislav Fomichev
2023-11-06 23:49 ` David Ahern
2023-11-08 0:02 ` Mina Almasry
2023-11-08 0:10 ` David Ahern
2023-11-10 23:16 ` Jakub Kicinski
2023-11-13 4:54 ` Mina Almasry
2023-11-06 2:44 ` [RFC PATCH v3 07/12] page-pool: device memory support Mina Almasry
2023-11-07 8:00 ` Yunsheng Lin
2023-11-07 21:56 ` Mina Almasry
2023-11-08 10:56 ` Yunsheng Lin
2023-11-09 3:20 ` Mina Almasry
2023-11-09 9:30 ` Yunsheng Lin
2023-11-09 12:20 ` Mina Almasry
2023-11-09 13:23 ` Yunsheng Lin
2023-11-09 14:23 ` Christian König
2023-11-09 9:01 ` Paolo Abeni [this message]
2023-11-06 2:44 ` [RFC PATCH v3 08/12] net: support non paged skb frags Mina Almasry
2023-11-07 9:00 ` Yunsheng Lin
2023-11-07 21:19 ` Mina Almasry
2023-11-08 11:25 ` Yunsheng Lin
2023-11-09 9:14 ` Paolo Abeni
2023-11-10 4:06 ` Mina Almasry
2023-11-10 23:19 ` Jakub Kicinski
2023-11-13 6:05 ` Mina Almasry
2023-11-13 22:17 ` Jakub Kicinski
2023-11-06 2:44 ` [RFC PATCH v3 09/12] net: add support for skbs with unreadable frags Mina Almasry
2023-11-06 18:47 ` Stanislav Fomichev
2023-11-06 19:34 ` David Ahern
2023-11-06 20:31 ` Mina Almasry
2023-11-06 21:59 ` Stanislav Fomichev
2023-11-06 22:18 ` Mina Almasry
2023-11-06 22:59 ` Stanislav Fomichev
2023-11-06 23:14 ` Kaiyuan Zhang
2023-11-06 23:27 ` Mina Almasry
2023-11-06 23:55 ` Stanislav Fomichev
2023-11-07 0:07 ` Willem de Bruijn
2023-11-07 0:14 ` Stanislav Fomichev
2023-11-07 0:59 ` Stanislav Fomichev
2023-11-07 2:23 ` Willem de Bruijn
2023-11-07 17:44 ` Stanislav Fomichev
2023-11-07 17:57 ` Willem de Bruijn
2023-11-07 18:14 ` Stanislav Fomichev
2023-11-07 0:20 ` Mina Almasry
2023-11-07 1:06 ` Stanislav Fomichev
2023-11-07 19:53 ` Mina Almasry
2023-11-07 21:05 ` Stanislav Fomichev
2023-11-07 21:17 ` Eric Dumazet
2023-11-07 22:23 ` Stanislav Fomichev
2023-11-10 23:17 ` Jakub Kicinski
2023-11-10 23:19 ` Jakub Kicinski
2023-11-07 1:09 ` David Ahern
2023-11-06 23:37 ` David Ahern
2023-11-07 0:03 ` Mina Almasry
2023-11-06 20:56 ` Stanislav Fomichev
2023-11-07 0:16 ` David Ahern
2023-11-07 0:23 ` Mina Almasry
2023-11-08 14:43 ` David Laight
2023-11-06 2:44 ` [RFC PATCH v3 10/12] tcp: RX path for devmem TCP Mina Almasry
2023-11-06 18:44 ` Stanislav Fomichev
2023-11-06 19:29 ` Mina Almasry
2023-11-06 21:14 ` Willem de Bruijn
2023-11-06 22:34 ` Stanislav Fomichev
2023-11-06 22:55 ` Willem de Bruijn
2023-11-06 23:32 ` Stanislav Fomichev
2023-11-06 23:55 ` David Ahern
2023-11-07 0:02 ` Willem de Bruijn
2023-11-07 23:55 ` Mina Almasry
2023-11-08 0:01 ` David Ahern
2023-11-09 2:39 ` Mina Almasry
2023-11-09 16:07 ` Edward Cree
2023-12-08 20:12 ` Pavel Begunkov
2023-11-09 11:05 ` Paolo Abeni
2023-11-10 23:16 ` Jakub Kicinski
2023-12-08 20:28 ` Pavel Begunkov
2023-12-08 20:09 ` Pavel Begunkov
2023-11-06 21:17 ` Stanislav Fomichev
2023-11-08 15:36 ` Edward Cree
2023-11-09 10:52 ` Paolo Abeni
2023-11-10 23:19 ` Jakub Kicinski
2023-11-06 2:44 ` [RFC PATCH v3 11/12] net: add SO_DEVMEM_DONTNEED setsockopt to release RX pages Mina Almasry
2023-11-06 2:44 ` [RFC PATCH v3 12/12] selftests: add ncdevmem, netcat for devmem TCP Mina Almasry
2023-11-09 11:03 ` Paolo Abeni
2023-11-10 23:13 ` Jakub Kicinski
2023-11-11 2:27 ` Mina Almasry
2023-11-11 2:35 ` Jakub Kicinski
2023-11-13 4:08 ` Mina Almasry
2023-11-13 22:20 ` Jakub Kicinski
2023-11-10 23:17 ` Jakub Kicinski
2023-11-07 15:18 ` [RFC PATCH v3 00/12] Device Memory TCP David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=429f6303c9c61d50ba6c5fdddec30c22dc0b2c09.camel@redhat.com \
--to=pabeni@redhat.com \
--cc=almasrymina@google.com \
--cc=arnd@arndb.de \
--cc=christian.koenig@amd.com \
--cc=davem@davemloft.net \
--cc=dri-devel@lists.freedesktop.org \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=jeroendb@google.com \
--cc=kuba@kernel.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pkaligineedi@google.com \
--cc=shakeelb@google.com \
--cc=shuah@kernel.org \
--cc=sumit.semwal@linaro.org \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).