All of lore.kernel.org
 help / color / mirror / Atom feed
From: Byungchul Park <byungchul@sk.com>
To: Pavel Begunkov <asml.silence@gmail.com>
Cc: Mina Almasry <almasrymina@google.com>,
	David Hildenbrand <david@redhat.com>,
	"willy@infradead.org" <willy@infradead.org>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, kernel_team@skhynix.com, kuba@kernel.org,
	ilias.apalodimas@linaro.org, harry.yoo@oracle.com,
	hawk@kernel.org, akpm@linux-foundation.org, davem@davemloft.net,
	john.fastabend@gmail.com, andrew+netdev@lunn.ch, toke@redhat.com,
	tariqt@nvidia.com, edumazet@google.com, pabeni@redhat.com,
	saeedm@nvidia.com, leon@kernel.org, ast@kernel.org,
	daniel@iogearbox.net, lorenzo.stoakes@oracle.com,
	Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, horms@kernel.org,
	linux-rdma@vger.kernel.org, bpf@vger.kernel.org,
	vishal.moola@gmail.com, hannes@cmpxchg.org, ziy@nvidia.com,
	jackmanb@google.com
Subject: Re: [PATCH net-next v9 3/8] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
Date: Thu, 17 Jul 2025 12:08:58 +0900	[thread overview]
Message-ID: <20250717030858.GA26168@system.software.com> (raw)
In-Reply-To: <582f41c0-2742-4400-9c81-0d46bf4e8314@gmail.com>

On Sat, Jul 12, 2025 at 02:58:14PM +0100, Pavel Begunkov wrote:
> On 7/11/25 02:14, Byungchul Park wrote:
> ...>>> +#ifdef CONFIG_PAGE_POOL
> > > > +/* XXX: This would better be moved to mm, once mm gets its way to
> > > > + * identify the type of page for page pool.
> > > > + */
> > > > +static inline bool page_pool_page_is_pp(struct page *page)
> > > > +{
> > > > +       struct netmem_desc *desc = page_to_nmdesc(page);
> > > > +
> > > > +       return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> > > > +}
> > > 
> > > pages can be pp pages (where they have pp fields inside of them) or
> > > non-pp pages (where they don't have pp fields inside them, because
> > > they were never allocated from the page_pool).
> > > 
> > > Casting a page to a netmem_desc, and then checking if the page was a
> > > pp page doesn't makes sense to me on a fundamental level. The
> > > netmem_desc is only valid if the page was a pp page in the first
> > > place. Maybe page_to_nmdesc should reject the cast if the page is not
> > > a pp page or something.
> > 
> > Right, as you already know, the current mainline code already has the
> > same problem but we've been using the werid way so far, in other words,
> > mm code is checking if it's a pp page or not by using ->pp_magic, but
> > it's ->lur, ->buddy_list, or ->pcp_list if it's not a pp page.
> > 
> > Both the mainline code and this patch can make sense *only if* it's
> > actually a pp page.  It's unevitable until mm provides a way to identify
> > the type of page for page pool.  Thoughts?
> Question to mm folks, can we add a new PGTY for page pool and use
> that to filter page pool originated pages? Like in the incomplete
> and untested diff below?
> 
> 
> commit 8fc2347fb3ff4a3fc7929c70a5a21e1128935d4a
> Author: Pavel Begunkov <asml.silence@gmail.com>
> Date:   Sat Jul 12 14:29:52 2025 +0100
> 
>     net/mm: use PGTY for tracking page pool pages
> 
>     Currently, we use page->pp_magic to determine whether a page belongs to
>     a page pool. It's not ideal as the field is aliased with other page
>     types, and thus needs to to rely on elaborated rules to work. Add a new
>     page type for page pool.

Hi Pavel,

I need this work to be done to remove ->pp_magic in struct page.  Will
you let me work on this work?  Or can you please refine and post this
work?  If you let me, I will go for this as a separate patch from this
series.

	Byungchul

> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0ef2ba0c667a..975a013f1f17 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4175,7 +4175,7 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
>  #ifdef CONFIG_PAGE_POOL
>  static inline bool page_pool_page_is_pp(struct page *page)
>  {
> -       return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> +       return PageNetpp(page);
>  }
>  #else
>  static inline bool page_pool_page_is_pp(struct page *page)
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 4fe5ee67535b..9bd1dfded2fc 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -957,6 +957,7 @@ enum pagetype {
>        PGTY_zsmalloc           = 0xf6,
>        PGTY_unaccepted         = 0xf7,
>        PGTY_large_kmalloc      = 0xf8,
> +       PGTY_netpp              = 0xf9,
> 
>        PGTY_mapcount_underflow = 0xff
>  };
> @@ -1101,6 +1102,11 @@ PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc)
>  PAGE_TYPE_OPS(Unaccepted, unaccepted, unaccepted)
>  FOLIO_TYPE_OPS(large_kmalloc, large_kmalloc)
> 
> +/*
> + * Marks page_pool allocated pages
> + */
> +PAGE_TYPE_OPS(Netpp, netpp, netpp)
> +
>  /**
>   * PageHuge - Determine if the page belongs to hugetlbfs
>   * @page: The page to test.
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index de1d95f04076..20f5dbb08149 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -113,6 +113,8 @@ static inline bool netmem_is_net_iov(const netmem_ref netmem)
>   */
>  static inline struct page *__netmem_to_page(netmem_ref netmem)
>  {
> +       DEBUG_NET_WARN_ON_ONCE(netmem_is_net_iov(netmem));
> +
>        return (__force struct page *)netmem;
>  }
> 
> diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h
> index cd95394399b4..e38c64da1a78 100644
> --- a/net/core/netmem_priv.h
> +++ b/net/core/netmem_priv.h
> @@ -13,16 +13,11 @@ static inline void netmem_or_pp_magic(netmem_ref netmem, unsigned long pp_magic)
>        __netmem_clear_lsb(netmem)->pp_magic |= pp_magic;
>  }
> 
> -static inline void netmem_clear_pp_magic(netmem_ref netmem)
> -{
> -       WARN_ON_ONCE(__netmem_clear_lsb(netmem)->pp_magic & PP_DMA_INDEX_MASK);
> -
> -       __netmem_clear_lsb(netmem)->pp_magic = 0;
> -}
> -
>  static inline bool netmem_is_pp(netmem_ref netmem)
>  {
> -       return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
> +       if (netmem_is_net_iov(netmem))
> +               return true;
> +       return page_pool_page_is_pp(netmem_to_page(netmem));
>  }
> 
>  static inline void netmem_set_pp(netmem_ref netmem, struct page_pool *pool)
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 05e2e22a8f7c..52120e2912a6 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -371,6 +371,13 @@ struct page_pool *page_pool_create(const struct page_pool_params *params)
>  }
>  EXPORT_SYMBOL(page_pool_create);
> 
> +static void page_pool_set_page_pp_info(struct page_pool *pool,
> +                                      struct page *page)
> +{
> +       __SetPageNetpp(page);
> +       page_pool_set_pp_info(page_to_netmem(page));
> +}
> +
>  static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem);
> 
>  static noinline netmem_ref page_pool_refill_alloc_cache(struct page_pool *pool)
> @@ -534,7 +541,7 @@ static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
>        }
> 
>        alloc_stat_inc(pool, slow_high_order);
> -       page_pool_set_pp_info(pool, page_to_netmem(page));
> +       page_pool_set_page_pp_info(pool, page);
> 
>        /* Track how many pages are held 'in-flight' */
>        pool->pages_state_hold_cnt++;
> @@ -579,7 +586,7 @@ static noinline netmem_ref __page_pool_alloc_netmems_slow(struct page_pool *pool
>                        continue;
>                }
> 
> -               page_pool_set_pp_info(pool, netmem);
> +               page_pool_set_page_pp_info(pool, __netmem_to_page(netmem));
>                pool->alloc.cache[pool->alloc.count++] = netmem;
>                /* Track how many pages are held 'in-flight' */
>                pool->pages_state_hold_cnt++;
> @@ -654,7 +661,6 @@ s32 page_pool_inflight(const struct page_pool *pool, bool strict)
>  void page_pool_set_pp_info(struct page_pool *pool, netmem_ref netmem)
>  {
>        netmem_set_pp(netmem, pool);
> -       netmem_or_pp_magic(netmem, PP_SIGNATURE);
> 
>        /* Ensuring all pages have been split into one fragment initially:
>         * page_pool_set_pp_info() is only called once for every page when it
> @@ -669,7 +675,6 @@ void page_pool_set_pp_info(struct page_pool *pool, netmem_ref netmem)
> 
>  void page_pool_clear_pp_info(netmem_ref netmem)
>  {
> -       netmem_clear_pp_magic(netmem);
>        netmem_set_pp(netmem, NULL);
>  }
> 
> @@ -730,8 +735,11 @@ static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
>        trace_page_pool_state_release(pool, netmem, count);
> 
>        if (put) {
> +               struct page *page = netmem_to_page(netmem);
> +
>                page_pool_clear_pp_info(netmem);
> -               put_page(netmem_to_page(netmem));
> +               __ClearPageNetpp(page);
> +               put_page(page);
>        }
>        /* An optimization would be to call __free_pages(page, pool->p.order)
>         * knowing page is not part of page-cache (thus avoiding a
> 
> --
> Pavel Begunkov

  parent reply	other threads:[~2025-07-17  3:09 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-10  8:27 [PATCH net-next v9 0/8] Split netmem from struct page Byungchul Park
2025-07-10  8:28 ` [PATCH net-next v9 1/8] netmem: introduce struct netmem_desc mirroring " Byungchul Park
2025-07-12 14:39   ` Pavel Begunkov
2025-07-14  4:23     ` Byungchul Park
2025-07-14 11:30       ` Pavel Begunkov
2025-07-14 11:58         ` Byungchul Park
2025-07-14 19:17         ` Mina Almasry
2025-07-15 10:01           ` Pavel Begunkov
2025-07-10  8:28 ` [PATCH net-next v9 2/8] netmem: introduce utility APIs to use struct netmem_desc Byungchul Park
2025-07-10 18:11   ` Mina Almasry
2025-07-11  1:02     ` Byungchul Park
2025-07-12 12:16       ` Pavel Begunkov
2025-07-12 12:05     ` Pavel Begunkov
2025-07-12 11:59   ` Pavel Begunkov
2025-07-13 23:07     ` Byungchul Park
2025-07-13 23:39       ` Byungchul Park
2025-07-14  9:43       ` Pavel Begunkov
2025-07-14 10:05         ` Byungchul Park
2025-07-14 11:45           ` Pavel Begunkov
2025-07-14 12:06             ` Byungchul Park
2025-07-10  8:28 ` [PATCH net-next v9 3/8] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
2025-07-10 18:19   ` Mina Almasry
2025-07-11  1:14     ` Byungchul Park
2025-07-12 13:58       ` Pavel Begunkov
2025-07-12 14:52         ` David Hildenbrand
2025-07-12 15:09           ` Pavel Begunkov
2025-07-13 23:22             ` Byungchul Park
2025-07-17  3:08         ` Byungchul Park [this message]
2025-07-22  1:23           ` Byungchul Park
2025-07-28 18:19             ` Pavel Begunkov
2025-07-14 19:09       ` Mina Almasry
2025-07-15  9:53         ` Pavel Begunkov
2025-07-10  8:28 ` [PATCH net-next v9 4/8] netmem: use netmem_desc instead of page to access ->pp in __netmem_get_pp() Byungchul Park
2025-07-10 18:25   ` Mina Almasry
2025-07-11  1:17     ` Byungchul Park
2025-07-10  8:28 ` [PATCH net-next v9 5/8] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
2025-07-10 18:26   ` Mina Almasry
2025-07-10  8:28 ` [PATCH net-next v9 6/8] mlx4: use netmem descriptor and APIs for page pool Byungchul Park
2025-07-10 18:29   ` Mina Almasry
2025-07-11  1:32     ` Byungchul Park
2025-07-14 19:02       ` Mina Almasry
2025-07-10  8:28 ` [PATCH net-next v9 7/8] netdevsim: " Byungchul Park
2025-07-10 18:26   ` Mina Almasry
2025-07-10  8:28 ` [PATCH net-next v9 8/8] mt76: " Byungchul Park
2025-07-12 14:22   ` Pavel Begunkov
2025-07-14  2:13     ` Byungchul Park
2025-07-10  8:47 ` [PATCH net-next v9 0/8] Split netmem from struct page Byungchul Park
2025-07-10 18:35 ` Mina Almasry
2025-07-11  0:42   ` Byungchul Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250717030858.GA26168@system.software.com \
    --to=byungchul@sk.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=edumazet@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jackmanb@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=kernel_team@skhynix.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rppt@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=surenb@google.com \
    --cc=tariqt@nvidia.com \
    --cc=toke@redhat.com \
    --cc=vbabka@suse.cz \
    --cc=vishal.moola@gmail.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.