public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Gorbunov Ivan <gorbunov.ivan@h-partners.com>
Cc: david@kernel.org, Liam.Howlett@oracle.com,
	akpm@linux-foundation.org, apopple@nvidia.com,
	baolin.wang@linux.alibaba.com, gladyshev.ilya1@h-partners.com,
	harry.yoo@oracle.com, kirill@shutemov.name,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	lorenzo.stoakes@oracle.com, mhocko@suse.com,
	muchun.song@linux.dev, rppt@kernel.org, surenb@google.com,
	torvalds@linuxfoundation.org, vbabka@suse.cz,
	willy@infradead.org, yuzhao@google.com, ziy@nvidia.com,
	artem.kuzin@huawei.com
Subject: Re: [PATCH v2 1/2] mm: drop page refcount zero state semantics
Date: Thu, 23 Apr 2026 13:07:52 -0500	[thread overview]
Message-ID: <20260423180752.GA31613@bhelgaas> (raw)
In-Reply-To: <9fd8ebbc0f4f45be611bae0d03dd25dd994233c0.1776350895.git.gorbunov.ivan@h-partners.com>

On Mon, Apr 20, 2026 at 08:01:18AM +0000, Gorbunov Ivan wrote:
> Right now 'zero' state could be interpreted in 2 ways
> 1) Unfrozen page which right now has no explicit owner
> 2) Frozen page
> 
> This states can be 'logically' distinguished by operations such as
> page_ref_add, page_ref_inc, etc. In the first we would want the counter to
> increase.
> 
> For example one can write
> 
> page = alloc_frozen_page(...);
> page_ref_inc(page, 1);
> 
> But in the second state increasing a counter of a frozen page, shouldn't be valid at all.
> 
> Another reason for change is our other patch (mm: implement page refcount locking via dedicated bit)
> in which frozen pages do not have 0 value in refcount when frozen.
> 
> This patch proposes 2 changes
> 1) Deprecate invariant that the value stored in reference count of frozen page is 0
> (Getter functions folio_ref_count/page_ref_count must still return 0 for frozen pages)
> 2) Allow modification operations like page_ref_add to be used only with
>    pages with owners
> 
> We've looked at places where pages are allocated, and they are
> always initialized via functions like set_page_count(page, 1). However, for
> clarity, we've added a debug BUG_ON inside modification functions to ensure
> that they are called only on pages with owners. In future those
> checks can be improved by replacing operations with their results
> returning analogs, if needed.
> 
> Co-developed-by: Gladyshev Ilya <gorbunov.ivan@h-partners.com>
> Signed-off-by: Gladyshev Ilya <gladyshev.ilya1@h-partners.com>
> Signed-off-by: Gorbunov Ivan <gorbunov.ivan@h-partners.com>

No opinion about the rest of the content, but the p2pdma.c change
looks like a no-op, so:

Acked-by: Bjorn Helgaas <bhelgaas@google.com> # p2pdma.c

You might consider rewrapping this commit log to fit in 75 columns or
so, as the log for the second patch does.

> ---
>  drivers/pci/p2pdma.c               |  2 +-
>  include/linux/page_ref.h           | 17 +++++++++++++++++
>  kernel/liveupdate/kexec_handover.c |  2 +-
>  mm/hugetlb.c                       |  2 +-
>  mm/mm_init.c                       |  6 +++---
>  mm/page_alloc.c                    |  4 ++--
>  6 files changed, 25 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index e0f546166eb8..e060ae7e1644 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
> @@ -158,7 +158,7 @@ static int p2pmem_alloc_mmap(struct file *filp, struct kobject *kobj,
>  			 * because we don't want to trigger the
>  			 * p2pdma_folio_free() path.
>  			 */
> -			set_page_count(page, 0);
> +			set_page_count_as_frozen(page);
>  			percpu_ref_put(ref);
>  			return ret;
>  		}
> diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
> index 94d3f0e71c06..a7a07b61d2ae 100644
> --- a/include/linux/page_ref.h
> +++ b/include/linux/page_ref.h
> @@ -62,6 +62,11 @@ static inline void __page_ref_unfreeze(struct page *page, int v)
>  
>  #endif
>  
> +static inline bool __page_count_is_frozen(int count)
> +{
> +	return count == 0;
> +}
> +
>  static inline int page_ref_count(const struct page *page)
>  {
>  	return atomic_read(&page->_refcount);
> @@ -115,8 +120,14 @@ static inline void init_page_count(struct page *page)
>  	set_page_count(page, 1);
>  }
>  
> +static inline void set_page_count_as_frozen(struct page *page)
> +{
> +	set_page_count(page, 0);
> +}
> +
>  static inline void page_ref_add(struct page *page, int nr)
>  {
> +	VM_BUG_ON(__page_count_is_frozen(page_count(page)));
>  	atomic_add(nr, &page->_refcount);
>  	if (page_ref_tracepoint_active(page_ref_mod))
>  		__page_ref_mod(page, nr);
> @@ -129,6 +140,7 @@ static inline void folio_ref_add(struct folio *folio, int nr)
>  
>  static inline void page_ref_sub(struct page *page, int nr)
>  {
> +	VM_BUG_ON(__page_count_is_frozen(page_count(page)));
>  	atomic_sub(nr, &page->_refcount);
>  	if (page_ref_tracepoint_active(page_ref_mod))
>  		__page_ref_mod(page, -nr);
> @@ -142,6 +154,7 @@ static inline void folio_ref_sub(struct folio *folio, int nr)
>  static inline int folio_ref_sub_return(struct folio *folio, int nr)
>  {
>  	int ret = atomic_sub_return(nr, &folio->_refcount);
> +	VM_BUG_ON(__page_count_is_frozen(ret + nr));
>  
>  	if (page_ref_tracepoint_active(page_ref_mod_and_return))
>  		__page_ref_mod_and_return(&folio->page, -nr, ret);
> @@ -150,6 +163,7 @@ static inline int folio_ref_sub_return(struct folio *folio, int nr)
>  
>  static inline void page_ref_inc(struct page *page)
>  {
> +	VM_BUG_ON(__page_count_is_frozen(page_count(page)));
>  	atomic_inc(&page->_refcount);
>  	if (page_ref_tracepoint_active(page_ref_mod))
>  		__page_ref_mod(page, 1);
> @@ -162,6 +176,7 @@ static inline void folio_ref_inc(struct folio *folio)
>  
>  static inline void page_ref_dec(struct page *page)
>  {
> +	VM_BUG_ON(__page_count_is_frozen(page_count(page)));
>  	atomic_dec(&page->_refcount);
>  	if (page_ref_tracepoint_active(page_ref_mod))
>  		__page_ref_mod(page, -1);
> @@ -189,6 +204,7 @@ static inline int folio_ref_sub_and_test(struct folio *folio, int nr)
>  static inline int page_ref_inc_return(struct page *page)
>  {
>  	int ret = atomic_inc_return(&page->_refcount);
> +	VM_BUG_ON(__page_count_is_frozen(ret - 1));
>  
>  	if (page_ref_tracepoint_active(page_ref_mod_and_return))
>  		__page_ref_mod_and_return(page, 1, ret);
> @@ -217,6 +233,7 @@ static inline int folio_ref_dec_and_test(struct folio *folio)
>  static inline int page_ref_dec_return(struct page *page)
>  {
>  	int ret = atomic_dec_return(&page->_refcount);
> +	VM_BUG_ON(__page_count_is_frozen(ret + 1));
>  
>  	if (page_ref_tracepoint_active(page_ref_mod_and_return))
>  		__page_ref_mod_and_return(page, -1, ret);
> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index b64f36a45296..36c21f3d8250 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -390,7 +390,7 @@ static void kho_init_folio(struct page *page, unsigned int order)
>  
>  	/* For higher order folios, tail pages get a page count of zero. */
>  	for (unsigned long i = 1; i < nr_pages; i++)
> -		set_page_count(page + i, 0);
> +		set_page_count_as_frozen(page + i);
>  
>  	if (order > 0)
>  		prep_compound_page(page, order);
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 1d41fa3dd43e..b364fda29111 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3186,7 +3186,7 @@ static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio,
>  	for (pfn = head_pfn + start_page_number; pfn < end_pfn; page++, pfn++) {
>  		__init_single_page(page, pfn, zone, nid);
>  		prep_compound_tail(page, &folio->page, order);
> -		set_page_count(page, 0);
> +		set_page_count_as_frozen(page);
>  	}
>  }
>  
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index cec7bb758bdd..e4ec672a9f51 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1066,7 +1066,7 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
>  	case MEMORY_DEVICE_PRIVATE:
>  	case MEMORY_DEVICE_COHERENT:
>  	case MEMORY_DEVICE_PCI_P2PDMA:
> -		set_page_count(page, 0);
> +		set_page_count_as_frozen(page);
>  		break;
>  
>  	case MEMORY_DEVICE_GENERIC:
> @@ -1112,7 +1112,7 @@ static void __ref memmap_init_compound(struct page *head,
>  
>  		__init_zone_device_page(page, pfn, zone_idx, nid, pgmap);
>  		prep_compound_tail(page, head, order);
> -		set_page_count(page, 0);
> +		set_page_count_as_frozen(page);
>  	}
>  	prep_compound_head(head, order);
>  }
> @@ -2250,7 +2250,7 @@ void __init init_cma_reserved_pageblock(struct page *page)
>  
>  	do {
>  		__ClearPageReserved(p);
> -		set_page_count(p, 0);
> +		set_page_count_as_frozen(p);
>  	} while (++p, --i);
>  
>  	init_pageblock_migratetype(page, MIGRATE_CMA, false);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 65e702fade61..27734cf795da 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1639,14 +1639,14 @@ void __meminit __free_pages_core(struct page *page, unsigned int order,
>  		for (loop = 0; loop < nr_pages; loop++, p++) {
>  			VM_WARN_ON_ONCE(PageReserved(p));
>  			__ClearPageOffline(p);
> -			set_page_count(p, 0);
> +			set_page_count_as_frozen(p);
>  		}
>  
>  		adjust_managed_page_count(page, nr_pages);
>  	} else {
>  		for (loop = 0; loop < nr_pages; loop++, p++) {
>  			__ClearPageReserved(p);
> -			set_page_count(p, 0);
> +			set_page_count_as_frozen(p);
>  		}
>  
>  		/* memblock adjusts totalram_pages() manually. */
> -- 
> 2.43.0
> 


  reply	other threads:[~2026-04-23 18:07 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-20  8:01 [PATCH v2 0/2] mm: improve folio refcount scalability Gorbunov Ivan
2026-04-20  8:01 ` [PATCH v2 1/2] mm: drop page refcount zero state semantics Gorbunov Ivan
2026-04-23 18:07   ` Bjorn Helgaas [this message]
2026-04-23 19:32   ` Zi Yan
2026-04-20  8:01 ` [PATCH v2 2/2] mm: implement page refcount locking via dedicated bit Gorbunov Ivan
2026-04-23 18:24   ` Matthew Wilcox
2026-04-23 18:31     ` Linus Torvalds
2026-04-23 19:20     ` David Hildenbrand (Arm)
2026-04-23 19:37   ` Zi Yan
2026-04-20 10:07 ` [syzbot ci] Re: mm: improve folio refcount scalability syzbot ci
2026-04-20 12:29   ` Gorbunov Ivan
2026-04-20 13:21     ` syzbot ci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260423180752.GA31613@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=artem.kuzin@huawei.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=gladyshev.ilya1@h-partners.com \
    --cc=gorbunov.ivan@h-partners.com \
    --cc=harry.yoo@oracle.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=torvalds@linuxfoundation.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox