The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <ljs@kernel.org>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	"David Hildenbrand (Arm)" <david@kernel.org>,
	"Jason Wang" <jasowang@redhat.com>,
	"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
	"Eugenio Pérez" <eperezma@redhat.com>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Oscar Salvador" <osalvador@suse.de>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Liam R. Howlett" <liam@infradead.org>,
	"Vlastimil Babka" <vbabka@kernel.org>,
	"Mike Rapoport" <rppt@kernel.org>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Michal Hocko" <mhocko@suse.com>,
	"Brendan Jackman" <jackmanb@google.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>, "Zi Yan" <ziy@nvidia.com>,
	"Baolin Wang" <baolin.wang@linux.alibaba.com>,
	"Nico Pache" <npache@redhat.com>,
	"Ryan Roberts" <ryan.roberts@arm.com>,
	"Dev Jain" <dev.jain@arm.com>, "Barry Song" <baohua@kernel.org>,
	"Lance Yang" <lance.yang@linux.dev>,
	"Hugh Dickins" <hughd@google.com>,
	"Matthew Brost" <matthew.brost@intel.com>,
	"Joshua Hahn" <joshua.hahnjy@gmail.com>,
	"Rakie Kim" <rakie.kim@sk.com>,
	"Byungchul Park" <byungchul@sk.com>,
	"Gregory Price" <gourry@gourry.net>,
	"Ying Huang" <ying.huang@linux.alibaba.com>,
	"Alistair Popple" <apopple@nvidia.com>,
	"Christoph Lameter" <cl@gentwo.org>,
	"David Rientjes" <rientjes@google.com>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Harry Yoo" <harry.yoo@oracle.com>,
	"Axel Rasmussen" <axelrasmussen@google.com>,
	"Yuanchu Xie" <yuanchu@google.com>, "Wei Xu" <weixugc@google.com>,
	"Chris Li" <chrisl@kernel.org>,
	"Kairui Song" <kasong@tencent.com>,
	"Kemeng Shi" <shikemeng@huaweicloud.com>,
	"Nhat Pham" <nphamcs@gmail.com>, "Baoquan He" <bhe@redhat.com>,
	virtualization@lists.linux.dev, linux-mm@kvack.org,
	"Andrea Arcangeli" <aarcange@redhat.com>
Subject: Re: [PATCH v10 03/37] mm: page_alloc: propagate PageReported flag across buddy splits
Date: Mon, 8 Jun 2026 10:52:28 +0100	[thread overview]
Message-ID: <aiaPBRm0Mp_WDPFa@lucifer> (raw)
In-Reply-To: <1a407d39a9ebd95e9de67e44f9ad2db37c9f31e9.1780906288.git.mst@redhat.com>

On Mon, Jun 08, 2026 at 04:34:39AM -0400, Michael S. Tsirkin wrote:
> When a reported free page is split via expand() to satisfy a
> smaller allocation, the sub-pages placed back on the free lists
> lose the PageReported flag.  This means they will be unnecessarily
> re-reported to the hypervisor in the next reporting cycle, wasting
> work.
>
> While I was unable to quantify the performance difference, it is
> an obvious waste, even if small.

Hmm, that makes me wonder if this is really worth it?

You're making everything more complicated propagating yet another parameter
around (which could actually in itself cause perf/bloat issues) for
something you say yourself isn't really doing much?

>
> Propagate the PageReported flag to sub-pages during expand(),
> both in page_del_and_expand() and try_to_claim_block(), so
>
> split_large_buddy() also propagates PageReported via a bool
> parameter: the caller saves PageReported before
> del_page_from_free_list() clears it, then passes the saved
> value. The flag is set after __free_one_page() with a
> PageBuddy check, matching the page_reporting_drain() pattern.
> Free-path callers pass false (freshly freed pages are never
> reported).
> that they are recognized as already-reported.

The sentence is completely garbled?

>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Again this really doesn't feel like it belongs to this series.

If it's really a bug, it should have fixes/Cc: stable, but honestly my
assessment here is this doesn't seem worthwhile at all.

You're not making a case for it and you're adding complexity. We don't just
take AI-reported bugs at face value (if this was indeed one).

(If sashiko-reported some people do Reported-by: sashiko-bot@kernel.org
now)

> Assisted-by: Claude:claude-opus-4-6
> ---
>  mm/page_alloc.c | 32 +++++++++++++++++++++++++-------
>  1 file changed, 25 insertions(+), 7 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d49c254174da..8dae5b3f5876 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1502,7 +1502,8 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>
>  /* Split a multi-block free page into its individual pageblocks. */
>  static void split_large_buddy(struct zone *zone, struct page *page,
> -			      unsigned long pfn, int order, fpi_t fpi)
> +			      unsigned long pfn, int order, fpi_t fpi,
> +			      bool reported)
>  {
>  	unsigned long end = pfn + (1 << order);
>
> @@ -1517,6 +1518,8 @@ static void split_large_buddy(struct zone *zone, struct page *page,
>  		int mt = get_pfnblock_migratetype(page, pfn);
>
>  		__free_one_page(page, pfn, zone, order, mt, fpi);
> +		if (reported && PageBuddy(page) && buddy_order(page) == order)

Isn't this racey? We just freed this page to the buddy allocator, couldn't
it be allocated before we set this flag?

> +			__SetPageReported(page);
>  		pfn += 1 << order;
>  		if (pfn == end)
>  			break;
> @@ -1559,11 +1562,12 @@ static void free_one_page(struct zone *zone, struct page *page,
>  		llist_for_each_entry_safe(p, tmp, llnode, pcp_llist) {
>  			unsigned int p_order = p->private;
>
> -			split_large_buddy(zone, p, page_to_pfn(p), p_order, fpi_flags);
> +			split_large_buddy(zone, p, page_to_pfn(p), p_order,
> +					  fpi_flags, false);

I don't love adding a mystery meat boolean parameter like this.

>  			__count_vm_events(PGFREE, 1 << p_order);
>  		}
>  	}
> -	split_large_buddy(zone, page, pfn, order, fpi_flags);
> +	split_large_buddy(zone, page, pfn, order, fpi_flags, false);
>  	spin_unlock_irqrestore(&zone->lock, flags);
>
>  	__count_vm_events(PGFREE, 1 << order);
> @@ -1694,7 +1698,7 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
>   * -- nyc
>   */
>  static inline unsigned int expand(struct zone *zone, struct page *page, int low,
> -				  int high, int migratetype)
> +				  int high, int migratetype, bool reported)
>  {
>  	unsigned int size = 1 << high;
>  	unsigned int nr_added = 0;
> @@ -1716,6 +1720,15 @@ static inline unsigned int expand(struct zone *zone, struct page *page, int low,
>  		__add_to_free_list(&page[size], zone, high, migratetype, false);
>  		set_buddy_order(&page[size], high);
>  		nr_added += size;
> +
> +		/*
> +		 * The parent page has been reported to the host.  The
> +		 * sub-pages are part of the same reported block, so mark
> +		 * them reported too.  This avoids re-reporting pages that
> +		 * the host already knows about.
> +		 */

This comment seems entirely redundant (and very much like something claude would
add, it's too verbose like this).

> +		if (reported)
> +			__SetPageReported(&page[size]);
>  	}
>
>  	return nr_added;
> @@ -1726,9 +1739,10 @@ static __always_inline void page_del_and_expand(struct zone *zone,
>  						int high, int migratetype)
>  {
>  	int nr_pages = 1 << high;
> +	bool was_reported = page_reported(page);
>
>  	__del_page_from_free_list(page, zone, high, migratetype);
> -	nr_pages -= expand(zone, page, low, high, migratetype);
> +	nr_pages -= expand(zone, page, low, high, migratetype, was_reported);

I don't love propagating yet another parameter like this.

>  	account_freepages(zone, -nr_pages, migratetype);
>  }
>
> @@ -2116,11 +2130,13 @@ static bool __move_freepages_block_isolate(struct zone *zone,
>  	/* We're a part of a larger buddy */
>  	if (PageBuddy(buddy) && buddy_order(buddy) > pageblock_order) {
>  		int order = buddy_order(buddy);
> +		bool reported = PageReported(buddy);
>
>  		del_page_from_free_list(buddy, zone, order,
>  					get_pfnblock_migratetype(buddy, buddy_pfn));
>  		toggle_pageblock_isolate(page, isolate);
> -		split_large_buddy(zone, buddy, buddy_pfn, order, FPI_NONE);
> +		split_large_buddy(zone, buddy, buddy_pfn, order, FPI_NONE,
> +				  reported);
>  		return true;
>  	}
>
> @@ -2283,10 +2299,12 @@ try_to_claim_block(struct zone *zone, struct page *page,
>  	/* Take ownership for orders >= pageblock_order */
>  	if (current_order >= pageblock_order) {
>  		unsigned int nr_added;
> +		bool was_reported = page_reported(page);
>
>  		del_page_from_free_list(page, zone, current_order, block_type);
>  		change_pageblock_range(page, current_order, start_type);
> -		nr_added = expand(zone, page, order, current_order, start_type);
> +		nr_added = expand(zone, page, order, current_order, start_type,
> +				  was_reported);
>  		account_freepages(zone, nr_added, start_type);
>  		return page;
>  	}
> --
> MST
>

Thanks, Lorenzo

  reply	other threads:[~2026-06-08  9:52 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08  8:33 [PATCH v10 00/37] mm/virtio: skip redundant zeroing of host-zeroed pages Michael S. Tsirkin
2026-06-08  8:34 ` [PATCH v10 01/37] mm: mempolicy: fix interleave index calculation Michael S. Tsirkin
2026-06-08  9:43   ` Lorenzo Stoakes
2026-06-08  8:34 ` [PATCH v10 02/37] mm: memory-failure: serialize TestSetPageHWPoison with zone->lock Michael S. Tsirkin
2026-06-08  9:43   ` Lorenzo Stoakes
2026-06-08 13:48     ` Michael S. Tsirkin
2026-06-08 14:14       ` Lorenzo Stoakes
2026-06-08 16:20       ` Andrew Morton
2026-06-08  8:34 ` [PATCH v10 03/37] mm: page_alloc: propagate PageReported flag across buddy splits Michael S. Tsirkin
2026-06-08  9:52   ` Lorenzo Stoakes [this message]
2026-06-08 12:50     ` Matthew Wilcox
2026-06-08  8:34 ` [PATCH v10 04/37] mm: page_reporting: allow driver to set batch capacity Michael S. Tsirkin
2026-06-08  8:34 ` [PATCH v10 05/37] mm: hugetlb: remove dead alloc_hugetlb_folio stub Michael S. Tsirkin
2026-06-08  9:56   ` Lorenzo Stoakes
2026-06-08  8:35 ` [PATCH v10 06/37] mm: move vma_alloc_folio_noprof to page_alloc.c Michael S. Tsirkin
2026-06-08 10:05   ` Lorenzo Stoakes
2026-06-08  8:35 ` [PATCH v10 07/37] mm: thread user_addr through page allocator for cache-friendly zeroing Michael S. Tsirkin
2026-06-08 10:23   ` Lorenzo Stoakes
2026-06-08 11:06     ` Lorenzo Stoakes
2026-06-08 13:04       ` Matthew Wilcox
2026-06-08 13:09         ` Lorenzo Stoakes
2026-06-08 14:26           ` David Hildenbrand (Arm)
2026-06-08 14:31             ` Matthew Wilcox
2026-06-08 14:37               ` David Hildenbrand (Arm)
2026-06-08 14:44                 ` Matthew Wilcox
2026-06-08 14:55                   ` David Hildenbrand (Arm)
2026-06-08 19:33                   ` Michael S. Tsirkin
2026-06-08 11:08     ` David Hildenbrand (Arm)
2026-06-08 15:27       ` Zi Yan
2026-06-08 18:39         ` David Hildenbrand (Arm)
2026-06-08  8:35 ` [PATCH v10 08/37] mm: add alloc_contig_frozen_pages_user " Michael S. Tsirkin
2026-06-08 10:29   ` Lorenzo Stoakes
2026-06-08  8:35 ` [PATCH v10 09/37] mm: hugetlb: thread user_addr through gigantic page allocation Michael S. Tsirkin
2026-06-08  8:36 ` [PATCH v10 10/37] mm: add folio_zero_user stub for configs without THP/HUGETLBFS Michael S. Tsirkin
2026-06-08  9:12   ` Lorenzo Stoakes
2026-06-08  8:36 ` [PATCH v10 11/37] mm: page_alloc: move prep_compound_page before post_alloc_hook Michael S. Tsirkin
2026-06-08 10:33   ` Lorenzo Stoakes
2026-06-08  8:36 ` [PATCH v10 12/37] mm: use folio_zero_user for user pages in post_alloc_hook Michael S. Tsirkin
2026-06-08 11:23   ` Lorenzo Stoakes
2026-06-08 15:53     ` Gregory Price
2026-06-08  8:36 ` [PATCH v10 13/37] mm: use __GFP_ZERO in vma_alloc_zeroed_movable_folio Michael S. Tsirkin
2026-06-08 10:39   ` Lorenzo Stoakes
2026-06-08 10:55     ` Lorenzo Stoakes
2026-06-08  8:37 ` [PATCH v10 14/37] mm: remove arch vma_alloc_zeroed_movable_folio overrides Michael S. Tsirkin
2026-06-08 11:29   ` Lorenzo Stoakes
2026-06-08  8:37 ` [PATCH v10 15/37] mm: alloc_anon_folio: pass raw fault address to vma_alloc_folio Michael S. Tsirkin
2026-06-08 11:35   ` Lorenzo Stoakes
2026-06-08  8:37 ` [PATCH v10 16/37] mm: alloc_swap_folio: " Michael S. Tsirkin
2026-06-08 11:37   ` Lorenzo Stoakes
2026-06-08 15:59     ` Gregory Price
2026-06-08  8:37 ` [PATCH v10 17/37] mm: page_reporting: skip redundant zeroing of host-zeroed reported pages Michael S. Tsirkin
2026-06-08 12:00   ` Lorenzo Stoakes
2026-06-08 16:09     ` Gregory Price
2026-06-08 18:40       ` Matthew Wilcox
2026-06-08  8:38 ` [PATCH v10 18/37] mm: page_alloc: use aliasing checks instead of user_alloc_needs_zeroing Michael S. Tsirkin
2026-06-08 11:39   ` Lorenzo Stoakes
2026-06-08  8:38 ` [PATCH v10 19/37] mm: page_alloc: clear PG_zeroed on buddy merge if not both zero Michael S. Tsirkin
2026-06-08 11:47   ` Lorenzo Stoakes
2026-06-08  8:38 ` [PATCH v10 20/37] mm: page_alloc: preserve PG_zeroed in page_del_and_expand Michael S. Tsirkin
2026-06-08  8:38 ` [PATCH v10 21/37] mm: page_alloc: propagate PG_zeroed in split_large_buddy Michael S. Tsirkin
2026-06-08  8:38 ` [PATCH v10 22/37] mm: add free_frozen_pages_zeroed Michael S. Tsirkin
2026-06-08 12:06   ` Lorenzo Stoakes
2026-06-08  8:38 ` [PATCH v10 23/37] mm: page_alloc: skip kernel_init_pages for FPI_ZEROED when safe Michael S. Tsirkin
2026-06-08 12:18   ` Lorenzo Stoakes
2026-06-08  8:38 ` [PATCH v10 24/37] mm: add put_page_zeroed and folio_put_zeroed Michael S. Tsirkin
2026-06-08 12:25   ` Lorenzo Stoakes
2026-06-08 12:46     ` David Hildenbrand (Arm)
2026-06-08 14:08       ` Michael S. Tsirkin
2026-06-08 14:28         ` David Hildenbrand (Arm)
2026-06-08  8:39 ` [PATCH v10 25/37] mm: use __GFP_ZERO in alloc_anon_folio Michael S. Tsirkin
2026-06-08 12:29   ` Lorenzo Stoakes
2026-06-08  8:39 ` [PATCH v10 26/37] mm: vma_alloc_anon_folio_pmd: pass raw fault address to vma_alloc_folio Michael S. Tsirkin
2026-06-08 12:30   ` Lorenzo Stoakes
2026-06-08  8:39 ` [PATCH v10 27/37] mm: use __GFP_ZERO in vma_alloc_anon_folio_pmd Michael S. Tsirkin
2026-06-08 12:32   ` Lorenzo Stoakes
2026-06-08  8:39 ` [PATCH v10 28/37] mm: hugetlb: add gfp parameter and skip zeroing for zeroed pages Michael S. Tsirkin
2026-06-08 12:44   ` Lorenzo Stoakes
2026-06-08  8:39 ` [PATCH v10 29/37] mm: memfd: skip zeroing for zeroed hugetlb pool pages Michael S. Tsirkin
2026-06-08 12:47   ` Lorenzo Stoakes
2026-06-08  8:39 ` [PATCH v10 30/37] mm: page_reporting: add per-page zeroed bitmap for host feedback Michael S. Tsirkin
2026-06-08  8:39 ` [PATCH v10 31/37] virtio_balloon: submit reported pages as individual buffers Michael S. Tsirkin
2026-06-08  8:40 ` [PATCH v10 32/37] virtio_balloon: disable indirect descriptors Michael S. Tsirkin
2026-06-08  8:40 ` [PATCH v10 33/37] mm: page_reporting: add flush parameter with page budget Michael S. Tsirkin
2026-06-08  8:40 ` [PATCH v10 34/37] virtio_balloon: skip zeroing for host-zeroed reported pages Michael S. Tsirkin
2026-06-08  8:40 ` [PATCH v10 35/37] virtio_balloon: disable reporting zeroed optimization for confidential guests Michael S. Tsirkin
2026-06-08  8:40 ` [PATCH v10 36/37] mm: balloon: use put_page_zeroed for zeroed balloon pages Michael S. Tsirkin
2026-06-08 11:10   ` David Hildenbrand (Arm)
2026-06-08  8:40 ` [PATCH v10 37/37] virtio_balloon: implement VIRTIO_BALLOON_F_DEVICE_INIT_ON_INFLATE Michael S. Tsirkin
2026-06-08  9:17 ` [PATCH v10 00/37] mm/virtio: skip redundant zeroing of host-zeroed pages Lorenzo Stoakes
2026-06-08 12:52   ` Lorenzo Stoakes
2026-06-08 11:02 ` Vlastimil Babka (SUSE)
2026-06-08 11:13   ` Vlastimil Babka (SUSE)
2026-06-08 15:45     ` Gregory Price
2026-06-08 17:50       ` Lorenzo Stoakes
2026-06-08 14:21 ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aiaPBRm0Mp_WDPFa@lucifer \
    --to=ljs@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=byungchul@sk.com \
    --cc=chrisl@kernel.org \
    --cc=cl@gentwo.org \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=eperezma@redhat.com \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=hughd@google.com \
    --cc=jackmanb@google.com \
    --cc=jasowang@redhat.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kasong@tencent.com \
    --cc=lance.yang@linux.dev \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.brost@intel.com \
    --cc=mhocko@suse.com \
    --cc=mst@redhat.com \
    --cc=muchun.song@linux.dev \
    --cc=npache@redhat.com \
    --cc=nphamcs@gmail.com \
    --cc=osalvador@suse.de \
    --cc=rakie.kim@sk.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=virtualization@lists.linux.dev \
    --cc=weixugc@google.com \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yuanchu@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox