[PATCH] mm/page_alloc: don't increase highatomic reserve after pcp alloc

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Frank van der Linden <fvdl@google.com>
To: akpm@linux-foundation.org, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org
Cc: Vlastimil Babka <vbabka@kernel.org>,
	Michal Hocko <mhocko@kernel.org>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Frank van der Linden <fvdl@google.com>,
	 Zhiguo Jiang <justinjiang@vivo.com>
Subject: [PATCH] mm/page_alloc: don't increase highatomic reserve after pcp alloc
Date: Fri, 20 Mar 2026 17:34:25 +0000	[thread overview]
Message-ID: <20260320173426.1831267-1-fvdl@google.com> (raw)

Higher order GFP_ATOMIC allocations can be served through a
PCP list with ALLOC_HIGHATOMIC set. Such an allocation can
e.g.  happen if a zone is between the low and min watermarks,
and get_page_from_freelist is retried after the alloc_flags
are relaxed.

The call to reserve_highatomic_pageblock() after such a PCP
allocation will result in an increase every single time:
the page from the (unmovable) PCP list will never have
migrate type MIGRATE_HIGHATOMIC, since MIGRATE_HIGHATOMIC
pages do not appear on the unmovable PCP list. So a new
pageblock is converted to MIGRATE_HIGHATOMIC.

Eventually that leads to the maximum of 1% of the zone being
used up by (often mostly free) MIGRATE_HIGHATOMIC pageblocks,
for no good reason. Since this space is not available for
normal allocations, this wastes memory and will push things
in to reclaim too soon.

This was observed on a system that ran a test with bursts of
memory activity, pared with GFP_ATOMIC SLUB activity. These
would lead to a new slab being allocated with GFP_ATOMIC,
sometimes hitting the get_page_from_freelist retry path by
being below the low watermark. While the frequency of those
allocations was low, it kept adding up over time, and the
number of MIGRATE_ATOMIC pageblocks kept increasing.

If a higher order atomic allocation can be served by
the unmovable PCP list, there is probably no need yet to
extend the reserves. So, move the check and possible extension
of the highatomic reserves to the buddy case only, and
do not refill the PCP list for ALLOC_HIGHATOMIC if it's
empty. This way, the PCP list is tried for ALLOC_HIGHATOMIC
for a fast atomic allocation. But it will immediately fall
back to rmqueue_buddy() if it's empty. In rmqueue_buddy(),
the MIGRATE_HIGHATOMIC buddy lists are tried first (as before),
and the reserves are extended only if that fails.

With this change, the test was stable. Highatomic reserves
were built up, but to a normal level. No highatomic failures
were seen.

This is similar to the patch proposed in [1] by Zhiguo Jiang,
but re-arranged a bit.

Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
Signed-off-by: Frank van der Linden <fvdl@google.com>
Link: https://lore.kernel.org/all/20231122013925.1507-1-justinjiang@vivo.com/ [1]
Fixes: 44042b4498728 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
---
 mm/page_alloc.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2d4b6f1a554e..57e17a15dae5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -243,6 +243,8 @@ unsigned int pageblock_order __read_mostly;

 static void __free_pages_ok(struct page *page, unsigned int order,
 			    fpi_t fpi_flags);
+static void reserve_highatomic_pageblock(struct page *page, int order,
+					 struct zone *zone);

 /*
  * results with 256, 32 in the lowmem_reserve sysctl:
@@ -3275,6 +3277,13 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
 		spin_unlock_irqrestore(&zone->lock, flags);
 	} while (check_new_pages(page, order));

+	/*
+	 * If this is a high-order atomic allocation then check
+	 * if the pageblock should be reserved for the future
+	 */
+	if (unlikely(alloc_flags & ALLOC_HIGHATOMIC))
+		reserve_highatomic_pageblock(page, order, zone);
+
 	__count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
 	zone_statistics(preferred_zone, zone, 1);

@@ -3346,6 +3355,20 @@ struct page *__rmqueue_pcplist(struct zone *zone, unsigned int order,
 			int batch = nr_pcp_alloc(pcp, zone, order);
 			int alloced;

+			/*
+			 * Don't refill the list for a higher order atomic
+			 * allocation under memory pressure, as this would
+			 * not build up any HIGHATOMIC reserves, which
+			 * might be needed soon.
+			 *
+			 * Instead, direct it towards the reserves by
+			 * returning NULL, which will make the caller fall
+			 * back to rmqueue_buddy. This will try to use the
+			 * reserves first and grow them if needed.
+			 */
+			if (alloc_flags & ALLOC_HIGHATOMIC)
+				return NULL;
+
 			alloced = rmqueue_bulk(zone, order,
 					batch, list,
 					migratetype, alloc_flags);
@@ -3961,13 +3984,6 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
 		if (page) {
 			prep_new_page(page, order, gfp_mask, alloc_flags);

-			/*
-			 * If this is a high-order atomic allocation then check
-			 * if the pageblock should be reserved for the future
-			 */
-			if (unlikely(alloc_flags & ALLOC_HIGHATOMIC))
-				reserve_highatomic_pageblock(page, order, zone);
-
 			return page;
 		} else {
 			if (cond_accept_memory(zone, order, alloc_flags))
-- 
2.53.0.959.g497ff81fa9-goog

next             reply	other threads:[~2026-03-20 17:34 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 17:34 Frank van der Linden [this message]
2026-03-23 13:36 ` [PATCH] mm/page_alloc: don't increase highatomic reserve after pcp alloc Vlastimil Babka (SUSE)

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:2d4b6f1a554 dfblob:57e17a15dae )
 OR (
bs:"[PATCH] mm/page_alloc: don't increase highatomic reserve after pcp alloc" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260320173426.1831267-1-fvdl@google.com \
    --to=fvdl@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=justinjiang@vivo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox