public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Frank van der Linden <fvdl@google.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Zhiguo Jiang <justinjiang@vivo.com>
Subject: Re: [PATCH] mm/page_alloc: don't increase highatomic reserve after pcp alloc
Date: Mon, 23 Mar 2026 14:36:28 +0100	[thread overview]
Message-ID: <44aadc9c-28d2-497d-ba4e-659517e8ca47@kernel.org> (raw)
In-Reply-To: <20260320173426.1831267-1-fvdl@google.com>

On 3/20/26 6:34 PM, Frank van der Linden wrote:
> Higher order GFP_ATOMIC allocations can be served through a
> PCP list with ALLOC_HIGHATOMIC set. Such an allocation can
> e.g.  happen if a zone is between the low and min watermarks,
> and get_page_from_freelist is retried after the alloc_flags
> are relaxed.
> 
> The call to reserve_highatomic_pageblock() after such a PCP
> allocation will result in an increase every single time:
> the page from the (unmovable) PCP list will never have
> migrate type MIGRATE_HIGHATOMIC, since MIGRATE_HIGHATOMIC
> pages do not appear on the unmovable PCP list. So a new
> pageblock is converted to MIGRATE_HIGHATOMIC.
> 
> Eventually that leads to the maximum of 1% of the zone being
> used up by (often mostly free) MIGRATE_HIGHATOMIC pageblocks,
> for no good reason. Since this space is not available for
> normal allocations, this wastes memory and will push things
> in to reclaim too soon.
> 
> This was observed on a system that ran a test with bursts of
> memory activity, pared with GFP_ATOMIC SLUB activity. These
> would lead to a new slab being allocated with GFP_ATOMIC,
> sometimes hitting the get_page_from_freelist retry path by
> being below the low watermark. While the frequency of those
> allocations was low, it kept adding up over time, and the
> number of MIGRATE_ATOMIC pageblocks kept increasing.
> 
> If a higher order atomic allocation can be served by
> the unmovable PCP list, there is probably no need yet to
> extend the reserves. So, move the check and possible extension
> of the highatomic reserves to the buddy case only, and
> do not refill the PCP list for ALLOC_HIGHATOMIC if it's
> empty. This way, the PCP list is tried for ALLOC_HIGHATOMIC
> for a fast atomic allocation. But it will immediately fall
> back to rmqueue_buddy() if it's empty. In rmqueue_buddy(),
> the MIGRATE_HIGHATOMIC buddy lists are tried first (as before),
> and the reserves are extended only if that fails.
> 
> With this change, the test was stable. Highatomic reserves
> were built up, but to a normal level. No highatomic failures
> were seen.
> 
> This is similar to the patch proposed in [1] by Zhiguo Jiang,
> but re-arranged a bit.
> 
> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> Signed-off-by: Frank van der Linden <fvdl@google.com>
> Link: https://lore.kernel.org/all/20231122013925.1507-1-justinjiang@vivo.com/ [1]
> Fixes: 44042b4498728 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")

Makes sense to me and looks ok. Thanks.

Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>

> ---
>  mm/page_alloc.c | 30 +++++++++++++++++++++++-------
>  1 file changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2d4b6f1a554e..57e17a15dae5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -243,6 +243,8 @@ unsigned int pageblock_order __read_mostly;
>  
>  static void __free_pages_ok(struct page *page, unsigned int order,
>  			    fpi_t fpi_flags);
> +static void reserve_highatomic_pageblock(struct page *page, int order,
> +					 struct zone *zone);
>  
>  /*
>   * results with 256, 32 in the lowmem_reserve sysctl:
> @@ -3275,6 +3277,13 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
>  		spin_unlock_irqrestore(&zone->lock, flags);
>  	} while (check_new_pages(page, order));
>  
> +	/*
> +	 * If this is a high-order atomic allocation then check
> +	 * if the pageblock should be reserved for the future
> +	 */
> +	if (unlikely(alloc_flags & ALLOC_HIGHATOMIC))
> +		reserve_highatomic_pageblock(page, order, zone);
> +
>  	__count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
>  	zone_statistics(preferred_zone, zone, 1);
>  
> @@ -3346,6 +3355,20 @@ struct page *__rmqueue_pcplist(struct zone *zone, unsigned int order,
>  			int batch = nr_pcp_alloc(pcp, zone, order);
>  			int alloced;
>  
> +			/*
> +			 * Don't refill the list for a higher order atomic
> +			 * allocation under memory pressure, as this would
> +			 * not build up any HIGHATOMIC reserves, which
> +			 * might be needed soon.
> +			 *
> +			 * Instead, direct it towards the reserves by
> +			 * returning NULL, which will make the caller fall
> +			 * back to rmqueue_buddy. This will try to use the
> +			 * reserves first and grow them if needed.
> +			 */
> +			if (alloc_flags & ALLOC_HIGHATOMIC)
> +				return NULL;
> +
>  			alloced = rmqueue_bulk(zone, order,
>  					batch, list,
>  					migratetype, alloc_flags);
> @@ -3961,13 +3984,6 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
>  		if (page) {
>  			prep_new_page(page, order, gfp_mask, alloc_flags);
>  
> -			/*
> -			 * If this is a high-order atomic allocation then check
> -			 * if the pageblock should be reserved for the future
> -			 */
> -			if (unlikely(alloc_flags & ALLOC_HIGHATOMIC))
> -				reserve_highatomic_pageblock(page, order, zone);
> -
>  			return page;
>  		} else {
>  			if (cond_accept_memory(zone, order, alloc_flags))



      reply	other threads:[~2026-03-23 13:36 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 17:34 [PATCH] mm/page_alloc: don't increase highatomic reserve after pcp alloc Frank van der Linden
2026-03-23 13:36 ` Vlastimil Babka (SUSE) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44aadc9c-28d2-497d-ba4e-659517e8ca47@kernel.org \
    --to=vbabka@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=fvdl@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=justinjiang@vivo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox