From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Frank van der Linden <fvdl@google.com>,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Zhiguo Jiang <justinjiang@vivo.com>
Subject: Re: [PATCH] mm/page_alloc: don't increase highatomic reserve after pcp alloc
Date: Mon, 23 Mar 2026 14:36:28 +0100 [thread overview]
Message-ID: <44aadc9c-28d2-497d-ba4e-659517e8ca47@kernel.org> (raw)
In-Reply-To: <20260320173426.1831267-1-fvdl@google.com>
On 3/20/26 6:34 PM, Frank van der Linden wrote:
> Higher order GFP_ATOMIC allocations can be served through a
> PCP list with ALLOC_HIGHATOMIC set. Such an allocation can
> e.g. happen if a zone is between the low and min watermarks,
> and get_page_from_freelist is retried after the alloc_flags
> are relaxed.
>
> The call to reserve_highatomic_pageblock() after such a PCP
> allocation will result in an increase every single time:
> the page from the (unmovable) PCP list will never have
> migrate type MIGRATE_HIGHATOMIC, since MIGRATE_HIGHATOMIC
> pages do not appear on the unmovable PCP list. So a new
> pageblock is converted to MIGRATE_HIGHATOMIC.
>
> Eventually that leads to the maximum of 1% of the zone being
> used up by (often mostly free) MIGRATE_HIGHATOMIC pageblocks,
> for no good reason. Since this space is not available for
> normal allocations, this wastes memory and will push things
> in to reclaim too soon.
>
> This was observed on a system that ran a test with bursts of
> memory activity, pared with GFP_ATOMIC SLUB activity. These
> would lead to a new slab being allocated with GFP_ATOMIC,
> sometimes hitting the get_page_from_freelist retry path by
> being below the low watermark. While the frequency of those
> allocations was low, it kept adding up over time, and the
> number of MIGRATE_ATOMIC pageblocks kept increasing.
>
> If a higher order atomic allocation can be served by
> the unmovable PCP list, there is probably no need yet to
> extend the reserves. So, move the check and possible extension
> of the highatomic reserves to the buddy case only, and
> do not refill the PCP list for ALLOC_HIGHATOMIC if it's
> empty. This way, the PCP list is tried for ALLOC_HIGHATOMIC
> for a fast atomic allocation. But it will immediately fall
> back to rmqueue_buddy() if it's empty. In rmqueue_buddy(),
> the MIGRATE_HIGHATOMIC buddy lists are tried first (as before),
> and the reserves are extended only if that fails.
>
> With this change, the test was stable. Highatomic reserves
> were built up, but to a normal level. No highatomic failures
> were seen.
>
> This is similar to the patch proposed in [1] by Zhiguo Jiang,
> but re-arranged a bit.
>
> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> Signed-off-by: Frank van der Linden <fvdl@google.com>
> Link: https://lore.kernel.org/all/20231122013925.1507-1-justinjiang@vivo.com/ [1]
> Fixes: 44042b4498728 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
Makes sense to me and looks ok. Thanks.
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
> ---
> mm/page_alloc.c | 30 +++++++++++++++++++++++-------
> 1 file changed, 23 insertions(+), 7 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2d4b6f1a554e..57e17a15dae5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -243,6 +243,8 @@ unsigned int pageblock_order __read_mostly;
>
> static void __free_pages_ok(struct page *page, unsigned int order,
> fpi_t fpi_flags);
> +static void reserve_highatomic_pageblock(struct page *page, int order,
> + struct zone *zone);
>
> /*
> * results with 256, 32 in the lowmem_reserve sysctl:
> @@ -3275,6 +3277,13 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
> spin_unlock_irqrestore(&zone->lock, flags);
> } while (check_new_pages(page, order));
>
> + /*
> + * If this is a high-order atomic allocation then check
> + * if the pageblock should be reserved for the future
> + */
> + if (unlikely(alloc_flags & ALLOC_HIGHATOMIC))
> + reserve_highatomic_pageblock(page, order, zone);
> +
> __count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
> zone_statistics(preferred_zone, zone, 1);
>
> @@ -3346,6 +3355,20 @@ struct page *__rmqueue_pcplist(struct zone *zone, unsigned int order,
> int batch = nr_pcp_alloc(pcp, zone, order);
> int alloced;
>
> + /*
> + * Don't refill the list for a higher order atomic
> + * allocation under memory pressure, as this would
> + * not build up any HIGHATOMIC reserves, which
> + * might be needed soon.
> + *
> + * Instead, direct it towards the reserves by
> + * returning NULL, which will make the caller fall
> + * back to rmqueue_buddy. This will try to use the
> + * reserves first and grow them if needed.
> + */
> + if (alloc_flags & ALLOC_HIGHATOMIC)
> + return NULL;
> +
> alloced = rmqueue_bulk(zone, order,
> batch, list,
> migratetype, alloc_flags);
> @@ -3961,13 +3984,6 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
> if (page) {
> prep_new_page(page, order, gfp_mask, alloc_flags);
>
> - /*
> - * If this is a high-order atomic allocation then check
> - * if the pageblock should be reserved for the future
> - */
> - if (unlikely(alloc_flags & ALLOC_HIGHATOMIC))
> - reserve_highatomic_pageblock(page, order, zone);
> -
> return page;
> } else {
> if (cond_accept_memory(zone, order, alloc_flags))
prev parent reply other threads:[~2026-03-23 13:36 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-20 17:34 [PATCH] mm/page_alloc: don't increase highatomic reserve after pcp alloc Frank van der Linden
2026-03-23 13:36 ` Vlastimil Babka (SUSE) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44aadc9c-28d2-497d-ba4e-659517e8ca47@kernel.org \
--to=vbabka@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=fvdl@google.com \
--cc=hannes@cmpxchg.org \
--cc=justinjiang@vivo.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox