From: Michal Hocko <mhocko@suse.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Suren Baghdasaryan <surenb@google.com>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
David Rientjes <rientjes@google.com>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Joshua Hahn <joshua.hahnjy@gmail.com>,
Pedro Falcato <pfalcato@suse.de>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH mm-unstable v3 1/3] mm/page_alloc: ignore the exact initial compaction result
Date: Tue, 6 Jan 2026 14:51:21 +0100 [thread overview]
Message-ID: <aV0TWde-Pu-8TBT8@tiehlicka> (raw)
In-Reply-To: <20260106-thp-thisnode-tweak-v3-1-f5d67c21a193@suse.cz>
On Tue 06-01-26 12:52:36, Vlastimil Babka wrote:
> For allocations that are of costly order and __GFP_NORETRY (and can
> perform compaction) we attempt direct compaction first. If that fails,
> we continue with a single round of direct reclaim+compaction (as for
> other __GFP_NORETRY allocations, except the compaction is of lower
> priority), with two exceptions that fail immediately:
>
> - __GFP_THISNODE is specified, to prevent zone_reclaim_mode-like
> behavior for e.g. THP page faults
>
> - compaction failed because it was deferred (i.e. has been failing
> recently so further attempts are not done for a while) or skipped,
> which means there are insufficient free base pages to defragment to
> begin with
>
> Upon closer inspection, the second condition has a somewhat flawed
> reasoning. If there are not enough base pages and reclaim could create
> them, we instead fail. When there are enough base pages and compaction
> has already ran and failed, we proceed and hope that reclaim and the
> subsequent compaction attempt will succeed. But it's unclear why they
> should and whether it will be as inexpensive as intended.
>
> It might make therefore more sense to just fail unconditionally after
> the initial compaction attempt. However that would change the semantics
> of __GFP_NORETRY to attempt reclaim at least once.
>
> Alternatively we can remove the compaction result checks and proceed
> with the single reclaim and (lower priority) compaction attempt, leaving
> only the __GFP_THISNODE exception for failing immediately.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Thanks!
> ---
> mm/page_alloc.c | 34 ++++++----------------------------
> 1 file changed, 6 insertions(+), 28 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ac8a12076b00..b06b1cb01e0e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4805,44 +4805,22 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> * includes some THP page fault allocations
> */
> if (costly_order && (gfp_mask & __GFP_NORETRY)) {
> - /*
> - * If allocating entire pageblock(s) and compaction
> - * failed because all zones are below low watermarks
> - * or is prohibited because it recently failed at this
> - * order, fail immediately unless the allocator has
> - * requested compaction and reclaim retry.
> - *
> - * Reclaim is
> - * - potentially very expensive because zones are far
> - * below their low watermarks or this is part of very
> - * bursty high order allocations,
> - * - not guaranteed to help because isolate_freepages()
> - * may not iterate over freed pages as part of its
> - * linear scan, and
> - * - unlikely to make entire pageblocks free on its
> - * own.
> - */
> - if (compact_result == COMPACT_SKIPPED ||
> - compact_result == COMPACT_DEFERRED)
> - goto nopage;
> -
> /*
> * THP page faults may attempt local node only first,
> * but are then allowed to only compact, not reclaim,
> * see alloc_pages_mpol().
> *
> - * Compaction can fail for other reasons than those
> - * checked above and we don't want such THP allocations
> - * to put reclaim pressure on a single node in a
> - * situation where other nodes might have plenty of
> - * available memory.
> + * Compaction has failed above and we don't want such
> + * THP allocations to put reclaim pressure on a single
> + * node in a situation where other nodes might have
> + * plenty of available memory.
> */
> if (gfp_mask & __GFP_THISNODE)
> goto nopage;
>
> /*
> - * Looks like reclaim/compaction is worth trying, but
> - * sync compaction could be very expensive, so keep
> + * Proceed with single round of reclaim/compaction, but
> + * since sync compaction could be very expensive, keep
> * using async compaction.
> */
> compact_priority = INIT_COMPACT_PRIORITY;
>
> --
> 2.52.0
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2026-01-06 13:51 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-06 11:52 [PATCH mm-unstable v3 0/3] tweaks for __alloc_pages_slowpath() Vlastimil Babka
2026-01-06 11:52 ` [PATCH mm-unstable v3 1/3] mm/page_alloc: ignore the exact initial compaction result Vlastimil Babka
2026-01-06 13:51 ` Michal Hocko [this message]
2026-01-06 11:52 ` [PATCH mm-unstable v3 2/3] mm/page_alloc: refactor the initial compaction handling Vlastimil Babka
2026-01-06 13:56 ` Michal Hocko
2026-01-06 11:52 ` [PATCH mm-unstable v3 3/3] mm/page_alloc: simplify __alloc_pages_slowpath() flow Vlastimil Babka
2026-01-06 14:00 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aV0TWde-Pu-8TBT8@tiehlicka \
--to=mhocko@suse.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=joshua.hahnjy@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=pfalcato@suse.de \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.