From: Michal Hocko <mhocko@suse.com>
To: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>, Baoquan He <bhe@redhat.com>
Subject: Re: [RFC 6/7] mm/vmalloc: Support non-blocking GFP flags in __vmalloc_area_node()
Date: Mon, 7 Jul 2025 09:13:04 +0200 [thread overview]
Message-ID: <aGtzgOXdhAAOTBhs@tiehlicka> (raw)
In-Reply-To: <20250704152537.55724-7-urezki@gmail.com>
On Fri 04-07-25 17:25:36, Uladzislau Rezki wrote:
> This patch makes __vmalloc_area_node() to correctly handle non-blocking
> allocation requests, such as GFP_ATOMIC and GFP_NOWAIT. Main changes:
>
> - nested_gfp flag follows the same non-blocking constraints
> as the primary gfp_mask, ensuring consistency and avoiding
> sleeping allocations in atomic contexts.
>
> - if blocking is not allowed, __GFP_NOFAIL is forcibly cleared
> and warning is issued if it was set, since __GFP_NOFAIL is
> incompatible with non-blocking contexts;
>
> - Add a __GFP_HIGHMEM to gfp_mask only for blocking requests
> if there are no DMA constraints.
>
> - in non-blocking mode we use memalloc_noreclaim_save/restore()
> to prevent reclaim related operations that may sleep while
> setting up page tables or mapping pages.
>
> This is particularly important for page table allocations that
> internally use GFP_PGTABLE_KERNEL, which may sleep unless such
> scope restrictions are applied. For example:
>
> <snip>
> #define GFP_PGTABLE_KERNEL (GFP_KERNEL | __GFP_ZERO)
>
> __pte_alloc_kernel()
> pte_alloc_one_kernel(&init_mm);
> pagetable_alloc_noprof(GFP_PGTABLE_KERNEL & ~__GFP_HIGHMEM, 0);
> <snip>
The changelog doesn't explain the actual implementation and that is
really crucial here. You rely on memalloc_noreclaim_save (i.e.
PF_MEMALLOC) to never trigger memory reclaim but you are not explaining
how do you prevent from the biggest caveat of this interface. Let me
quote the documentation
* Users of this scope have to be extremely careful to not deplete the reserves
* completely and implement a throttling mechanism which controls the
* consumption of the reserve based on the amount of freed memory. Usage of a
* pre-allocated pool (e.g. mempool) should be always considered before using
* this scope.
Unless I am missing something _any_ vmalloc(GFP_NOWAIT|GFP_ATOMIC) user
would get practically unbound access to the whole available memory. This
is not really acceptable.
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
> mm/vmalloc.c | 30 +++++++++++++++++++++++++-----
> 1 file changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 2eaff0575a9e..fe1699e01e02 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3711,7 +3711,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> pgprot_t prot, unsigned int page_shift,
> int node)
> {
> - const gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
> + gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
> bool nofail = gfp_mask & __GFP_NOFAIL;
> unsigned long addr = (unsigned long)area->addr;
> unsigned long size = get_vm_area_size(area);
> @@ -3719,12 +3719,28 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> unsigned int nr_small_pages = size >> PAGE_SHIFT;
> unsigned int page_order;
> unsigned int flags;
> + bool noblock;
> int ret;
>
> array_size = (unsigned long)nr_small_pages * sizeof(struct page *);
> + noblock = !gfpflags_allow_blocking(gfp_mask);
>
> - if (!(gfp_mask & (GFP_DMA | GFP_DMA32)))
> - gfp_mask |= __GFP_HIGHMEM;
> + if (noblock) {
> + /* __GFP_NOFAIL is incompatible with non-blocking contexts. */
> + WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL);
> + gfp_mask &= ~__GFP_NOFAIL;
> +
> + /*
> + * In non-sleeping contexts, ensure nested allocations follow
> + * same non-blocking rules.
> + */
> + nested_gfp = gfp_mask | __GFP_ZERO;
> + nofail = false;
> + } else {
> + /* Allow highmem allocations if there are no DMA constraints. */
> + if (!(gfp_mask & (GFP_DMA | GFP_DMA32)))
> + gfp_mask |= __GFP_HIGHMEM;
> + }
>
> /* Please note that the recursion is strictly bounded. */
> if (array_size > PAGE_SIZE) {
> @@ -3788,7 +3804,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> * page tables allocations ignore external gfp mask, enforce it
> * by the scope API
> */
> - if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> + if (noblock)
> + flags = memalloc_noreclaim_save();
> + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> flags = memalloc_nofs_save();
> else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
> flags = memalloc_noio_save();
> @@ -3800,7 +3818,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> schedule_timeout_uninterruptible(1);
> } while (nofail && (ret < 0));
>
> - if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> + if (noblock)
> + memalloc_noreclaim_restore(flags);
> + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> memalloc_nofs_restore(flags);
> else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
> memalloc_noio_restore(flags);
> --
> 2.39.5
>
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2025-07-07 7:13 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-04 15:25 [RFC 0/7] vmallloc and non-blocking GFPs Uladzislau Rezki (Sony)
2025-07-04 15:25 ` [RFC 1/7] lib/test_vmalloc: Add non-block-alloc-test case Uladzislau Rezki (Sony)
2025-07-08 5:59 ` [External] " Adrian Huang12
2025-07-08 8:29 ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 2/7] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area() Uladzislau Rezki (Sony)
2025-07-07 7:11 ` Michal Hocko
2025-07-08 12:34 ` Uladzislau Rezki
2025-07-08 15:17 ` Michal Hocko
2025-07-08 16:45 ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 3/7] mm/vmalloc: Avoid cond_resched() when blocking is not permitted Uladzislau Rezki (Sony)
2025-07-07 7:11 ` Michal Hocko
2025-07-08 12:29 ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 4/7] mm/kasan, mm/vmalloc: Respect GFP flags in kasan_populate_vmalloc() Uladzislau Rezki (Sony)
2025-07-07 1:47 ` Baoquan He
2025-07-08 1:15 ` Baoquan He
2025-07-08 8:30 ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 5/7] mm/vmalloc: Defer freeing partly initialized vm_struct Uladzislau Rezki (Sony)
2025-07-04 15:25 ` [RFC 6/7] mm/vmalloc: Support non-blocking GFP flags in __vmalloc_area_node() Uladzislau Rezki (Sony)
2025-07-07 7:13 ` Michal Hocko [this message]
2025-07-08 12:27 ` Uladzislau Rezki
2025-07-08 15:22 ` Michal Hocko
2025-07-09 11:20 ` Uladzislau Rezki
2025-07-08 15:47 ` Michal Hocko
2025-07-09 13:45 ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 7/7] mm: Drop __GFP_DIRECT_RECLAIM flag if PF_MEMALLOC is set Uladzislau Rezki (Sony)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGtzgOXdhAAOTBhs@tiehlicka \
--to=mhocko@suse.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).