From: Michal Hocko <mhocko@suse.com>
To: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>, Baoquan He <bhe@redhat.com>
Subject: Re: [RFC 6/7] mm/vmalloc: Support non-blocking GFP flags in __vmalloc_area_node()
Date: Mon, 7 Jul 2025 09:13:04 +0200 [thread overview]
Message-ID: <aGtzgOXdhAAOTBhs@tiehlicka> (raw)
In-Reply-To: <20250704152537.55724-7-urezki@gmail.com>
On Fri 04-07-25 17:25:36, Uladzislau Rezki wrote:
> This patch makes __vmalloc_area_node() to correctly handle non-blocking
> allocation requests, such as GFP_ATOMIC and GFP_NOWAIT. Main changes:
>
> - nested_gfp flag follows the same non-blocking constraints
> as the primary gfp_mask, ensuring consistency and avoiding
> sleeping allocations in atomic contexts.
>
> - if blocking is not allowed, __GFP_NOFAIL is forcibly cleared
> and warning is issued if it was set, since __GFP_NOFAIL is
> incompatible with non-blocking contexts;
>
> - Add a __GFP_HIGHMEM to gfp_mask only for blocking requests
> if there are no DMA constraints.
>
> - in non-blocking mode we use memalloc_noreclaim_save/restore()
> to prevent reclaim related operations that may sleep while
> setting up page tables or mapping pages.
>
> This is particularly important for page table allocations that
> internally use GFP_PGTABLE_KERNEL, which may sleep unless such
> scope restrictions are applied. For example:
>
> <snip>
> #define GFP_PGTABLE_KERNEL (GFP_KERNEL | __GFP_ZERO)
>
> __pte_alloc_kernel()
> pte_alloc_one_kernel(&init_mm);
> pagetable_alloc_noprof(GFP_PGTABLE_KERNEL & ~__GFP_HIGHMEM, 0);
> <snip>
The changelog doesn't explain the actual implementation and that is
really crucial here. You rely on memalloc_noreclaim_save (i.e.
PF_MEMALLOC) to never trigger memory reclaim but you are not explaining
how do you prevent from the biggest caveat of this interface. Let me
quote the documentation
* Users of this scope have to be extremely careful to not deplete the reserves
* completely and implement a throttling mechanism which controls the
* consumption of the reserve based on the amount of freed memory. Usage of a
* pre-allocated pool (e.g. mempool) should be always considered before using
* this scope.
Unless I am missing something _any_ vmalloc(GFP_NOWAIT|GFP_ATOMIC) user
would get practically unbound access to the whole available memory. This
is not really acceptable.
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
> mm/vmalloc.c | 30 +++++++++++++++++++++++++-----
> 1 file changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 2eaff0575a9e..fe1699e01e02 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3711,7 +3711,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> pgprot_t prot, unsigned int page_shift,
> int node)
> {
> - const gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
> + gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
> bool nofail = gfp_mask & __GFP_NOFAIL;
> unsigned long addr = (unsigned long)area->addr;
> unsigned long size = get_vm_area_size(area);
> @@ -3719,12 +3719,28 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> unsigned int nr_small_pages = size >> PAGE_SHIFT;
> unsigned int page_order;
> unsigned int flags;
> + bool noblock;
> int ret;
>
> array_size = (unsigned long)nr_small_pages * sizeof(struct page *);
> + noblock = !gfpflags_allow_blocking(gfp_mask);
>
> - if (!(gfp_mask & (GFP_DMA | GFP_DMA32)))
> - gfp_mask |= __GFP_HIGHMEM;
> + if (noblock) {
> + /* __GFP_NOFAIL is incompatible with non-blocking contexts. */
> + WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL);
> + gfp_mask &= ~__GFP_NOFAIL;
> +
> + /*
> + * In non-sleeping contexts, ensure nested allocations follow
> + * same non-blocking rules.
> + */
> + nested_gfp = gfp_mask | __GFP_ZERO;
> + nofail = false;
> + } else {
> + /* Allow highmem allocations if there are no DMA constraints. */
> + if (!(gfp_mask & (GFP_DMA | GFP_DMA32)))
> + gfp_mask |= __GFP_HIGHMEM;
> + }
>
> /* Please note that the recursion is strictly bounded. */
> if (array_size > PAGE_SIZE) {
> @@ -3788,7 +3804,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> * page tables allocations ignore external gfp mask, enforce it
> * by the scope API
> */
> - if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> + if (noblock)
> + flags = memalloc_noreclaim_save();
> + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> flags = memalloc_nofs_save();
> else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
> flags = memalloc_noio_save();
> @@ -3800,7 +3818,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> schedule_timeout_uninterruptible(1);
> } while (nofail && (ret < 0));
>
> - if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> + if (noblock)
> + memalloc_noreclaim_restore(flags);
> + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> memalloc_nofs_restore(flags);
> else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
> memalloc_noio_restore(flags);
> --
> 2.39.5
>
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2025-07-07 7:13 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-04 15:25 [RFC 0/7] vmallloc and non-blocking GFPs Uladzislau Rezki (Sony)
2025-07-04 15:25 ` [RFC 1/7] lib/test_vmalloc: Add non-block-alloc-test case Uladzislau Rezki (Sony)
2025-07-08 5:59 ` [External] " Adrian Huang12
2025-07-08 8:29 ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 2/7] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area() Uladzislau Rezki (Sony)
2025-07-07 7:11 ` Michal Hocko
2025-07-08 12:34 ` Uladzislau Rezki
2025-07-08 15:17 ` Michal Hocko
2025-07-08 16:45 ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 3/7] mm/vmalloc: Avoid cond_resched() when blocking is not permitted Uladzislau Rezki (Sony)
2025-07-07 7:11 ` Michal Hocko
2025-07-08 12:29 ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 4/7] mm/kasan, mm/vmalloc: Respect GFP flags in kasan_populate_vmalloc() Uladzislau Rezki (Sony)
2025-07-07 1:47 ` Baoquan He
2025-07-08 1:15 ` Baoquan He
2025-07-08 8:30 ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 5/7] mm/vmalloc: Defer freeing partly initialized vm_struct Uladzislau Rezki (Sony)
2025-07-04 15:25 ` [RFC 6/7] mm/vmalloc: Support non-blocking GFP flags in __vmalloc_area_node() Uladzislau Rezki (Sony)
2025-07-07 7:13 ` Michal Hocko [this message]
2025-07-08 12:27 ` Uladzislau Rezki
2025-07-08 15:22 ` Michal Hocko
2025-07-09 11:20 ` Uladzislau Rezki
2025-07-08 15:47 ` Michal Hocko
2025-07-09 13:45 ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 7/7] mm: Drop __GFP_DIRECT_RECLAIM flag if PF_MEMALLOC is set Uladzislau Rezki (Sony)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGtzgOXdhAAOTBhs@tiehlicka \
--to=mhocko@suse.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.