linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>, Baoquan He <bhe@redhat.com>
Subject: Re: [RFC 6/7] mm/vmalloc: Support non-blocking GFP flags in __vmalloc_area_node()
Date: Mon, 7 Jul 2025 09:13:04 +0200	[thread overview]
Message-ID: <aGtzgOXdhAAOTBhs@tiehlicka> (raw)
In-Reply-To: <20250704152537.55724-7-urezki@gmail.com>

On Fri 04-07-25 17:25:36, Uladzislau Rezki wrote:
> This patch makes __vmalloc_area_node() to correctly handle non-blocking
> allocation requests, such as GFP_ATOMIC and GFP_NOWAIT. Main changes:
> 
> - nested_gfp flag follows the same non-blocking constraints
>   as the primary gfp_mask, ensuring consistency and avoiding
>   sleeping allocations in atomic contexts.
> 
> - if blocking is not allowed, __GFP_NOFAIL is forcibly cleared
>   and warning is issued if it was set, since __GFP_NOFAIL is
>   incompatible with non-blocking contexts;
> 
> - Add a __GFP_HIGHMEM to gfp_mask only for blocking requests
>   if there are no DMA constraints.
> 
> - in non-blocking mode we use memalloc_noreclaim_save/restore()
>   to prevent reclaim related operations that may sleep while
>   setting up page tables or mapping pages.
> 
> This is particularly important for page table allocations that
> internally use GFP_PGTABLE_KERNEL, which may sleep unless such
> scope restrictions are applied. For example:
> 
> <snip>
>     #define GFP_PGTABLE_KERNEL (GFP_KERNEL | __GFP_ZERO)
> 
>     __pte_alloc_kernel()
>         pte_alloc_one_kernel(&init_mm);
>             pagetable_alloc_noprof(GFP_PGTABLE_KERNEL & ~__GFP_HIGHMEM, 0);
> <snip>

The changelog doesn't explain the actual implementation and that is
really crucial here. You rely on memalloc_noreclaim_save (i.e.
PF_MEMALLOC) to never trigger memory reclaim but you are not explaining
how do you prevent from the biggest caveat of this interface. Let me
quote the documentation
 * Users of this scope have to be extremely careful to not deplete the reserves
 * completely and implement a throttling mechanism which controls the
 * consumption of the reserve based on the amount of freed memory. Usage of a
 * pre-allocated pool (e.g. mempool) should be always considered before using
 * this scope.

Unless I am missing something _any_ vmalloc(GFP_NOWAIT|GFP_ATOMIC) user
would get practically unbound access to the whole available memory. This
is not really acceptable.

> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  mm/vmalloc.c | 30 +++++++++++++++++++++++++-----
>  1 file changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 2eaff0575a9e..fe1699e01e02 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3711,7 +3711,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>  				 pgprot_t prot, unsigned int page_shift,
>  				 int node)
>  {
> -	const gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
> +	gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
>  	bool nofail = gfp_mask & __GFP_NOFAIL;
>  	unsigned long addr = (unsigned long)area->addr;
>  	unsigned long size = get_vm_area_size(area);
> @@ -3719,12 +3719,28 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>  	unsigned int nr_small_pages = size >> PAGE_SHIFT;
>  	unsigned int page_order;
>  	unsigned int flags;
> +	bool noblock;
>  	int ret;
>  
>  	array_size = (unsigned long)nr_small_pages * sizeof(struct page *);
> +	noblock = !gfpflags_allow_blocking(gfp_mask);
>  
> -	if (!(gfp_mask & (GFP_DMA | GFP_DMA32)))
> -		gfp_mask |= __GFP_HIGHMEM;
> +	if (noblock) {
> +		/* __GFP_NOFAIL is incompatible with non-blocking contexts. */
> +		WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL);
> +		gfp_mask &= ~__GFP_NOFAIL;
> +
> +		/*
> +		 * In non-sleeping contexts, ensure nested allocations follow
> +		 * same non-blocking rules.
> +		 */
> +		nested_gfp = gfp_mask | __GFP_ZERO;
> +		nofail = false;
> +	} else {
> +		/* Allow highmem allocations if there are no DMA constraints. */
> +		if (!(gfp_mask & (GFP_DMA | GFP_DMA32)))
> +			gfp_mask |= __GFP_HIGHMEM;
> +	}
>  
>  	/* Please note that the recursion is strictly bounded. */
>  	if (array_size > PAGE_SIZE) {
> @@ -3788,7 +3804,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>  	 * page tables allocations ignore external gfp mask, enforce it
>  	 * by the scope API
>  	 */
> -	if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> +	if (noblock)
> +		flags = memalloc_noreclaim_save();
> +	else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
>  		flags = memalloc_nofs_save();
>  	else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
>  		flags = memalloc_noio_save();
> @@ -3800,7 +3818,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>  			schedule_timeout_uninterruptible(1);
>  	} while (nofail && (ret < 0));
>  
> -	if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> +	if (noblock)
> +		memalloc_noreclaim_restore(flags);
> +	else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
>  		memalloc_nofs_restore(flags);
>  	else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
>  		memalloc_noio_restore(flags);
> -- 
> 2.39.5
> 

-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2025-07-07  7:13 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-04 15:25 [RFC 0/7] vmallloc and non-blocking GFPs Uladzislau Rezki (Sony)
2025-07-04 15:25 ` [RFC 1/7] lib/test_vmalloc: Add non-block-alloc-test case Uladzislau Rezki (Sony)
2025-07-08  5:59   ` [External] " Adrian Huang12
2025-07-08  8:29     ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 2/7] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area() Uladzislau Rezki (Sony)
2025-07-07  7:11   ` Michal Hocko
2025-07-08 12:34     ` Uladzislau Rezki
2025-07-08 15:17       ` Michal Hocko
2025-07-08 16:45         ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 3/7] mm/vmalloc: Avoid cond_resched() when blocking is not permitted Uladzislau Rezki (Sony)
2025-07-07  7:11   ` Michal Hocko
2025-07-08 12:29     ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 4/7] mm/kasan, mm/vmalloc: Respect GFP flags in kasan_populate_vmalloc() Uladzislau Rezki (Sony)
2025-07-07  1:47   ` Baoquan He
2025-07-08  1:15     ` Baoquan He
2025-07-08  8:30       ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 5/7] mm/vmalloc: Defer freeing partly initialized vm_struct Uladzislau Rezki (Sony)
2025-07-04 15:25 ` [RFC 6/7] mm/vmalloc: Support non-blocking GFP flags in __vmalloc_area_node() Uladzislau Rezki (Sony)
2025-07-07  7:13   ` Michal Hocko [this message]
2025-07-08 12:27     ` Uladzislau Rezki
2025-07-08 15:22       ` Michal Hocko
2025-07-09 11:20         ` Uladzislau Rezki
2025-07-08 15:47   ` Michal Hocko
2025-07-09 13:45     ` Uladzislau Rezki
2025-07-04 15:25 ` [RFC 7/7] mm: Drop __GFP_DIRECT_RECLAIM flag if PF_MEMALLOC is set Uladzislau Rezki (Sony)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aGtzgOXdhAAOTBhs@tiehlicka \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).