linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/10] __vmalloc() and no-block support(v2)
@ 2025-09-15 13:40 Uladzislau Rezki (Sony)
  2025-09-15 13:40 ` [PATCH v2 01/10] lib/test_vmalloc: add no_block_alloc_test case Uladzislau Rezki (Sony)
                   ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton; +Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki

This is v2, i do not count RFC version. It is based on Linux 6.17-rc6.

https://lore.kernel.org/all/20250704152537.55724-1-urezki@gmail.com/
https://lkml.org/lkml/2025/8/7/332

It makes __vmalloc() to support non-blocking flags such as GFP_ATOMIC
and GFP_NOWAIT, vmalloc_huge() is not supported yet.

v1 -> v2:
 - plumb gfp to KMSAN for its internal allocations;
 - update documentation of __vmalloc_node_range();
 - Instead if dropping direct reclaim flag, check PF_MEMALLOC in might_alloc();
 - dropped "mm/vmalloc: Remove cond_resched() in vm_area_alloc_pages()"
   patch and keep cond_resched() points. Need more investigation before
   dropping them.

Uladzislau Rezki (Sony) (10):
  lib/test_vmalloc: add no_block_alloc_test case
  lib/test_vmalloc: Remove xfail condition check
  mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area()
  mm/vmalloc: Avoid cond_resched() when blocking is not permitted
  mm/vmalloc: Defer freeing partly initialized vm_struct
  mm/vmalloc: Handle non-blocking GFP in __vmalloc_area_node()
  mm/kasan: Support non-blocking GFP in kasan_populate_vmalloc()
  kmsan: Remove hard-coded GFP_KERNEL flags
  mm: Skip might_alloc() warnings when PF_MEMALLOC is set
  mm/vmalloc: Update __vmalloc_node_range() documentation

 include/linux/kmsan.h    |   6 +-
 include/linux/sched/mm.h |   3 +-
 include/linux/vmalloc.h  |   8 +-
 lib/test_vmalloc.c       |  28 ++++++-
 mm/internal.h            |   4 +-
 mm/kasan/shadow.c        |  12 +--
 mm/kmsan/shadow.c        |   6 +-
 mm/percpu-vm.c           |   2 +-
 mm/vmalloc.c             | 158 ++++++++++++++++++++++++++++++---------
 9 files changed, 169 insertions(+), 58 deletions(-)

-- 
2.47.3



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 01/10] lib/test_vmalloc: add no_block_alloc_test case
  2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
@ 2025-09-15 13:40 ` Uladzislau Rezki (Sony)
  2025-09-15 13:40 ` [PATCH v2 02/10] lib/test_vmalloc: Remove xfail condition check Uladzislau Rezki (Sony)
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton; +Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki

Introduce a new test case "no_block_alloc_test" that verifies
non-blocking allocations using __vmalloc() with GFP_ATOMIC and
GFP_NOWAIT flags.

It is recommended to build kernel with CONFIG_DEBUG_ATOMIC_SLEEP
enabled to help catch "sleeping while atomic" issues. This test
ensures that memory allocation logic under atomic constraints
does not inadvertently sleep.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 lib/test_vmalloc.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c
index 2815658ccc37..aae5f4910aff 100644
--- a/lib/test_vmalloc.c
+++ b/lib/test_vmalloc.c
@@ -54,6 +54,7 @@ __param(int, run_test_mask, 7,
 		"\t\tid: 256,  name: kvfree_rcu_1_arg_vmalloc_test\n"
 		"\t\tid: 512,  name: kvfree_rcu_2_arg_vmalloc_test\n"
 		"\t\tid: 1024, name: vm_map_ram_test\n"
+		"\t\tid: 2048, name: no_block_alloc_test\n"
 		/* Add a new test case description here. */
 );
 
@@ -283,6 +284,30 @@ static int fix_size_alloc_test(void)
 	return 0;
 }
 
+static int no_block_alloc_test(void)
+{
+	void *ptr;
+	int i;
+
+	for (i = 0; i < test_loop_count; i++) {
+		bool use_atomic = !!(get_random_u8() % 2);
+		gfp_t gfp = use_atomic ? GFP_ATOMIC : GFP_NOWAIT;
+		unsigned long size = (nr_pages > 0 ? nr_pages : 1) * PAGE_SIZE;
+
+		preempt_disable();
+		ptr = __vmalloc(size, gfp);
+		preempt_enable();
+
+		if (!ptr)
+			return -1;
+
+		*((__u8 *)ptr) = 0;
+		vfree(ptr);
+	}
+
+	return 0;
+}
+
 static int
 pcpu_alloc_test(void)
 {
@@ -411,6 +436,7 @@ static struct test_case_desc test_case_array[] = {
 	{ "kvfree_rcu_1_arg_vmalloc_test", kvfree_rcu_1_arg_vmalloc_test, },
 	{ "kvfree_rcu_2_arg_vmalloc_test", kvfree_rcu_2_arg_vmalloc_test, },
 	{ "vm_map_ram_test", vm_map_ram_test, },
+	{ "no_block_alloc_test", no_block_alloc_test, true },
 	/* Add a new test case here. */
 };
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 02/10] lib/test_vmalloc: Remove xfail condition check
  2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
  2025-09-15 13:40 ` [PATCH v2 01/10] lib/test_vmalloc: add no_block_alloc_test case Uladzislau Rezki (Sony)
@ 2025-09-15 13:40 ` Uladzislau Rezki (Sony)
  2025-09-15 13:40 ` [PATCH v2 03/10] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area() Uladzislau Rezki (Sony)
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton; +Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki

A test marked with "xfail = true" is expected to fail but that
does not mean it is predetermined to fail. Remove "xfail" condition
check for tests which pass successfully.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 lib/test_vmalloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c
index aae5f4910aff..6521c05c7816 100644
--- a/lib/test_vmalloc.c
+++ b/lib/test_vmalloc.c
@@ -500,7 +500,7 @@ static int test_func(void *private)
 		for (j = 0; j < test_repeat_count; j++) {
 			ret = test_case_array[index].test_func();
 
-			if (!ret && !test_case_array[index].xfail)
+			if (!ret)
 				t->data[index].test_passed++;
 			else if (ret && test_case_array[index].xfail)
 				t->data[index].test_xfailed++;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 03/10] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area()
  2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
  2025-09-15 13:40 ` [PATCH v2 01/10] lib/test_vmalloc: add no_block_alloc_test case Uladzislau Rezki (Sony)
  2025-09-15 13:40 ` [PATCH v2 02/10] lib/test_vmalloc: Remove xfail condition check Uladzislau Rezki (Sony)
@ 2025-09-15 13:40 ` Uladzislau Rezki (Sony)
  2025-09-18  2:56   ` Baoquan He
  2025-09-15 13:40 ` [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted Uladzislau Rezki (Sony)
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton
  Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki, Michal Hocko

alloc_vmap_area() currently assumes that sleeping is allowed during
allocation. This is not true for callers which pass non-blocking
GFP flags, such as GFP_ATOMIC or GFP_NOWAIT.

This patch adds logic to detect whether the given gfp_mask permits
blocking. It avoids invoking might_sleep() or falling back to reclaim
path if blocking is not allowed.

This makes alloc_vmap_area() safer for use in non-sleeping contexts,
where previously it could hit unexpected sleeps, trigger warnings.

It is a preparation and adjustment step to later allow both GFP_ATOMIC
and GFP_NOWAIT allocations in this series.

Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 5edd536ba9d2..49a0f81930a8 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2017,6 +2017,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	unsigned long freed;
 	unsigned long addr;
 	unsigned int vn_id;
+	bool allow_block;
 	int purged = 0;
 	int ret;
 
@@ -2028,7 +2029,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 
 	/* Only reclaim behaviour flags are relevant. */
 	gfp_mask = gfp_mask & GFP_RECLAIM_MASK;
-	might_sleep();
+	allow_block = gfpflags_allow_blocking(gfp_mask);
+	might_sleep_if(allow_block);
 
 	/*
 	 * If a VA is obtained from a global heap(if it fails here)
@@ -2065,8 +2067,16 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	 * If an allocation fails, the error value is
 	 * returned. Therefore trigger the overflow path.
 	 */
-	if (IS_ERR_VALUE(addr))
-		goto overflow;
+	if (IS_ERR_VALUE(addr)) {
+		if (allow_block)
+			goto overflow;
+
+		/*
+		 * We can not trigger any reclaim logic because
+		 * sleeping is not allowed, thus fail an allocation.
+		 */
+		goto out_free_va;
+	}
 
 	va->va_start = addr;
 	va->va_end = addr + size;
@@ -2116,6 +2126,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 		pr_warn("vmalloc_node_range for size %lu failed: Address range restricted to %#lx - %#lx\n",
 				size, vstart, vend);
 
+out_free_va:
 	kmem_cache_free(vmap_area_cachep, va);
 	return ERR_PTR(-EBUSY);
 }
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted
  2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
                   ` (2 preceding siblings ...)
  2025-09-15 13:40 ` [PATCH v2 03/10] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area() Uladzislau Rezki (Sony)
@ 2025-09-15 13:40 ` Uladzislau Rezki (Sony)
  2025-09-15 17:11   ` Michal Hocko
  2025-09-18  2:57   ` Baoquan He
  2025-09-15 13:40 ` [PATCH v2 05/10] mm/vmalloc: Defer freeing partly initialized vm_struct Uladzislau Rezki (Sony)
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton; +Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki

vm_area_alloc_pages() contains the only voluntary reschedule points
along vmalloc() allocation path. They are needed to ensure forward
progress on PREEMPT_NONE kernels under contention for vmap metadata
(e.g. alloc_vmap_area()).

However, yielding should only be done if the given GFP flags allow
blocking. This patch avoids calling cond_resched() when allocation
context is non-blocking(GFP_ATOMIC, GFP_NOWAIT).

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 49a0f81930a8..b77e8be75f10 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3633,7 +3633,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 							pages + nr_allocated);
 
 			nr_allocated += nr;
-			cond_resched();
+
+			if (gfpflags_allow_blocking(gfp))
+				cond_resched();
 
 			/*
 			 * If zero or pages were obtained partly,
@@ -3675,7 +3677,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 		for (i = 0; i < (1U << order); i++)
 			pages[nr_allocated + i] = page + i;
 
-		cond_resched();
+		if (gfpflags_allow_blocking(gfp))
+			cond_resched();
+
 		nr_allocated += 1U << order;
 	}
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 05/10] mm/vmalloc: Defer freeing partly initialized vm_struct
  2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
                   ` (3 preceding siblings ...)
  2025-09-15 13:40 ` [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted Uladzislau Rezki (Sony)
@ 2025-09-15 13:40 ` Uladzislau Rezki (Sony)
  2025-09-18  2:59   ` Baoquan He
  2025-09-15 13:40 ` [PATCH v2 06/10] mm/vmalloc: Handle non-blocking GFP in __vmalloc_area_node() Uladzislau Rezki (Sony)
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton
  Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki, Michal Hocko

__vmalloc_area_node() may call free_vmap_area() or vfree() on
error paths, both of which can sleep. This becomes problematic
if the function is invoked from an atomic context, such as when
GFP_ATOMIC or GFP_NOWAIT is passed via gfp_mask.

To fix this, unify error paths and defer the cleanup of partly
initialized vm_struct objects to a workqueue. This ensures that
freeing happens in a process context and avoids invalid sleeps
in atomic regions.

Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/linux/vmalloc.h |  6 +++++-
 mm/vmalloc.c            | 34 +++++++++++++++++++++++++++++++---
 2 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 2759dac6be44..97252078a3dc 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -50,7 +50,11 @@ struct iov_iter;		/* in uio.h */
 #endif
 
 struct vm_struct {
-	struct vm_struct	*next;
+	union {
+		struct vm_struct *next;	  /* Early registration of vm_areas. */
+		struct llist_node llnode; /* Asynchronous freeing on error paths. */
+	};
+
 	void			*addr;
 	unsigned long		size;
 	unsigned long		flags;
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b77e8be75f10..e61e62872372 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3686,6 +3686,35 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 	return nr_allocated;
 }
 
+static LLIST_HEAD(pending_vm_area_cleanup);
+static void cleanup_vm_area_work(struct work_struct *work)
+{
+	struct vm_struct *area, *tmp;
+	struct llist_node *head;
+
+	head = llist_del_all(&pending_vm_area_cleanup);
+	if (!head)
+		return;
+
+	llist_for_each_entry_safe(area, tmp, head, llnode) {
+		if (!area->pages)
+			free_vm_area(area);
+		else
+			vfree(area->addr);
+	}
+}
+
+/*
+ * Helper for __vmalloc_area_node() to defer cleanup
+ * of partially initialized vm_struct in error paths.
+ */
+static DECLARE_WORK(cleanup_vm_area, cleanup_vm_area_work);
+static void defer_vm_area_cleanup(struct vm_struct *area)
+{
+	if (llist_add(&area->llnode, &pending_vm_area_cleanup))
+		schedule_work(&cleanup_vm_area);
+}
+
 static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 				 pgprot_t prot, unsigned int page_shift,
 				 int node)
@@ -3717,8 +3746,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 		warn_alloc(gfp_mask, NULL,
 			"vmalloc error: size %lu, failed to allocated page array size %lu",
 			nr_small_pages * PAGE_SIZE, array_size);
-		free_vm_area(area);
-		return NULL;
+		goto fail;
 	}
 
 	set_vm_area_page_order(area, page_shift - PAGE_SHIFT);
@@ -3795,7 +3823,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 	return area->addr;
 
 fail:
-	vfree(area->addr);
+	defer_vm_area_cleanup(area);
 	return NULL;
 }
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 06/10] mm/vmalloc: Handle non-blocking GFP in __vmalloc_area_node()
  2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
                   ` (4 preceding siblings ...)
  2025-09-15 13:40 ` [PATCH v2 05/10] mm/vmalloc: Defer freeing partly initialized vm_struct Uladzislau Rezki (Sony)
@ 2025-09-15 13:40 ` Uladzislau Rezki (Sony)
  2025-09-18  3:01   ` Baoquan He
  2025-09-15 13:40 ` [PATCH v2 07/10] mm/kasan: Support non-blocking GFP in kasan_populate_vmalloc() Uladzislau Rezki (Sony)
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton
  Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki, Michal Hocko

Make __vmalloc_area_node() respect non-blocking GFP masks such
as GFP_ATOMIC and GFP_NOWAIT.

- Add memalloc_apply_gfp_scope()/memalloc_restore_scope()
  helpers to apply a proper scope.
- Apply memalloc_apply_gfp_scope()/memalloc_restore_scope()
  around vmap_pages_range() for page table setup.
- Set "nofail" to false if a non-blocking mask is used, as
  they are mutually exclusive.

This is particularly important for page table allocations that
internally use GFP_PGTABLE_KERNEL, which may sleep unless such
scope restrictions are applied. For example:

<snip>
__pte_alloc_kernel()
  pte_alloc_one_kernel(&init_mm);
    pagetable_alloc_noprof(GFP_PGTABLE_KERNEL & ~__GFP_HIGHMEM, 0);
<snip>

Note: in most cases, PTE entries are established only up to the
level required by current vmap space usage, meaning the page tables
are typically fully populated during the mapping process.

Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/linux/vmalloc.h |  2 ++
 mm/vmalloc.c            | 52 +++++++++++++++++++++++++++++++++--------
 2 files changed, 44 insertions(+), 10 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 97252078a3dc..dcbcbfa842ae 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -326,4 +326,6 @@ bool vmalloc_dump_obj(void *object);
 static inline bool vmalloc_dump_obj(void *object) { return false; }
 #endif
 
+unsigned int memalloc_apply_gfp_scope(gfp_t gfp_mask);
+void memalloc_restore_scope(unsigned int flags);
 #endif /* _LINUX_VMALLOC_H */
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e61e62872372..5e01c6ac4aca 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3715,6 +3715,42 @@ static void defer_vm_area_cleanup(struct vm_struct *area)
 		schedule_work(&cleanup_vm_area);
 }
 
+/*
+ * Page tables allocations ignore external GFP. Enforces it by
+ * the memalloc scope API. It is used by vmalloc internals and
+ * KASAN shadow population only.
+ *
+ * GFP to scope mapping:
+ *
+ * non-blocking (no __GFP_DIRECT_RECLAIM) - memalloc_noreclaim_save()
+ * GFP_NOFS - memalloc_nofs_save()
+ * GFP_NOIO - memalloc_noio_save()
+ *
+ * Returns a flag cookie to pair with restore.
+ */
+unsigned int
+memalloc_apply_gfp_scope(gfp_t gfp_mask)
+{
+	unsigned int flags = 0;
+
+	if (!gfpflags_allow_blocking(gfp_mask))
+		flags = memalloc_noreclaim_save();
+	else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
+		flags = memalloc_nofs_save();
+	else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
+		flags = memalloc_noio_save();
+
+	/* 0 - no scope applied. */
+	return flags;
+}
+
+void
+memalloc_restore_scope(unsigned int flags)
+{
+	if (flags)
+		memalloc_flags_restore(flags);
+}
+
 static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 				 pgprot_t prot, unsigned int page_shift,
 				 int node)
@@ -3731,6 +3767,10 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 
 	array_size = (unsigned long)nr_small_pages * sizeof(struct page *);
 
+	/* __GFP_NOFAIL and "noblock" flags are mutually exclusive. */
+	if (!gfpflags_allow_blocking(gfp_mask))
+		nofail = false;
+
 	if (!(gfp_mask & (GFP_DMA | GFP_DMA32)))
 		gfp_mask |= __GFP_HIGHMEM;
 
@@ -3796,22 +3836,14 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 	 * page tables allocations ignore external gfp mask, enforce it
 	 * by the scope API
 	 */
-	if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
-		flags = memalloc_nofs_save();
-	else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
-		flags = memalloc_noio_save();
-
+	flags = memalloc_apply_gfp_scope(gfp_mask);
 	do {
 		ret = vmap_pages_range(addr, addr + size, prot, area->pages,
 			page_shift);
 		if (nofail && (ret < 0))
 			schedule_timeout_uninterruptible(1);
 	} while (nofail && (ret < 0));
-
-	if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
-		memalloc_nofs_restore(flags);
-	else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
-		memalloc_noio_restore(flags);
+	memalloc_restore_scope(flags);
 
 	if (ret < 0) {
 		warn_alloc(gfp_mask, NULL,
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 07/10] mm/kasan: Support non-blocking GFP in kasan_populate_vmalloc()
  2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
                   ` (5 preceding siblings ...)
  2025-09-15 13:40 ` [PATCH v2 06/10] mm/vmalloc: Handle non-blocking GFP in __vmalloc_area_node() Uladzislau Rezki (Sony)
@ 2025-09-15 13:40 ` Uladzislau Rezki (Sony)
  2025-09-18  3:02   ` Baoquan He
  2025-09-18 14:56   ` Andrey Ryabinin
  2025-09-15 13:40 ` [PATCH v2 08/10] kmsan: Remove hard-coded GFP_KERNEL flags Uladzislau Rezki (Sony)
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton
  Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki, Andrey Ryabinin,
	Alexander Potapenko

A "gfp_mask" is already passed to kasan_populate_vmalloc() as
an argument to respect GFPs from callers and KASAN uses it for
its internal allocations.

But apply_to_page_range() function ignores GFP flags due to a
hard-coded mask.

Wrap the call with memalloc_apply_gfp_scope()/memalloc_restore_scope()
so that non-blocking GFP flags(GFP_ATOMIC, GFP_NOWAIT) are respected.

Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/kasan/shadow.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index 11d472a5c4e8..c6643a72d9f6 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -377,18 +377,10 @@ static int __kasan_populate_vmalloc(unsigned long start, unsigned long end, gfp_
 		 * page tables allocations ignore external gfp mask, enforce it
 		 * by the scope API
 		 */
-		if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
-			flags = memalloc_nofs_save();
-		else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
-			flags = memalloc_noio_save();
-
+		flags = memalloc_apply_gfp_scope(gfp_mask);
 		ret = apply_to_page_range(&init_mm, start, nr_pages * PAGE_SIZE,
 					  kasan_populate_vmalloc_pte, &data);
-
-		if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
-			memalloc_nofs_restore(flags);
-		else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
-			memalloc_noio_restore(flags);
+		memalloc_restore_scope(flags);
 
 		___free_pages_bulk(data.pages, nr_pages);
 		if (ret)
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 08/10] kmsan: Remove hard-coded GFP_KERNEL flags
  2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
                   ` (6 preceding siblings ...)
  2025-09-15 13:40 ` [PATCH v2 07/10] mm/kasan: Support non-blocking GFP in kasan_populate_vmalloc() Uladzislau Rezki (Sony)
@ 2025-09-15 13:40 ` Uladzislau Rezki (Sony)
  2025-09-15 13:40 ` [PATCH v2 09/10] mm: Skip might_alloc() warnings when PF_MEMALLOC is set Uladzislau Rezki (Sony)
  2025-09-15 13:40 ` [PATCH v2 10/10] mm/vmalloc: Update __vmalloc_node_range() documentation Uladzislau Rezki (Sony)
  9 siblings, 0 replies; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton
  Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki,
	Alexander Potapenko, Marco Elver

kmsan_vmap_pages_range_noflush() allocates its temp s_pages/o_pages
arrays with GFP_KERNEL, which may sleep. This is inconsistent with
vmalloc() as it will support non-blocking requests later.

Plumb gfp_mask through the kmsan_vmap_pages_range_noflush(), so it
can use it internally for its demand.

Please note, the subsequent __vmap_pages_range_noflush() still uses
GFP_KERNEL and can sleep. If a caller runs under reclaim constraints,
sleeping is forbidden, it must establish the appropriate memalloc
scope API.

Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/linux/kmsan.h |  6 ++++--
 mm/internal.h         |  4 ++--
 mm/kmsan/shadow.c     |  6 +++---
 mm/percpu-vm.c        |  2 +-
 mm/vmalloc.c          | 26 +++++++++++++++++---------
 5 files changed, 27 insertions(+), 17 deletions(-)

diff --git a/include/linux/kmsan.h b/include/linux/kmsan.h
index 2b1432cc16d5..e4b34e7a3b11 100644
--- a/include/linux/kmsan.h
+++ b/include/linux/kmsan.h
@@ -133,6 +133,7 @@ void kmsan_kfree_large(const void *ptr);
  * @prot:	page protection flags used for vmap.
  * @pages:	array of pages.
  * @page_shift:	page_shift passed to vmap_range_noflush().
+ * @gfp_mask:	gfp_mask to use internally.
  *
  * KMSAN maps shadow and origin pages of @pages into contiguous ranges in
  * vmalloc metadata address range. Returns 0 on success, callers must check
@@ -142,7 +143,8 @@ int __must_check kmsan_vmap_pages_range_noflush(unsigned long start,
 						unsigned long end,
 						pgprot_t prot,
 						struct page **pages,
-						unsigned int page_shift);
+						unsigned int page_shift,
+						gfp_t gfp_mask);
 
 /**
  * kmsan_vunmap_kernel_range_noflush() - Notify KMSAN about a vunmap.
@@ -348,7 +350,7 @@ static inline void kmsan_kfree_large(const void *ptr)
 
 static inline int __must_check kmsan_vmap_pages_range_noflush(
 	unsigned long start, unsigned long end, pgprot_t prot,
-	struct page **pages, unsigned int page_shift)
+	struct page **pages, unsigned int page_shift, gfp_t gfp_mask)
 {
 	return 0;
 }
diff --git a/mm/internal.h b/mm/internal.h
index 45b725c3dc03..5f3486c1cb83 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1359,7 +1359,7 @@ size_t splice_folio_into_pipe(struct pipe_inode_info *pipe,
 #ifdef CONFIG_MMU
 void __init vmalloc_init(void);
 int __must_check vmap_pages_range_noflush(unsigned long addr, unsigned long end,
-                pgprot_t prot, struct page **pages, unsigned int page_shift);
+	pgprot_t prot, struct page **pages, unsigned int page_shift, gfp_t gfp_mask);
 unsigned int get_vm_area_page_order(struct vm_struct *vm);
 #else
 static inline void vmalloc_init(void)
@@ -1368,7 +1368,7 @@ static inline void vmalloc_init(void)
 
 static inline
 int __must_check vmap_pages_range_noflush(unsigned long addr, unsigned long end,
-                pgprot_t prot, struct page **pages, unsigned int page_shift)
+	pgprot_t prot, struct page **pages, unsigned int page_shift, gfp_t gfp_mask)
 {
 	return -EINVAL;
 }
diff --git a/mm/kmsan/shadow.c b/mm/kmsan/shadow.c
index 54f3c3c962f0..3cd733663100 100644
--- a/mm/kmsan/shadow.c
+++ b/mm/kmsan/shadow.c
@@ -215,7 +215,7 @@ void kmsan_free_page(struct page *page, unsigned int order)
 
 int kmsan_vmap_pages_range_noflush(unsigned long start, unsigned long end,
 				   pgprot_t prot, struct page **pages,
-				   unsigned int page_shift)
+				   unsigned int page_shift, gfp_t gfp_mask)
 {
 	unsigned long shadow_start, origin_start, shadow_end, origin_end;
 	struct page **s_pages, **o_pages;
@@ -230,8 +230,8 @@ int kmsan_vmap_pages_range_noflush(unsigned long start, unsigned long end,
 		return 0;
 
 	nr = (end - start) / PAGE_SIZE;
-	s_pages = kcalloc(nr, sizeof(*s_pages), GFP_KERNEL);
-	o_pages = kcalloc(nr, sizeof(*o_pages), GFP_KERNEL);
+	s_pages = kcalloc(nr, sizeof(*s_pages), gfp_mask);
+	o_pages = kcalloc(nr, sizeof(*o_pages), gfp_mask);
 	if (!s_pages || !o_pages) {
 		err = -ENOMEM;
 		goto ret;
diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c
index cd69caf6aa8d..4f5937090590 100644
--- a/mm/percpu-vm.c
+++ b/mm/percpu-vm.c
@@ -194,7 +194,7 @@ static int __pcpu_map_pages(unsigned long addr, struct page **pages,
 			    int nr_pages)
 {
 	return vmap_pages_range_noflush(addr, addr + (nr_pages << PAGE_SHIFT),
-					PAGE_KERNEL, pages, PAGE_SHIFT);
+			PAGE_KERNEL, pages, PAGE_SHIFT, GFP_KERNEL);
 }
 
 /**
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 5e01c6ac4aca..2d4e22dd04f7 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -671,16 +671,28 @@ int __vmap_pages_range_noflush(unsigned long addr, unsigned long end,
 }
 
 int vmap_pages_range_noflush(unsigned long addr, unsigned long end,
-		pgprot_t prot, struct page **pages, unsigned int page_shift)
+		pgprot_t prot, struct page **pages, unsigned int page_shift,
+		gfp_t gfp_mask)
 {
 	int ret = kmsan_vmap_pages_range_noflush(addr, end, prot, pages,
-						 page_shift);
+						page_shift, gfp_mask);
 
 	if (ret)
 		return ret;
 	return __vmap_pages_range_noflush(addr, end, prot, pages, page_shift);
 }
 
+static int __vmap_pages_range(unsigned long addr, unsigned long end,
+		pgprot_t prot, struct page **pages, unsigned int page_shift,
+		gfp_t gfp_mask)
+{
+	int err;
+
+	err = vmap_pages_range_noflush(addr, end, prot, pages, page_shift, gfp_mask);
+	flush_cache_vmap(addr, end);
+	return err;
+}
+
 /**
  * vmap_pages_range - map pages to a kernel virtual address
  * @addr: start of the VM area to map
@@ -696,11 +708,7 @@ int vmap_pages_range_noflush(unsigned long addr, unsigned long end,
 int vmap_pages_range(unsigned long addr, unsigned long end,
 		pgprot_t prot, struct page **pages, unsigned int page_shift)
 {
-	int err;
-
-	err = vmap_pages_range_noflush(addr, end, prot, pages, page_shift);
-	flush_cache_vmap(addr, end);
-	return err;
+	return __vmap_pages_range(addr, end, prot, pages, page_shift, GFP_KERNEL);
 }
 
 static int check_sparse_vm_area(struct vm_struct *area, unsigned long start,
@@ -3838,8 +3846,8 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 	 */
 	flags = memalloc_apply_gfp_scope(gfp_mask);
 	do {
-		ret = vmap_pages_range(addr, addr + size, prot, area->pages,
-			page_shift);
+		ret = __vmap_pages_range(addr, addr + size, prot, area->pages,
+				page_shift, nested_gfp);
 		if (nofail && (ret < 0))
 			schedule_timeout_uninterruptible(1);
 	} while (nofail && (ret < 0));
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 09/10] mm: Skip might_alloc() warnings when PF_MEMALLOC is set
  2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
                   ` (7 preceding siblings ...)
  2025-09-15 13:40 ` [PATCH v2 08/10] kmsan: Remove hard-coded GFP_KERNEL flags Uladzislau Rezki (Sony)
@ 2025-09-15 13:40 ` Uladzislau Rezki (Sony)
  2025-09-15 17:16   ` Michal Hocko
  2025-09-15 13:40 ` [PATCH v2 10/10] mm/vmalloc: Update __vmalloc_node_range() documentation Uladzislau Rezki (Sony)
  9 siblings, 1 reply; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton; +Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki

might_alloc() catches invalid blocking allocations in contexts
where sleeping is not allowed.

However when PF_MEMALLOC is set, the page allocator already skips
reclaim and other blocking paths. In such cases, a blocking gfp_mask
does not actually lead to blocking, so triggering might_alloc() splats
is misleading.

Adjust might_alloc() to skip warnings when the current task has
PF_MEMALLOC set, matching the allocator's actual blocking behaviour.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/linux/sched/mm.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 2201da0afecc..dc2d3cab32ef 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -318,7 +318,8 @@ static inline void might_alloc(gfp_t gfp_mask)
 	fs_reclaim_acquire(gfp_mask);
 	fs_reclaim_release(gfp_mask);
 
-	might_sleep_if(gfpflags_allow_blocking(gfp_mask));
+	might_sleep_if(gfpflags_allow_blocking(gfp_mask) &&
+		!(current->flags & PF_MEMALLOC));
 }
 
 /**
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 10/10] mm/vmalloc: Update __vmalloc_node_range() documentation
  2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
                   ` (8 preceding siblings ...)
  2025-09-15 13:40 ` [PATCH v2 09/10] mm: Skip might_alloc() warnings when PF_MEMALLOC is set Uladzislau Rezki (Sony)
@ 2025-09-15 13:40 ` Uladzislau Rezki (Sony)
  2025-09-15 17:13   ` Michal Hocko
  2025-09-16  0:34   ` kernel test robot
  9 siblings, 2 replies; 26+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-09-15 13:40 UTC (permalink / raw)
  To: linux-mm, Andrew Morton; +Cc: Michal Hocko, Baoquan He, LKML, Uladzislau Rezki

__vmalloc() function now supports non-blocking flags such as
GFP_ATOMIC and GFP_NOWAIT. Update the documentation accordingly.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 2d4e22dd04f7..e56d576b46c8 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3880,19 +3880,20 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
  * @caller:		  caller's return address
  *
  * Allocate enough pages to cover @size from the page level
- * allocator with @gfp_mask flags. Please note that the full set of gfp
- * flags are not supported. GFP_KERNEL, GFP_NOFS and GFP_NOIO are all
- * supported.
- * Zone modifiers are not supported. From the reclaim modifiers
- * __GFP_DIRECT_RECLAIM is required (aka GFP_NOWAIT is not supported)
- * and only __GFP_NOFAIL is supported (i.e. __GFP_NORETRY and
- * __GFP_RETRY_MAYFAIL are not supported).
+ * allocator with @gfp_mask flags and map them into contiguous
+ * virtual range with protection @prot.
  *
- * __GFP_NOWARN can be used to suppress failures messages.
+ * Supported GFP classes: %GFP_KERNEL, %GFP_ATOMIC, %GFP_NOWAIT,
+ * %GFP_NOFS and %GFP_NOIO. Zone modifiers are not supported.
+ * Please note %GFP_ATOMIC and %GFP_NOWAIT are supported only
+ * by __vmalloc().
+
+ * Retry modifiers: only %__GFP_NOFAIL is supported; %__GFP_NORETRY
+ * and %__GFP_RETRY_MAYFAIL are not supported.
  *
- * Map them into contiguous kernel virtual space, using a pagetable
- * protection of @prot.
+ * %__GFP_NOWARN can be used to suppress failure messages.
  *
+ * Can not be called from interrupt nor NMI contexts.
  * Return: the address of the area or %NULL on failure
  */
 void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted
  2025-09-15 13:40 ` [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted Uladzislau Rezki (Sony)
@ 2025-09-15 17:11   ` Michal Hocko
  2025-09-16 15:28     ` Uladzislau Rezki
  2025-09-18  2:57   ` Baoquan He
  1 sibling, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2025-09-15 17:11 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony); +Cc: linux-mm, Andrew Morton, Baoquan He, LKML

On Mon 15-09-25 15:40:34, Uladzislau Rezki wrote:
> vm_area_alloc_pages() contains the only voluntary reschedule points
> along vmalloc() allocation path. They are needed to ensure forward
> progress on PREEMPT_NONE kernels under contention for vmap metadata
> (e.g. alloc_vmap_area()).
> 
> However, yielding should only be done if the given GFP flags allow
> blocking. This patch avoids calling cond_resched() when allocation
> context is non-blocking(GFP_ATOMIC, GFP_NOWAIT).

We do have cond_resched in the page allocator path, right?
So unless I am missing something we can safely drope these. I thought we
have discused this already.

> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  mm/vmalloc.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 49a0f81930a8..b77e8be75f10 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3633,7 +3633,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>  							pages + nr_allocated);
>  
>  			nr_allocated += nr;
> -			cond_resched();
> +
> +			if (gfpflags_allow_blocking(gfp))
> +				cond_resched();
>  
>  			/*
>  			 * If zero or pages were obtained partly,
> @@ -3675,7 +3677,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>  		for (i = 0; i < (1U << order); i++)
>  			pages[nr_allocated + i] = page + i;
>  
> -		cond_resched();
> +		if (gfpflags_allow_blocking(gfp))
> +			cond_resched();
> +
>  		nr_allocated += 1U << order;
>  	}
>  
> -- 
> 2.47.3

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 10/10] mm/vmalloc: Update __vmalloc_node_range() documentation
  2025-09-15 13:40 ` [PATCH v2 10/10] mm/vmalloc: Update __vmalloc_node_range() documentation Uladzislau Rezki (Sony)
@ 2025-09-15 17:13   ` Michal Hocko
  2025-09-16 15:34     ` Uladzislau Rezki
  2025-09-16  0:34   ` kernel test robot
  1 sibling, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2025-09-15 17:13 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony); +Cc: linux-mm, Andrew Morton, Baoquan He, LKML

On Mon 15-09-25 15:40:40, Uladzislau Rezki wrote:
> __vmalloc() function now supports non-blocking flags such as
> GFP_ATOMIC and GFP_NOWAIT. Update the documentation accordingly.
> 
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

I would just fold this into the patch which adds the support. We also
need kvmalloc doc update.
Anyway
Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/vmalloc.c | 21 +++++++++++----------
>  1 file changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 2d4e22dd04f7..e56d576b46c8 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3880,19 +3880,20 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>   * @caller:		  caller's return address
>   *
>   * Allocate enough pages to cover @size from the page level
> - * allocator with @gfp_mask flags. Please note that the full set of gfp
> - * flags are not supported. GFP_KERNEL, GFP_NOFS and GFP_NOIO are all
> - * supported.
> - * Zone modifiers are not supported. From the reclaim modifiers
> - * __GFP_DIRECT_RECLAIM is required (aka GFP_NOWAIT is not supported)
> - * and only __GFP_NOFAIL is supported (i.e. __GFP_NORETRY and
> - * __GFP_RETRY_MAYFAIL are not supported).
> + * allocator with @gfp_mask flags and map them into contiguous
> + * virtual range with protection @prot.
>   *
> - * __GFP_NOWARN can be used to suppress failures messages.
> + * Supported GFP classes: %GFP_KERNEL, %GFP_ATOMIC, %GFP_NOWAIT,
> + * %GFP_NOFS and %GFP_NOIO. Zone modifiers are not supported.
> + * Please note %GFP_ATOMIC and %GFP_NOWAIT are supported only
> + * by __vmalloc().
> +
> + * Retry modifiers: only %__GFP_NOFAIL is supported; %__GFP_NORETRY
> + * and %__GFP_RETRY_MAYFAIL are not supported.
>   *
> - * Map them into contiguous kernel virtual space, using a pagetable
> - * protection of @prot.
> + * %__GFP_NOWARN can be used to suppress failure messages.
>   *
> + * Can not be called from interrupt nor NMI contexts.
>   * Return: the address of the area or %NULL on failure
>   */
>  void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
> -- 
> 2.47.3

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 09/10] mm: Skip might_alloc() warnings when PF_MEMALLOC is set
  2025-09-15 13:40 ` [PATCH v2 09/10] mm: Skip might_alloc() warnings when PF_MEMALLOC is set Uladzislau Rezki (Sony)
@ 2025-09-15 17:16   ` Michal Hocko
  2025-09-16 15:23     ` Uladzislau Rezki
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2025-09-15 17:16 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony); +Cc: linux-mm, Andrew Morton, Baoquan He, LKML

On Mon 15-09-25 15:40:39, Uladzislau Rezki wrote:
> might_alloc() catches invalid blocking allocations in contexts
> where sleeping is not allowed.
> 
> However when PF_MEMALLOC is set, the page allocator already skips
> reclaim and other blocking paths. In such cases, a blocking gfp_mask
> does not actually lead to blocking, so triggering might_alloc() splats
> is misleading.
> 
> Adjust might_alloc() to skip warnings when the current task has
> PF_MEMALLOC set, matching the allocator's actual blocking behaviour.
> 
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

I would probably just bail out early for PF_MEMALLOC to not meddle with
might_sleep_if condition as it seems to read better but I do not insist.
Acked-by: Michal Hocko <mhocko@suse.com>
Thanks

> ---
>  include/linux/sched/mm.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
> index 2201da0afecc..dc2d3cab32ef 100644
> --- a/include/linux/sched/mm.h
> +++ b/include/linux/sched/mm.h
> @@ -318,7 +318,8 @@ static inline void might_alloc(gfp_t gfp_mask)
>  	fs_reclaim_acquire(gfp_mask);
>  	fs_reclaim_release(gfp_mask);
>  
> -	might_sleep_if(gfpflags_allow_blocking(gfp_mask));
> +	might_sleep_if(gfpflags_allow_blocking(gfp_mask) &&
> +		!(current->flags & PF_MEMALLOC));
>  }
>  
>  /**
> -- 
> 2.47.3

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 10/10] mm/vmalloc: Update __vmalloc_node_range() documentation
  2025-09-15 13:40 ` [PATCH v2 10/10] mm/vmalloc: Update __vmalloc_node_range() documentation Uladzislau Rezki (Sony)
  2025-09-15 17:13   ` Michal Hocko
@ 2025-09-16  0:34   ` kernel test robot
  1 sibling, 0 replies; 26+ messages in thread
From: kernel test robot @ 2025-09-16  0:34 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony), linux-mm, Andrew Morton
  Cc: oe-kbuild-all, Linux Memory Management List, Michal Hocko,
	Baoquan He, LKML, Uladzislau Rezki

Hi Uladzislau,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on linus/master v6.17-rc6 next-20250912]
[cannot apply to dennis-percpu/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Uladzislau-Rezki-Sony/lib-test_vmalloc-add-no_block_alloc_test-case/20250915-214352
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20250915134041.151462-11-urezki%40gmail.com
patch subject: [PATCH v2 10/10] mm/vmalloc: Update __vmalloc_node_range() documentation
config: alpha-allnoconfig (https://download.01.org/0day-ci/archive/20250916/202509160821.gR75Zhnh-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250916/202509160821.gR75Zhnh-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202509160821.gR75Zhnh-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> Warning: mm/vmalloc.c:3889 bad line: 

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 09/10] mm: Skip might_alloc() warnings when PF_MEMALLOC is set
  2025-09-15 17:16   ` Michal Hocko
@ 2025-09-16 15:23     ` Uladzislau Rezki
  0 siblings, 0 replies; 26+ messages in thread
From: Uladzislau Rezki @ 2025-09-16 15:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Uladzislau Rezki (Sony), linux-mm, Andrew Morton, Baoquan He,
	LKML

On Mon, Sep 15, 2025 at 07:16:50PM +0200, Michal Hocko wrote:
> On Mon 15-09-25 15:40:39, Uladzislau Rezki wrote:
> > might_alloc() catches invalid blocking allocations in contexts
> > where sleeping is not allowed.
> > 
> > However when PF_MEMALLOC is set, the page allocator already skips
> > reclaim and other blocking paths. In such cases, a blocking gfp_mask
> > does not actually lead to blocking, so triggering might_alloc() splats
> > is misleading.
> > 
> > Adjust might_alloc() to skip warnings when the current task has
> > PF_MEMALLOC set, matching the allocator's actual blocking behaviour.
> > 
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> 
> I would probably just bail out early for PF_MEMALLOC to not meddle with
> might_sleep_if condition as it seems to read better but I do not insist.
> Acked-by: Michal Hocko <mhocko@suse.com>
>
Thank you, i will apply it and place the check into separate "if"
condition.

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted
  2025-09-15 17:11   ` Michal Hocko
@ 2025-09-16 15:28     ` Uladzislau Rezki
  2025-09-16 18:08       ` Michal Hocko
  0 siblings, 1 reply; 26+ messages in thread
From: Uladzislau Rezki @ 2025-09-16 15:28 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Uladzislau Rezki (Sony), linux-mm, Andrew Morton, Baoquan He,
	LKML

On Mon, Sep 15, 2025 at 07:11:27PM +0200, Michal Hocko wrote:
> On Mon 15-09-25 15:40:34, Uladzislau Rezki wrote:
> > vm_area_alloc_pages() contains the only voluntary reschedule points
> > along vmalloc() allocation path. They are needed to ensure forward
> > progress on PREEMPT_NONE kernels under contention for vmap metadata
> > (e.g. alloc_vmap_area()).
> > 
> > However, yielding should only be done if the given GFP flags allow
> > blocking. This patch avoids calling cond_resched() when allocation
> > context is non-blocking(GFP_ATOMIC, GFP_NOWAIT).
> 
> We do have cond_resched in the page allocator path, right?
> So unless I am missing something we can safely drope these. I thought we
> have discused this already.
> 
Yes, we discussed this. I did some test with dropped cond_resched() for
!PREEMPT kernel and i can trigger soft-lockups under really heavy stress
load.

I prefer to keep them so far for consistency. I need some time to
investigate it more. As i noted in commit message, the vmalloc()
path only has those two resched points. Probably i need to move
them into another place later.

As for page-allocator, it is in a slow path which i do not hit in
my stress-setup.

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 10/10] mm/vmalloc: Update __vmalloc_node_range() documentation
  2025-09-15 17:13   ` Michal Hocko
@ 2025-09-16 15:34     ` Uladzislau Rezki
  0 siblings, 0 replies; 26+ messages in thread
From: Uladzislau Rezki @ 2025-09-16 15:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Uladzislau Rezki (Sony), linux-mm, Andrew Morton, Baoquan He,
	LKML

On Mon, Sep 15, 2025 at 07:13:01PM +0200, Michal Hocko wrote:
> On Mon 15-09-25 15:40:40, Uladzislau Rezki wrote:
> > __vmalloc() function now supports non-blocking flags such as
> > GFP_ATOMIC and GFP_NOWAIT. Update the documentation accordingly.
> > 
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> 
> I would just fold this into the patch which adds the support. We also
> need kvmalloc doc update.
> Anyway
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
Thank you. Applied.

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted
  2025-09-16 15:28     ` Uladzislau Rezki
@ 2025-09-16 18:08       ` Michal Hocko
  2025-09-17  5:22         ` Uladzislau Rezki
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2025-09-16 18:08 UTC (permalink / raw)
  To: Uladzislau Rezki; +Cc: linux-mm, Andrew Morton, Baoquan He, LKML

On Tue 16-09-25 17:28:36, Uladzislau Rezki wrote:
> On Mon, Sep 15, 2025 at 07:11:27PM +0200, Michal Hocko wrote:
> > On Mon 15-09-25 15:40:34, Uladzislau Rezki wrote:
> > > vm_area_alloc_pages() contains the only voluntary reschedule points
> > > along vmalloc() allocation path. They are needed to ensure forward
> > > progress on PREEMPT_NONE kernels under contention for vmap metadata
> > > (e.g. alloc_vmap_area()).
> > > 
> > > However, yielding should only be done if the given GFP flags allow
> > > blocking. This patch avoids calling cond_resched() when allocation
> > > context is non-blocking(GFP_ATOMIC, GFP_NOWAIT).
> > 
> > We do have cond_resched in the page allocator path, right?
> > So unless I am missing something we can safely drope these. I thought we
> > have discused this already.
> > 
> Yes, we discussed this. I did some test with dropped cond_resched() for
> !PREEMPT kernel and i can trigger soft-lockups under really heavy stress
> load.
> 
> I prefer to keep them so far for consistency. I need some time to
> investigate it more. As i noted in commit message, the vmalloc()
> path only has those two resched points. Probably i need to move
> them into another place later.
> 
> As for page-allocator, it is in a slow path which i do not hit in
> my stress-setup.

OK, so the fast path can trigger the soft lockup? If yes please mention
that in the changelog so that we know why this is needed. With that
included feel free to add
Acked-by: Michal Hocko <mhocko@suse.com>

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted
  2025-09-16 18:08       ` Michal Hocko
@ 2025-09-17  5:22         ` Uladzislau Rezki
  0 siblings, 0 replies; 26+ messages in thread
From: Uladzislau Rezki @ 2025-09-17  5:22 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Uladzislau Rezki, linux-mm, Andrew Morton, Baoquan He, LKML

On Tue, Sep 16, 2025 at 08:08:18PM +0200, Michal Hocko wrote:
> On Tue 16-09-25 17:28:36, Uladzislau Rezki wrote:
> > On Mon, Sep 15, 2025 at 07:11:27PM +0200, Michal Hocko wrote:
> > > On Mon 15-09-25 15:40:34, Uladzislau Rezki wrote:
> > > > vm_area_alloc_pages() contains the only voluntary reschedule points
> > > > along vmalloc() allocation path. They are needed to ensure forward
> > > > progress on PREEMPT_NONE kernels under contention for vmap metadata
> > > > (e.g. alloc_vmap_area()).
> > > > 
> > > > However, yielding should only be done if the given GFP flags allow
> > > > blocking. This patch avoids calling cond_resched() when allocation
> > > > context is non-blocking(GFP_ATOMIC, GFP_NOWAIT).
> > > 
> > > We do have cond_resched in the page allocator path, right?
> > > So unless I am missing something we can safely drope these. I thought we
> > > have discused this already.
> > > 
> > Yes, we discussed this. I did some test with dropped cond_resched() for
> > !PREEMPT kernel and i can trigger soft-lockups under really heavy stress
> > load.
> > 
> > I prefer to keep them so far for consistency. I need some time to
> > investigate it more. As i noted in commit message, the vmalloc()
> > path only has those two resched points. Probably i need to move
> > them into another place later.
> > 
> > As for page-allocator, it is in a slow path which i do not hit in
> > my stress-setup.
> 
> OK, so the fast path can trigger the soft lockup? If yes please mention
> that in the changelog so that we know why this is needed. With that
> included feel free to add
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
We, in vmalloc(), also have a slow path. Those two points seem to help.
I will move them later to alloc_vmal_area(), after a slow path serves
a request.

Thank you!

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 03/10] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area()
  2025-09-15 13:40 ` [PATCH v2 03/10] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area() Uladzislau Rezki (Sony)
@ 2025-09-18  2:56   ` Baoquan He
  0 siblings, 0 replies; 26+ messages in thread
From: Baoquan He @ 2025-09-18  2:56 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: linux-mm, Andrew Morton, Michal Hocko, LKML, Michal Hocko

On 09/15/25 at 03:40pm, Uladzislau Rezki (Sony) wrote:
> alloc_vmap_area() currently assumes that sleeping is allowed during
> allocation. This is not true for callers which pass non-blocking
> GFP flags, such as GFP_ATOMIC or GFP_NOWAIT.
> 
> This patch adds logic to detect whether the given gfp_mask permits
> blocking. It avoids invoking might_sleep() or falling back to reclaim
> path if blocking is not allowed.
> 
> This makes alloc_vmap_area() safer for use in non-sleeping contexts,
> where previously it could hit unexpected sleeps, trigger warnings.
> 
> It is a preparation and adjustment step to later allow both GFP_ATOMIC
> and GFP_NOWAIT allocations in this series.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  mm/vmalloc.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)

LGTM,

Reviewed-by: Baoquan He <bhe@redhat.com>

> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 5edd536ba9d2..49a0f81930a8 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2017,6 +2017,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  	unsigned long freed;
>  	unsigned long addr;
>  	unsigned int vn_id;
> +	bool allow_block;
>  	int purged = 0;
>  	int ret;
>  
> @@ -2028,7 +2029,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  
>  	/* Only reclaim behaviour flags are relevant. */
>  	gfp_mask = gfp_mask & GFP_RECLAIM_MASK;
> -	might_sleep();
> +	allow_block = gfpflags_allow_blocking(gfp_mask);
> +	might_sleep_if(allow_block);
>  
>  	/*
>  	 * If a VA is obtained from a global heap(if it fails here)
> @@ -2065,8 +2067,16 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  	 * If an allocation fails, the error value is
>  	 * returned. Therefore trigger the overflow path.
>  	 */
> -	if (IS_ERR_VALUE(addr))
> -		goto overflow;
> +	if (IS_ERR_VALUE(addr)) {
> +		if (allow_block)
> +			goto overflow;
> +
> +		/*
> +		 * We can not trigger any reclaim logic because
> +		 * sleeping is not allowed, thus fail an allocation.
> +		 */
> +		goto out_free_va;
> +	}
>  
>  	va->va_start = addr;
>  	va->va_end = addr + size;
> @@ -2116,6 +2126,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  		pr_warn("vmalloc_node_range for size %lu failed: Address range restricted to %#lx - %#lx\n",
>  				size, vstart, vend);
>  
> +out_free_va:
>  	kmem_cache_free(vmap_area_cachep, va);
>  	return ERR_PTR(-EBUSY);
>  }
> -- 
> 2.47.3
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted
  2025-09-15 13:40 ` [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted Uladzislau Rezki (Sony)
  2025-09-15 17:11   ` Michal Hocko
@ 2025-09-18  2:57   ` Baoquan He
  1 sibling, 0 replies; 26+ messages in thread
From: Baoquan He @ 2025-09-18  2:57 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony); +Cc: linux-mm, Andrew Morton, Michal Hocko, LKML

On 09/15/25 at 03:40pm, Uladzislau Rezki (Sony) wrote:
> vm_area_alloc_pages() contains the only voluntary reschedule points
> along vmalloc() allocation path. They are needed to ensure forward
> progress on PREEMPT_NONE kernels under contention for vmap metadata
> (e.g. alloc_vmap_area()).
> 
> However, yielding should only be done if the given GFP flags allow
> blocking. This patch avoids calling cond_resched() when allocation
> context is non-blocking(GFP_ATOMIC, GFP_NOWAIT).
> 
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  mm/vmalloc.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

Reviewed-by: Baoquan He <bhe@redhat.com>

> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 49a0f81930a8..b77e8be75f10 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3633,7 +3633,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>  							pages + nr_allocated);
>  
>  			nr_allocated += nr;
> -			cond_resched();
> +
> +			if (gfpflags_allow_blocking(gfp))
> +				cond_resched();
>  
>  			/*
>  			 * If zero or pages were obtained partly,
> @@ -3675,7 +3677,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>  		for (i = 0; i < (1U << order); i++)
>  			pages[nr_allocated + i] = page + i;
>  
> -		cond_resched();
> +		if (gfpflags_allow_blocking(gfp))
> +			cond_resched();
> +
>  		nr_allocated += 1U << order;
>  	}
>  
> -- 
> 2.47.3
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/10] mm/vmalloc: Defer freeing partly initialized vm_struct
  2025-09-15 13:40 ` [PATCH v2 05/10] mm/vmalloc: Defer freeing partly initialized vm_struct Uladzislau Rezki (Sony)
@ 2025-09-18  2:59   ` Baoquan He
  0 siblings, 0 replies; 26+ messages in thread
From: Baoquan He @ 2025-09-18  2:59 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: linux-mm, Andrew Morton, Michal Hocko, LKML, Michal Hocko

On 09/15/25 at 03:40pm, Uladzislau Rezki (Sony) wrote:
> __vmalloc_area_node() may call free_vmap_area() or vfree() on
> error paths, both of which can sleep. This becomes problematic
> if the function is invoked from an atomic context, such as when
> GFP_ATOMIC or GFP_NOWAIT is passed via gfp_mask.
> 
> To fix this, unify error paths and defer the cleanup of partly
> initialized vm_struct objects to a workqueue. This ensures that
> freeing happens in a process context and avoids invalid sleeps
> in atomic regions.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  include/linux/vmalloc.h |  6 +++++-
>  mm/vmalloc.c            | 34 +++++++++++++++++++++++++++++++---
>  2 files changed, 36 insertions(+), 4 deletions(-)

Reviewed-by: Baoquan He <bhe@redhat.com>

> 
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index 2759dac6be44..97252078a3dc 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -50,7 +50,11 @@ struct iov_iter;		/* in uio.h */
>  #endif
>  
>  struct vm_struct {
> -	struct vm_struct	*next;
> +	union {
> +		struct vm_struct *next;	  /* Early registration of vm_areas. */
> +		struct llist_node llnode; /* Asynchronous freeing on error paths. */
> +	};
> +
>  	void			*addr;
>  	unsigned long		size;
>  	unsigned long		flags;
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index b77e8be75f10..e61e62872372 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3686,6 +3686,35 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>  	return nr_allocated;
>  }
>  
> +static LLIST_HEAD(pending_vm_area_cleanup);
> +static void cleanup_vm_area_work(struct work_struct *work)
> +{
> +	struct vm_struct *area, *tmp;
> +	struct llist_node *head;
> +
> +	head = llist_del_all(&pending_vm_area_cleanup);
> +	if (!head)
> +		return;
> +
> +	llist_for_each_entry_safe(area, tmp, head, llnode) {
> +		if (!area->pages)
> +			free_vm_area(area);
> +		else
> +			vfree(area->addr);
> +	}
> +}
> +
> +/*
> + * Helper for __vmalloc_area_node() to defer cleanup
> + * of partially initialized vm_struct in error paths.
> + */
> +static DECLARE_WORK(cleanup_vm_area, cleanup_vm_area_work);
> +static void defer_vm_area_cleanup(struct vm_struct *area)
> +{
> +	if (llist_add(&area->llnode, &pending_vm_area_cleanup))
> +		schedule_work(&cleanup_vm_area);
> +}
> +
>  static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>  				 pgprot_t prot, unsigned int page_shift,
>  				 int node)
> @@ -3717,8 +3746,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>  		warn_alloc(gfp_mask, NULL,
>  			"vmalloc error: size %lu, failed to allocated page array size %lu",
>  			nr_small_pages * PAGE_SIZE, array_size);
> -		free_vm_area(area);
> -		return NULL;
> +		goto fail;
>  	}
>  
>  	set_vm_area_page_order(area, page_shift - PAGE_SHIFT);
> @@ -3795,7 +3823,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>  	return area->addr;
>  
>  fail:
> -	vfree(area->addr);
> +	defer_vm_area_cleanup(area);
>  	return NULL;
>  }
>  
> -- 
> 2.47.3
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 06/10] mm/vmalloc: Handle non-blocking GFP in __vmalloc_area_node()
  2025-09-15 13:40 ` [PATCH v2 06/10] mm/vmalloc: Handle non-blocking GFP in __vmalloc_area_node() Uladzislau Rezki (Sony)
@ 2025-09-18  3:01   ` Baoquan He
  0 siblings, 0 replies; 26+ messages in thread
From: Baoquan He @ 2025-09-18  3:01 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: linux-mm, Andrew Morton, Michal Hocko, LKML, Michal Hocko

On 09/15/25 at 03:40pm, Uladzislau Rezki (Sony) wrote:
> Make __vmalloc_area_node() respect non-blocking GFP masks such
> as GFP_ATOMIC and GFP_NOWAIT.
> 
> - Add memalloc_apply_gfp_scope()/memalloc_restore_scope()
>   helpers to apply a proper scope.
> - Apply memalloc_apply_gfp_scope()/memalloc_restore_scope()
>   around vmap_pages_range() for page table setup.
> - Set "nofail" to false if a non-blocking mask is used, as
>   they are mutually exclusive.
> 
> This is particularly important for page table allocations that
> internally use GFP_PGTABLE_KERNEL, which may sleep unless such
> scope restrictions are applied. For example:
> 
> <snip>
> __pte_alloc_kernel()
>   pte_alloc_one_kernel(&init_mm);
>     pagetable_alloc_noprof(GFP_PGTABLE_KERNEL & ~__GFP_HIGHMEM, 0);
> <snip>
> 
> Note: in most cases, PTE entries are established only up to the
> level required by current vmap space usage, meaning the page tables
> are typically fully populated during the mapping process.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  include/linux/vmalloc.h |  2 ++
>  mm/vmalloc.c            | 52 +++++++++++++++++++++++++++++++++--------
>  2 files changed, 44 insertions(+), 10 deletions(-)

Reviewed-by: Baoquan He <bhe@redhat.com>



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 07/10] mm/kasan: Support non-blocking GFP in kasan_populate_vmalloc()
  2025-09-15 13:40 ` [PATCH v2 07/10] mm/kasan: Support non-blocking GFP in kasan_populate_vmalloc() Uladzislau Rezki (Sony)
@ 2025-09-18  3:02   ` Baoquan He
  2025-09-18 14:56   ` Andrey Ryabinin
  1 sibling, 0 replies; 26+ messages in thread
From: Baoquan He @ 2025-09-18  3:02 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: linux-mm, Andrew Morton, Michal Hocko, LKML, Andrey Ryabinin,
	Alexander Potapenko

On 09/15/25 at 03:40pm, Uladzislau Rezki (Sony) wrote:
> A "gfp_mask" is already passed to kasan_populate_vmalloc() as
> an argument to respect GFPs from callers and KASAN uses it for
> its internal allocations.
> 
> But apply_to_page_range() function ignores GFP flags due to a
> hard-coded mask.
> 
> Wrap the call with memalloc_apply_gfp_scope()/memalloc_restore_scope()
> so that non-blocking GFP flags(GFP_ATOMIC, GFP_NOWAIT) are respected.
> 
> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
> Cc: Alexander Potapenko <glider@google.com>
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  mm/kasan/shadow.c | 12 ++----------
>  1 file changed, 2 insertions(+), 10 deletions(-)

Reviewed-by: Baoquan He <bhe@redhat.com>

> 
> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> index 11d472a5c4e8..c6643a72d9f6 100644
> --- a/mm/kasan/shadow.c
> +++ b/mm/kasan/shadow.c
> @@ -377,18 +377,10 @@ static int __kasan_populate_vmalloc(unsigned long start, unsigned long end, gfp_
>  		 * page tables allocations ignore external gfp mask, enforce it
>  		 * by the scope API
>  		 */
> -		if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> -			flags = memalloc_nofs_save();
> -		else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
> -			flags = memalloc_noio_save();
> -
> +		flags = memalloc_apply_gfp_scope(gfp_mask);
>  		ret = apply_to_page_range(&init_mm, start, nr_pages * PAGE_SIZE,
>  					  kasan_populate_vmalloc_pte, &data);
> -
> -		if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> -			memalloc_nofs_restore(flags);
> -		else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
> -			memalloc_noio_restore(flags);
> +		memalloc_restore_scope(flags);
>  
>  		___free_pages_bulk(data.pages, nr_pages);
>  		if (ret)
> -- 
> 2.47.3
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 07/10] mm/kasan: Support non-blocking GFP in kasan_populate_vmalloc()
  2025-09-15 13:40 ` [PATCH v2 07/10] mm/kasan: Support non-blocking GFP in kasan_populate_vmalloc() Uladzislau Rezki (Sony)
  2025-09-18  3:02   ` Baoquan He
@ 2025-09-18 14:56   ` Andrey Ryabinin
  1 sibling, 0 replies; 26+ messages in thread
From: Andrey Ryabinin @ 2025-09-18 14:56 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony), linux-mm, Andrew Morton
  Cc: Michal Hocko, Baoquan He, LKML, Alexander Potapenko

On 9/15/25 3:40 PM, Uladzislau Rezki (Sony) wrote:
> A "gfp_mask" is already passed to kasan_populate_vmalloc() as
> an argument to respect GFPs from callers and KASAN uses it for
> its internal allocations.
> 
> But apply_to_page_range() function ignores GFP flags due to a
> hard-coded mask.
> 
> Wrap the call with memalloc_apply_gfp_scope()/memalloc_restore_scope()
> so that non-blocking GFP flags(GFP_ATOMIC, GFP_NOWAIT) are respected.
> 
> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
> Cc: Alexander Potapenko <glider@google.com>
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---

Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2025-09-18 14:56 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-15 13:40 [PATCH v2 00/10] __vmalloc() and no-block support(v2) Uladzislau Rezki (Sony)
2025-09-15 13:40 ` [PATCH v2 01/10] lib/test_vmalloc: add no_block_alloc_test case Uladzislau Rezki (Sony)
2025-09-15 13:40 ` [PATCH v2 02/10] lib/test_vmalloc: Remove xfail condition check Uladzislau Rezki (Sony)
2025-09-15 13:40 ` [PATCH v2 03/10] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area() Uladzislau Rezki (Sony)
2025-09-18  2:56   ` Baoquan He
2025-09-15 13:40 ` [PATCH v2 04/10] mm/vmalloc: Avoid cond_resched() when blocking is not permitted Uladzislau Rezki (Sony)
2025-09-15 17:11   ` Michal Hocko
2025-09-16 15:28     ` Uladzislau Rezki
2025-09-16 18:08       ` Michal Hocko
2025-09-17  5:22         ` Uladzislau Rezki
2025-09-18  2:57   ` Baoquan He
2025-09-15 13:40 ` [PATCH v2 05/10] mm/vmalloc: Defer freeing partly initialized vm_struct Uladzislau Rezki (Sony)
2025-09-18  2:59   ` Baoquan He
2025-09-15 13:40 ` [PATCH v2 06/10] mm/vmalloc: Handle non-blocking GFP in __vmalloc_area_node() Uladzislau Rezki (Sony)
2025-09-18  3:01   ` Baoquan He
2025-09-15 13:40 ` [PATCH v2 07/10] mm/kasan: Support non-blocking GFP in kasan_populate_vmalloc() Uladzislau Rezki (Sony)
2025-09-18  3:02   ` Baoquan He
2025-09-18 14:56   ` Andrey Ryabinin
2025-09-15 13:40 ` [PATCH v2 08/10] kmsan: Remove hard-coded GFP_KERNEL flags Uladzislau Rezki (Sony)
2025-09-15 13:40 ` [PATCH v2 09/10] mm: Skip might_alloc() warnings when PF_MEMALLOC is set Uladzislau Rezki (Sony)
2025-09-15 17:16   ` Michal Hocko
2025-09-16 15:23     ` Uladzislau Rezki
2025-09-15 13:40 ` [PATCH v2 10/10] mm/vmalloc: Update __vmalloc_node_range() documentation Uladzislau Rezki (Sony)
2025-09-15 17:13   ` Michal Hocko
2025-09-16 15:34     ` Uladzislau Rezki
2025-09-16  0:34   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).