* Re: [PATCH v2 1/3] vmalloc: add __GFP_SKIP_KASAN support
2026-03-24 13:26 ` [PATCH v2 1/3] vmalloc: add __GFP_SKIP_KASAN support Muhammad Usama Anjum
@ 2026-04-10 18:10 ` Catalin Marinas
2026-04-16 9:10 ` David Hildenbrand
2026-04-22 13:21 ` Ryan Roberts
2 siblings, 0 replies; 18+ messages in thread
From: Catalin Marinas @ 2026-04-10 18:10 UTC (permalink / raw)
To: Muhammad Usama Anjum
Cc: Arnd Bergmann, Ingo Molnar, Peter Zijlstra, Juri Lelli,
Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
Mel Gorman, Valentin Schneider, Kees Cook, Andrew Morton,
David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Uladzislau Rezki, linux-arch, linux-kernel, linux-mm,
Andrey Konovalov, Marco Elver, Vincenzo Frascino,
Peter Collingbourne, Will Deacon, Ryan.Roberts, david.hildenbrand
On Tue, Mar 24, 2026 at 01:26:27PM +0000, Muhammad Usama Anjum wrote:
> For allocations that will be accessed only with match-all pointers
> (e.g., kernel stacks), setting tags is wasted work. If the caller
> already set __GFP_SKIP_KASAN, don’t skip zeroing the pages and
> don’t set KASAN_VMALLOC_PROT_NORMAL so kasan_unpoison_vmalloc()
> returns early without tagging.
>
> Before this patch, __GFP_SKIP_KASAN wasn't being used with vmalloc
> APIs. So it wasn't being checked. Now its being checked and acted
> upon. Other KASAN modes are unchanged because __GFP_SKIP_KASAN isn't
> defined there.
>
> This is a preparatory patch for optimizing kernel stack allocations.
>
> Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
> ---
> Changes since v1:
> - Simplify skip conditions based on the fact that __GFP_SKIP_KASAN
> is zero in non-hw-tags mode.
> - Add __GFP_SKIP_KASAN to GFP_VMALLOC_SUPPORTED list of flags
> ---
> mm/vmalloc.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index c607307c657a6..69ae205effb46 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3939,7 +3939,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> __GFP_NOFAIL | __GFP_ZERO |\
> __GFP_NORETRY | __GFP_RETRY_MAYFAIL |\
> GFP_NOFS | GFP_NOIO | GFP_KERNEL_ACCOUNT |\
> - GFP_USER | __GFP_NOLOCKDEP)
> + GFP_USER | __GFP_NOLOCKDEP | __GFP_SKIP_KASAN)
>
> static gfp_t vmalloc_fix_flags(gfp_t flags)
> {
> @@ -3980,6 +3980,8 @@ static gfp_t vmalloc_fix_flags(gfp_t flags)
> *
> * %__GFP_NOWARN can be used to suppress failure messages.
> *
> + * %__GFP_SKIP_KASAN can be used to skip poisoning
> + *
> * Can not be called from interrupt nor NMI contexts.
> * Return: the address of the area or %NULL on failure
> */
> @@ -4041,7 +4043,9 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
> * kasan_unpoison_vmalloc().
> */
> if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
> - if (kasan_hw_tags_enabled()) {
> + bool skip_kasan = gfp_mask & __GFP_SKIP_KASAN;
> +
> + if (kasan_hw_tags_enabled() && !skip_kasan) {
> /*
> * Modify protection bits to allow tagging.
> * This must be done before mapping.
> @@ -4057,7 +4061,8 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
> }
>
> /* Take note that the mapping is PAGE_KERNEL. */
> - kasan_flags |= KASAN_VMALLOC_PROT_NORMAL;
> + if (!skip_kasan)
> + kasan_flags |= KASAN_VMALLOC_PROT_NORMAL;
> }
In the cover letter, you said that __GFP_SKIP_KASAN is only meant for
KASAN_HW_TAGS. IIUC, here you skip passing KASAN_VMALLOC_PROT_NORMAL
even for KASAN_SW_TAGS. The flag is used in mm/kasan/shadow.c.
--
Catalin
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v2 1/3] vmalloc: add __GFP_SKIP_KASAN support
2026-03-24 13:26 ` [PATCH v2 1/3] vmalloc: add __GFP_SKIP_KASAN support Muhammad Usama Anjum
2026-04-10 18:10 ` Catalin Marinas
@ 2026-04-16 9:10 ` David Hildenbrand
2026-04-22 13:21 ` Ryan Roberts
2 siblings, 0 replies; 18+ messages in thread
From: David Hildenbrand @ 2026-04-16 9:10 UTC (permalink / raw)
To: Muhammad Usama Anjum, Arnd Bergmann, Ingo Molnar, Peter Zijlstra,
Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
Ben Segall, Mel Gorman, Valentin Schneider, Kees Cook,
Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Uladzislau Rezki, linux-arch,
linux-kernel, linux-mm, Andrey Konovalov, Marco Elver,
Vincenzo Frascino, Peter Collingbourne, Catalin Marinas,
Will Deacon, Ryan.Roberts
On 3/24/26 14:26, Muhammad Usama Anjum wrote:
> For allocations that will be accessed only with match-all pointers
> (e.g., kernel stacks), setting tags is wasted work. If the caller
> already set __GFP_SKIP_KASAN, don’t skip zeroing the pages and
> don’t set KASAN_VMALLOC_PROT_NORMAL so kasan_unpoison_vmalloc()
> returns early without tagging.
>
> Before this patch, __GFP_SKIP_KASAN wasn't being used with vmalloc
> APIs. So it wasn't being checked. Now its being checked and acted
> upon. Other KASAN modes are unchanged because __GFP_SKIP_KASAN isn't
> defined there.
>
> This is a preparatory patch for optimizing kernel stack allocations.
>
> Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
> ---
> Changes since v1:
> - Simplify skip conditions based on the fact that __GFP_SKIP_KASAN
> is zero in non-hw-tags mode.
> - Add __GFP_SKIP_KASAN to GFP_VMALLOC_SUPPORTED list of flags
> ---
> mm/vmalloc.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index c607307c657a6..69ae205effb46 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3939,7 +3939,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> __GFP_NOFAIL | __GFP_ZERO |\
> __GFP_NORETRY | __GFP_RETRY_MAYFAIL |\
> GFP_NOFS | GFP_NOIO | GFP_KERNEL_ACCOUNT |\
> - GFP_USER | __GFP_NOLOCKDEP)
> + GFP_USER | __GFP_NOLOCKDEP | __GFP_SKIP_KASAN)
>
> static gfp_t vmalloc_fix_flags(gfp_t flags)
> {
> @@ -3980,6 +3980,8 @@ static gfp_t vmalloc_fix_flags(gfp_t flags)
> *
> * %__GFP_NOWARN can be used to suppress failure messages.
> *
> + * %__GFP_SKIP_KASAN can be used to skip poisoning
> + *
> * Can not be called from interrupt nor NMI contexts.
> * Return: the address of the area or %NULL on failure
> */
> @@ -4041,7 +4043,9 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
> * kasan_unpoison_vmalloc().
> */
> if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
> - if (kasan_hw_tags_enabled()) {
> + bool skip_kasan = gfp_mask & __GFP_SKIP_KASAN;
> +
> + if (kasan_hw_tags_enabled() && !skip_kasan) {
This code gets ever more ugly. :)
After I spotted the horrible ___GFP_SKIP_ZERO that shouldn't even exist,
I thought about teaching vmalloc.c to use a sub-allocator interface to
the buddy instead, where we would essentially say "leave zeroing and
KASAN to the sub-allocator": vmalloc.
Then, we'd get rid of ___GFP_SKIP_ZERO and just use __GFP_SKIP_KASAN to
decide ourselves here what to do with KASAN.
I tried to implement that, but that SW KASAN / !KASAN handling messes
with my brain. :)
In particular, the order for HW KASAN is currently:
a) Allocate pages *and map them*.
b) Zero the pages
That means that we have temporarily unzeroed pages mapped there. I don't
know if that's problematic, but it's one of the differences to SW KASAN
/ ! KASAN handling here.
--
Cheers,
David
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v2 1/3] vmalloc: add __GFP_SKIP_KASAN support
2026-03-24 13:26 ` [PATCH v2 1/3] vmalloc: add __GFP_SKIP_KASAN support Muhammad Usama Anjum
2026-04-10 18:10 ` Catalin Marinas
2026-04-16 9:10 ` David Hildenbrand
@ 2026-04-22 13:21 ` Ryan Roberts
2026-04-22 14:23 ` Dev Jain
2 siblings, 1 reply; 18+ messages in thread
From: Ryan Roberts @ 2026-04-22 13:21 UTC (permalink / raw)
To: Muhammad Usama Anjum, Arnd Bergmann, Ingo Molnar, Peter Zijlstra,
Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
Ben Segall, Mel Gorman, Valentin Schneider, Kees Cook,
Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Uladzislau Rezki, linux-arch,
linux-kernel, linux-mm, Andrey Konovalov, Marco Elver,
Vincenzo Frascino, Peter Collingbourne, Catalin Marinas,
Will Deacon, david.hildenbrand
On 24/03/2026 13:26, Muhammad Usama Anjum wrote:
> For allocations that will be accessed only with match-all pointers
> (e.g., kernel stacks), setting tags is wasted work. If the caller
> already set __GFP_SKIP_KASAN, don’t skip zeroing the pages and
> don’t set KASAN_VMALLOC_PROT_NORMAL so kasan_unpoison_vmalloc()
> returns early without tagging.
>
> Before this patch, __GFP_SKIP_KASAN wasn't being used with vmalloc
> APIs. So it wasn't being checked. Now its being checked and acted
> upon. Other KASAN modes are unchanged because __GFP_SKIP_KASAN isn't
> defined there.
>
> This is a preparatory patch for optimizing kernel stack allocations.
>
> Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
> ---
> Changes since v1:
> - Simplify skip conditions based on the fact that __GFP_SKIP_KASAN
> is zero in non-hw-tags mode.
> - Add __GFP_SKIP_KASAN to GFP_VMALLOC_SUPPORTED list of flags
> ---
> mm/vmalloc.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index c607307c657a6..69ae205effb46 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3939,7 +3939,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> __GFP_NOFAIL | __GFP_ZERO |\
> __GFP_NORETRY | __GFP_RETRY_MAYFAIL |\
> GFP_NOFS | GFP_NOIO | GFP_KERNEL_ACCOUNT |\
> - GFP_USER | __GFP_NOLOCKDEP)
> + GFP_USER | __GFP_NOLOCKDEP | __GFP_SKIP_KASAN)
>
> static gfp_t vmalloc_fix_flags(gfp_t flags)
> {
> @@ -3980,6 +3980,8 @@ static gfp_t vmalloc_fix_flags(gfp_t flags)
> *
> * %__GFP_NOWARN can be used to suppress failure messages.
> *
> + * %__GFP_SKIP_KASAN can be used to skip poisoning
You mean skip *un*poisoning, I think? But you would only want this to apply to
the actaul pages mapped by vmalloc. You wouldn't want to skip unpoisoning for
any allocated meta data; I think that is currently possible since the gfp_flags
that are passed into __vmalloc_node_range_noprof() are passed down to
__get_vm_area_node() unmdified. You probably want to explicitly ensure
__GFP_SKIP_KASAN is clear for that internal call?
> + *
> * Can not be called from interrupt nor NMI contexts.
> * Return: the address of the area or %NULL on failure
> */
> @@ -4041,7 +4043,9 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
> * kasan_unpoison_vmalloc().
> */
> if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
> - if (kasan_hw_tags_enabled()) {
> + bool skip_kasan = gfp_mask & __GFP_SKIP_KASAN;
> +
> + if (kasan_hw_tags_enabled() && !skip_kasan) {
> /*
> * Modify protection bits to allow tagging.
> * This must be done before mapping.
> @@ -4057,7 +4061,8 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
> }
>
> /* Take note that the mapping is PAGE_KERNEL. */
> - kasan_flags |= KASAN_VMALLOC_PROT_NORMAL;
> + if (!skip_kasan)
> + kasan_flags |= KASAN_VMALLOC_PROT_NORMAL;
It's pretty ugly to use the absence of this flag to rely on
kasan_unpoison_vmalloc() not unpoisoning. Perhaps it is preferable to just not
call kasan_unpoison_vmalloc() for the skip_kasan case?
> }
>
> /* Allocate physical pages and map them into vmalloc space. */
Perhaps something like this would work:
---8<---
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index c31a8615a8328..c340db141df57 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3979,6 +3979,8 @@ static gfp_t vmalloc_fix_flags(gfp_t flags)
* under moderate memory pressure.
*
* %__GFP_NOWARN can be used to suppress failure messages.
+
+ * %__GFP_SKIP_KASAN skip unpoisoning of mapped pages (when prot=PAGE_KERNEL).
*
* Can not be called from interrupt nor NMI contexts.
* Return: the address of the area or %NULL on failure
@@ -3993,6 +3995,9 @@ void *__vmalloc_node_range_noprof(unsigned long size,
unsigned long align,
kasan_vmalloc_flags_t kasan_flags = KASAN_VMALLOC_NONE;
unsigned long original_align = align;
unsigned int shift = PAGE_SHIFT;
+ bool skip_kasan = gfp_mask & __GFP_SKIP_KASAN;
+
+ gfp_mask &= ~__GFP_SKIP_KASAN;
if (WARN_ON_ONCE(!size))
return NULL;
@@ -4041,7 +4046,7 @@ void *__vmalloc_node_range_noprof(unsigned long size,
unsigned long align,
* kasan_unpoison_vmalloc().
*/
if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
- if (kasan_hw_tags_enabled()) {
+ if (kasan_hw_tags_enabled() && !skip_kasan) {
/*
* Modify protection bits to allow tagging.
* This must be done before mapping.
@@ -4054,6 +4059,12 @@ void *__vmalloc_node_range_noprof(unsigned long size,
unsigned long align,
* poisoned and zeroed by kasan_unpoison_vmalloc().
*/
gfp_mask |= __GFP_SKIP_KASAN | __GFP_SKIP_ZERO;
+ } else if (skip_kasan) {
+ /*
+ * Skip page_alloc unpoisoning physical pages backing
+ * VM_ALLOC mapping, as requested by caller.
+ */
+ gfp_mask |= __GFP_SKIP_KASAN;
}
/* Take note that the mapping is PAGE_KERNEL. */
@@ -4078,7 +4089,8 @@ void *__vmalloc_node_range_noprof(unsigned long size,
unsigned long align,
(gfp_mask & __GFP_SKIP_ZERO))
kasan_flags |= KASAN_VMALLOC_INIT;
/* KASAN_VMALLOC_PROT_NORMAL already set if required. */
- area->addr = kasan_unpoison_vmalloc(area->addr, size, kasan_flags);
+ if (!skip_kasan)
+ area->addr = kasan_unpoison_vmalloc(area->addr, size, kasan_flags);
/*
* In this function, newly allocated vm_struct has VM_UNINITIALIZED
---8<---
Thanks,
Ryan
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH v2 1/3] vmalloc: add __GFP_SKIP_KASAN support
2026-04-22 13:21 ` Ryan Roberts
@ 2026-04-22 14:23 ` Dev Jain
2026-04-22 14:38 ` Ryan Roberts
0 siblings, 1 reply; 18+ messages in thread
From: Dev Jain @ 2026-04-22 14:23 UTC (permalink / raw)
To: Ryan Roberts, Muhammad Usama Anjum, Arnd Bergmann, Ingo Molnar,
Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
Kees Cook, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Uladzislau Rezki, linux-arch,
linux-kernel, linux-mm, Andrey Konovalov, Marco Elver,
Vincenzo Frascino, Peter Collingbourne, Catalin Marinas,
Will Deacon, david.hildenbrand
On 22/04/26 6:51 pm, Ryan Roberts wrote:
> On 24/03/2026 13:26, Muhammad Usama Anjum wrote:
>> For allocations that will be accessed only with match-all pointers
>> (e.g., kernel stacks), setting tags is wasted work. If the caller
>> already set __GFP_SKIP_KASAN, don’t skip zeroing the pages and
>> don’t set KASAN_VMALLOC_PROT_NORMAL so kasan_unpoison_vmalloc()
>> returns early without tagging.
>>
>> Before this patch, __GFP_SKIP_KASAN wasn't being used with vmalloc
>> APIs. So it wasn't being checked. Now its being checked and acted
>> upon. Other KASAN modes are unchanged because __GFP_SKIP_KASAN isn't
>> defined there.
>>
>> This is a preparatory patch for optimizing kernel stack allocations.
>>
>> Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
>> ---
>> Changes since v1:
>> - Simplify skip conditions based on the fact that __GFP_SKIP_KASAN
>> is zero in non-hw-tags mode.
>> - Add __GFP_SKIP_KASAN to GFP_VMALLOC_SUPPORTED list of flags
>> ---
>> mm/vmalloc.c | 11 ++++++++---
>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index c607307c657a6..69ae205effb46 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -3939,7 +3939,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>> __GFP_NOFAIL | __GFP_ZERO |\
>> __GFP_NORETRY | __GFP_RETRY_MAYFAIL |\
>> GFP_NOFS | GFP_NOIO | GFP_KERNEL_ACCOUNT |\
>> - GFP_USER | __GFP_NOLOCKDEP)
>> + GFP_USER | __GFP_NOLOCKDEP | __GFP_SKIP_KASAN)
>>
>> static gfp_t vmalloc_fix_flags(gfp_t flags)
>> {
>> @@ -3980,6 +3980,8 @@ static gfp_t vmalloc_fix_flags(gfp_t flags)
>> *
>> * %__GFP_NOWARN can be used to suppress failure messages.
>> *
>> + * %__GFP_SKIP_KASAN can be used to skip poisoning
>
> You mean skip *un*poisoning, I think? But you would only want this to apply to
> the actaul pages mapped by vmalloc. You wouldn't want to skip unpoisoning for
> any allocated meta data; I think that is currently possible since the gfp_flags
> that are passed into __vmalloc_node_range_noprof() are passed down to
> __get_vm_area_node() unmdified. You probably want to explicitly ensure
> __GFP_SKIP_KASAN is clear for that internal call?
>
>> + *
>> * Can not be called from interrupt nor NMI contexts.
>> * Return: the address of the area or %NULL on failure
>> */
>> @@ -4041,7 +4043,9 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
>> * kasan_unpoison_vmalloc().
>> */
>> if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
>> - if (kasan_hw_tags_enabled()) {
>> + bool skip_kasan = gfp_mask & __GFP_SKIP_KASAN;
>> +
>> + if (kasan_hw_tags_enabled() && !skip_kasan) {
>> /*
>> * Modify protection bits to allow tagging.
>> * This must be done before mapping.
>> @@ -4057,7 +4061,8 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
>> }
>>
>> /* Take note that the mapping is PAGE_KERNEL. */
>> - kasan_flags |= KASAN_VMALLOC_PROT_NORMAL;
>> + if (!skip_kasan)
>> + kasan_flags |= KASAN_VMALLOC_PROT_NORMAL;
>
> It's pretty ugly to use the absence of this flag to rely on
> kasan_unpoison_vmalloc() not unpoisoning. Perhaps it is preferable to just not
> call kasan_unpoison_vmalloc() for the skip_kasan case?
>
>> }
>>
>> /* Allocate physical pages and map them into vmalloc space. */
>
> Perhaps something like this would work:
>
> ---8<---
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index c31a8615a8328..c340db141df57 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3979,6 +3979,8 @@ static gfp_t vmalloc_fix_flags(gfp_t flags)
> * under moderate memory pressure.
> *
> * %__GFP_NOWARN can be used to suppress failure messages.
> +
> + * %__GFP_SKIP_KASAN skip unpoisoning of mapped pages (when prot=PAGE_KERNEL).
> *
> * Can not be called from interrupt nor NMI contexts.
> * Return: the address of the area or %NULL on failure
> @@ -3993,6 +3995,9 @@ void *__vmalloc_node_range_noprof(unsigned long size,
> unsigned long align,
> kasan_vmalloc_flags_t kasan_flags = KASAN_VMALLOC_NONE;
> unsigned long original_align = align;
> unsigned int shift = PAGE_SHIFT;
> + bool skip_kasan = gfp_mask & __GFP_SKIP_KASAN;
> +
> + gfp_mask &= ~__GFP_SKIP_KASAN;
Okay so this is so that metadata allocation can keep using normal
page allocator side unpoisoning.
> if (WARN_ON_ONCE(!size))
> return NULL;
> @@ -4041,7 +4046,7 @@ void *__vmalloc_node_range_noprof(unsigned long size,
> unsigned long align,
> * kasan_unpoison_vmalloc().
> */
> if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
> - if (kasan_hw_tags_enabled()) {
> + if (kasan_hw_tags_enabled() && !skip_kasan) {
Why do we want to elide GFP_SKIP_ZERO (set below) in this case?
> /*
> * Modify protection bits to allow tagging.
> * This must be done before mapping.
> @@ -4054,6 +4059,12 @@ void *__vmalloc_node_range_noprof(unsigned long size,
> unsigned long align,
> * poisoned and zeroed by kasan_unpoison_vmalloc().
> */
> gfp_mask |= __GFP_SKIP_KASAN | __GFP_SKIP_ZERO;
> + } else if (skip_kasan) {
> + /*
> + * Skip page_alloc unpoisoning physical pages backing
> + * VM_ALLOC mapping, as requested by caller.
> + */
> + gfp_mask |= __GFP_SKIP_KASAN;
> }
> /* Take note that the mapping is PAGE_KERNEL. */
> @@ -4078,7 +4089,8 @@ void *__vmalloc_node_range_noprof(unsigned long size,
> unsigned long align,
> (gfp_mask & __GFP_SKIP_ZERO))
> kasan_flags |= KASAN_VMALLOC_INIT;
> /* KASAN_VMALLOC_PROT_NORMAL already set if required. */
> - area->addr = kasan_unpoison_vmalloc(area->addr, size, kasan_flags);
> + if (!skip_kasan)
> + area->addr = kasan_unpoison_vmalloc(area->addr, size, kasan_flags);
I really think we should do some decoupling here - GFP_SKIP_KASAN means,
"skip KASAN when going through page allocator". Now we reuse this flag
to skip vmalloc unpoisoning.
Some code path using GFP_SKIP_KASAN (which is highly likely given that
GFP_HIGHUSER_MOVABLE has this) and also using vmalloc() will unintentionally
also skip vmalloc unpoisoning.
I think we are doing patch 1 because of patch 2 - so in patch 2, perhaps
instead of calling __vmalloc_node we can call __vmalloc_node_range_noprof and
shift this "skip vmalloc unpoisoning" functionality into vmalloc flags instead?
Perhaps this won't work for the nommu case (__vmalloc_node has two definitions),
just a line of thought.
> /*
> * In this function, newly allocated vm_struct has VM_UNINITIALIZED
>
> ---8<---
>
> Thanks,
> Ryan
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v2 1/3] vmalloc: add __GFP_SKIP_KASAN support
2026-04-22 14:23 ` Dev Jain
@ 2026-04-22 14:38 ` Ryan Roberts
2026-04-22 15:59 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 18+ messages in thread
From: Ryan Roberts @ 2026-04-22 14:38 UTC (permalink / raw)
To: Dev Jain, Muhammad Usama Anjum, Arnd Bergmann, Ingo Molnar,
Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
Kees Cook, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Uladzislau Rezki, linux-arch,
linux-kernel, linux-mm, Andrey Konovalov, Marco Elver,
Vincenzo Frascino, Peter Collingbourne, Catalin Marinas,
Will Deacon, david.hildenbrand
On 22/04/2026 15:23, Dev Jain wrote:
>
>
> On 22/04/26 6:51 pm, Ryan Roberts wrote:
>> On 24/03/2026 13:26, Muhammad Usama Anjum wrote:
>>> For allocations that will be accessed only with match-all pointers
>>> (e.g., kernel stacks), setting tags is wasted work. If the caller
>>> already set __GFP_SKIP_KASAN, don’t skip zeroing the pages and
>>> don’t set KASAN_VMALLOC_PROT_NORMAL so kasan_unpoison_vmalloc()
>>> returns early without tagging.
>>>
>>> Before this patch, __GFP_SKIP_KASAN wasn't being used with vmalloc
>>> APIs. So it wasn't being checked. Now its being checked and acted
>>> upon. Other KASAN modes are unchanged because __GFP_SKIP_KASAN isn't
>>> defined there.
>>>
>>> This is a preparatory patch for optimizing kernel stack allocations.
>>>
>>> Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
>>> ---
>>> Changes since v1:
>>> - Simplify skip conditions based on the fact that __GFP_SKIP_KASAN
>>> is zero in non-hw-tags mode.
>>> - Add __GFP_SKIP_KASAN to GFP_VMALLOC_SUPPORTED list of flags
>>> ---
>>> mm/vmalloc.c | 11 ++++++++---
>>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index c607307c657a6..69ae205effb46 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -3939,7 +3939,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>>> __GFP_NOFAIL | __GFP_ZERO |\
>>> __GFP_NORETRY | __GFP_RETRY_MAYFAIL |\
>>> GFP_NOFS | GFP_NOIO | GFP_KERNEL_ACCOUNT |\
>>> - GFP_USER | __GFP_NOLOCKDEP)
>>> + GFP_USER | __GFP_NOLOCKDEP | __GFP_SKIP_KASAN)
>>>
>>> static gfp_t vmalloc_fix_flags(gfp_t flags)
>>> {
>>> @@ -3980,6 +3980,8 @@ static gfp_t vmalloc_fix_flags(gfp_t flags)
>>> *
>>> * %__GFP_NOWARN can be used to suppress failure messages.
>>> *
>>> + * %__GFP_SKIP_KASAN can be used to skip poisoning
>>
>> You mean skip *un*poisoning, I think? But you would only want this to apply to
>> the actaul pages mapped by vmalloc. You wouldn't want to skip unpoisoning for
>> any allocated meta data; I think that is currently possible since the gfp_flags
>> that are passed into __vmalloc_node_range_noprof() are passed down to
>> __get_vm_area_node() unmdified. You probably want to explicitly ensure
>> __GFP_SKIP_KASAN is clear for that internal call?
>>
>>> + *
>>> * Can not be called from interrupt nor NMI contexts.
>>> * Return: the address of the area or %NULL on failure
>>> */
>>> @@ -4041,7 +4043,9 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
>>> * kasan_unpoison_vmalloc().
>>> */
>>> if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
>>> - if (kasan_hw_tags_enabled()) {
>>> + bool skip_kasan = gfp_mask & __GFP_SKIP_KASAN;
>>> +
>>> + if (kasan_hw_tags_enabled() && !skip_kasan) {
>>> /*
>>> * Modify protection bits to allow tagging.
>>> * This must be done before mapping.
>>> @@ -4057,7 +4061,8 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
>>> }
>>>
>>> /* Take note that the mapping is PAGE_KERNEL. */
>>> - kasan_flags |= KASAN_VMALLOC_PROT_NORMAL;
>>> + if (!skip_kasan)
>>> + kasan_flags |= KASAN_VMALLOC_PROT_NORMAL;
>>
>> It's pretty ugly to use the absence of this flag to rely on
>> kasan_unpoison_vmalloc() not unpoisoning. Perhaps it is preferable to just not
>> call kasan_unpoison_vmalloc() for the skip_kasan case?
>>
>>> }
>>>
>>> /* Allocate physical pages and map them into vmalloc space. */
>>
>> Perhaps something like this would work:
>>
>> ---8<---
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index c31a8615a8328..c340db141df57 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -3979,6 +3979,8 @@ static gfp_t vmalloc_fix_flags(gfp_t flags)
>> * under moderate memory pressure.
>> *
>> * %__GFP_NOWARN can be used to suppress failure messages.
>> +
>> + * %__GFP_SKIP_KASAN skip unpoisoning of mapped pages (when prot=PAGE_KERNEL).
>> *
>> * Can not be called from interrupt nor NMI contexts.
>> * Return: the address of the area or %NULL on failure
>> @@ -3993,6 +3995,9 @@ void *__vmalloc_node_range_noprof(unsigned long size,
>> unsigned long align,
>> kasan_vmalloc_flags_t kasan_flags = KASAN_VMALLOC_NONE;
>> unsigned long original_align = align;
>> unsigned int shift = PAGE_SHIFT;
>> + bool skip_kasan = gfp_mask & __GFP_SKIP_KASAN;
>> +
>> + gfp_mask &= ~__GFP_SKIP_KASAN;
>
> Okay so this is so that metadata allocation can keep using normal
> page allocator side unpoisoning.
Yes.
>
>> if (WARN_ON_ONCE(!size))
>> return NULL;
>> @@ -4041,7 +4046,7 @@ void *__vmalloc_node_range_noprof(unsigned long size,
>> unsigned long align,
>> * kasan_unpoison_vmalloc().
>> */
>> if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
>> - if (kasan_hw_tags_enabled()) {
>> + if (kasan_hw_tags_enabled() && !skip_kasan) {
>
> Why do we want to elide GFP_SKIP_ZERO (set below) in this case?
You mean why do we want to skip initializing the allocated memory to zero for
the case where kasan HW_TAGS is enabled and we are not skipping kasan unpoisoning?
Because setting tags at the same time as zeroing the memory is less expensive
than doing them both as separate operations. So we tell page_alloc not to bother
zeroing the memory and kasan_unpoison_vmalloc() does it at the same time as
setting the tags instead. See kasan_unpoison() which ultimately calls
mte_set_mem_tag_range().
>
>> /*
>> * Modify protection bits to allow tagging.
>> * This must be done before mapping.
>> @@ -4054,6 +4059,12 @@ void *__vmalloc_node_range_noprof(unsigned long size,
>> unsigned long align,
>> * poisoned and zeroed by kasan_unpoison_vmalloc().
>> */
>> gfp_mask |= __GFP_SKIP_KASAN | __GFP_SKIP_ZERO;
>> + } else if (skip_kasan) {
>> + /*
>> + * Skip page_alloc unpoisoning physical pages backing
>> + * VM_ALLOC mapping, as requested by caller.
>> + */
>> + gfp_mask |= __GFP_SKIP_KASAN;
>> }
>> /* Take note that the mapping is PAGE_KERNEL. */
>> @@ -4078,7 +4089,8 @@ void *__vmalloc_node_range_noprof(unsigned long size,
>> unsigned long align,
>> (gfp_mask & __GFP_SKIP_ZERO))
>> kasan_flags |= KASAN_VMALLOC_INIT;
>> /* KASAN_VMALLOC_PROT_NORMAL already set if required. */
>> - area->addr = kasan_unpoison_vmalloc(area->addr, size, kasan_flags);
>> + if (!skip_kasan)
>> + area->addr = kasan_unpoison_vmalloc(area->addr, size, kasan_flags);
>
> I really think we should do some decoupling here - GFP_SKIP_KASAN means,
> "skip KASAN when going through page allocator". > Now we reuse this flag
> to skip vmalloc unpoisoning.
>
> Some code path using GFP_SKIP_KASAN (which is highly likely given that
> GFP_HIGHUSER_MOVABLE has this) and also using vmalloc() will unintentionally
> also skip vmalloc unpoisoning.
If a caller wants to vmalloc() memory with GFP_HIGHUSER_MOVABLE (which seems
HIGHLY suspect to me) then surely leaving the memory poisoned is *exactly* what
they expect?
>
> I think we are doing patch 1 because of patch 2 - so in patch 2, perhaps
> instead of calling __vmalloc_node we can call __vmalloc_node_range_noprof and
> shift this "skip vmalloc unpoisoning" functionality into vmalloc flags instead?
This is exactly how Usama was doing it in v1. I suggested we should just reuse
the existing flag since it already provides the semantic we want and is less
confusing than introducing a new flag.
I know David is keen to do a wider rework and remove/rename/change the semantics
of __GFP_SKIP_KASAN, but I'm hoping that if we just continue to use the existing
flag and its semantics for vmalloc then there is no reason why this series can't
be merged independently of that wider rework.
Thanks,
Ryan
> Perhaps this won't work for the nommu case (__vmalloc_node has two definitions),
> just a line of thought.
>
>
>> /*
>> * In this function, newly allocated vm_struct has VM_UNINITIALIZED
>>
>> ---8<---
>>
>> Thanks,
>> Ryan
>>
>>
>
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v2 1/3] vmalloc: add __GFP_SKIP_KASAN support
2026-04-22 14:38 ` Ryan Roberts
@ 2026-04-22 15:59 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 18+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-22 15:59 UTC (permalink / raw)
To: Ryan Roberts, Dev Jain, Muhammad Usama Anjum, Arnd Bergmann,
Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, Kees Cook, Andrew Morton, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Uladzislau Rezki, linux-arch,
linux-kernel, linux-mm, Andrey Konovalov, Marco Elver,
Vincenzo Frascino, Peter Collingbourne, Catalin Marinas,
Will Deacon, david.hildenbrand
>> I think we are doing patch 1 because of patch 2 - so in patch 2, perhaps
>> instead of calling __vmalloc_node we can call __vmalloc_node_range_noprof and
>> shift this "skip vmalloc unpoisoning" functionality into vmalloc flags instead?
>
> This is exactly how Usama was doing it in v1. I suggested we should just reuse
> the existing flag since it already provides the semantic we want and is less
> confusing than introducing a new flag.
>
> I know David is keen to do a wider rework and remove/rename/change the semantics
> of __GFP_SKIP_KASAN, but I'm hoping that if we just continue to use the existing
> flag and its semantics for vmalloc then there is no reason why this series can't
> be merged independently of that wider rework.
Independent of how the flag will be called, I think it will have the same
semantics. How we'll implement that internally is a different question.
So I agree that adding __GFP_SKIP_KASAN support here is the right approach for
the time being.
--
Cheers,
David
^ permalink raw reply [flat|nested] 18+ messages in thread