* [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path
@ 2026-04-25 7:07 Mingyu Wang
2026-04-27 7:55 ` Muchun Song
0 siblings, 1 reply; 8+ messages in thread
From: Mingyu Wang @ 2026-04-25 7:07 UTC (permalink / raw)
To: muchun.song; +Cc: Liam.Howlett, akpm, linux-mm, linux-kernel, Mingyu Wang
While fuzzing with Syzkaller and fault injection (failslab) enabled,
I observed a persistent resv_map memory leak in the hugetlb mmap error path.
BUG: memory leak
unreferenced object 0xffff888110b92400 (size 512):
comm "syz.0.5386", pid 20390, jiffies 4298157188
backtrace:
__kmalloc_cache_noprof+0x509/0x6e0
resv_map_alloc+0x47/0x3a0
hugetlb_reserve_pages+0x758/0x1220
hugetlbfs_file_mmap_prepare+0x492/0x790
__mmap_region+0x1ae6/0x29f0
This is a regression introduced by the recent VMA iterator and mmap region
refactoring, which decoupled mmap preparation from VMA completion.
In `__mmap_region()`, `call_mmap_prepare()` triggers `hugetlbfs_file_mmap_prepare()`,
which successfully allocates the `resv_map` and registers a `success_hook`
in `desc->action`.
If `__mmap_new_vma()` subsequently fails (e.g., `vma_iter_prealloc()`
returns -ENOMEM due to failslab), the code jumps to `abort_munmap`.
However, the `desc` structure is completely discarded without invoking
any cleanup. The newly allocated empty VMA is freed, but since
`set_vma_user_defined_fields()` was never reached, `vm_area_free()`
doesn't call `hugetlb_vm_close()`. Thus, the `resv_map` is permanently leaked.
This RFC proposes adding an `abort_hook` to `struct mmap_action`
so that subsystems can properly clean up resources allocated during the
`mmap_prepare` phase if VMA creation fails.
Any feedback on whether this architectural approach is correct, or how to
properly implement the hugetlb unreserve rollback, would be highly appreciated.
Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
---
fs/hugetlbfs/inode.c | 9 +++++++++
include/linux/mm_types.h | 2 ++
mm/vma.c | 4 ++++
3 files changed, 15 insertions(+)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 8b05bec08e04..002bb6d9ca23 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -102,6 +102,14 @@ static int hugetlb_file_mmap_prepare_success(const struct vm_area_struct *vma)
return hugetlb_vma_lock_alloc((struct vm_area_struct *)vma);
}
+static void hugetlb_file_mmap_prepare_abort(struct vm_area_desc *desc)
+{
+ /*
+ * TODO: Implement the proper rollback for hugetlb_reserve_pages()
+ * and drop the resv_map reference held in the desc here.
+ */
+}
+
static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc)
{
struct file *file = desc->file;
@@ -172,6 +180,7 @@ static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc)
if (!ret) {
/* Allocate the VMA lock after we set it up. */
desc->action.success_hook = hugetlb_file_mmap_prepare_success;
+ desc->action.abort_hook = hugetlb_file_mmap_prepare_abort;
/*
* We cannot permit the rmap finding this VMA in the time
* between the VMA being inserted into the VMA tree and the
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index a308e2c23b82..9320f6699fa9 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -861,6 +861,8 @@ struct mmap_action {
* it is not valid to clear the error here.
*/
int (*error_hook)(int err);
+
+ void (*abort_hook)(struct vm_area_desc *desc);
/*
* This should be set in rare instances where the operation required
diff --git a/mm/vma.c b/mm/vma.c
index 377321b48734..d64cea5b4335 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -2799,6 +2799,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
*/
if (map.file_doesnt_need_get)
fput(map.file);
+
+ if (have_mmap_prepare && desc.action.abort_hook)
+ desc.action.abort_hook(&desc);
+
vms_abort_munmap_vmas(&map.vms, &map.mas_detach);
return error;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path 2026-04-25 7:07 [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path Mingyu Wang @ 2026-04-27 7:55 ` Muchun Song 2026-04-27 8:17 ` David Hildenbrand (Arm) 2026-04-27 14:18 ` Lorenzo Stoakes 0 siblings, 2 replies; 8+ messages in thread From: Muchun Song @ 2026-04-27 7:55 UTC (permalink / raw) To: Mingyu Wang Cc: Liam.Howlett, akpm, linux-mm, linux-kernel, ljs, vbabka, jannh, pfalcato, osalvador, david > On Apr 25, 2026, at 15:07, Mingyu Wang <25181214217@stu.xidian.edu.cn> wrote: > > While fuzzing with Syzkaller and fault injection (failslab) enabled, > I observed a persistent resv_map memory leak in the hugetlb mmap error path. > > BUG: memory leak > unreferenced object 0xffff888110b92400 (size 512): > comm "syz.0.5386", pid 20390, jiffies 4298157188 > backtrace: > __kmalloc_cache_noprof+0x509/0x6e0 > resv_map_alloc+0x47/0x3a0 > hugetlb_reserve_pages+0x758/0x1220 > hugetlbfs_file_mmap_prepare+0x492/0x790 > __mmap_region+0x1ae6/0x29f0 > > This is a regression introduced by the recent VMA iterator and mmap region > refactoring, which decoupled mmap preparation from VMA completion. > > In `__mmap_region()`, `call_mmap_prepare()` triggers `hugetlbfs_file_mmap_prepare()`, > which successfully allocates the `resv_map` and registers a `success_hook` > in `desc->action`. > > If `__mmap_new_vma()` subsequently fails (e.g., `vma_iter_prealloc()` > returns -ENOMEM due to failslab), the code jumps to `abort_munmap`. > However, the `desc` structure is completely discarded without invoking > any cleanup. The newly allocated empty VMA is freed, but since > `set_vma_user_defined_fields()` was never reached, `vm_area_free()` > doesn't call `hugetlb_vm_close()`. Thus, the `resv_map` is permanently leaked. > > This RFC proposes adding an `abort_hook` to `struct mmap_action` > so that subsystems can properly clean up resources allocated during the > `mmap_prepare` phase if VMA creation fails. > > Any feedback on whether this architectural approach is correct, or how to > properly implement the hugetlb unreserve rollback, would be highly appreciated. Please use ./scripts/get_maintainer.pl to get full mail list for Cc/To since it is not only related to HugeTLB subsystem. It will also consider the author of commit introducing the problem. > > Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn> > --- > fs/hugetlbfs/inode.c | 9 +++++++++ > include/linux/mm_types.h | 2 ++ > mm/vma.c | 4 ++++ > 3 files changed, 15 insertions(+) > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > index 8b05bec08e04..002bb6d9ca23 100644 > --- a/fs/hugetlbfs/inode.c > +++ b/fs/hugetlbfs/inode.c > @@ -102,6 +102,14 @@ static int hugetlb_file_mmap_prepare_success(const struct vm_area_struct *vma) > return hugetlb_vma_lock_alloc((struct vm_area_struct *)vma); > } > > +static void hugetlb_file_mmap_prepare_abort(struct vm_area_desc *desc) > +{ > + /* > + * TODO: Implement the proper rollback for hugetlb_reserve_pages() > + * and drop the resv_map reference held in the desc here. > + */ > +} > + > static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) > { > struct file *file = desc->file; > @@ -172,6 +180,7 @@ static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) > if (!ret) { > /* Allocate the VMA lock after we set it up. */ > desc->action.success_hook = hugetlb_file_mmap_prepare_success; > + desc->action.abort_hook = hugetlb_file_mmap_prepare_abort; > /* > * We cannot permit the rmap finding this VMA in the time > * between the VMA being inserted into the VMA tree and the > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index a308e2c23b82..9320f6699fa9 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -861,6 +861,8 @@ struct mmap_action { > * it is not valid to clear the error here. > */ > int (*error_hook)(int err); > + > + void (*abort_hook)(struct vm_area_desc *desc); At least for me, it is not good name to distinguish it from error_hook. abort_mmap_prepare? I am not sure if it is a good solution, Cc other MM maintainers as well. Muchun, Thanks. > > /* > * This should be set in rare instances where the operation required > diff --git a/mm/vma.c b/mm/vma.c > index 377321b48734..d64cea5b4335 100644 > --- a/mm/vma.c > +++ b/mm/vma.c > @@ -2799,6 +2799,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr, > */ > if (map.file_doesnt_need_get) > fput(map.file); > + > + if (have_mmap_prepare && desc.action.abort_hook) > + desc.action.abort_hook(&desc); > + > vms_abort_munmap_vmas(&map.vms, &map.mas_detach); > return error; > } > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path 2026-04-27 7:55 ` Muchun Song @ 2026-04-27 8:17 ` David Hildenbrand (Arm) 2026-04-27 11:14 ` 王明煜 2026-04-27 14:18 ` Lorenzo Stoakes 1 sibling, 1 reply; 8+ messages in thread From: David Hildenbrand (Arm) @ 2026-04-27 8:17 UTC (permalink / raw) To: Muchun Song, Mingyu Wang Cc: Liam.Howlett, akpm, linux-mm, linux-kernel, ljs, vbabka, jannh, pfalcato, osalvador >> /* >> * We cannot permit the rmap finding this VMA in the time >> * between the VMA being inserted into the VMA tree and the >> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h >> index a308e2c23b82..9320f6699fa9 100644 >> --- a/include/linux/mm_types.h >> +++ b/include/linux/mm_types.h >> @@ -861,6 +861,8 @@ struct mmap_action { >> * it is not valid to clear the error here. >> */ >> int (*error_hook)(int err); >> + >> + void (*abort_hook)(struct vm_area_desc *desc); > > At least for me, it is not good name to distinguish it from error_hook. > abort_mmap_prepare? I am not sure if it is a good solution, Cc other > MM maintainers as well. > I'm sure Lorenzo will provide feedback here :) -- Cheers, David ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path 2026-04-27 8:17 ` David Hildenbrand (Arm) @ 2026-04-27 11:14 ` 王明煜 0 siblings, 0 replies; 8+ messages in thread From: 王明煜 @ 2026-04-27 11:14 UTC (permalink / raw) To: David Hildenbrand (Arm), muchun.song Cc: Muchun Song, Liam.Howlett, akpm, linux-mm, linux-kernel, vbabka, jannh, pfalcato, osalvador, ljs Hi Muchun and David, Thanks a lot for taking the time to review! > At least for me, it is not good name to distinguish it from error_hook. > abort_mmap_prepare? I am not sure if it is a good solution, Cc other > MM maintainers as well. I completely agree. `abort_mmap_prepare` is indeed much clearer and more specific to its purpose. I will update the naming in the v2 patch. > Please use ./scripts/get_maintainer.pl to get full mail list for Cc/To since > it is not only related to HugeTLB subsystem. I apologize for the missing CCs. I did generate the full list using `get_maintainer.pl`, but the SMTP server of Xidian University drops emails with too many foreign recipients due to strict anti-spam policies. I had to heavily trim the CC list just to get the RFC patch delivered. I will try to use an alternative email or chunk the CCs to ensure all relevant authors are included in the next version. I will hold off on sending the v2 patch until Lorenzo shares his thoughts on whether this architectural approach is the right way to go. Best regards, Mingyu > -----原始邮件----- > 发件人: "David Hildenbrand (Arm)" <david@kernel.org> > 发送时间:2026-04-27 16:17:23 (星期一) > 收件人: "Muchun Song" <muchun.song@linux.dev>, "Mingyu Wang" <25181214217@stu.xidian.edu.cn> > 抄送: Liam.Howlett@oracle.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, ljs@kernel.org, vbabka@kernel.org, jannh@google.com, pfalcato@suse.de, osalvador@suse.de > 主题: Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path > > > >> /* > >> * We cannot permit the rmap finding this VMA in the time > >> * between the VMA being inserted into the VMA tree and the > >> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > >> index a308e2c23b82..9320f6699fa9 100644 > >> --- a/include/linux/mm_types.h > >> +++ b/include/linux/mm_types.h > >> @@ -861,6 +861,8 @@ struct mmap_action { > >> * it is not valid to clear the error here. > >> */ > >> int (*error_hook)(int err); > >> + > >> + void (*abort_hook)(struct vm_area_desc *desc); > > > > At least for me, it is not good name to distinguish it from error_hook. > > abort_mmap_prepare? I am not sure if it is a good solution, Cc other > > MM maintainers as well. > > > > I'm sure Lorenzo will provide feedback here :) > > -- > Cheers, > > David ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path 2026-04-27 7:55 ` Muchun Song 2026-04-27 8:17 ` David Hildenbrand (Arm) @ 2026-04-27 14:18 ` Lorenzo Stoakes 2026-04-27 14:39 ` 王明煜 1 sibling, 1 reply; 8+ messages in thread From: Lorenzo Stoakes @ 2026-04-27 14:18 UTC (permalink / raw) To: Muchun Song Cc: Mingyu Wang, Liam.Howlett, akpm, linux-mm, linux-kernel, vbabka, jannh, pfalcato, osalvador, david On Mon, Apr 27, 2026 at 03:55:00PM +0800, Muchun Song wrote: > > > On Apr 25, 2026, at 15:07, Mingyu Wang <25181214217@stu.xidian.edu.cn> wrote: > > > > While fuzzing with Syzkaller and fault injection (failslab) enabled, > > I observed a persistent resv_map memory leak in the hugetlb mmap error path. > > > > BUG: memory leak > > unreferenced object 0xffff888110b92400 (size 512): > > comm "syz.0.5386", pid 20390, jiffies 4298157188 > > backtrace: > > __kmalloc_cache_noprof+0x509/0x6e0 > > resv_map_alloc+0x47/0x3a0 > > hugetlb_reserve_pages+0x758/0x1220 > > hugetlbfs_file_mmap_prepare+0x492/0x790 > > __mmap_region+0x1ae6/0x29f0 > > > > This is a regression introduced by the recent VMA iterator and mmap region > > refactoring, which decoupled mmap preparation from VMA completion. > > > > In `__mmap_region()`, `call_mmap_prepare()` triggers `hugetlbfs_file_mmap_prepare()`, > > which successfully allocates the `resv_map` and registers a `success_hook` > > in `desc->action`. > > > > If `__mmap_new_vma()` subsequently fails (e.g., `vma_iter_prealloc()` > > returns -ENOMEM due to failslab), the code jumps to `abort_munmap`. > > However, the `desc` structure is completely discarded without invoking > > any cleanup. The newly allocated empty VMA is freed, but since > > `set_vma_user_defined_fields()` was never reached, `vm_area_free()` > > doesn't call `hugetlb_vm_close()`. Thus, the `resv_map` is permanently leaked. > > > > This RFC proposes adding an `abort_hook` to `struct mmap_action` > > so that subsystems can properly clean up resources allocated during the > > `mmap_prepare` phase if VMA creation fails. Yeah sorry, this is a general problem that I addressed already with the vm_ops->mapped callback. I'll come up with a patch to fix this up for hugetlb, thanks for highlighting this. I plan to get rid of the success hook in any case, it's only because hugetlb is doing something really... not great with the 'VMA lock' (really hugetlb VMA lock I suppose) that we need a VMA pointer at the point. > > > > Any feedback on whether this architectural approach is correct, or how to > > properly implement the hugetlb unreserve rollback, would be highly appreciated. > > Please use ./scripts/get_maintainer.pl to get full mail list for Cc/To since > it is not only related to HugeTLB subsystem. It will also consider the author > of commit introducing the problem. Please do do that, I wrote this whole thing, so y'know :) Especially at the moment I am very very busy with LSF coming up so you need to make sure you include me in the mail. > > > > > Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn> > > --- > > fs/hugetlbfs/inode.c | 9 +++++++++ > > include/linux/mm_types.h | 2 ++ > > mm/vma.c | 4 ++++ > > 3 files changed, 15 insertions(+) > > > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > > index 8b05bec08e04..002bb6d9ca23 100644 > > --- a/fs/hugetlbfs/inode.c > > +++ b/fs/hugetlbfs/inode.c > > @@ -102,6 +102,14 @@ static int hugetlb_file_mmap_prepare_success(const struct vm_area_struct *vma) > > return hugetlb_vma_lock_alloc((struct vm_area_struct *)vma); > > } > > > > +static void hugetlb_file_mmap_prepare_abort(struct vm_area_desc *desc) > > +{ > > + /* > > + * TODO: Implement the proper rollback for hugetlb_reserve_pages() > > + * and drop the resv_map reference held in the desc here. > > + */ > > +} > > + > > static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) > > { > > struct file *file = desc->file; > > @@ -172,6 +180,7 @@ static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) > > if (!ret) { > > /* Allocate the VMA lock after we set it up. */ > > desc->action.success_hook = hugetlb_file_mmap_prepare_success; > > + desc->action.abort_hook = hugetlb_file_mmap_prepare_abort; > > /* > > * We cannot permit the rmap finding this VMA in the time > > * between the VMA being inserted into the VMA tree and the > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > > index a308e2c23b82..9320f6699fa9 100644 > > --- a/include/linux/mm_types.h > > +++ b/include/linux/mm_types.h > > @@ -861,6 +861,8 @@ struct mmap_action { > > * it is not valid to clear the error here. > > */ > > int (*error_hook)(int err); > > + > > + void (*abort_hook)(struct vm_area_desc *desc); > > At least for me, it is not good name to distinguish it from error_hook. > abort_mmap_prepare? I am not sure if it is a good solution, Cc other > MM maintainers as well. Yeah no we definitely don't want this, I plan to eliminate these hooks eventually. Really intend on adding no others, these were to work around effectively legacy issues in mmap callbacks. > > Muchun, > Thanks. > > > > > /* > > * This should be set in rare instances where the operation required > > diff --git a/mm/vma.c b/mm/vma.c > > index 377321b48734..d64cea5b4335 100644 > > --- a/mm/vma.c > > +++ b/mm/vma.c > > @@ -2799,6 +2799,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr, > > */ > > if (map.file_doesnt_need_get) > > fput(map.file); > > + > > + if (have_mmap_prepare && desc.action.abort_hook) > > + desc.action.abort_hook(&desc); > > + > > vms_abort_munmap_vmas(&map.vms, &map.mas_detach); > > return error; > > } > > -- > > 2.34.1 > > > Thanks, Lorenzo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path 2026-04-27 14:18 ` Lorenzo Stoakes @ 2026-04-27 14:39 ` 王明煜 2026-04-27 15:20 ` Lorenzo Stoakes 0 siblings, 1 reply; 8+ messages in thread From: 王明煜 @ 2026-04-27 14:39 UTC (permalink / raw) To: Lorenzo Stoakes Cc: Muchun Song, Liam.Howlett, akpm, linux-mm, linux-kernel, vbabka, jannh, pfalcato, osalvador, david, ljs Hi Lorenzo, Thanks for the prompt response and for taking over the fix! I completely understand the architectural direction to eliminate the hooks, which makes perfect sense. I'll watch out for your patch. Also, my sincere apologies for omitting you from the CC list originally. My university's SMTP server blocked the email due to the long recipient list from `get_maintainer.pl`, forcing me to trim it heavily just to get the report out. I'll make sure to explicitly include the original authors in the future. Good luck with the upcoming LSF/MM! Best regards, Mingyu > -----原始邮件----- > 发件人: "Lorenzo Stoakes" <ljs@kernel.org> > 发送时间:2026-04-27 22:18:34 (星期一) > 收件人: "Muchun Song" <muchun.song@linux.dev> > 抄送: "Mingyu Wang" <25181214217@stu.xidian.edu.cn>, Liam.Howlett@oracle.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, vbabka@kernel.org, jannh@google.com, pfalcato@suse.de, osalvador@suse.de, david@kernel.org > 主题: Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path > > > On Mon, Apr 27, 2026 at 03:55:00PM +0800, Muchun Song wrote: > > > > > On Apr 25, 2026, at 15:07, Mingyu Wang <25181214217@stu.xidian.edu.cn> wrote: > > > > > > While fuzzing with Syzkaller and fault injection (failslab) enabled, > > > I observed a persistent resv_map memory leak in the hugetlb mmap error path. > > > > > > BUG: memory leak > > > unreferenced object 0xffff888110b92400 (size 512): > > > comm "syz.0.5386", pid 20390, jiffies 4298157188 > > > backtrace: > > > __kmalloc_cache_noprof+0x509/0x6e0 > > > resv_map_alloc+0x47/0x3a0 > > > hugetlb_reserve_pages+0x758/0x1220 > > > hugetlbfs_file_mmap_prepare+0x492/0x790 > > > __mmap_region+0x1ae6/0x29f0 > > > > > > This is a regression introduced by the recent VMA iterator and mmap region > > > refactoring, which decoupled mmap preparation from VMA completion. > > > > > > In `__mmap_region()`, `call_mmap_prepare()` triggers `hugetlbfs_file_mmap_prepare()`, > > > which successfully allocates the `resv_map` and registers a `success_hook` > > > in `desc->action`. > > > > > > If `__mmap_new_vma()` subsequently fails (e.g., `vma_iter_prealloc()` > > > returns -ENOMEM due to failslab), the code jumps to `abort_munmap`. > > > However, the `desc` structure is completely discarded without invoking > > > any cleanup. The newly allocated empty VMA is freed, but since > > > `set_vma_user_defined_fields()` was never reached, `vm_area_free()` > > > doesn't call `hugetlb_vm_close()`. Thus, the `resv_map` is permanently leaked. > > > > > > This RFC proposes adding an `abort_hook` to `struct mmap_action` > > > so that subsystems can properly clean up resources allocated during the > > > `mmap_prepare` phase if VMA creation fails. > > Yeah sorry, this is a general problem that I addressed already with the > vm_ops->mapped callback. > > I'll come up with a patch to fix this up for hugetlb, thanks for highlighting > this. > > I plan to get rid of the success hook in any case, it's only because hugetlb is > doing something really... not great with the 'VMA lock' (really hugetlb VMA lock > I suppose) that we need a VMA pointer at the point. > > > > > > > Any feedback on whether this architectural approach is correct, or how to > > > properly implement the hugetlb unreserve rollback, would be highly appreciated. > > > > Please use ./scripts/get_maintainer.pl to get full mail list for Cc/To since > > it is not only related to HugeTLB subsystem. It will also consider the author > > of commit introducing the problem. > > Please do do that, I wrote this whole thing, so y'know :) > > Especially at the moment I am very very busy with LSF coming up so you need to > make sure you include me in the mail. > > > > > > > > > Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn> > > > --- > > > fs/hugetlbfs/inode.c | 9 +++++++++ > > > include/linux/mm_types.h | 2 ++ > > > mm/vma.c | 4 ++++ > > > 3 files changed, 15 insertions(+) > > > > > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > > > index 8b05bec08e04..002bb6d9ca23 100644 > > > --- a/fs/hugetlbfs/inode.c > > > +++ b/fs/hugetlbfs/inode.c > > > @@ -102,6 +102,14 @@ static int hugetlb_file_mmap_prepare_success(const struct vm_area_struct *vma) > > > return hugetlb_vma_lock_alloc((struct vm_area_struct *)vma); > > > } > > > > > > +static void hugetlb_file_mmap_prepare_abort(struct vm_area_desc *desc) > > > +{ > > > + /* > > > + * TODO: Implement the proper rollback for hugetlb_reserve_pages() > > > + * and drop the resv_map reference held in the desc here. > > > + */ > > > +} > > > + > > > static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) > > > { > > > struct file *file = desc->file; > > > @@ -172,6 +180,7 @@ static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) > > > if (!ret) { > > > /* Allocate the VMA lock after we set it up. */ > > > desc->action.success_hook = hugetlb_file_mmap_prepare_success; > > > + desc->action.abort_hook = hugetlb_file_mmap_prepare_abort; > > > /* > > > * We cannot permit the rmap finding this VMA in the time > > > * between the VMA being inserted into the VMA tree and the > > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > > > index a308e2c23b82..9320f6699fa9 100644 > > > --- a/include/linux/mm_types.h > > > +++ b/include/linux/mm_types.h > > > @@ -861,6 +861,8 @@ struct mmap_action { > > > * it is not valid to clear the error here. > > > */ > > > int (*error_hook)(int err); > > > + > > > + void (*abort_hook)(struct vm_area_desc *desc); > > > > At least for me, it is not good name to distinguish it from error_hook. > > abort_mmap_prepare? I am not sure if it is a good solution, Cc other > > MM maintainers as well. > > Yeah no we definitely don't want this, I plan to eliminate these hooks > eventually. > > Really intend on adding no others, these were to work around effectively legacy > issues in mmap callbacks. > > > > > Muchun, > > Thanks. > > > > > > > > /* > > > * This should be set in rare instances where the operation required > > > diff --git a/mm/vma.c b/mm/vma.c > > > index 377321b48734..d64cea5b4335 100644 > > > --- a/mm/vma.c > > > +++ b/mm/vma.c > > > @@ -2799,6 +2799,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr, > > > */ > > > if (map.file_doesnt_need_get) > > > fput(map.file); > > > + > > > + if (have_mmap_prepare && desc.action.abort_hook) > > > + desc.action.abort_hook(&desc); > > > + > > > vms_abort_munmap_vmas(&map.vms, &map.mas_detach); > > > return error; > > > } > > > -- > > > 2.34.1 > > > > > > > Thanks, Lorenzo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path 2026-04-27 14:39 ` 王明煜 @ 2026-04-27 15:20 ` Lorenzo Stoakes 2026-04-27 16:13 ` 王明煜 0 siblings, 1 reply; 8+ messages in thread From: Lorenzo Stoakes @ 2026-04-27 15:20 UTC (permalink / raw) To: 王明煜 Cc: Muchun Song, Liam.Howlett, akpm, linux-mm, linux-kernel, vbabka, jannh, pfalcato, osalvador, david On Mon, Apr 27, 2026 at 10:39:37PM +0800, 王明煜 wrote: > Hi Lorenzo, > > Thanks for the prompt response and for taking over the fix! > > I completely understand the architectural direction to eliminate the hooks, which makes perfect sense. I'll watch out for your patch. > > Also, my sincere apologies for omitting you from the CC list originally. My university's SMTP server blocked the email due to the long recipient list from `get_maintainer.pl`, forcing me to trim it heavily just to get the report out. I'll make sure to explicitly include the original authors in the future. Ugh sorry to hear! > > Good luck with the upcoming LSF/MM! Thanks! I will come back to this, it's not going to be straightforward, hugetlbfs is a... pain. But thanks again for raising this, there is very much an issue here that I need to address. vm_ops->mapped won't do here, as hugetlbfs (of course) requires data that the interface (intentionally) doesn't provide. I may have to revert this (ugh), because mmap_prepare should be idempotent and this is just an error on my part converting things a little too one-to-one. Anyway, shall return to this! :) > > Best regards, > Mingyu Thanks, Lorenzo > > > > -----原始邮件----- > > 发件人: "Lorenzo Stoakes" <ljs@kernel.org> > > 发送时间:2026-04-27 22:18:34 (星期一) > > 收件人: "Muchun Song" <muchun.song@linux.dev> > > 抄送: "Mingyu Wang" <25181214217@stu.xidian.edu.cn>, Liam.Howlett@oracle.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, vbabka@kernel.org, jannh@google.com, pfalcato@suse.de, osalvador@suse.de, david@kernel.org > > 主题: Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path > > > > > > On Mon, Apr 27, 2026 at 03:55:00PM +0800, Muchun Song wrote: > > > > > > > On Apr 25, 2026, at 15:07, Mingyu Wang <25181214217@stu.xidian.edu.cn> wrote: > > > > > > > > While fuzzing with Syzkaller and fault injection (failslab) enabled, > > > > I observed a persistent resv_map memory leak in the hugetlb mmap error path. > > > > > > > > BUG: memory leak > > > > unreferenced object 0xffff888110b92400 (size 512): > > > > comm "syz.0.5386", pid 20390, jiffies 4298157188 > > > > backtrace: > > > > __kmalloc_cache_noprof+0x509/0x6e0 > > > > resv_map_alloc+0x47/0x3a0 > > > > hugetlb_reserve_pages+0x758/0x1220 > > > > hugetlbfs_file_mmap_prepare+0x492/0x790 > > > > __mmap_region+0x1ae6/0x29f0 > > > > > > > > This is a regression introduced by the recent VMA iterator and mmap region > > > > refactoring, which decoupled mmap preparation from VMA completion. > > > > > > > > In `__mmap_region()`, `call_mmap_prepare()` triggers `hugetlbfs_file_mmap_prepare()`, > > > > which successfully allocates the `resv_map` and registers a `success_hook` > > > > in `desc->action`. > > > > > > > > If `__mmap_new_vma()` subsequently fails (e.g., `vma_iter_prealloc()` > > > > returns -ENOMEM due to failslab), the code jumps to `abort_munmap`. > > > > However, the `desc` structure is completely discarded without invoking > > > > any cleanup. The newly allocated empty VMA is freed, but since > > > > `set_vma_user_defined_fields()` was never reached, `vm_area_free()` > > > > doesn't call `hugetlb_vm_close()`. Thus, the `resv_map` is permanently leaked. > > > > > > > > This RFC proposes adding an `abort_hook` to `struct mmap_action` > > > > so that subsystems can properly clean up resources allocated during the > > > > `mmap_prepare` phase if VMA creation fails. > > > > Yeah sorry, this is a general problem that I addressed already with the > > vm_ops->mapped callback. > > > > I'll come up with a patch to fix this up for hugetlb, thanks for highlighting > > this. > > > > I plan to get rid of the success hook in any case, it's only because hugetlb is > > doing something really... not great with the 'VMA lock' (really hugetlb VMA lock > > I suppose) that we need a VMA pointer at the point. > > > > > > > > > > Any feedback on whether this architectural approach is correct, or how to > > > > properly implement the hugetlb unreserve rollback, would be highly appreciated. > > > > > > Please use ./scripts/get_maintainer.pl to get full mail list for Cc/To since > > > it is not only related to HugeTLB subsystem. It will also consider the author > > > of commit introducing the problem. > > > > Please do do that, I wrote this whole thing, so y'know :) > > > > Especially at the moment I am very very busy with LSF coming up so you need to > > make sure you include me in the mail. > > > > > > > > > > > > > Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn> > > > > --- > > > > fs/hugetlbfs/inode.c | 9 +++++++++ > > > > include/linux/mm_types.h | 2 ++ > > > > mm/vma.c | 4 ++++ > > > > 3 files changed, 15 insertions(+) > > > > > > > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > > > > index 8b05bec08e04..002bb6d9ca23 100644 > > > > --- a/fs/hugetlbfs/inode.c > > > > +++ b/fs/hugetlbfs/inode.c > > > > @@ -102,6 +102,14 @@ static int hugetlb_file_mmap_prepare_success(const struct vm_area_struct *vma) > > > > return hugetlb_vma_lock_alloc((struct vm_area_struct *)vma); > > > > } > > > > > > > > +static void hugetlb_file_mmap_prepare_abort(struct vm_area_desc *desc) > > > > +{ > > > > + /* > > > > + * TODO: Implement the proper rollback for hugetlb_reserve_pages() > > > > + * and drop the resv_map reference held in the desc here. > > > > + */ > > > > +} > > > > + > > > > static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) > > > > { > > > > struct file *file = desc->file; > > > > @@ -172,6 +180,7 @@ static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) > > > > if (!ret) { > > > > /* Allocate the VMA lock after we set it up. */ > > > > desc->action.success_hook = hugetlb_file_mmap_prepare_success; > > > > + desc->action.abort_hook = hugetlb_file_mmap_prepare_abort; > > > > /* > > > > * We cannot permit the rmap finding this VMA in the time > > > > * between the VMA being inserted into the VMA tree and the > > > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > > > > index a308e2c23b82..9320f6699fa9 100644 > > > > --- a/include/linux/mm_types.h > > > > +++ b/include/linux/mm_types.h > > > > @@ -861,6 +861,8 @@ struct mmap_action { > > > > * it is not valid to clear the error here. > > > > */ > > > > int (*error_hook)(int err); > > > > + > > > > + void (*abort_hook)(struct vm_area_desc *desc); > > > > > > At least for me, it is not good name to distinguish it from error_hook. > > > abort_mmap_prepare? I am not sure if it is a good solution, Cc other > > > MM maintainers as well. > > > > Yeah no we definitely don't want this, I plan to eliminate these hooks > > eventually. > > > > Really intend on adding no others, these were to work around effectively legacy > > issues in mmap callbacks. > > > > > > > > Muchun, > > > Thanks. > > > > > > > > > > > /* > > > > * This should be set in rare instances where the operation required > > > > diff --git a/mm/vma.c b/mm/vma.c > > > > index 377321b48734..d64cea5b4335 100644 > > > > --- a/mm/vma.c > > > > +++ b/mm/vma.c > > > > @@ -2799,6 +2799,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr, > > > > */ > > > > if (map.file_doesnt_need_get) > > > > fput(map.file); > > > > + > > > > + if (have_mmap_prepare && desc.action.abort_hook) > > > > + desc.action.abort_hook(&desc); > > > > + > > > > vms_abort_munmap_vmas(&map.vms, &map.mas_detach); > > > > return error; > > > > } > > > > -- > > > > 2.34.1 > > > > > > > > > > > Thanks, Lorenzo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path 2026-04-27 15:20 ` Lorenzo Stoakes @ 2026-04-27 16:13 ` 王明煜 0 siblings, 0 replies; 8+ messages in thread From: 王明煜 @ 2026-04-27 16:13 UTC (permalink / raw) To: Lorenzo Stoakes Cc: Muchun Song, Liam.Howlett, akpm, linux-mm, linux-kernel, vbabka, jannh, pfalcato, osalvador, david Hi Lorenzo, No worries at all! Enjoy the LSF/MM summit. I completely understand the pain with hugetlbfs. Since this specific memory leak is deeply tied to the error path and requires fault injection (failslab) under heavy fuzzing to hit, it has been notoriously difficult to extract a clean C reproducer. Because of this, I'd be more than happy to help test any fix or revert you eventually come up with. Whenever you have a prototype patch ready (even if it's just a rough draft), feel free to ping me or send it my way. I can easily plug it into my Syzkaller environment and verify if the resv_map leak is fully resolved under pressure. Good luck with the conference, and looking forward to your updates when you're back! Best regards, Mingyu > -----原始邮件----- > 发件人: "Lorenzo Stoakes" <ljs@kernel.org> > 发送时间:2026-04-27 23:20:13 (星期一) > 收件人: 王明煜 <25181214217@stu.xidian.edu.cn> > 抄送: "Muchun Song" <muchun.song@linux.dev>, Liam.Howlett@oracle.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, vbabka@kernel.org, jannh@google.com, pfalcato@suse.de, osalvador@suse.de, david@kernel.org > 主题: Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path > > On Mon, Apr 27, 2026 at 10:39:37PM +0800, 王明煜 wrote: > > Hi Lorenzo, > > > > Thanks for the prompt response and for taking over the fix! > > > > I completely understand the architectural direction to eliminate the hooks, which makes perfect sense. I'll watch out for your patch. > > > > Also, my sincere apologies for omitting you from the CC list originally. My university's SMTP server blocked the email due to the long recipient list from `get_maintainer.pl`, forcing me to trim it heavily just to get the report out. I'll make sure to explicitly include the original authors in the future. > > Ugh sorry to hear! > > > > > Good luck with the upcoming LSF/MM! > > Thanks! > > I will come back to this, it's not going to be straightforward, hugetlbfs is > a... pain. > > But thanks again for raising this, there is very much an issue here that I need > to address. > > vm_ops->mapped won't do here, as hugetlbfs (of course) requires data that the > interface (intentionally) doesn't provide. > > I may have to revert this (ugh), because mmap_prepare should be idempotent and > this is just an error on my part converting things a little too one-to-one. > > Anyway, shall return to this! :) > > > > > Best regards, > > Mingyu > > Thanks, Lorenzo > > > > > > > > -----原始邮件----- > > > 发件人: "Lorenzo Stoakes" <ljs@kernel.org> > > > 发送时间:2026-04-27 22:18:34 (星期一) > > > 收件人: "Muchun Song" <muchun.song@linux.dev> > > > 抄送: "Mingyu Wang" <25181214217@stu.xidian.edu.cn>, Liam.Howlett@oracle.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, vbabka@kernel.org, jannh@google.com, pfalcato@suse.de, osalvador@suse.de, david@kernel.org > > > 主题: Re: [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path > > > > > > > > > On Mon, Apr 27, 2026 at 03:55:00PM +0800, Muchun Song wrote: > > > > > > > > > On Apr 25, 2026, at 15:07, Mingyu Wang <25181214217@stu.xidian.edu.cn> wrote: > > > > > > > > > > While fuzzing with Syzkaller and fault injection (failslab) enabled, > > > > > I observed a persistent resv_map memory leak in the hugetlb mmap error path. > > > > > > > > > > BUG: memory leak > > > > > unreferenced object 0xffff888110b92400 (size 512): > > > > > comm "syz.0.5386", pid 20390, jiffies 4298157188 > > > > > backtrace: > > > > > __kmalloc_cache_noprof+0x509/0x6e0 > > > > > resv_map_alloc+0x47/0x3a0 > > > > > hugetlb_reserve_pages+0x758/0x1220 > > > > > hugetlbfs_file_mmap_prepare+0x492/0x790 > > > > > __mmap_region+0x1ae6/0x29f0 > > > > > > > > > > This is a regression introduced by the recent VMA iterator and mmap region > > > > > refactoring, which decoupled mmap preparation from VMA completion. > > > > > > > > > > In `__mmap_region()`, `call_mmap_prepare()` triggers `hugetlbfs_file_mmap_prepare()`, > > > > > which successfully allocates the `resv_map` and registers a `success_hook` > > > > > in `desc->action`. > > > > > > > > > > If `__mmap_new_vma()` subsequently fails (e.g., `vma_iter_prealloc()` > > > > > returns -ENOMEM due to failslab), the code jumps to `abort_munmap`. > > > > > However, the `desc` structure is completely discarded without invoking > > > > > any cleanup. The newly allocated empty VMA is freed, but since > > > > > `set_vma_user_defined_fields()` was never reached, `vm_area_free()` > > > > > doesn't call `hugetlb_vm_close()`. Thus, the `resv_map` is permanently leaked. > > > > > > > > > > This RFC proposes adding an `abort_hook` to `struct mmap_action` > > > > > so that subsystems can properly clean up resources allocated during the > > > > > `mmap_prepare` phase if VMA creation fails. > > > > > > Yeah sorry, this is a general problem that I addressed already with the > > > vm_ops->mapped callback. > > > > > > I'll come up with a patch to fix this up for hugetlb, thanks for highlighting > > > this. > > > > > > I plan to get rid of the success hook in any case, it's only because hugetlb is > > > doing something really... not great with the 'VMA lock' (really hugetlb VMA lock > > > I suppose) that we need a VMA pointer at the point. > > > > > > > > > > > > > Any feedback on whether this architectural approach is correct, or how to > > > > > properly implement the hugetlb unreserve rollback, would be highly appreciated. > > > > > > > > Please use ./scripts/get_maintainer.pl to get full mail list for Cc/To since > > > > it is not only related to HugeTLB subsystem. It will also consider the author > > > > of commit introducing the problem. > > > > > > Please do do that, I wrote this whole thing, so y'know :) > > > > > > Especially at the moment I am very very busy with LSF coming up so you need to > > > make sure you include me in the mail. > > > > > > > > > > > > > > > > > Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn> > > > > > --- > > > > > fs/hugetlbfs/inode.c | 9 +++++++++ > > > > > include/linux/mm_types.h | 2 ++ > > > > > mm/vma.c | 4 ++++ > > > > > 3 files changed, 15 insertions(+) > > > > > > > > > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > > > > > index 8b05bec08e04..002bb6d9ca23 100644 > > > > > --- a/fs/hugetlbfs/inode.c > > > > > +++ b/fs/hugetlbfs/inode.c > > > > > @@ -102,6 +102,14 @@ static int hugetlb_file_mmap_prepare_success(const struct vm_area_struct *vma) > > > > > return hugetlb_vma_lock_alloc((struct vm_area_struct *)vma); > > > > > } > > > > > > > > > > +static void hugetlb_file_mmap_prepare_abort(struct vm_area_desc *desc) > > > > > +{ > > > > > + /* > > > > > + * TODO: Implement the proper rollback for hugetlb_reserve_pages() > > > > > + * and drop the resv_map reference held in the desc here. > > > > > + */ > > > > > +} > > > > > + > > > > > static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) > > > > > { > > > > > struct file *file = desc->file; > > > > > @@ -172,6 +180,7 @@ static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) > > > > > if (!ret) { > > > > > /* Allocate the VMA lock after we set it up. */ > > > > > desc->action.success_hook = hugetlb_file_mmap_prepare_success; > > > > > + desc->action.abort_hook = hugetlb_file_mmap_prepare_abort; > > > > > /* > > > > > * We cannot permit the rmap finding this VMA in the time > > > > > * between the VMA being inserted into the VMA tree and the > > > > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > > > > > index a308e2c23b82..9320f6699fa9 100644 > > > > > --- a/include/linux/mm_types.h > > > > > +++ b/include/linux/mm_types.h > > > > > @@ -861,6 +861,8 @@ struct mmap_action { > > > > > * it is not valid to clear the error here. > > > > > */ > > > > > int (*error_hook)(int err); > > > > > + > > > > > + void (*abort_hook)(struct vm_area_desc *desc); > > > > > > > > At least for me, it is not good name to distinguish it from error_hook. > > > > abort_mmap_prepare? I am not sure if it is a good solution, Cc other > > > > MM maintainers as well. > > > > > > Yeah no we definitely don't want this, I plan to eliminate these hooks > > > eventually. > > > > > > Really intend on adding no others, these were to work around effectively legacy > > > issues in mmap callbacks. > > > > > > > > > > > Muchun, > > > > Thanks. > > > > > > > > > > > > > > /* > > > > > * This should be set in rare instances where the operation required > > > > > diff --git a/mm/vma.c b/mm/vma.c > > > > > index 377321b48734..d64cea5b4335 100644 > > > > > --- a/mm/vma.c > > > > > +++ b/mm/vma.c > > > > > @@ -2799,6 +2799,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr, > > > > > */ > > > > > if (map.file_doesnt_need_get) > > > > > fput(map.file); > > > > > + > > > > > + if (have_mmap_prepare && desc.action.abort_hook) > > > > > + desc.action.abort_hook(&desc); > > > > > + > > > > > vms_abort_munmap_vmas(&map.vms, &map.mas_detach); > > > > > return error; > > > > > } > > > > > -- > > > > > 2.34.1 > > > > > > > > > > > > > > > Thanks, Lorenzo ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-04-27 16:25 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-25 7:07 [RFC PATCH] mm/hugetlb: fix resv_map memory leak in __mmap_region error path Mingyu Wang 2026-04-27 7:55 ` Muchun Song 2026-04-27 8:17 ` David Hildenbrand (Arm) 2026-04-27 11:14 ` 王明煜 2026-04-27 14:18 ` Lorenzo Stoakes 2026-04-27 14:39 ` 王明煜 2026-04-27 15:20 ` Lorenzo Stoakes 2026-04-27 16:13 ` 王明煜
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox