* [RFC PATCH v2 0/2] Fix uprobe anon page be overwritten during mremap
@ 2025-05-27 13:23 Pu Lehui
2025-05-27 13:23 ` [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma " Pu Lehui
2025-05-27 13:23 ` [RFC PATCH v2 2/2] mm/mremap: Expose abnormal new_pte during move_ptes Pu Lehui
0 siblings, 2 replies; 21+ messages in thread
From: Pu Lehui @ 2025-05-27 13:23 UTC (permalink / raw)
To: mhiramat, oleg, peterz, akpm, Liam.Howlett, lorenzo.stoakes,
vbabka, jannh, pfalcato
Cc: linux-mm, linux-kernel, pulehui
From: Pu Lehui <pulehui@huawei.com>
patch 1: the mainly fix for uprobe anon page be overwritten issue.
patch 2: WARN_ON_ONCE for new_pte not NULL during move_ptes.
RFC v2:
- skip uprobe_mmap on expanded vma.
- add skip_vma_uprobe field to struct vma_prepare and
vma_merge_struct. (Lorenzo)
- add WARN_ON_ONCE when new_pte is not NULL. (Oleg)
- Corrected some of the comments.
RFC v1:
https://lore.kernel.org/all/20250521092503.3116340-1-pulehui@huaweicloud.com/
Pu Lehui (2):
mm/mremap: Fix uprobe anon page be overwritten when expanding vma
during mremap
mm/mremap: Expose abnormal new_pte during move_ptes
mm/mremap.c | 2 ++
mm/vma.c | 7 ++++++-
mm/vma.h | 7 +++++++
3 files changed, 15 insertions(+), 1 deletion(-)
--
2.34.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 13:23 [RFC PATCH v2 0/2] Fix uprobe anon page be overwritten during mremap Pu Lehui
@ 2025-05-27 13:23 ` Pu Lehui
2025-05-27 13:34 ` Lorenzo Stoakes
` (2 more replies)
2025-05-27 13:23 ` [RFC PATCH v2 2/2] mm/mremap: Expose abnormal new_pte during move_ptes Pu Lehui
1 sibling, 3 replies; 21+ messages in thread
From: Pu Lehui @ 2025-05-27 13:23 UTC (permalink / raw)
To: mhiramat, oleg, peterz, akpm, Liam.Howlett, lorenzo.stoakes,
vbabka, jannh, pfalcato
Cc: linux-mm, linux-kernel, pulehui
From: Pu Lehui <pulehui@huawei.com>
We encountered a BUG alert triggered by Syzkaller as follows:
BUG: Bad rss-counter state mm:00000000b4a60fca type:MM_ANONPAGES val:1
And we can reproduce it with the following steps:
1. register uprobe on file at zero offset
2. mmap the file at zero offset:
addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
3. mremap part of vma1 to new vma2:
addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE);
4. mremap back to orig addr1:
mremap(addr2, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
In the step 3, the vma1 range [addr1, addr1 + 4096] will be remap to new
vma2 with range [addr2, addr2 + 8192], and remap uprobe anon page from
the vma1 to vma2, then unmap the vma1 range [addr1, addr1 + 4096].
In tht step 4, the vma2 range [addr2, addr2 + 4096] will be remap back
to the addr range [addr1, addr1 + 4096]. Since the addr range [addr1 +
4096, addr1 + 8192] still maps the file, it will take
vma_merge_new_range to merge these two addr ranges, and then do
uprobe_mmap in vma_complete. Since the merged vma pgoff is also zero
offset, it will install uprobe anon page to the merged vma. However, the
upcomming move_page_tables step, which use set_pte_at to remap the vma2
uprobe anon page to the merged vma, will over map the old uprobe anon
page in the merged vma, and lead the old uprobe anon page to be orphan.
Since the uprobe anon page will be remapped to the merged vma, we can
remove the unnecessary uprobe_mmap on merged vma, that is, do not
perform uprobe_mmap on expanded vma.
This problem was first find in linux-6.6.y and also exists in the
community syzkaller:
https://lore.kernel.org/all/000000000000ada39605a5e71711@google.com/T/
The complete Syzkaller C reproduction program is as follows:
#define _GNU_SOURCE
#include <fcntl.h>
#include <unistd.h>
#include <syscall.h>
#include <sys/mman.h>
#include <linux/perf_event.h>
int main(int argc, char *argv[])
{
int fd = open(FNAME, O_RDWR|O_CREAT, 0600);
struct perf_event_attr attr = {
.type = 9,
.uprobe_path = (long) FNAME,
.probe_offset = 0x0,
};
void *addr1, *addr2;
write(fd, "x", 1);
mmap(NULL, 4096, PROT_EXEC, MAP_PRIVATE, fd, 0);
syscall(__NR_perf_event_open, &attr, 0, 0, -1, 0);
addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE);
mremap(addr2, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
return 0;
}
Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
---
mm/vma.c | 7 ++++++-
mm/vma.h | 7 +++++++
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/mm/vma.c b/mm/vma.c
index 1c6595f282e5..6445f515c7f2 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -358,7 +358,8 @@ static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi,
if (vp->file) {
i_mmap_unlock_write(vp->mapping);
- uprobe_mmap(vp->vma);
+ if (!vp->skip_vma_uprobe)
+ uprobe_mmap(vp->vma);
if (vp->adj_next)
uprobe_mmap(vp->adj_next);
@@ -737,6 +738,7 @@ static int commit_merge(struct vma_merge_struct *vmg)
if (vma_iter_prealloc(vmg->vmi, vma))
return -ENOMEM;
+ vp.skip_vma_uprobe = vmg->skip_vma_uprobe;
vma_prepare(&vp);
/*
* THP pages may need to do additional splits if we increase
@@ -1151,6 +1153,9 @@ int vma_expand(struct vma_merge_struct *vmg)
if (remove_next)
vmg->__remove_next = true;
+ /* skip uprobe_mmap on expanded vma */
+ vmg->skip_vma_uprobe = true;
+
if (commit_merge(vmg))
goto nomem;
diff --git a/mm/vma.h b/mm/vma.h
index 9a8af9be29a8..56cc0364d239 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -19,6 +19,8 @@ struct vma_prepare {
struct vm_area_struct *insert;
struct vm_area_struct *remove;
struct vm_area_struct *remove2;
+ /* Whether to skip uprobe_mmap on vma */
+ bool skip_vma_uprobe;
};
struct unlink_vma_file_batch {
@@ -120,6 +122,11 @@ struct vma_merge_struct {
*/
bool give_up_on_oom :1;
+ /*
+ * Whether to skip uprobe_mmap on merged vma.
+ */
+ bool skip_vma_uprobe :1;
+
/* Internal flags set during merge process: */
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC PATCH v2 2/2] mm/mremap: Expose abnormal new_pte during move_ptes
2025-05-27 13:23 [RFC PATCH v2 0/2] Fix uprobe anon page be overwritten during mremap Pu Lehui
2025-05-27 13:23 ` [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma " Pu Lehui
@ 2025-05-27 13:23 ` Pu Lehui
2025-05-27 14:24 ` Oleg Nesterov
1 sibling, 1 reply; 21+ messages in thread
From: Pu Lehui @ 2025-05-27 13:23 UTC (permalink / raw)
To: mhiramat, oleg, peterz, akpm, Liam.Howlett, lorenzo.stoakes,
vbabka, jannh, pfalcato
Cc: linux-mm, linux-kernel, pulehui
From: Pu Lehui <pulehui@huawei.com>
When executing move_ptes, the new_pte must be NULL, otherwise it will be
overwritten by the old_pte, and cause the abnormal new_pte to be leaked.
In order to make this problem to be more explicit, let's add
WARN_ON_ONCE when new_pte is not NULL.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
---
mm/mremap.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/mremap.c b/mm/mremap.c
index 83e359754961..adb3a111b5c1 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -237,6 +237,8 @@ static int move_ptes(struct pagetable_move_control *pmc,
for (; old_addr < old_end; old_pte++, old_addr += PAGE_SIZE,
new_pte++, new_addr += PAGE_SIZE) {
+ WARN_ON_ONCE(!pte_none(*new_pte))
+
if (pte_none(ptep_get(old_pte)))
continue;
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 13:23 ` [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma " Pu Lehui
@ 2025-05-27 13:34 ` Lorenzo Stoakes
2025-05-27 15:17 ` Oleg Nesterov
` (2 more replies)
2025-05-27 14:23 ` Oleg Nesterov
2025-05-27 15:30 ` Oleg Nesterov
2 siblings, 3 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-05-27 13:34 UTC (permalink / raw)
To: Pu Lehui
Cc: mhiramat, oleg, peterz, akpm, Liam.Howlett, vbabka, jannh,
pfalcato, linux-mm, linux-kernel, pulehui
Hi Pu,
Just as you were sending this I was about to send you my suggestion :) That
is hilarious haha. Maybe some cosmic connection? :P
Anyway I don't know whether your approach is _quite_ correct here, you are
generally disabling the uprobe behaviour on all VMA expansion (that is -
new VMA merge, relocate_vma_down() (unlikely to be relevant ;), it seems
sensible to only do so in the circumstances we know to be problematic.
I really do not want mremap-specific stuff in general merge code, it's the
move page tables bit that makes this broken.
So... :P let me attach my suggestion as a fix-patch below.
TO BE CLEAR - I present this in this form purely to make it easy for you to
grab, PLEASE take this patch and rework it (if it makes sense), I don't
need any attribution other than a Suggested-by, maybe.
They are rather close other than where this flag is set :)
I am by the way assuming that uprobes work by installing a special PTE at
page offset 0 and only in the case where there is something installed here
do we need to worry.
Please correct me if I'm misunderstanding here, I am not an expert on
uprobes.
Thanks! This is a great find overall.
I'd also quite like us to make your repro a self test if possible? Not sure
where we'd put it... could be in tools/testing/selftests/mm/merge.c
actually, new one I created explicitly for VMA merge scenarios.
Cheers, Lorenzo
On Tue, May 27, 2025 at 01:23:50PM +0000, Pu Lehui wrote:
> From: Pu Lehui <pulehui@huawei.com>
>
> We encountered a BUG alert triggered by Syzkaller as follows:
> BUG: Bad rss-counter state mm:00000000b4a60fca type:MM_ANONPAGES val:1
>
> And we can reproduce it with the following steps:
> 1. register uprobe on file at zero offset
> 2. mmap the file at zero offset:
> addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
> 3. mremap part of vma1 to new vma2:
> addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE);
> 4. mremap back to orig addr1:
> mremap(addr2, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
>
> In the step 3, the vma1 range [addr1, addr1 + 4096] will be remap to new
> vma2 with range [addr2, addr2 + 8192], and remap uprobe anon page from
> the vma1 to vma2, then unmap the vma1 range [addr1, addr1 + 4096].
> In tht step 4, the vma2 range [addr2, addr2 + 4096] will be remap back
> to the addr range [addr1, addr1 + 4096]. Since the addr range [addr1 +
> 4096, addr1 + 8192] still maps the file, it will take
> vma_merge_new_range to merge these two addr ranges, and then do
> uprobe_mmap in vma_complete. Since the merged vma pgoff is also zero
> offset, it will install uprobe anon page to the merged vma. However, the
> upcomming move_page_tables step, which use set_pte_at to remap the vma2
> uprobe anon page to the merged vma, will over map the old uprobe anon
> page in the merged vma, and lead the old uprobe anon page to be orphan.
>
> Since the uprobe anon page will be remapped to the merged vma, we can
> remove the unnecessary uprobe_mmap on merged vma, that is, do not
> perform uprobe_mmap on expanded vma.
>
> This problem was first find in linux-6.6.y and also exists in the
> community syzkaller:
> https://lore.kernel.org/all/000000000000ada39605a5e71711@google.com/T/
>
> The complete Syzkaller C reproduction program is as follows:
>
> #define _GNU_SOURCE
> #include <fcntl.h>
> #include <unistd.h>
> #include <syscall.h>
> #include <sys/mman.h>
> #include <linux/perf_event.h>
>
> int main(int argc, char *argv[])
> {
> int fd = open(FNAME, O_RDWR|O_CREAT, 0600);
> struct perf_event_attr attr = {
> .type = 9,
> .uprobe_path = (long) FNAME,
> .probe_offset = 0x0,
> };
> void *addr1, *addr2;
>
> write(fd, "x", 1);
> mmap(NULL, 4096, PROT_EXEC, MAP_PRIVATE, fd, 0);
>
> syscall(__NR_perf_event_open, &attr, 0, 0, -1, 0);
> addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
> addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE);
> mremap(addr2, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
>
> return 0;
> }
>
> Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
Nit, but we'll want to cc: stable once we agree on a solution too.
> Signed-off-by: Pu Lehui <pulehui@huawei.com>
> ---
> mm/vma.c | 7 ++++++-
> mm/vma.h | 7 +++++++
> 2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/mm/vma.c b/mm/vma.c
> index 1c6595f282e5..6445f515c7f2 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -358,7 +358,8 @@ static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi,
>
> if (vp->file) {
> i_mmap_unlock_write(vp->mapping);
> - uprobe_mmap(vp->vma);
> + if (!vp->skip_vma_uprobe)
> + uprobe_mmap(vp->vma);
>
> if (vp->adj_next)
> uprobe_mmap(vp->adj_next);
I think we need to cover off the adj_next case too, strictly. See the
attached patch :P
To be honest the adj_next case won't be relevant, as this is not set for a
new VMA, but it'd be odd to have a 'skip' or 'no' uprobe option and to not
obey it in both cases.
> @@ -737,6 +738,7 @@ static int commit_merge(struct vma_merge_struct *vmg)
> if (vma_iter_prealloc(vmg->vmi, vma))
> return -ENOMEM;
>
> + vp.skip_vma_uprobe = vmg->skip_vma_uprobe;
> vma_prepare(&vp);
> /*
> * THP pages may need to do additional splits if we increase
> @@ -1151,6 +1153,9 @@ int vma_expand(struct vma_merge_struct *vmg)
> if (remove_next)
> vmg->__remove_next = true;
>
> + /* skip uprobe_mmap on expanded vma */
> + vmg->skip_vma_uprobe = true;
> +
Yeah this doesn't feel right, this will make this happen for all
vma_expand() cases (including new VMA merge) and that's surely not correct
for non-mremap cases?
Again see the attached patch :P
> if (commit_merge(vmg))
> goto nomem;
>
> diff --git a/mm/vma.h b/mm/vma.h
> index 9a8af9be29a8..56cc0364d239 100644
> --- a/mm/vma.h
> +++ b/mm/vma.h
> @@ -19,6 +19,8 @@ struct vma_prepare {
> struct vm_area_struct *insert;
> struct vm_area_struct *remove;
> struct vm_area_struct *remove2;
> + /* Whether to skip uprobe_mmap on vma */
> + bool skip_vma_uprobe;
> };
>
> struct unlink_vma_file_batch {
> @@ -120,6 +122,11 @@ struct vma_merge_struct {
> */
> bool give_up_on_oom :1;
>
> + /*
> + * Whether to skip uprobe_mmap on merged vma.
> + */
> + bool skip_vma_uprobe :1;
> +
> /* Internal flags set during merge process: */
>
> /*
> --
> 2.34.1
>
----8<----
From 30e39d0f223ae4ab5669584593071f5f7266beeb Mon Sep 17 00:00:00 2001
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date: Tue, 27 May 2025 14:11:26 +0100
Subject: [PATCH] mm: avoid orphaned uprobe
If, on mremap, me move a file-backed VMA mapped at offset 0 that possess a
uprobe and it merges with an adjacent VMA, we will then invoke
uprobe_mmap() once again to install a uprobe into this newly established
VMA.
This is problematic, as when we then move the page tables back into place,
we overwrite the uprobe entry and thus orphan the merge-established uprobe.
Avoid this by threading state to explicitly disallow the establishment of a
new uprobe upon merge under these circumstances.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
mm/vma.c | 21 ++++++++++++++++++---
mm/vma.h | 5 +++++
2 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/mm/vma.c b/mm/vma.c
index 1c6595f282e5..cc18d1dcdc57 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -169,6 +169,9 @@ static void init_multi_vma_prep(struct vma_prepare *vp,
vp->file = vma->vm_file;
if (vp->file)
vp->mapping = vma->vm_file->f_mapping;
+
+ if (vmg && vmg->no_uprobe)
+ vp->no_uprobe = true;
}
/*
@@ -358,10 +361,13 @@ static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi,
if (vp->file) {
i_mmap_unlock_write(vp->mapping);
- uprobe_mmap(vp->vma);
- if (vp->adj_next)
- uprobe_mmap(vp->adj_next);
+ if (!vp->no_uprobe) {
+ uprobe_mmap(vp->vma);
+
+ if (vp->adj_next)
+ uprobe_mmap(vp->adj_next);
+ }
}
if (vp->remove) {
@@ -1830,6 +1836,15 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
vmg.middle = NULL; /* New VMA range. */
vmg.pgoff = pgoff;
vmg.next = vma_iter_next_rewind(&vmi, NULL);
+
+ /*
+ * If the VMA we are copying might contain a uprobe PTE, ensure that we
+ * do not establish one upon merge. otherwise, when mremap() moves page
+ * tables into place, we'll orphan the one just created.
+ */
+ if (vma->vm_file && vma->vm_pgoff == 0)
+ vmg.no_uprobe = true;
+
new_vma = vma_merge_new_range(&vmg);
if (new_vma) {
diff --git a/mm/vma.h b/mm/vma.h
index 9a8af9be29a8..4c35c5ab1aa2 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -19,6 +19,8 @@ struct vma_prepare {
struct vm_area_struct *insert;
struct vm_area_struct *remove;
struct vm_area_struct *remove2;
+
+ bool no_uprobe :1;
};
struct unlink_vma_file_batch {
@@ -120,6 +122,9 @@ struct vma_merge_struct {
*/
bool give_up_on_oom :1;
+ /* If set, do not install a uprobe upon merge. */
+ bool no_uprobe :1;
+
/* Internal flags set during merge process: */
/*
--
2.49.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 13:23 ` [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma " Pu Lehui
2025-05-27 13:34 ` Lorenzo Stoakes
@ 2025-05-27 14:23 ` Oleg Nesterov
2025-05-27 16:32 ` Pu Lehui
2025-05-27 15:30 ` Oleg Nesterov
2 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2025-05-27 14:23 UTC (permalink / raw)
To: Pu Lehui
Cc: mhiramat, peterz, akpm, Liam.Howlett, lorenzo.stoakes, vbabka,
jannh, pfalcato, linux-mm, linux-kernel, pulehui
Well, I leave this to you / Lorenzo / David, but...
On 05/27, Pu Lehui wrote:
>
> Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
I don't think that commit could cause this problem.
Oleg.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 2/2] mm/mremap: Expose abnormal new_pte during move_ptes
2025-05-27 13:23 ` [RFC PATCH v2 2/2] mm/mremap: Expose abnormal new_pte during move_ptes Pu Lehui
@ 2025-05-27 14:24 ` Oleg Nesterov
0 siblings, 0 replies; 21+ messages in thread
From: Oleg Nesterov @ 2025-05-27 14:24 UTC (permalink / raw)
To: Pu Lehui
Cc: mhiramat, peterz, akpm, Liam.Howlett, lorenzo.stoakes, vbabka,
jannh, pfalcato, linux-mm, linux-kernel, pulehui
On 05/27, Pu Lehui wrote:
>
> @@ -237,6 +237,8 @@ static int move_ptes(struct pagetable_move_control *pmc,
>
> for (; old_addr < old_end; old_pte++, old_addr += PAGE_SIZE,
> new_pte++, new_addr += PAGE_SIZE) {
> + WARN_ON_ONCE(!pte_none(*new_pte))
it seems you forgot to add ";" after WARN_ON_ONCE ;)
Oleg.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 13:34 ` Lorenzo Stoakes
@ 2025-05-27 15:17 ` Oleg Nesterov
2025-05-27 15:27 ` Lorenzo Stoakes
2025-05-27 15:29 ` Lorenzo Stoakes
2025-05-27 16:29 ` Pu Lehui
2 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2025-05-27 15:17 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Pu Lehui, mhiramat, peterz, akpm, Liam.Howlett, vbabka, jannh,
pfalcato, linux-mm, linux-kernel, pulehui
On 05/27, Lorenzo Stoakes wrote:
>
> I am by the way assuming that uprobes work by installing a special PTE at
> page offset 0 and only in the case where there is something installed here
> do we need to worry.
perhaps I misunderstood you but no...
Basically, we have uprobe_register(struct inode *inode, loff_t offset, ...).
If / when a process mmaps this inode/file and the (new) VMA includes this offset,
we need to call uprobe_mmap() to install the breakpoint at the virtual address
which corresponds to this offset.
Oleg.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 15:17 ` Oleg Nesterov
@ 2025-05-27 15:27 ` Lorenzo Stoakes
0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-05-27 15:27 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Pu Lehui, mhiramat, peterz, akpm, Liam.Howlett, vbabka, jannh,
pfalcato, linux-mm, linux-kernel, pulehui
On Tue, May 27, 2025 at 05:17:15PM +0200, Oleg Nesterov wrote:
> On 05/27, Lorenzo Stoakes wrote:
> >
> > I am by the way assuming that uprobes work by installing a special PTE at
> > page offset 0 and only in the case where there is something installed here
> > do we need to worry.
>
> perhaps I misunderstood you but no...
>
> Basically, we have uprobe_register(struct inode *inode, loff_t offset, ...).
> If / when a process mmaps this inode/file and the (new) VMA includes this offset,
> we need to call uprobe_mmap() to install the breakpoint at the virtual address
> which corresponds to this offset.
OK this was based on the explanation given re: zero offset.
So presumably then this is determined by the perf_event_attr.probe_offset field?
OK well we can drop the vm_pgoff predicate and things should still work because,
after the move:
A B
|-----------------------||--------------------------|
| VMA to be merged with || VMA being moved in place |
|-----------------------||--------------------------|
If A contains the relevant breakpoint(s), then they will stay in place, as
move_page_tables() won't overwrite them.
If B contains them, then the new mechanism that prevents uprobe_mmap() from
installing a new one (orphaning the old) will make everything right.
So we simply take my approach, and drop the vm_pgoff condition and everything
should work fine :)
Obviously let me know if you see any flaw in this.
>
> Oleg.
>
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 13:34 ` Lorenzo Stoakes
2025-05-27 15:17 ` Oleg Nesterov
@ 2025-05-27 15:29 ` Lorenzo Stoakes
2025-05-27 16:29 ` Pu Lehui
2 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-05-27 15:29 UTC (permalink / raw)
To: Pu Lehui
Cc: mhiramat, oleg, peterz, akpm, Liam.Howlett, vbabka, jannh,
pfalcato, linux-mm, linux-kernel, pulehui
See below for corrections coming out of the converastion with Oleg (see [0]).
The approach I suggest should still work fine as far as I can tell with the
mentioned correction.
Thanks!
[0]: https://lore.kernel.org/linux-mm/f2bb8d6a-d131-4c12-871e-0bca5dbab253@lucifer.local/
On Tue, May 27, 2025 at 02:34:41PM +0100, Lorenzo Stoakes wrote:
> Hi Pu,
>
> Just as you were sending this I was about to send you my suggestion :) That
> is hilarious haha. Maybe some cosmic connection? :P
>
> Anyway I don't know whether your approach is _quite_ correct here, you are
> generally disabling the uprobe behaviour on all VMA expansion (that is -
> new VMA merge, relocate_vma_down() (unlikely to be relevant ;), it seems
> sensible to only do so in the circumstances we know to be problematic.
>
> I really do not want mremap-specific stuff in general merge code, it's the
> move page tables bit that makes this broken.
>
> So... :P let me attach my suggestion as a fix-patch below.
>
> TO BE CLEAR - I present this in this form purely to make it easy for you to
> grab, PLEASE take this patch and rework it (if it makes sense), I don't
> need any attribution other than a Suggested-by, maybe.
>
> They are rather close other than where this flag is set :)
>
> I am by the way assuming that uprobes work by installing a special PTE at
> page offset 0 and only in the case where there is something installed here
> do we need to worry.
>
> Please correct me if I'm misunderstanding here, I am not an expert on
> uprobes.
^ This is incorrect, but removing this condition should make everything work
fine.
>
> Thanks! This is a great find overall.
>
> I'd also quite like us to make your repro a self test if possible? Not sure
> where we'd put it... could be in tools/testing/selftests/mm/merge.c
> actually, new one I created explicitly for VMA merge scenarios.
>
> Cheers, Lorenzo
>
> On Tue, May 27, 2025 at 01:23:50PM +0000, Pu Lehui wrote:
> > From: Pu Lehui <pulehui@huawei.com>
> >
> > We encountered a BUG alert triggered by Syzkaller as follows:
> > BUG: Bad rss-counter state mm:00000000b4a60fca type:MM_ANONPAGES val:1
> >
> > And we can reproduce it with the following steps:
> > 1. register uprobe on file at zero offset
> > 2. mmap the file at zero offset:
> > addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
> > 3. mremap part of vma1 to new vma2:
> > addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE);
> > 4. mremap back to orig addr1:
> > mremap(addr2, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
> >
> > In the step 3, the vma1 range [addr1, addr1 + 4096] will be remap to new
> > vma2 with range [addr2, addr2 + 8192], and remap uprobe anon page from
> > the vma1 to vma2, then unmap the vma1 range [addr1, addr1 + 4096].
> > In tht step 4, the vma2 range [addr2, addr2 + 4096] will be remap back
> > to the addr range [addr1, addr1 + 4096]. Since the addr range [addr1 +
> > 4096, addr1 + 8192] still maps the file, it will take
> > vma_merge_new_range to merge these two addr ranges, and then do
> > uprobe_mmap in vma_complete. Since the merged vma pgoff is also zero
> > offset, it will install uprobe anon page to the merged vma. However, the
> > upcomming move_page_tables step, which use set_pte_at to remap the vma2
> > uprobe anon page to the merged vma, will over map the old uprobe anon
> > page in the merged vma, and lead the old uprobe anon page to be orphan.
> >
> > Since the uprobe anon page will be remapped to the merged vma, we can
> > remove the unnecessary uprobe_mmap on merged vma, that is, do not
> > perform uprobe_mmap on expanded vma.
> >
> > This problem was first find in linux-6.6.y and also exists in the
> > community syzkaller:
> > https://lore.kernel.org/all/000000000000ada39605a5e71711@google.com/T/
> >
> > The complete Syzkaller C reproduction program is as follows:
> >
> > #define _GNU_SOURCE
> > #include <fcntl.h>
> > #include <unistd.h>
> > #include <syscall.h>
> > #include <sys/mman.h>
> > #include <linux/perf_event.h>
> >
> > int main(int argc, char *argv[])
> > {
> > int fd = open(FNAME, O_RDWR|O_CREAT, 0600);
> > struct perf_event_attr attr = {
> > .type = 9,
> > .uprobe_path = (long) FNAME,
> > .probe_offset = 0x0,
> > };
> > void *addr1, *addr2;
> >
> > write(fd, "x", 1);
> > mmap(NULL, 4096, PROT_EXEC, MAP_PRIVATE, fd, 0);
> >
> > syscall(__NR_perf_event_open, &attr, 0, 0, -1, 0);
> > addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
> > addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE);
> > mremap(addr2, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
> >
> > return 0;
> > }
> >
> > Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
>
> Nit, but we'll want to cc: stable once we agree on a solution too.
>
> > Signed-off-by: Pu Lehui <pulehui@huawei.com>
> > ---
> > mm/vma.c | 7 ++++++-
> > mm/vma.h | 7 +++++++
> > 2 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/vma.c b/mm/vma.c
> > index 1c6595f282e5..6445f515c7f2 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
> > @@ -358,7 +358,8 @@ static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi,
> >
> > if (vp->file) {
> > i_mmap_unlock_write(vp->mapping);
> > - uprobe_mmap(vp->vma);
> > + if (!vp->skip_vma_uprobe)
> > + uprobe_mmap(vp->vma);
> >
> > if (vp->adj_next)
> > uprobe_mmap(vp->adj_next);
>
> I think we need to cover off the adj_next case too, strictly. See the
> attached patch :P
>
> To be honest the adj_next case won't be relevant, as this is not set for a
> new VMA, but it'd be odd to have a 'skip' or 'no' uprobe option and to not
> obey it in both cases.
>
> > @@ -737,6 +738,7 @@ static int commit_merge(struct vma_merge_struct *vmg)
> > if (vma_iter_prealloc(vmg->vmi, vma))
> > return -ENOMEM;
> >
> > + vp.skip_vma_uprobe = vmg->skip_vma_uprobe;
> > vma_prepare(&vp);
> > /*
> > * THP pages may need to do additional splits if we increase
> > @@ -1151,6 +1153,9 @@ int vma_expand(struct vma_merge_struct *vmg)
> > if (remove_next)
> > vmg->__remove_next = true;
> >
> > + /* skip uprobe_mmap on expanded vma */
> > + vmg->skip_vma_uprobe = true;
> > +
>
> Yeah this doesn't feel right, this will make this happen for all
> vma_expand() cases (including new VMA merge) and that's surely not correct
> for non-mremap cases?
>
> Again see the attached patch :P
>
> > if (commit_merge(vmg))
> > goto nomem;
> >
> > diff --git a/mm/vma.h b/mm/vma.h
> > index 9a8af9be29a8..56cc0364d239 100644
> > --- a/mm/vma.h
> > +++ b/mm/vma.h
> > @@ -19,6 +19,8 @@ struct vma_prepare {
> > struct vm_area_struct *insert;
> > struct vm_area_struct *remove;
> > struct vm_area_struct *remove2;
> > + /* Whether to skip uprobe_mmap on vma */
> > + bool skip_vma_uprobe;
> > };
> >
> > struct unlink_vma_file_batch {
> > @@ -120,6 +122,11 @@ struct vma_merge_struct {
> > */
> > bool give_up_on_oom :1;
> >
> > + /*
> > + * Whether to skip uprobe_mmap on merged vma.
> > + */
> > + bool skip_vma_uprobe :1;
> > +
> > /* Internal flags set during merge process: */
> >
> > /*
> > --
> > 2.34.1
> >
>
> ----8<----
> From 30e39d0f223ae4ab5669584593071f5f7266beeb Mon Sep 17 00:00:00 2001
> From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Date: Tue, 27 May 2025 14:11:26 +0100
> Subject: [PATCH] mm: avoid orphaned uprobe
>
> If, on mremap, me move a file-backed VMA mapped at offset 0 that possess a
> uprobe and it merges with an adjacent VMA, we will then invoke
> uprobe_mmap() once again to install a uprobe into this newly established
> VMA.
As per the converastion with Oleg, this bit of the commit message is mistaken,
drop the bit about the offset obviously... :)
>
> This is problematic, as when we then move the page tables back into place,
> we overwrite the uprobe entry and thus orphan the merge-established uprobe.
>
> Avoid this by threading state to explicitly disallow the establishment of a
> new uprobe upon merge under these circumstances.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
> mm/vma.c | 21 ++++++++++++++++++---
> mm/vma.h | 5 +++++
> 2 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/mm/vma.c b/mm/vma.c
> index 1c6595f282e5..cc18d1dcdc57 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -169,6 +169,9 @@ static void init_multi_vma_prep(struct vma_prepare *vp,
> vp->file = vma->vm_file;
> if (vp->file)
> vp->mapping = vma->vm_file->f_mapping;
> +
> + if (vmg && vmg->no_uprobe)
> + vp->no_uprobe = true;
> }
>
> /*
> @@ -358,10 +361,13 @@ static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi,
>
> if (vp->file) {
> i_mmap_unlock_write(vp->mapping);
> - uprobe_mmap(vp->vma);
>
> - if (vp->adj_next)
> - uprobe_mmap(vp->adj_next);
> + if (!vp->no_uprobe) {
> + uprobe_mmap(vp->vma);
> +
> + if (vp->adj_next)
> + uprobe_mmap(vp->adj_next);
> + }
> }
>
> if (vp->remove) {
> @@ -1830,6 +1836,15 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
> vmg.middle = NULL; /* New VMA range. */
> vmg.pgoff = pgoff;
> vmg.next = vma_iter_next_rewind(&vmi, NULL);
> +
> + /*
> + * If the VMA we are copying might contain a uprobe PTE, ensure that we
> + * do not establish one upon merge. otherwise, when mremap() moves page
> + * tables into place, we'll orphan the one just created.
> + */
> + if (vma->vm_file && vma->vm_pgoff == 0)
As per the conversation with Oleg, please drop the 'vma->vm_pgoff == 0'
condition here, this is not correct.
> + vmg.no_uprobe = true;
> +
> new_vma = vma_merge_new_range(&vmg);
>
> if (new_vma) {
> diff --git a/mm/vma.h b/mm/vma.h
> index 9a8af9be29a8..4c35c5ab1aa2 100644
> --- a/mm/vma.h
> +++ b/mm/vma.h
> @@ -19,6 +19,8 @@ struct vma_prepare {
> struct vm_area_struct *insert;
> struct vm_area_struct *remove;
> struct vm_area_struct *remove2;
> +
> + bool no_uprobe :1;
> };
>
> struct unlink_vma_file_batch {
> @@ -120,6 +122,9 @@ struct vma_merge_struct {
> */
> bool give_up_on_oom :1;
>
> + /* If set, do not install a uprobe upon merge. */
> + bool no_uprobe :1;
> +
> /* Internal flags set during merge process: */
>
> /*
> --
> 2.49.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 13:23 ` [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma " Pu Lehui
2025-05-27 13:34 ` Lorenzo Stoakes
2025-05-27 14:23 ` Oleg Nesterov
@ 2025-05-27 15:30 ` Oleg Nesterov
2025-05-27 15:33 ` Lorenzo Stoakes
2025-05-27 16:35 ` Pu Lehui
2 siblings, 2 replies; 21+ messages in thread
From: Oleg Nesterov @ 2025-05-27 15:30 UTC (permalink / raw)
To: Pu Lehui
Cc: mhiramat, peterz, akpm, Liam.Howlett, lorenzo.stoakes, vbabka,
jannh, pfalcato, linux-mm, linux-kernel, pulehui
Not that this is really important, but the test-case looks broken,
On 05/27, Pu Lehui wrote:
>
> #define _GNU_SOURCE
> #include <fcntl.h>
> #include <unistd.h>
> #include <syscall.h>
> #include <sys/mman.h>
> #include <linux/perf_event.h>
>
> int main(int argc, char *argv[])
> {
> int fd = open(FNAME, O_RDWR|O_CREAT, 0600);
FNAME is not defined
> struct perf_event_attr attr = {
> .type = 9,
Cough ;) Yes I too used perf_event_attr.type == 9 when I wrote another
test-case. Because I am lazy and this is what I see in
/sys/bus/event_source/devices/uprobe/type on my machine.
But me should not assume that perf_pmu_register(&perf_uprobe) -> idr_alloc()
will return 9.
> write(fd, "x", 1);
looks unnecessary.
Oleg.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 15:30 ` Oleg Nesterov
@ 2025-05-27 15:33 ` Lorenzo Stoakes
2025-05-27 15:52 ` Oleg Nesterov
2025-05-27 16:41 ` Pu Lehui
2025-05-27 16:35 ` Pu Lehui
1 sibling, 2 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-05-27 15:33 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Pu Lehui, mhiramat, peterz, akpm, Liam.Howlett, vbabka, jannh,
pfalcato, linux-mm, linux-kernel, pulehui
On Tue, May 27, 2025 at 05:30:08PM +0200, Oleg Nesterov wrote:
> Not that this is really important, but the test-case looks broken,
>
> On 05/27, Pu Lehui wrote:
> >
> > #define _GNU_SOURCE
> > #include <fcntl.h>
> > #include <unistd.h>
> > #include <syscall.h>
> > #include <sys/mman.h>
> > #include <linux/perf_event.h>
> >
> > int main(int argc, char *argv[])
> > {
> > int fd = open(FNAME, O_RDWR|O_CREAT, 0600);
>
> FNAME is not defined
>
> > struct perf_event_attr attr = {
> > .type = 9,
>
> Cough ;) Yes I too used perf_event_attr.type == 9 when I wrote another
> test-case. Because I am lazy and this is what I see in
> /sys/bus/event_source/devices/uprobe/type on my machine.
>
> But me should not assume that perf_pmu_register(&perf_uprobe) -> idr_alloc()
> will return 9.
>
> > write(fd, "x", 1);
>
> looks unnecessary.
>
> Oleg.
>
While I agree we should probably try to do this nicely, in defence of Pu I think
this is adapted from the syzkaller horror show :P and that code does tend to
just insert random integers etc.
It would be good to refine this into something more robust if possible and
ideally add as a self-test, however!
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 15:33 ` Lorenzo Stoakes
@ 2025-05-27 15:52 ` Oleg Nesterov
2025-05-27 16:41 ` Pu Lehui
1 sibling, 0 replies; 21+ messages in thread
From: Oleg Nesterov @ 2025-05-27 15:52 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Pu Lehui, mhiramat, peterz, akpm, Liam.Howlett, vbabka, jannh,
pfalcato, linux-mm, linux-kernel, pulehui
On 05/27, Lorenzo Stoakes wrote:
>
> On Tue, May 27, 2025 at 05:30:08PM +0200, Oleg Nesterov wrote:
> >
> > > struct perf_event_attr attr = {
> > > .type = 9,
> >
> > Cough ;) Yes I too used perf_event_attr.type == 9 when I wrote another
> > test-case. Because I am lazy and this is what I see in
> > /sys/bus/event_source/devices/uprobe/type on my machine.
> >
> > But me should not assume that perf_pmu_register(&perf_uprobe) -> idr_alloc()
> > will return 9.
>
> While I agree we should probably try to do this nicely, in defence of Pu I think
> this is adapted from the syzkaller horror show :P and that code does tend to
> just insert random integers etc.
Not really ;)
syzkaller's reproducer reads /sys/bus/event_source/devices/uprobe/type, see
https://lore.kernel.org/all/20250521092503.3116340-1-pulehui@huaweicloud.com/
In any case. Many thanks to Lehui who investigated this problem!
Oleg.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 13:34 ` Lorenzo Stoakes
2025-05-27 15:17 ` Oleg Nesterov
2025-05-27 15:29 ` Lorenzo Stoakes
@ 2025-05-27 16:29 ` Pu Lehui
2 siblings, 0 replies; 21+ messages in thread
From: Pu Lehui @ 2025-05-27 16:29 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: mhiramat, oleg, peterz, akpm, Liam.Howlett, vbabka, jannh,
pfalcato, linux-mm, linux-kernel, pulehui
On 2025/5/27 21:34, Lorenzo Stoakes wrote:
> Hi Pu,
>
> Just as you were sending this I was about to send you my suggestion :) That
> is hilarious haha. Maybe some cosmic connection? :P
Hi Lorenzo,
haha, so coincidental.😁
>
> Anyway I don't know whether your approach is _quite_ correct here, you are
> generally disabling the uprobe behaviour on all VMA expansion (that is -
> new VMA merge, relocate_vma_down() (unlikely to be relevant ;), it seems
> sensible to only do so in the circumstances we know to be problematic.
agree, we should limit that on copy_vma.
>
> I really do not want mremap-specific stuff in general merge code, it's the
> move page tables bit that makes this broken.
>
> So... :P let me attach my suggestion as a fix-patch below.
>
> TO BE CLEAR - I present this in this form purely to make it easy for you to
> grab, PLEASE take this patch and rework it (if it makes sense), I don't
> need any attribution other than a Suggested-by, maybe.
This is truly appreciate! In fact, this is my first patch related to
memory and uprobe. My original intention was to spark discussion and
attract experts to help handle this matter. I don't place much
importance on the attribution of this patch. Like you, my focus is on
contributing to the resolution of the issue.😁
>
> They are rather close other than where this flag is set :)
>
> I am by the way assuming that uprobes work by installing a special PTE at
> page offset 0 and only in the case where there is something installed here
> do we need to worry.
As Oleg later mentioned, we do not need this restriction, and I just
verified this using the following test case:
#define _GNU_SOURCE
#include <fcntl.h>
#include <unistd.h>
#include <syscall.h>
#include <sys/mman.h>
#include <linux/perf_event.h>
#define FNAME "./foo"
int main(int argc, char *argv[])
{
int fd = open(FNAME, O_RDWR|O_CREAT, 0600);
struct perf_event_attr attr = {
.type = 9,
.size = 72,
.config1 = (long) FNAME,
.config2 = 8192,
};
void *addr1, *addr2;
ftruncate(fd, 4 * 4096);
mmap(NULL, 4096, PROT_EXEC, MAP_PRIVATE, fd, 4096);
syscall(__NR_perf_event_open, &attr, 0, 0, -1, 0);
addr1 = mmap(NULL, 3 * 4096, PROT_NONE, MAP_PRIVATE, fd, 4096);
addr2 = mremap(addr1, 2 * 4096, 3 * 4096, MREMAP_MAYMOVE);
mremap(addr2 + 4096, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED,
addr1 + 4096);
return 0;
}
But one thing confuses me—if mmap is used with zero_offset, it will not
reproduce. Perhaps I need to verify this tomorrow.
#define _GNU_SOURCE
#include <fcntl.h>
#include <unistd.h>
#include <syscall.h>
#include <sys/mman.h>
#include <linux/perf_event.h>
#define FNAME "./foo"
int main(int argc, char *argv[])
{
int fd = open(FNAME, O_RDWR|O_CREAT, 0600);
struct perf_event_attr attr = {
.type = 9,
.size = 72,
.config1 = (long) FNAME,
.config2 = 8192,
};
void *addr1, *addr2;
ftruncate(fd, 4 * 4096);
mmap(NULL, 4096, PROT_EXEC, MAP_PRIVATE, fd, 0);
syscall(__NR_perf_event_open, &attr, 0, 0, -1, 0);
addr1 = mmap(NULL, 4 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
addr2 = mremap(addr1, 3 * 4096, 4 * 4096, MREMAP_MAYMOVE);
mremap(addr2 + 2 * 4096, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED,
addr1 + 2 * 4096);
return 0;
}
>
> Please correct me if I'm misunderstanding here, I am not an expert on
> uprobes.
>
> Thanks! This is a great find overall.
>
> I'd also quite like us to make your repro a self test if possible? Not sure
> where we'd put it... could be in tools/testing/selftests/mm/merge.c
> actually, new one I created explicitly for VMA merge scenarios.
I will have a try!
>
> Cheers, Lorenzo
>
> On Tue, May 27, 2025 at 01:23:50PM +0000, Pu Lehui wrote:
>> From: Pu Lehui <pulehui@huawei.com>
>>
>> We encountered a BUG alert triggered by Syzkaller as follows:
>> BUG: Bad rss-counter state mm:00000000b4a60fca type:MM_ANONPAGES val:1
>>
>> And we can reproduce it with the following steps:
>> 1. register uprobe on file at zero offset
>> 2. mmap the file at zero offset:
>> addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
>> 3. mremap part of vma1 to new vma2:
>> addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE);
>> 4. mremap back to orig addr1:
>> mremap(addr2, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
>>
>> In the step 3, the vma1 range [addr1, addr1 + 4096] will be remap to new
>> vma2 with range [addr2, addr2 + 8192], and remap uprobe anon page from
>> the vma1 to vma2, then unmap the vma1 range [addr1, addr1 + 4096].
>> In tht step 4, the vma2 range [addr2, addr2 + 4096] will be remap back
>> to the addr range [addr1, addr1 + 4096]. Since the addr range [addr1 +
>> 4096, addr1 + 8192] still maps the file, it will take
>> vma_merge_new_range to merge these two addr ranges, and then do
>> uprobe_mmap in vma_complete. Since the merged vma pgoff is also zero
>> offset, it will install uprobe anon page to the merged vma. However, the
>> upcomming move_page_tables step, which use set_pte_at to remap the vma2
>> uprobe anon page to the merged vma, will over map the old uprobe anon
>> page in the merged vma, and lead the old uprobe anon page to be orphan.
>>
>> Since the uprobe anon page will be remapped to the merged vma, we can
>> remove the unnecessary uprobe_mmap on merged vma, that is, do not
>> perform uprobe_mmap on expanded vma.
>>
>> This problem was first find in linux-6.6.y and also exists in the
>> community syzkaller:
>> https://lore.kernel.org/all/000000000000ada39605a5e71711@google.com/T/
>>
>> The complete Syzkaller C reproduction program is as follows:
>>
>> #define _GNU_SOURCE
>> #include <fcntl.h>
>> #include <unistd.h>
>> #include <syscall.h>
>> #include <sys/mman.h>
>> #include <linux/perf_event.h>
>>
>> int main(int argc, char *argv[])
>> {
>> int fd = open(FNAME, O_RDWR|O_CREAT, 0600);
>> struct perf_event_attr attr = {
>> .type = 9,
>> .uprobe_path = (long) FNAME,
>> .probe_offset = 0x0,
>> };
>> void *addr1, *addr2;
>>
>> write(fd, "x", 1);
>> mmap(NULL, 4096, PROT_EXEC, MAP_PRIVATE, fd, 0);
>>
>> syscall(__NR_perf_event_open, &attr, 0, 0, -1, 0);
>> addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
>> addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE);
>> mremap(addr2, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
>>
>> return 0;
>> }
>>
>> Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
>
> Nit, but we'll want to cc: stable once we agree on a solution too.
add `Cc: stable@vger.kernel.org` above Fixes tag?
>
>> Signed-off-by: Pu Lehui <pulehui@huawei.com>
>> ---
>> mm/vma.c | 7 ++++++-
>> mm/vma.h | 7 +++++++
>> 2 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/vma.c b/mm/vma.c
>> index 1c6595f282e5..6445f515c7f2 100644
>> --- a/mm/vma.c
>> +++ b/mm/vma.c
>> @@ -358,7 +358,8 @@ static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi,
>>
>> if (vp->file) {
>> i_mmap_unlock_write(vp->mapping);
>> - uprobe_mmap(vp->vma);
>> + if (!vp->skip_vma_uprobe)
>> + uprobe_mmap(vp->vma);
>>
>> if (vp->adj_next)
>> uprobe_mmap(vp->adj_next);
>
> I think we need to cover off the adj_next case too, strictly. See the
> attached patch :P
>
> To be honest the adj_next case won't be relevant, as this is not set for a
> new VMA, but it'd be odd to have a 'skip' or 'no' uprobe option and to not
> obey it in both cases.
alright, I haven't thought of any possible situations yet. If there are
any problems later, we can add it back.
>
>> @@ -737,6 +738,7 @@ static int commit_merge(struct vma_merge_struct *vmg)
>> if (vma_iter_prealloc(vmg->vmi, vma))
>> return -ENOMEM;
>>
>> + vp.skip_vma_uprobe = vmg->skip_vma_uprobe;
>> vma_prepare(&vp);
>> /*
>> * THP pages may need to do additional splits if we increase
>> @@ -1151,6 +1153,9 @@ int vma_expand(struct vma_merge_struct *vmg)
>> if (remove_next)
>> vmg->__remove_next = true;
>>
>> + /* skip uprobe_mmap on expanded vma */
>> + vmg->skip_vma_uprobe = true;
>> +
>
> Yeah this doesn't feel right, this will make this happen for all
> vma_expand() cases (including new VMA merge) and that's surely not correct
> for non-mremap cases?
agree!
>
> Again see the attached patch :P
>
>> if (commit_merge(vmg))
>> goto nomem;
>>
>> diff --git a/mm/vma.h b/mm/vma.h
>> index 9a8af9be29a8..56cc0364d239 100644
>> --- a/mm/vma.h
>> +++ b/mm/vma.h
>> @@ -19,6 +19,8 @@ struct vma_prepare {
>> struct vm_area_struct *insert;
>> struct vm_area_struct *remove;
>> struct vm_area_struct *remove2;
>> + /* Whether to skip uprobe_mmap on vma */
>> + bool skip_vma_uprobe;
>> };
>>
>> struct unlink_vma_file_batch {
>> @@ -120,6 +122,11 @@ struct vma_merge_struct {
>> */
>> bool give_up_on_oom :1;
>>
>> + /*
>> + * Whether to skip uprobe_mmap on merged vma.
>> + */
>> + bool skip_vma_uprobe :1;
>> +
>> /* Internal flags set during merge process: */
>>
>> /*
>> --
>> 2.34.1
>>
>
> ----8<----
>>From 30e39d0f223ae4ab5669584593071f5f7266beeb Mon Sep 17 00:00:00 2001
> From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Date: Tue, 27 May 2025 14:11:26 +0100
> Subject: [PATCH] mm: avoid orphaned uprobe
>
> If, on mremap, me move a file-backed VMA mapped at offset 0 that possess a
> uprobe and it merges with an adjacent VMA, we will then invoke
> uprobe_mmap() once again to install a uprobe into this newly established
> VMA.
>
> This is problematic, as when we then move the page tables back into place,
> we overwrite the uprobe entry and thus orphan the merge-established uprobe.
>
> Avoid this by threading state to explicitly disallow the establishment of a
> new uprobe upon merge under these circumstances.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
> mm/vma.c | 21 ++++++++++++++++++---
> mm/vma.h | 5 +++++
> 2 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/mm/vma.c b/mm/vma.c
> index 1c6595f282e5..cc18d1dcdc57 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -169,6 +169,9 @@ static void init_multi_vma_prep(struct vma_prepare *vp,
> vp->file = vma->vm_file;
> if (vp->file)
> vp->mapping = vma->vm_file->f_mapping;
> +
> + if (vmg && vmg->no_uprobe)
> + vp->no_uprobe = true;
> }
>
> /*
> @@ -358,10 +361,13 @@ static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi,
>
> if (vp->file) {
> i_mmap_unlock_write(vp->mapping);
> - uprobe_mmap(vp->vma);
>
> - if (vp->adj_next)
> - uprobe_mmap(vp->adj_next);
> + if (!vp->no_uprobe) {
> + uprobe_mmap(vp->vma);
> +
> + if (vp->adj_next)
> + uprobe_mmap(vp->adj_next);
> + }
> }
>
> if (vp->remove) {
> @@ -1830,6 +1836,15 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
> vmg.middle = NULL; /* New VMA range. */
> vmg.pgoff = pgoff;
> vmg.next = vma_iter_next_rewind(&vmi, NULL);
> +
> + /*
> + * If the VMA we are copying might contain a uprobe PTE, ensure that we
> + * do not establish one upon merge. otherwise, when mremap() moves page
> + * tables into place, we'll orphan the one just created.
> + */
> + if (vma->vm_file && vma->vm_pgoff == 0)
> + vmg.no_uprobe = true;
> +
> new_vma = vma_merge_new_range(&vmg);
>
> if (new_vma) {
> diff --git a/mm/vma.h b/mm/vma.h
> index 9a8af9be29a8..4c35c5ab1aa2 100644
> --- a/mm/vma.h
> +++ b/mm/vma.h
> @@ -19,6 +19,8 @@ struct vma_prepare {
> struct vm_area_struct *insert;
> struct vm_area_struct *remove;
> struct vm_area_struct *remove2;
> +
> + bool no_uprobe :1;
> };
>
> struct unlink_vma_file_batch {
> @@ -120,6 +122,9 @@ struct vma_merge_struct {
> */
> bool give_up_on_oom :1;
>
> + /* If set, do not install a uprobe upon merge. */
> + bool no_uprobe :1;
> +
> /* Internal flags set during merge process: */
>
> /*
> --
> 2.49.0
nice, thanks Lorenzo!
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 14:23 ` Oleg Nesterov
@ 2025-05-27 16:32 ` Pu Lehui
2025-05-27 16:37 ` Lorenzo Stoakes
2025-05-27 17:20 ` Oleg Nesterov
0 siblings, 2 replies; 21+ messages in thread
From: Pu Lehui @ 2025-05-27 16:32 UTC (permalink / raw)
To: Oleg Nesterov
Cc: mhiramat, peterz, akpm, Liam.Howlett, lorenzo.stoakes, vbabka,
jannh, pfalcato, linux-mm, linux-kernel, pulehui
On 2025/5/27 22:23, Oleg Nesterov wrote:
> Well, I leave this to you / Lorenzo / David, but...
>
> On 05/27, Pu Lehui wrote:
>>
>> Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
>
> I don't think that commit could cause this problem.
Hi Oleg,
Me too! I was test that before and after commit 78a320542e6c, so call it
the `directly related commit`. In fact, I think the issue was introduced
in the original commit 2b1444983508 ("uprobes, mm, x86: Add the ability
to install and remove uprobes breakpoints") # v3.5-rc1.
>
> Oleg.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 15:30 ` Oleg Nesterov
2025-05-27 15:33 ` Lorenzo Stoakes
@ 2025-05-27 16:35 ` Pu Lehui
1 sibling, 0 replies; 21+ messages in thread
From: Pu Lehui @ 2025-05-27 16:35 UTC (permalink / raw)
To: Oleg Nesterov
Cc: mhiramat, peterz, akpm, Liam.Howlett, lorenzo.stoakes, vbabka,
jannh, pfalcato, linux-mm, linux-kernel, pulehui
On 2025/5/27 23:30, Oleg Nesterov wrote:
> Not that this is really important, but the test-case looks broken,
>
> On 05/27, Pu Lehui wrote:
>>
>> #define _GNU_SOURCE
>> #include <fcntl.h>
>> #include <unistd.h>
>> #include <syscall.h>
>> #include <sys/mman.h>
>> #include <linux/perf_event.h>
>>
>> int main(int argc, char *argv[])
>> {
>> int fd = open(FNAME, O_RDWR|O_CREAT, 0600);
>
> FNAME is not defined
>
>> struct perf_event_attr attr = {
>> .type = 9,
>
> Cough ;) Yes I too used perf_event_attr.type == 9 when I wrote another
> test-case. Because I am lazy and this is what I see in
> /sys/bus/event_source/devices/uprobe/type on my machine.
>
> But me should not assume that perf_pmu_register(&perf_uprobe) -> idr_alloc()
> will return 9.
>
>> write(fd, "x", 1);
>
> looks unnecessary.
>
> Oleg.
Oops...Thanks Oleg, I think I should thoroughly verify it. Perhaps I
shouldn't be working on patches so late at night.🥱
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 16:32 ` Pu Lehui
@ 2025-05-27 16:37 ` Lorenzo Stoakes
2025-05-27 16:52 ` Pu Lehui
2025-05-27 17:20 ` Oleg Nesterov
1 sibling, 1 reply; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-05-27 16:37 UTC (permalink / raw)
To: Pu Lehui
Cc: Oleg Nesterov, mhiramat, peterz, akpm, Liam.Howlett, vbabka,
jannh, pfalcato, linux-mm, linux-kernel, pulehui
On Wed, May 28, 2025 at 12:32:57AM +0800, Pu Lehui wrote:
>
> On 2025/5/27 22:23, Oleg Nesterov wrote:
> > Well, I leave this to you / Lorenzo / David, but...
> >
> > On 05/27, Pu Lehui wrote:
> > >
> > > Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
> >
> > I don't think that commit could cause this problem.
>
> Hi Oleg,
>
> Me too! I was test that before and after commit 78a320542e6c, so call it the
> `directly related commit`. In fact, I think the issue was introduced in the
> original commit 2b1444983508 ("uprobes, mm, x86: Add the ability to install
> and remove uprobes breakpoints") # v3.5-rc1.
Yeah I missed this! I concure that the cited fixes commit is probably not it :P
Your repro is delightfully consistent so could you bisect to be sure?
BTW did you just say v3.5... Good Lord :)
>
> >
> > Oleg.
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 15:33 ` Lorenzo Stoakes
2025-05-27 15:52 ` Oleg Nesterov
@ 2025-05-27 16:41 ` Pu Lehui
1 sibling, 0 replies; 21+ messages in thread
From: Pu Lehui @ 2025-05-27 16:41 UTC (permalink / raw)
To: Lorenzo Stoakes, Oleg Nesterov
Cc: mhiramat, peterz, akpm, Liam.Howlett, vbabka, jannh, pfalcato,
linux-mm, linux-kernel, pulehui
On 2025/5/27 23:33, Lorenzo Stoakes wrote:
> On Tue, May 27, 2025 at 05:30:08PM +0200, Oleg Nesterov wrote:
>> Not that this is really important, but the test-case looks broken,
>>
>> On 05/27, Pu Lehui wrote:
>>>
>>> #define _GNU_SOURCE
>>> #include <fcntl.h>
>>> #include <unistd.h>
>>> #include <syscall.h>
>>> #include <sys/mman.h>
>>> #include <linux/perf_event.h>
>>>
>>> int main(int argc, char *argv[])
>>> {
>>> int fd = open(FNAME, O_RDWR|O_CREAT, 0600);
>>
>> FNAME is not defined
>>
>>> struct perf_event_attr attr = {
>>> .type = 9,
>>
>> Cough ;) Yes I too used perf_event_attr.type == 9 when I wrote another
>> test-case. Because I am lazy and this is what I see in
>> /sys/bus/event_source/devices/uprobe/type on my machine.
>>
>> But me should not assume that perf_pmu_register(&perf_uprobe) -> idr_alloc()
>> will return 9.
>>
>>> write(fd, "x", 1);
>>
>> looks unnecessary.
>>
>> Oleg.
>>
>
> While I agree we should probably try to do this nicely, in defence of Pu I think
> this is adapted from the syzkaller horror show :P and that code does tend to
> just insert random integers etc.
>
> It would be good to refine this into something more robust if possible and
> ideally add as a self-test, however!
Yeah, just trying to make the commit message more compact, but miss a
lot. Will do better next.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 16:37 ` Lorenzo Stoakes
@ 2025-05-27 16:52 ` Pu Lehui
0 siblings, 0 replies; 21+ messages in thread
From: Pu Lehui @ 2025-05-27 16:52 UTC (permalink / raw)
To: Lorenzo Stoakes, Pu Lehui
Cc: Oleg Nesterov, mhiramat, peterz, akpm, Liam.Howlett, vbabka,
jannh, pfalcato, linux-mm, linux-kernel
On 2025/5/28 0:37, Lorenzo Stoakes wrote:
> On Wed, May 28, 2025 at 12:32:57AM +0800, Pu Lehui wrote:
>>
>> On 2025/5/27 22:23, Oleg Nesterov wrote:
>>> Well, I leave this to you / Lorenzo / David, but...
>>>
>>> On 05/27, Pu Lehui wrote:
>>>>
>>>> Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
>>>
>>> I don't think that commit could cause this problem.
>>
>> Hi Oleg,
>>
>> Me too! I was test that before and after commit 78a320542e6c, so call it the
>> `directly related commit`. In fact, I think the issue was introduced in the
>> original commit 2b1444983508 ("uprobes, mm, x86: Add the ability to install
>> and remove uprobes breakpoints") # v3.5-rc1.
>
> Yeah I missed this! I concure that the cited fixes commit is probably not it :P
>
> Your repro is delightfully consistent so could you bisect to be sure?
OK, I will try that tomorrow. PS: It is hard to work repro that for the
so old kernel.😅
>
> BTW did you just say v3.5... Good Lord :)
>
>>
>>>
>>> Oleg.
>>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 16:32 ` Pu Lehui
2025-05-27 16:37 ` Lorenzo Stoakes
@ 2025-05-27 17:20 ` Oleg Nesterov
2025-05-29 15:09 ` Pu Lehui
1 sibling, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2025-05-27 17:20 UTC (permalink / raw)
To: Pu Lehui
Cc: mhiramat, peterz, akpm, Liam.Howlett, lorenzo.stoakes, vbabka,
jannh, pfalcato, linux-mm, linux-kernel, pulehui
Hi Lehui,
On 05/28, Pu Lehui wrote:
>
> On 2025/5/27 22:23, Oleg Nesterov wrote:
> >Well, I leave this to you / Lorenzo / David, but...
> >
> >On 05/27, Pu Lehui wrote:
> >>
> >>Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
> >
> >I don't think that commit could cause this problem.
>
> Hi Oleg,
>
> Me too! I was test that before and after commit 78a320542e6c, so call it the
> `directly related commit`.
I feel I am totally confused...
but _may be_ you have used the initial reproducer which used PROT_NONE in
void *addr2 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
?
If yes. I _think_ we should have the same problem with or without 78a320542e6c,
just you need to s/PROT_NONE/PROT_EXEC/.
> In fact, I think the issue was introduced in the
> original commit 2b1444983508 ("uprobes, mm, x86: Add the ability to install
> and remove uprobes breakpoints") # v3.5-rc1.
probably yes... Damn I don't know ;)
Oleg.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-27 17:20 ` Oleg Nesterov
@ 2025-05-29 15:09 ` Pu Lehui
2025-05-29 15:12 ` Lorenzo Stoakes
0 siblings, 1 reply; 21+ messages in thread
From: Pu Lehui @ 2025-05-29 15:09 UTC (permalink / raw)
To: Oleg Nesterov, lorenzo.stoakes
Cc: mhiramat, peterz, akpm, Liam.Howlett, vbabka, jannh, pfalcato,
linux-mm, linux-kernel, pulehui
On 2025/5/28 1:20, Oleg Nesterov wrote:
> Hi Lehui,
>
> On 05/28, Pu Lehui wrote:
>>
>> On 2025/5/27 22:23, Oleg Nesterov wrote:
>>> Well, I leave this to you / Lorenzo / David, but...
>>>
>>> On 05/27, Pu Lehui wrote:
>>>>
>>>> Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
>>>
>>> I don't think that commit could cause this problem.
>>
>> Hi Oleg,
>>
>> Me too! I was test that before and after commit 78a320542e6c, so call it the
>> `directly related commit`.
>
> I feel I am totally confused...
>
> but _may be_ you have used the initial reproducer which used PROT_NONE in
>
> void *addr2 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
>
> ?
>
> If yes. I _think_ we should have the same problem with or without 78a320542e6c,
> just you need to s/PROT_NONE/PROT_EXEC/.
>
>> In fact, I think the issue was introduced in the
>> original commit 2b1444983508 ("uprobes, mm, x86: Add the ability to install
>> and remove uprobes breakpoints") # v3.5-rc1.
>
> probably yes... Damn I don't know ;)
>
> Oleg.
Hi Oleg, Lorenzo,
Upon verification, the issue was first introduced by the commit
2b1444983508 ("uprobes, mm, x86: Add the ability to install and remove
uprobes breakpoints"). Uprobe only became available for user use after
commit f3f096cfedf8 ("tracing: Provide trace events interface for
uprobes"), but at that time, the issue was obscured by another
problem—specifically, the always failure of uprobe_mmap processing for
the newly allocated new_vma during copy_vma. After commit 6dab3cc078e3
("uprobes: Remove copy_vma()->uprobe_mmap()") addressed that, the
original issue was exposed.
Therefore, I believe the Fixes tag should best reference commit
2b1444983508 ("uprobes, mm, x86: Add the ability to install and remove
uprobes breakpoints").
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma during mremap
2025-05-29 15:09 ` Pu Lehui
@ 2025-05-29 15:12 ` Lorenzo Stoakes
0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-05-29 15:12 UTC (permalink / raw)
To: Pu Lehui
Cc: Oleg Nesterov, mhiramat, peterz, akpm, Liam.Howlett, vbabka,
jannh, pfalcato, linux-mm, linux-kernel, pulehui
On Thu, May 29, 2025 at 11:09:47PM +0800, Pu Lehui wrote:
>
>
> On 2025/5/28 1:20, Oleg Nesterov wrote:
> > Hi Lehui,
> >
> > On 05/28, Pu Lehui wrote:
> > >
> > > On 2025/5/27 22:23, Oleg Nesterov wrote:
> > > > Well, I leave this to you / Lorenzo / David, but...
> > > >
> > > > On 05/27, Pu Lehui wrote:
> > > > >
> > > > > Fixes: 78a320542e6c ("uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC")
> > > >
> > > > I don't think that commit could cause this problem.
> > >
> > > Hi Oleg,
> > >
> > > Me too! I was test that before and after commit 78a320542e6c, so call it the
> > > `directly related commit`.
> >
> > I feel I am totally confused...
> >
> > but _may be_ you have used the initial reproducer which used PROT_NONE in
> >
> > void *addr2 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
> >
> > ?
> >
> > If yes. I _think_ we should have the same problem with or without 78a320542e6c,
> > just you need to s/PROT_NONE/PROT_EXEC/.
> >
> > > In fact, I think the issue was introduced in the
> > > original commit 2b1444983508 ("uprobes, mm, x86: Add the ability to install
> > > and remove uprobes breakpoints") # v3.5-rc1.
> >
> > probably yes... Damn I don't know ;)
> >
> > Oleg.
>
> Hi Oleg, Lorenzo,
>
> Upon verification, the issue was first introduced by the commit 2b1444983508
> ("uprobes, mm, x86: Add the ability to install and remove uprobes
> breakpoints"). Uprobe only became available for user use after commit
> f3f096cfedf8 ("tracing: Provide trace events interface for uprobes"), but at
> that time, the issue was obscured by another problem—specifically, the
> always failure of uprobe_mmap processing for the newly allocated new_vma
> during copy_vma. After commit 6dab3cc078e3 ("uprobes: Remove
> copy_vma()->uprobe_mmap()") addressed that, the original issue was exposed.
>
> Therefore, I believe the Fixes tag should best reference commit 2b1444983508
> ("uprobes, mm, x86: Add the ability to install and remove uprobes
> breakpoints").
>
Wow, thanks so much for putting in the legwork to figure that
out. 2012... yikes!
Sorry, that probably was quite painful to bisect to/extract from the
obscuring commit, but useful to have the right fixes commit :>)
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-05-29 15:14 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-27 13:23 [RFC PATCH v2 0/2] Fix uprobe anon page be overwritten during mremap Pu Lehui
2025-05-27 13:23 ` [RFC PATCH v2 1/2] mm/mremap: Fix uprobe anon page be overwritten when expanding vma " Pu Lehui
2025-05-27 13:34 ` Lorenzo Stoakes
2025-05-27 15:17 ` Oleg Nesterov
2025-05-27 15:27 ` Lorenzo Stoakes
2025-05-27 15:29 ` Lorenzo Stoakes
2025-05-27 16:29 ` Pu Lehui
2025-05-27 14:23 ` Oleg Nesterov
2025-05-27 16:32 ` Pu Lehui
2025-05-27 16:37 ` Lorenzo Stoakes
2025-05-27 16:52 ` Pu Lehui
2025-05-27 17:20 ` Oleg Nesterov
2025-05-29 15:09 ` Pu Lehui
2025-05-29 15:12 ` Lorenzo Stoakes
2025-05-27 15:30 ` Oleg Nesterov
2025-05-27 15:33 ` Lorenzo Stoakes
2025-05-27 15:52 ` Oleg Nesterov
2025-05-27 16:41 ` Pu Lehui
2025-05-27 16:35 ` Pu Lehui
2025-05-27 13:23 ` [RFC PATCH v2 2/2] mm/mremap: Expose abnormal new_pte during move_ptes Pu Lehui
2025-05-27 14:24 ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).