* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-05-30 17:54 ` Peter Zijlstra
@ 2012-05-30 18:03 ` Peter Zijlstra
2012-05-30 18:04 ` Srikar Dronamraju
2012-05-31 18:53 ` Oleg Nesterov
2 siblings, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2012-05-30 18:03 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Oleg Nesterov, Ingo Molnar, Ananth N Mavinakayanahalli,
Anton Arapov, Linus Torvalds, Masami Hiramatsu, linux-kernel
On Wed, 2012-05-30 at 19:54 +0200, Peter Zijlstra wrote:
> Something like so?
---
kernel/events/uprobes.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 985be4d..16f986d 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -45,6 +45,19 @@ static DEFINE_SPINLOCK(uprobes_treelock); /* serialize rbtree access */
#define UPROBES_HASH_SZ 13
+/*
+ * We need separate register/unregister and mmap/munmap lock hashes because of
+ * mmap_sem nesting.
+ *
+ * {,un}egister_uprobe() need to install probes on (potentially) all processes
+ * and thus need to acquire multiple mmap_sems (consequtively, not
+ * concurrently), whereas uprobe_m{,un}map() are called while holding mmap_sem
+ * for the particular process doing the mmap.
+ *
+ * This all means that register_uprobe() can race with uprobe_mmap() and we
+ * can try and install a probe where one is already installed.
+ */
+
/* serialize (un)register */
static struct mutex uprobes_mutex[UPROBES_HASH_SZ];
@@ -356,6 +369,9 @@ int __weak set_swbp(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned
{
int result;
+ /*
+ * See the comment near uprobes_hash().
+ */
result = is_swbp_at_addr(mm, vaddr);
if (result == 1)
return -EEXIST;
@@ -870,6 +886,10 @@ static int register_for_each_vma(struct uprobe *uprobe, bool is_register)
up_read(&mm->mmap_sem);
mmput(mm);
if (is_register) {
+ /*
+ * We can race against uprobe_mmap() see the comment
+ * near uprobe_hash().
+ */
if (ret && ret == -EEXIST)
ret = 0;
if (ret)
@@ -1080,7 +1100,10 @@ int uprobe_mmap(struct vm_area_struct *vma)
ret = install_breakpoint(uprobe, vma->vm_mm, vma, vaddr);
- /* Ignore double add: */
+ /*
+ * We can race against register_uprobe(), see the
+ * comment near uprobe_hash().
+ */
if (ret == -EEXIST) {
ret = 0;
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-05-30 17:54 ` Peter Zijlstra
2012-05-30 18:03 ` Peter Zijlstra
@ 2012-05-30 18:04 ` Srikar Dronamraju
2012-05-30 18:21 ` Peter Zijlstra
2012-05-31 18:53 ` Oleg Nesterov
2 siblings, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2012-05-30 18:04 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Oleg Nesterov, Ingo Molnar, Ananth N Mavinakayanahalli,
Anton Arapov, Linus Torvalds, Masami Hiramatsu, linux-kernel
>
> So its the -EEXIST from set_swbp() that I was thinking about. I think I
> was also wrong on the reason that that can happen. register's vma
> iteration is very careful not to have the same vma twice, but it can
> race against mmap() because of the uprobe_hash() vs uprobe_mmap_hash()
> madness, right?
right.
>
> Something like so?
>
> ---
> kernel/events/uprobes.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 985be4d..b4e749e 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -45,6 +45,19 @@ static DEFINE_SPINLOCK(uprobes_treelock); /* serialize rbtree access */
>
> #define UPROBES_HASH_SZ 13
>
> +/*
> + * We need separate register/unregister and mmap/munmap lock hashes because of
> + * mmap_sem nesting.
> + *
> + * {,un}egister_uprobe() needs to install probes on (potentially) all processes
> + * and thus need to acquire multiple mmap_sems (consecutively, not
> + * concurrently), whereas uprobe_m{,un}map() is called while holding mmap_sem
> + * for the particular process doing the mmap.
> + *
> + * This all means that register_uprobe() can race with uprobe_mmap() and we
> + * can try and install a probe where one is already installed.
> + */
Nit: {,un}egister_uprobe should have been uprobe_{,un}register at
couple of places.
> +
> /* serialize (un)register */
> static struct mutex uprobes_mutex[UPROBES_HASH_SZ];
>
> @@ -356,6 +369,9 @@ int __weak set_swbp(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned
> {
> int result;
>
> + /*
> + * See the comment near uprobes_hash().
> + */
> result = is_swbp_at_addr(mm, vaddr);
> if (result == 1)
> return -EEXIST;
>
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-05-30 18:04 ` Srikar Dronamraju
@ 2012-05-30 18:21 ` Peter Zijlstra
0 siblings, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2012-05-30 18:21 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Oleg Nesterov, Ingo Molnar, Ananth N Mavinakayanahalli,
Anton Arapov, Linus Torvalds, Masami Hiramatsu, linux-kernel
On Wed, 2012-05-30 at 23:34 +0530, Srikar Dronamraju wrote:
> Nit: {,un}egister_uprobe should have been uprobe_{,un}register at
> couple of places.
Ah indeed.. very silly.. hopefully last version.
---
Subject: uprobe: Document uprobe_register() vs uprobe_mmap() race
Because the mind is treacherous and makes us forget we need to write
stuff down.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
kernel/events/uprobes.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 985be4d..fd6fb30 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -45,6 +45,23 @@ static DEFINE_SPINLOCK(uprobes_treelock); /* serialize rbtree access */
#define UPROBES_HASH_SZ 13
+/*
+ * We need separate register/unregister and mmap/munmap lock hashes because of
+ * mmap_sem nesting.
+ *
+ * uprobe_register() needs to install probes on (potentially) all processes
+ * and thus needs to acquire multiple mmap_sems (consequtively, not
+ * concurrently), whereas uprobe_mmap() is called while holding mmap_sem
+ * for the particular process doing the mmap.
+ *
+ * uprobe_register()->register_for_each_vma() needs to drop/acquire mmap_sem
+ * because of lock order against i_mmap_mutex. This means there's a hole in the
+ * register vma iteration where a mmap() can happen.
+ *
+ * Thus uprobe_register() can race with uprobe_mmap() and we can try and
+ * install a probe where one is already installed.
+ */
+
/* serialize (un)register */
static struct mutex uprobes_mutex[UPROBES_HASH_SZ];
@@ -356,6 +373,9 @@ int __weak set_swbp(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned
{
int result;
+ /*
+ * See the comment near uprobes_hash().
+ */
result = is_swbp_at_addr(mm, vaddr);
if (result == 1)
return -EEXIST;
@@ -870,6 +890,10 @@ static int register_for_each_vma(struct uprobe *uprobe, bool is_register)
up_read(&mm->mmap_sem);
mmput(mm);
if (is_register) {
+ /*
+ * We can race against uprobe_mmap() see the comment
+ * near uprobe_hash().
+ */
if (ret && ret == -EEXIST)
ret = 0;
if (ret)
@@ -1080,7 +1104,10 @@ int uprobe_mmap(struct vm_area_struct *vma)
ret = install_breakpoint(uprobe, vma->vm_mm, vma, vaddr);
- /* Ignore double add: */
+ /*
+ * We can race against uprobe_register(), see the
+ * comment near uprobe_hash().
+ */
if (ret == -EEXIST) {
ret = 0;
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-05-30 17:54 ` Peter Zijlstra
2012-05-30 18:03 ` Peter Zijlstra
2012-05-30 18:04 ` Srikar Dronamraju
@ 2012-05-31 18:53 ` Oleg Nesterov
2012-06-01 15:53 ` Oleg Nesterov
2012-06-01 16:47 ` Srikar Dronamraju
2 siblings, 2 replies; 24+ messages in thread
From: Oleg Nesterov @ 2012-05-31 18:53 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Srikar Dronamraju, Ingo Molnar, Ananth N Mavinakayanahalli,
Anton Arapov, Linus Torvalds, Masami Hiramatsu, linux-kernel
On 05/30, Peter Zijlstra wrote:
>
> register's vma
> iteration is very careful not to have the same vma twice,
Hmm. I am wondering if it is careful enough...
Just in case, I think your patch is great. But it seems to me
there is another problem.
__find_next_vma_info() checks tmpvi->mm == vma->vm_mm to detect the
already visited mm/vma. However, afaics this can be false positive?
The caller, register_for_each_vma(), does mmput() and after that
this memory can be freed and re-used as another mm_struct.
I'll recheck this, but perhaps we need something like below?
Oleg.
--- x/kernel/events/uprobes.c
+++ x/kernel/events/uprobes.c
@@ -851,7 +851,6 @@ static int register_for_each_vma(struct
list_del(&vi->probe_list);
kfree(vi);
up_write(&mm->mmap_sem);
- mmput(mm);
continue;
}
vaddr = vma_address(vma, uprobe->offset);
@@ -860,7 +859,6 @@ static int register_for_each_vma(struct
list_del(&vi->probe_list);
kfree(vi);
up_write(&mm->mmap_sem);
- mmput(mm);
continue;
}
@@ -870,7 +868,6 @@ static int register_for_each_vma(struct
remove_breakpoint(uprobe, mm, vi->vaddr);
up_write(&mm->mmap_sem);
- mmput(mm);
if (is_register) {
if (ret && ret == -EEXIST)
ret = 0;
@@ -881,6 +878,7 @@ static int register_for_each_vma(struct
list_for_each_entry_safe(vi, tmpvi, &try_list, probe_list) {
list_del(&vi->probe_list);
+ mmput(vi->mm)
kfree(vi);
}
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-05-31 18:53 ` Oleg Nesterov
@ 2012-06-01 15:53 ` Oleg Nesterov
2012-06-01 16:33 ` Srikar Dronamraju
2012-06-01 16:47 ` Srikar Dronamraju
1 sibling, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2012-06-01 15:53 UTC (permalink / raw)
To: Peter Zijlstra, Srikar Dronamraju
Cc: Ingo Molnar, Ananth N Mavinakayanahalli, Anton Arapov,
Linus Torvalds, Masami Hiramatsu, linux-kernel
On 05/31, Oleg Nesterov wrote:
>
> __find_next_vma_info() checks tmpvi->mm == vma->vm_mm to detect the
> already visited mm/vma. However, afaics this can be false positive?
Yes, but I guess this is harmless, we can rely on uprobe_mmap.
But. Doesn't this mean we can greatly simplify register_for_each_vma()
and make it O(n) ?
Unless I missed something, we can simply create the list of
mm/vaddr structures under ->i_mmap_mutex (vma_prio_tree_foreach), then
register_for_each_vma() can process the list and that is all.
If another mapping comes after we drop i_mmap_mutex, uprobe_mmap()
should be called and it should install the bp.
Srikar, Peter, what do you think?
Oleg.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-06-01 15:53 ` Oleg Nesterov
@ 2012-06-01 16:33 ` Srikar Dronamraju
2012-06-01 17:20 ` Oleg Nesterov
0 siblings, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2012-06-01 16:33 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Peter Zijlstra, Ingo Molnar, Ananth N Mavinakayanahalli,
Anton Arapov, Linus Torvalds, Masami Hiramatsu, linux-kernel
* Oleg Nesterov <oleg@redhat.com> [2012-06-01 17:53:12]:
> On 05/31, Oleg Nesterov wrote:
> >
> > __find_next_vma_info() checks tmpvi->mm == vma->vm_mm to detect the
> > already visited mm/vma. However, afaics this can be false positive?
>
> Yes, but I guess this is harmless, we can rely on uprobe_mmap.
>
>
>
> But. Doesn't this mean we can greatly simplify register_for_each_vma()
> and make it O(n) ?
>
> Unless I missed something, we can simply create the list of
> mm/vaddr structures under ->i_mmap_mutex (vma_prio_tree_foreach), then
> register_for_each_vma() can process the list and that is all.
If I remember correctly, we cannot allocate the list elements under
i_mmap_mutex. We dont know how many list elements to allocate.
This is what Peter had to say : https://lkml.org/lkml/2011/6/27/72
"Because we try to take i_mmap_mutex during reclaim, trying to unmap
pages. So suppose we do an allocation while holding i_mmap_mutex, find
there's no free memory, try and unmap a page in order to free it, and
we're stuck."
>
> If another mapping comes after we drop i_mmap_mutex, uprobe_mmap()
> should be called and it should install the bp.
>
--
Thanks and Regards
Srikar
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-06-01 16:33 ` Srikar Dronamraju
@ 2012-06-01 17:20 ` Oleg Nesterov
2012-06-01 18:31 ` Oleg Nesterov
0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2012-06-01 17:20 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Peter Zijlstra, Ingo Molnar, Ananth N Mavinakayanahalli,
Anton Arapov, Linus Torvalds, Masami Hiramatsu, linux-kernel
On 06/01, Srikar Dronamraju wrote:
>
> * Oleg Nesterov <oleg@redhat.com> [2012-06-01 17:53:12]:
>
> > But. Doesn't this mean we can greatly simplify register_for_each_vma()
> > and make it O(n) ?
> >
> > Unless I missed something, we can simply create the list of
> > mm/vaddr structures under ->i_mmap_mutex (vma_prio_tree_foreach), then
> > register_for_each_vma() can process the list and that is all.
>
>
> If I remember correctly, we cannot allocate the list elements under
> i_mmap_mutex. We dont know how many list elements to allocate.
>
> This is what Peter had to say : https://lkml.org/lkml/2011/6/27/72
>
> "Because we try to take i_mmap_mutex during reclaim, trying to unmap
> pages. So suppose we do an allocation while holding i_mmap_mutex, find
> there's no free memory, try and unmap a page in order to free it, and
> we're stuck."
Yes, try_to_unmap.
But. What do you think about the pseudo-code below? Only to illustrate
the approach, the code is not complete.
In the "likely" case we do vma_prio_tree_foreach() twice, this is
better than the current quadratic behaviour.
Oleg.
struct map_info {
struct map_info *next;
struct mm_struct *mm;
loff_t vaddr;
};
static struct map_info *
build_map_info_list(struct address_space *mapping, loff_t offset,
bool is_register)
{
struct map_info *prev = NULL;
struct map_info *curr;
int more;
again:
more = 0;
curr = NULL;
vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) {
struct map_info *info = prev;
if (!valid_vma(vma, is_register))
continue;
if (!info) {
more++;
continue;
}
if (!atomic_inc_not_zero(&vma->vm_mm->mm_users))
contunue;
prev = info->next;
info->mm = vma->vm_mm;
info->vaddr = vma_address(vma, offset);
info->next = curr;
curr = info;
}
if (!more) {
while (prev) {
map_info *tmp = prev;
prev = prev->next;
kfree(tmp);
}
return curr;
}
prev = curr;
while (curr) {
mmput(curr->mm);
curr = curr->next;
}
while (more--) {
struct map_info *info = kmalloc(...);
if (!info)
return ERR_PTR(-ENOMEM); // MEMORY LEAK
info->next = prev;
prev = info;
}
goto again;
}
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-06-01 17:20 ` Oleg Nesterov
@ 2012-06-01 18:31 ` Oleg Nesterov
2012-06-02 18:21 ` Oleg Nesterov
0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2012-06-01 18:31 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Peter Zijlstra, Ingo Molnar, Ananth N Mavinakayanahalli,
Anton Arapov, Linus Torvalds, Masami Hiramatsu, linux-kernel
On 06/01, Oleg Nesterov wrote:
>
> But. What do you think about the pseudo-code below? Only to illustrate
> the approach, the code is not complete.
>
> In the "likely" case we do vma_prio_tree_foreach() twice, this is
> better than the current quadratic behaviour.
See the "full" version. Untested of course, most probably has bugs,
but hopefully close enough.
What do you think?
Oleg.
struct map_info {
struct map_info *next;
struct mm_struct *mm;
loff_t vaddr;
};
static inline struct map_info *free_map_info(struct map_info *info)
{
struct map_info *next = info->next;
kfree(info);
return next;
}
static struct map_info *
build_map_info(struct address_space *mapping, loff_t offset, bool is_register)
{
unsigned long pgoff = offset >> PAGE_SHIFT;
struct prio_tree_iter iter;
struct vm_area_struct *vma;
struct map_info *curr = NULL;
struct map_info *prev = NULL;
struct map_info *info;
int more = 0;
again:
mutex_lock(&mapping->i_mmap_mutex);
vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) {
if (!valid_vma(vma, is_register))
continue;
if (!prev) {
more++;
continue;
}
if (!atomic_inc_not_zero(&vma->vm_mm->mm_users))
continue;
info = prev;
prev = prev->next;
info->next = curr;
curr = info;
info->mm = vma->vm_mm;
info->vaddr = vma_address(vma, offset);
}
mutex_unlock(&mapping->i_mmap_mutex);
if (!more)
goto out;
prev = curr;
while (curr) {
mmput(curr->mm);
curr = curr->next;
}
while (more--) {
info = kmalloc(sizeof(struct map_info), GFP_KERNEL);
if (!info) {
curr = ERR_PTR(-ENOMEM);
goto out;
}
info->next = prev;
prev = info;
}
goto again;
out:
while (prev)
prev = free_map_info(prev);
return curr;
}
static int register_for_each_vma(struct uprobe *uprobe, bool is_register)
{
struct map_info *info;
int err = 0;
info = build_map_info(uprobe->inode->i_mapping,
uprobe->offset, is_register);
if (IS_ERR(info))
return PTR_ERR(info);
while (info) {
struct mm_struct *mm = info->mm;
struct vm_area_struct *vma;
loff_t vaddr;
if (err)
goto free;
down_write(&mm->mmap_sem);
vma = find_vma(mm, (unsigned long)info->vaddr);
if (!vma || !valid_vma(vma, is_register))
goto unlock;
vaddr = vma_address(vma, uprobe->offset);
if (vma->vm_file->f_mapping->host != uprobe->inode ||
vaddr != info->vaddr)
goto unlock;
if (is_register) {
err = install_breakpoint(uprobe, mm, vma, info->vaddr);
if (err == -EEXIST)
err = 0;
} else {
remove_breakpoint(uprobe, mm, info->vaddr);
}
unlock:
up_write(&mm->mmap_sem);
free:
mmput(mm);
info = free_map_info(info);
}
return err;
}
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-06-01 18:31 ` Oleg Nesterov
@ 2012-06-02 18:21 ` Oleg Nesterov
0 siblings, 0 replies; 24+ messages in thread
From: Oleg Nesterov @ 2012-06-02 18:21 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Peter Zijlstra, Ingo Molnar, Ananth N Mavinakayanahalli,
Anton Arapov, Linus Torvalds, Masami Hiramatsu, linux-kernel
On 06/01, Oleg Nesterov wrote:
>
> while (more--) {
> info = kmalloc(sizeof(struct map_info), GFP_KERNEL);
> if (!info) {
> curr = ERR_PTR(-ENOMEM);
> goto out;
> }
> info->next = prev;
> prev = info;
> }
This should be "do ... while (--more)", otherwise it seems to work
and really helps.
I did the simple test under qemu. Currently uprobe_register() hangs
"forever" if the probed addr has 10000 mappings, I'have stopped qemu
after several minutes.
With the new code uprobe_register() needs 0.403s to complete.
We can also change build_map_info() to try GFP_NOWAIT | GFP_NOMEMALLOC
under i_mmap_mutex first, not sure this is really needed.
I'll write the changelog and send the patch...
Do you see any problems with this approach?
Oleg.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-05-31 18:53 ` Oleg Nesterov
2012-06-01 15:53 ` Oleg Nesterov
@ 2012-06-01 16:47 ` Srikar Dronamraju
2012-06-01 18:38 ` Oleg Nesterov
1 sibling, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2012-06-01 16:47 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Peter Zijlstra, Ingo Molnar, Ananth N Mavinakayanahalli,
Anton Arapov, Linus Torvalds, Masami Hiramatsu, linux-kernel
* Oleg Nesterov <oleg@redhat.com> [2012-05-31 20:53:39]:
> On 05/30, Peter Zijlstra wrote:
> >
> > register's vma
> > iteration is very careful not to have the same vma twice,
>
> Hmm. I am wondering if it is careful enough...
>
> Just in case, I think your patch is great. But it seems to me
> there is another problem.
>
> __find_next_vma_info() checks tmpvi->mm == vma->vm_mm to detect the
> already visited mm/vma. However, afaics this can be false positive?
>
> The caller, register_for_each_vma(), does mmput() and after that
> this memory can be freed and re-used as another mm_struct.
>
Before we do a mmput(), we take the mmap_sem for that mm. So this mm
cannot be freed until we complete insertion/removal. If the mm gets
reused after the insertion/removal, and maps the inode, then we are
doing the right thing by parsing it.
Or are you hinting at some other problem?
> I'll recheck this, but perhaps we need something like below?
>
> Oleg.
>
> --- x/kernel/events/uprobes.c
> +++ x/kernel/events/uprobes.c
> @@ -851,7 +851,6 @@ static int register_for_each_vma(struct
> list_del(&vi->probe_list);
> kfree(vi);
> up_write(&mm->mmap_sem);
> - mmput(mm);
If we want to do this, then we have to even move the list_del and kfree
along with mmput. Otherwise we may leak mm's. I see no advantages
except for decrease in code. I might be missing something trivial.
On the other side, moving the lines below would mean keeping extra
elements in the list that have to be traversed again. i.e if we
determine in this pass that elements of the list can be removed, then
why keep them till the next pass.
> continue;
> }
> vaddr = vma_address(vma, uprobe->offset);
> @@ -860,7 +859,6 @@ static int register_for_each_vma(struct
> list_del(&vi->probe_list);
> kfree(vi);
> up_write(&mm->mmap_sem);
> - mmput(mm);
> continue;
> }
>
> @@ -870,7 +868,6 @@ static int register_for_each_vma(struct
> remove_breakpoint(uprobe, mm, vi->vaddr);
>
> up_write(&mm->mmap_sem);
> - mmput(mm);
> if (is_register) {
> if (ret && ret == -EEXIST)
> ret = 0;
> @@ -881,6 +878,7 @@ static int register_for_each_vma(struct
>
> list_for_each_entry_safe(vi, tmpvi, &try_list, probe_list) {
> list_del(&vi->probe_list);
> + mmput(vi->mm)
> kfree(vi);
> }
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T
2012-06-01 16:47 ` Srikar Dronamraju
@ 2012-06-01 18:38 ` Oleg Nesterov
0 siblings, 0 replies; 24+ messages in thread
From: Oleg Nesterov @ 2012-06-01 18:38 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Peter Zijlstra, Ingo Molnar, Ananth N Mavinakayanahalli,
Anton Arapov, Linus Torvalds, Masami Hiramatsu, linux-kernel
On 06/01, Srikar Dronamraju wrote:
>
> * Oleg Nesterov <oleg@redhat.com> [2012-05-31 20:53:39]:
>
> > __find_next_vma_info() checks tmpvi->mm == vma->vm_mm to detect the
> > already visited mm/vma. However, afaics this can be false positive?
> >
> > The caller, register_for_each_vma(), does mmput() and after that
> > this memory can be freed and re-used as another mm_struct.
> >
>
> Before we do a mmput(), we take the mmap_sem for that mm. So this mm
> cannot be freed until we complete insertion/removal. If the mm gets
> reused after the insertion/removal, and maps the inode,
Yes, see the previous email from me, we can rely on uprobe_mmap().
> Or are you hinting at some other problem?
Perhaps this deserves a comment. I mean, to explain that yes,
tmpvi->mm == vma->vm_mm can be wrong but this is fine.
However, I hope we simply can kill this code. See build_map_info()
I sent.
Oleg.
^ permalink raw reply [flat|nested] 24+ messages in thread