* [PATCH] uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode()
@ 2014-05-05 15:13 Oleg Nesterov
2014-05-19 12:41 ` Michal Hocko
2014-05-21 10:05 ` Srikar Dronamraju
0 siblings, 2 replies; 4+ messages in thread
From: Oleg Nesterov @ 2014-05-05 15:13 UTC (permalink / raw)
To: Ingo Molnar, Srikar Dronamraju
Cc: Ananth N Mavinakayanahalli, David Long, Hugh Dickins,
Michal Hocko, linux-kernel
Hugh says:
The one I noticed was that it forgets all about memcg (because
it was copied from KSM, and there the replacement page has already
been charged to a memcg). See how mm/memory.c do_anonymous_page()
does a mem_cgroup_charge_anon().
Hopefully not a big problem, uprobes is a system-wide thing and only
root can insert the probes. But I agree, should be fixed anyway.
Add mem_cgroup_{un,}charge_anon() into uprobe_write_opcode(). To simplify
the error handling (and avoid the new "uncharge" label) the patch also
moves anon_vma_prepare() up before we alloc/charge the new page.
While at it fix the comment about ->mmap_sem, it is held for write.
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/events/uprobes.c | 23 +++++++++++------------
1 files changed, 11 insertions(+), 12 deletions(-)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 7716c40..a13251e 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -279,18 +279,13 @@ static int verify_opcode(struct page *page, unsigned long vaddr, uprobe_opcode_t
* supported by that architecture then we need to modify is_trap_at_addr and
* uprobe_write_opcode accordingly. This would never be a problem for archs
* that have fixed length instructions.
- */
-
-/*
+ *
* uprobe_write_opcode - write the opcode at a given virtual address.
* @mm: the probed process address space.
* @vaddr: the virtual address to store the opcode.
* @opcode: opcode to be written at @vaddr.
*
- * Called with mm->mmap_sem held (for read and with a reference to
- * mm).
- *
- * For mm @mm, write the opcode at @vaddr.
+ * Called with mm->mmap_sem held for write.
* Return 0 (success) or a negative errno.
*/
int uprobe_write_opcode(struct mm_struct *mm, unsigned long vaddr,
@@ -310,21 +305,25 @@ retry:
if (ret <= 0)
goto put_old;
+ ret = anon_vma_prepare(vma);
+ if (ret)
+ goto put_old;
+
ret = -ENOMEM;
new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, vaddr);
if (!new_page)
goto put_old;
- __SetPageUptodate(new_page);
+ if (mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL))
+ goto put_new;
+ __SetPageUptodate(new_page);
copy_highpage(new_page, old_page);
copy_to_page(new_page, vaddr, &opcode, UPROBE_SWBP_INSN_SIZE);
- ret = anon_vma_prepare(vma);
- if (ret)
- goto put_new;
-
ret = __replace_page(vma, vaddr, old_page, new_page);
+ if (ret)
+ mem_cgroup_uncharge_page(new_page);
put_new:
page_cache_release(new_page);
--
1.5.5.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode()
2014-05-05 15:13 [PATCH] uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode() Oleg Nesterov
@ 2014-05-19 12:41 ` Michal Hocko
2014-05-19 13:50 ` Oleg Nesterov
2014-05-21 10:05 ` Srikar Dronamraju
1 sibling, 1 reply; 4+ messages in thread
From: Michal Hocko @ 2014-05-19 12:41 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Ingo Molnar, Srikar Dronamraju, Ananth N Mavinakayanahalli,
David Long, Hugh Dickins, linux-kernel
On Mon 05-05-14 17:13:20, Oleg Nesterov wrote:
Ups, this really slipped through...
Sorry about that.
> Hugh says:
>
> The one I noticed was that it forgets all about memcg (because
> it was copied from KSM, and there the replacement page has already
> been charged to a memcg). See how mm/memory.c do_anonymous_page()
> does a mem_cgroup_charge_anon().
>
> Hopefully not a big problem, uprobes is a system-wide thing and only
> root can insert the probes. But I agree, should be fixed anyway.
I am not familiar with uprobes but AFAIU this can be a non-trivial of
(now) uncharged memory. The fact that only root is allowed to install
probes doesn't change the fact that this is still unaccounted memory and
so setups which are tuned to not trigger OOM killer would be broken.
> Add mem_cgroup_{un,}charge_anon() into uprobe_write_opcode(). To simplify
> the error handling (and avoid the new "uncharge" label) the patch also
> moves anon_vma_prepare() up before we alloc/charge the new page.
>
> While at it fix the comment about ->mmap_sem, it is held for write.
> Suggested-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
> ---
> kernel/events/uprobes.c | 23 +++++++++++------------
> 1 files changed, 11 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 7716c40..a13251e 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -279,18 +279,13 @@ static int verify_opcode(struct page *page, unsigned long vaddr, uprobe_opcode_t
> * supported by that architecture then we need to modify is_trap_at_addr and
> * uprobe_write_opcode accordingly. This would never be a problem for archs
> * that have fixed length instructions.
> - */
> -
> -/*
> + *
> * uprobe_write_opcode - write the opcode at a given virtual address.
> * @mm: the probed process address space.
> * @vaddr: the virtual address to store the opcode.
> * @opcode: opcode to be written at @vaddr.
> *
> - * Called with mm->mmap_sem held (for read and with a reference to
> - * mm).
> - *
> - * For mm @mm, write the opcode at @vaddr.
> + * Called with mm->mmap_sem held for write.
> * Return 0 (success) or a negative errno.
> */
> int uprobe_write_opcode(struct mm_struct *mm, unsigned long vaddr,
> @@ -310,21 +305,25 @@ retry:
> if (ret <= 0)
> goto put_old;
>
> + ret = anon_vma_prepare(vma);
> + if (ret)
> + goto put_old;
> +
> ret = -ENOMEM;
> new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, vaddr);
> if (!new_page)
> goto put_old;
>
> - __SetPageUptodate(new_page);
> + if (mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL))
> + goto put_new;
>
> + __SetPageUptodate(new_page);
> copy_highpage(new_page, old_page);
> copy_to_page(new_page, vaddr, &opcode, UPROBE_SWBP_INSN_SIZE);
>
> - ret = anon_vma_prepare(vma);
> - if (ret)
> - goto put_new;
> -
> ret = __replace_page(vma, vaddr, old_page, new_page);
> + if (ret)
> + mem_cgroup_uncharge_page(new_page);
>
> put_new:
> page_cache_release(new_page);
> --
> 1.5.5.1
>
>
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode()
2014-05-19 12:41 ` Michal Hocko
@ 2014-05-19 13:50 ` Oleg Nesterov
0 siblings, 0 replies; 4+ messages in thread
From: Oleg Nesterov @ 2014-05-19 13:50 UTC (permalink / raw)
To: Michal Hocko
Cc: Ingo Molnar, Srikar Dronamraju, Ananth N Mavinakayanahalli,
David Long, Hugh Dickins, linux-kernel
On 05/19, Michal Hocko wrote:
>
> On Mon 05-05-14 17:13:20, Oleg Nesterov wrote:
>
> Ups, this really slipped through...
> Sorry about that.
>
> > Hugh says:
> >
> > The one I noticed was that it forgets all about memcg (because
> > it was copied from KSM, and there the replacement page has already
> > been charged to a memcg). See how mm/memory.c do_anonymous_page()
> > does a mem_cgroup_charge_anon().
> >
> > Hopefully not a big problem, uprobes is a system-wide thing and only
> > root can insert the probes. But I agree, should be fixed anyway.
>
> I am not familiar with uprobes but AFAIU this can be a non-trivial of
> (now) uncharged memory. The fact that only root is allowed to install
> probes doesn't change the fact that this is still unaccounted memory and
> so setups which are tuned to not trigger OOM killer would be broken.
Sorry for confusion. I only tried to say that a user can't exploit this
problem. And "system-wide" means that installing a uprobe has other
negative impacts on the whole system. IOW, I do not think that, say, this
fix is -stable material.
> > Add mem_cgroup_{un,}charge_anon() into uprobe_write_opcode(). To simplify
> > the error handling (and avoid the new "uncharge" label) the patch also
> > moves anon_vma_prepare() up before we alloc/charge the new page.
> >
> > While at it fix the comment about ->mmap_sem, it is held for write.
>
> > Suggested-by: Hugh Dickins <hughd@google.com>
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
>
> Acked-by: Michal Hocko <mhocko@suse.cz>
Thanks!
Oleg.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode()
2014-05-05 15:13 [PATCH] uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode() Oleg Nesterov
2014-05-19 12:41 ` Michal Hocko
@ 2014-05-21 10:05 ` Srikar Dronamraju
1 sibling, 0 replies; 4+ messages in thread
From: Srikar Dronamraju @ 2014-05-21 10:05 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Ingo Molnar, Ananth N Mavinakayanahalli, David Long, Hugh Dickins,
Michal Hocko, linux-kernel
* Oleg Nesterov <oleg@redhat.com> [2014-05-05 17:13:20]:
> Hugh says:
>
> The one I noticed was that it forgets all about memcg (because
> it was copied from KSM, and there the replacement page has already
> been charged to a memcg). See how mm/memory.c do_anonymous_page()
> does a mem_cgroup_charge_anon().
>
> Hopefully not a big problem, uprobes is a system-wide thing and only
> root can insert the probes. But I agree, should be fixed anyway.
>
> Add mem_cgroup_{un,}charge_anon() into uprobe_write_opcode(). To simplify
> the error handling (and avoid the new "uncharge" label) the patch also
> moves anon_vma_prepare() up before we alloc/charge the new page.
>
> While at it fix the comment about ->mmap_sem, it is held for write.
>
> Suggested-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
--
Thanks and Regards
Srikar Dronamraju
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-05-21 10:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-05 15:13 [PATCH] uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode() Oleg Nesterov
2014-05-19 12:41 ` Michal Hocko
2014-05-19 13:50 ` Oleg Nesterov
2014-05-21 10:05 ` Srikar Dronamraju
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox