* [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn()
@ 2012-06-08 9:32 Ananth N Mavinakayanahalli
2012-06-08 9:34 ` [PATCH v2 2/2] [POWERPC] uprobes: powerpc port Ananth N Mavinakayanahalli
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Ananth N Mavinakayanahalli @ 2012-06-08 9:32 UTC (permalink / raw)
To: linuxppc-dev, lkml
Cc: michael, antonb, Paul Mackerras, benh, Ingo Molnar, peterz,
Srikar Dronamraju, Jim Keniston, oleg
From: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
On RISC architectures like powerpc, instructions are fixed size.
Instruction analysis on such platforms is just a matter of (insn % 4).
Pass the vaddr at which the uprobe is to be inserted so that
arch_uprobe_analyze_insn() can flag misaligned registration requests.
Changes in V2:
Pass (unsigned long)addr instead of (loff_t)vaddr to
arch_uprobe_analyze_insn(). We need the loff_t vaddr to take care of
the offset of inode:offset pair for large file sizes on 32bit.
Signed-off-by: Ananth N Mavinakaynahalli <ananth@in.ibm.com>
---
arch/x86/include/asm/uprobes.h | 2 +-
arch/x86/kernel/uprobes.c | 3 ++-
kernel/events/uprobes.c | 2 +-
3 files changed, 4 insertions(+), 3 deletions(-)
Index: linux-3.5-rc1/arch/x86/include/asm/uprobes.h
===================================================================
--- linux-3.5-rc1.orig/arch/x86/include/asm/uprobes.h
+++ linux-3.5-rc1/arch/x86/include/asm/uprobes.h
@@ -48,7 +48,7 @@ struct arch_uprobe_task {
#endif
};
-extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm);
+extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr);
extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
Index: linux-3.5-rc1/arch/x86/kernel/uprobes.c
===================================================================
--- linux-3.5-rc1.orig/arch/x86/kernel/uprobes.c
+++ linux-3.5-rc1/arch/x86/kernel/uprobes.c
@@ -409,9 +409,10 @@ static int validate_insn_bits(struct arc
* arch_uprobe_analyze_insn - instruction analysis including validity and fixups.
* @mm: the probed address space.
* @arch_uprobe: the probepoint information.
+ * @addr: virtual address at which to install the probepoint
* Return 0 on success or a -ve number on error.
*/
-int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm)
+int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr)
{
int ret;
struct insn insn;
Index: linux-3.5-rc1/kernel/events/uprobes.c
===================================================================
--- linux-3.5-rc1.orig/kernel/events/uprobes.c
+++ linux-3.5-rc1/kernel/events/uprobes.c
@@ -697,7 +697,7 @@ install_breakpoint(struct uprobe *uprobe
if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn))
return -EEXIST;
- ret = arch_uprobe_analyze_insn(&uprobe->arch, mm);
+ ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, addr);
if (ret)
return ret;
^ permalink raw reply [flat|nested] 16+ messages in thread* [PATCH v2 2/2] [POWERPC] uprobes: powerpc port 2012-06-08 9:32 [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Ananth N Mavinakayanahalli @ 2012-06-08 9:34 ` Ananth N Mavinakayanahalli 2012-06-08 14:58 ` [tip:perf/core] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() tip-bot for Ananth N Mavinakayanahalli 2012-06-11 16:12 ` [PATCH v2 1/2] " Oleg Nesterov 2 siblings, 0 replies; 16+ messages in thread From: Ananth N Mavinakayanahalli @ 2012-06-08 9:34 UTC (permalink / raw) To: linuxppc-dev, lkml Cc: michael, antonb, Paul Mackerras, benh, Ingo Molnar, peterz, Srikar Dronamraju, Jim Keniston, oleg From: Ananth N Mavinakayanahalli <ananth@in.ibm.com> This is the port of uprobes to powerpc. Usage is similar to x86. One TODO in this port compared to x86 is the uprobe abort_xol() logic. x86 depends on the thread_struct.trap_nr (absent in powerpc) to determine if a signal was caused when the uprobed instruction was single-stepped/ emulated, in which case, we reset the instruction pointer to the probed address and retry the probe again. Tested on POWER6; I don't see anything here that should stop it from working on a ppc32; since I don't have access to a ppc32 machine, it would be good if somoene could verify that part. [root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc Added new event: probe_libc:malloc (on 0xb4860) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc -aR sleep 1 [root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20 [ perf record: Woken up 22 times to write data ] [ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ] [root@xxxx ~]# ./bin/perf report --stdio # ======== # captured on: Mon Jun 4 05:26:31 2012 # hostname : xxxx.ibm.com # os release : 3.4.0-uprobe # perf version : 3.4.0 # arch : ppc64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : POWER6 (raw), altivec supported # cpuid : 62,769 # total memory : 7310528 kB # cmdline : /root/bin/perf record -e probe_libc:malloc -aR sleep 20 # event : name = probe_libc:malloc, type = 2, config = 0x124, config1 = 0x0, con # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # ======== # # Samples: 83K of event 'probe_libc:malloc' # Event count (approx.): 83484 # # Overhead Command Shared Object Symbol # ........ ............ ............. .......... # 69.05% tar libc-2.12.so [.] malloc 28.57% rm libc-2.12.so [.] malloc 1.32% avahi-daemon libc-2.12.so [.] malloc 0.58% bash libc-2.12.so [.] malloc 0.28% sshd libc-2.12.so [.] malloc 0.08% irqbalance libc-2.12.so [.] malloc 0.05% bzip2 libc-2.12.so [.] malloc 0.04% sleep libc-2.12.so [.] malloc 0.03% multipathd libc-2.12.so [.] malloc 0.01% sendmail libc-2.12.so [.] malloc 0.01% automount libc-2.12.so [.] malloc V2: a. arch_uprobe_analyze_insn() now gets unsigned long addr. b. Verified that mtmsr[d] and rfi[d] are handled correctly by emulate_step() (no changes to this patch). Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> --- arch/powerpc/Kconfig | 3 arch/powerpc/include/asm/thread_info.h | 4 arch/powerpc/include/asm/uprobes.h | 49 +++++++++ arch/powerpc/kernel/Makefile | 1 arch/powerpc/kernel/signal.c | 6 + arch/powerpc/kernel/uprobes.c | 164 +++++++++++++++++++++++++++++++++ 6 files changed, 226 insertions(+), 1 deletion(-) Index: linux-3.5-rc1/arch/powerpc/include/asm/thread_info.h =================================================================== --- linux-3.5-rc1.orig/arch/powerpc/include/asm/thread_info.h +++ linux-3.5-rc1/arch/powerpc/include/asm/thread_info.h @@ -96,6 +96,7 @@ static inline struct thread_info *curren #define TIF_RESTOREALL 11 /* Restore all regs (implies NOERROR) */ #define TIF_NOERROR 12 /* Force successful syscall return */ #define TIF_NOTIFY_RESUME 13 /* callback before returning to user */ +#define TIF_UPROBE 14 /* breakpointed or single-stepping */ #define TIF_SYSCALL_TRACEPOINT 15 /* syscall tracepoint instrumentation */ /* as above, but as bit values */ @@ -112,12 +113,13 @@ static inline struct thread_info *curren #define _TIF_RESTOREALL (1<<TIF_RESTOREALL) #define _TIF_NOERROR (1<<TIF_NOERROR) #define _TIF_NOTIFY_RESUME (1<<TIF_NOTIFY_RESUME) +#define _TIF_UPROBE (1<<TIF_UPROBE) #define _TIF_SYSCALL_TRACEPOINT (1<<TIF_SYSCALL_TRACEPOINT) #define _TIF_SYSCALL_T_OR_A (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT) #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \ - _TIF_NOTIFY_RESUME) + _TIF_NOTIFY_RESUME | _TIF_UPROBE) #define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR) /* Bits in local_flags */ Index: linux-3.5-rc1/arch/powerpc/include/asm/uprobes.h =================================================================== --- /dev/null +++ linux-3.5-rc1/arch/powerpc/include/asm/uprobes.h @@ -0,0 +1,49 @@ +#ifndef _ASM_UPROBES_H +#define _ASM_UPROBES_H +/* + * User-space Probes (UProbes) for powerpc + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2007-2012 + * + * Adapted from the x86 port by Ananth N Mavinakayanahalli <ananth@in.ibm.com> + */ + +#include <linux/notifier.h> + +typedef unsigned int uprobe_opcode_t; + +#define MAX_UINSN_BYTES 4 +#define UPROBE_XOL_SLOT_BYTES (MAX_UINSN_BYTES) + +#define UPROBE_SWBP_INSN 0x7fe00008 +#define UPROBE_SWBP_INSN_SIZE 4 /* swbp insn size in bytes */ + +struct arch_uprobe { + u8 insn[MAX_UINSN_BYTES]; +}; + +struct arch_uprobe_task { +}; + +extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr); +extern unsigned long uprobe_get_swbp_addr(struct pt_regs *regs); +extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs); +extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs); +extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk); +extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data); +extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs); +#endif /* _ASM_UPROBES_H */ Index: linux-3.5-rc1/arch/powerpc/kernel/Makefile =================================================================== --- linux-3.5-rc1.orig/arch/powerpc/kernel/Makefile +++ linux-3.5-rc1/arch/powerpc/kernel/Makefile @@ -96,6 +96,7 @@ obj-$(CONFIG_MODULES) += ppc_ksyms.o obj-$(CONFIG_BOOTX_TEXT) += btext.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_KPROBES) += kprobes.o +obj-$(CONFIG_UPROBES) += uprobes.o obj-$(CONFIG_PPC_UDBG_16550) += legacy_serial.o udbg_16550.o obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-$(CONFIG_SWIOTLB) += dma-swiotlb.o Index: linux-3.5-rc1/arch/powerpc/kernel/signal.c =================================================================== --- linux-3.5-rc1.orig/arch/powerpc/kernel/signal.c +++ linux-3.5-rc1/arch/powerpc/kernel/signal.c @@ -11,6 +11,7 @@ #include <linux/tracehook.h> #include <linux/signal.h> +#include <linux/uprobes.h> #include <linux/key.h> #include <asm/hw_breakpoint.h> #include <asm/uaccess.h> @@ -157,6 +158,11 @@ static int do_signal(struct pt_regs *reg void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags) { + if (thread_info_flags & _TIF_UPROBE) { + clear_thread_flag(TIF_UPROBE); + uprobe_notify_resume(regs); + } + if (thread_info_flags & _TIF_SIGPENDING) do_signal(regs); Index: linux-3.5-rc1/arch/powerpc/kernel/uprobes.c =================================================================== --- /dev/null +++ linux-3.5-rc1/arch/powerpc/kernel/uprobes.c @@ -0,0 +1,164 @@ +/* + * User-space Probes (UProbes) for powerpc + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2007-2012 + * + * Adapted from the x86 port by Ananth N Mavinakayanahalli <ananth@in.ibm.com> + */ +#include <linux/kernel.h> +#include <linux/sched.h> +#include <linux/ptrace.h> +#include <linux/uprobes.h> +#include <linux/uaccess.h> + +#include <linux/kdebug.h> +#include <asm/sstep.h> + +/** + * arch_uprobe_analyze_insn + * @mm: the probed address space. + * @arch_uprobe: the probepoint information. + * @addr: vaddr to probe. + * Return 0 on success or a -ve number on error. + */ +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr) +{ + if (addr & 0x03) + return -EINVAL; + return 0; +} + +/* + * arch_uprobe_pre_xol - prepare to execute out of line. + * @auprobe: the probepoint information. + * @regs: reflects the saved user state of current task. + */ +int arch_uprobe_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) +{ + /* FIXME: We don't support abort_xol on powerpc for now */ + regs->nip = current->utask->xol_vaddr; + return 0; +} + +/** + * uprobe_get_swbp_addr - compute address of swbp given post-swbp regs + * @regs: Reflects the saved state of the task after it has hit a breakpoint + * instruction. + * Return the address of the breakpoint instruction. + */ +unsigned long uprobe_get_swbp_addr(struct pt_regs *regs) +{ + return instruction_pointer(regs); +} + +/* + * If xol insn itself traps and generates a signal (SIGILL/SIGSEGV/etc), + * then detect the case where a singlestepped instruction jumps back to its + * own address. It is assumed that anything like do_page_fault/do_trap/etc + * sets thread.trap_nr != -1. + * + * FIXME: powerpc however doesn't have thread.trap_nr yet. + * + * arch_uprobe_pre_xol/arch_uprobe_post_xol save/restore thread.trap_nr, + * arch_uprobe_xol_was_trapped() simply checks that ->trap_nr is not equal to + * UPROBE_TRAP_NR == -1 set by arch_uprobe_pre_xol(). + */ +bool arch_uprobe_xol_was_trapped(struct task_struct *t) +{ + /* FIXME: We don't support abort_xol on powerpc for now */ + return false; +} + +/* + * Called after single-stepping. To avoid the SMP problems that can + * occur when we temporarily put back the original opcode to + * single-step, we single-stepped a copy of the instruction. + * + * This function prepares to resume execution after the single-step. + */ +int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) +{ + /* FIXME: We don't support abort_xol on powerpc for now */ + + /* + * On powerpc, except for loads and stores, most instructions + * including ones that alter code flow (branches, calls, returns) + * are emulated in the kernel. We get here only if the emulation + * support doesn't exist and have to fix-up the next instruction + * to be executed. + */ + regs->nip = current->utask->vaddr + MAX_UINSN_BYTES; + return 0; +} + +/* callback routine for handling exceptions. */ +int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data) +{ + struct die_args *args = data; + struct pt_regs *regs = args->regs; + int ret = NOTIFY_DONE; + + /* We are only interested in userspace traps */ + if (regs && !user_mode(regs)) + return NOTIFY_DONE; + + switch (val) { + case DIE_BPT: + if (uprobe_pre_sstep_notifier(regs)) + ret = NOTIFY_STOP; + break; + case DIE_SSTEP: + if (uprobe_post_sstep_notifier(regs)) + ret = NOTIFY_STOP; + default: + break; + } + return ret; +} + +/* + * This function gets called when XOL instruction either gets trapped or + * the thread has a fatal signal, so reset the instruction pointer to its + * probed address. + */ +void arch_uprobe_abort_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) +{ + /* FIXME: We don't support abort_xol on powerpc for now */ + return; +} + +/* + * See if the instruction can be emulated. + * Returns true if instruction was emulated, false otherwise. + */ +bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs) +{ + int ret; + unsigned int insn; + + memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES); + + /* + * emulate_step() returns 1 if the insn was successfully emulated. + * For all other cases, we need to single-step in hardware. + */ + ret = emulate_step(regs, insn); + if (ret > 0) + return true; + + return false; +} Index: linux-3.5-rc1/arch/powerpc/Kconfig =================================================================== --- linux-3.5-rc1.orig/arch/powerpc/Kconfig +++ linux-3.5-rc1/arch/powerpc/Kconfig @@ -237,6 +237,9 @@ config PPC_OF_PLATFORM_PCI config ARCH_SUPPORTS_DEBUG_PAGEALLOC def_bool y +config ARCH_SUPPORTS_UPROBES + def_bool y + config PPC_ADV_DEBUG_REGS bool depends on 40x || BOOKE ^ permalink raw reply [flat|nested] 16+ messages in thread
* [tip:perf/core] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-08 9:32 [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Ananth N Mavinakayanahalli 2012-06-08 9:34 ` [PATCH v2 2/2] [POWERPC] uprobes: powerpc port Ananth N Mavinakayanahalli @ 2012-06-08 14:58 ` tip-bot for Ananth N Mavinakayanahalli 2012-06-11 16:12 ` [PATCH v2 1/2] " Oleg Nesterov 2 siblings, 0 replies; 16+ messages in thread From: tip-bot for Ananth N Mavinakayanahalli @ 2012-06-08 14:58 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, paulus, hpa, mingo, jkenisto, ananth, srikar, tglx Commit-ID: 7eb9ba5ed312ec6ed9d22259c5da1acb7cf4bd29 Gitweb: http://git.kernel.org/tip/7eb9ba5ed312ec6ed9d22259c5da1acb7cf4bd29 Author: Ananth N Mavinakayanahalli <ananth@in.ibm.com> AuthorDate: Fri, 8 Jun 2012 15:02:57 +0530 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Fri, 8 Jun 2012 12:22:27 +0200 uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() On RISC architectures like powerpc, instructions are fixed size. Instruction analysis on such platforms is just a matter of (insn % 4). Pass the vaddr at which the uprobe is to be inserted so that arch_uprobe_analyze_insn() can flag misaligned registration requests. Signed-off-by: Ananth N Mavinakaynahalli <ananth@in.ibm.com> Cc: michael@ellerman.id.au Cc: antonb@thinktux.localdomain Cc: Paul Mackerras <paulus@samba.org> Cc: benh@kernel.crashing.org Cc: peterz@infradead.org Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: oleg@redhat.com Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20120608093257.GG13409@in.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/include/asm/uprobes.h | 2 +- arch/x86/kernel/uprobes.c | 3 ++- kernel/events/uprobes.c | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/uprobes.h b/arch/x86/include/asm/uprobes.h index 1e9bed1..f3971bb 100644 --- a/arch/x86/include/asm/uprobes.h +++ b/arch/x86/include/asm/uprobes.h @@ -48,7 +48,7 @@ struct arch_uprobe_task { #endif }; -extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm); +extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr); extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs); extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs); extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk); diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index dc4e910..36fd420 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -409,9 +409,10 @@ static int validate_insn_bits(struct arch_uprobe *auprobe, struct mm_struct *mm, * arch_uprobe_analyze_insn - instruction analysis including validity and fixups. * @mm: the probed address space. * @arch_uprobe: the probepoint information. + * @addr: virtual address at which to install the probepoint * Return 0 on success or a -ve number on error. */ -int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm) +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr) { int ret; struct insn insn; diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 8c5e043..b52376d 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -706,7 +706,7 @@ install_breakpoint(struct uprobe *uprobe, struct mm_struct *mm, if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn)) return -EEXIST; - ret = arch_uprobe_analyze_insn(&uprobe->arch, mm); + ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, addr); if (ret) return ret; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-08 9:32 [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Ananth N Mavinakayanahalli 2012-06-08 9:34 ` [PATCH v2 2/2] [POWERPC] uprobes: powerpc port Ananth N Mavinakayanahalli 2012-06-08 14:58 ` [tip:perf/core] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() tip-bot for Ananth N Mavinakayanahalli @ 2012-06-11 16:12 ` Oleg Nesterov 2012-06-11 19:09 ` Q: a_ops->readpage() && struct file Oleg Nesterov 2012-06-12 16:54 ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Srikar Dronamraju 2 siblings, 2 replies; 16+ messages in thread From: Oleg Nesterov @ 2012-06-11 16:12 UTC (permalink / raw) To: Ananth N Mavinakayanahalli Cc: linuxppc-dev, lkml, michael, antonb, Paul Mackerras, benh, Ingo Molnar, peterz, Srikar Dronamraju, Jim Keniston Ananth, Srikar, I think the patch is correct and I am sorry for nit-picking, this is really minor. But, On 06/08, Ananth N Mavinakayanahalli wrote: > > Changes in V2: > Pass (unsigned long)addr Well, IMHO, this is confusing. First of all, why do we have this "addr" or even "vaddr"? It should not exists. We pass it to copy_insn(), but for what?? copy_insn() should simply use uprobe->offset, the virtual address for this particular mapping does not matter at all. I am going to send the cleanup. Note also that we should move this !UPROBE_COPY_INSN from install_breakpoint() to somewhere near alloc_uprobe(). This code is called only once, it looks a bit strange to use the "random" mm (the first mm vma_prio_tree_foreach() finds) and its mapping to verify the insn. In fact this is simply not correct and should be fixed, note that on x86 arch_uprobe_analyze_insn() checks mm->context.ia32_compat. IOW, Perhaps uprobe->offset makes more sense? > --- linux-3.5-rc1.orig/kernel/events/uprobes.c > +++ linux-3.5-rc1/kernel/events/uprobes.c > @@ -697,7 +697,7 @@ install_breakpoint(struct uprobe *uprobe > if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn)) > return -EEXIST; > > - ret = arch_uprobe_analyze_insn(&uprobe->arch, mm); > + ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, addr); Just fyi, this conflicts with "[PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T" I sent, but the conflict is trivial. Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Q: a_ops->readpage() && struct file 2012-06-11 16:12 ` [PATCH v2 1/2] " Oleg Nesterov @ 2012-06-11 19:09 ` Oleg Nesterov 2012-06-13 9:58 ` Peter Zijlstra 2012-06-12 16:54 ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Srikar Dronamraju 1 sibling, 1 reply; 16+ messages in thread From: Oleg Nesterov @ 2012-06-11 19:09 UTC (permalink / raw) To: Ananth N Mavinakayanahalli, Linus Torvalds, Al Viro Cc: linuxppc-dev, lkml, michael, antonb, Paul Mackerras, benh, Ingo Molnar, peterz, Srikar Dronamraju, Jim Keniston On 06/11, Oleg Nesterov wrote: > > Note also that we should move this !UPROBE_COPY_INSN from > install_breakpoint() to somewhere near alloc_uprobe(). The main problem is, uprobe_register() doesn't have struct file for read_mapping_page(). Stupid question. I'm afraid the answer is "no" but I'll ask anyway. Is it safe to pass filp == NULL to mapping->readpage()? In fact I do not understand why it needs "struct file*" and I do not see any example of actual usage. Yes, I didn't try to grep very much and I understand that the filesystem can do something special. Say it can use file->private_data... However. There is read_cache_page_gfp() which does use a_ops->readpage(filp => NULL), and the comment says nothing about the riskiness. If not, is there any other way uprobe_register(struct inode *inode, loff_t offset) can read the page at this offset? And btw, read_mapping_page() accepts "void *data". Why? it uses filler == a_ops->readpage, it shouldn't accept anything but file pointer? Please help, I know nothing about vfs. Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Q: a_ops->readpage() && struct file 2012-06-11 19:09 ` Q: a_ops->readpage() && struct file Oleg Nesterov @ 2012-06-13 9:58 ` Peter Zijlstra 2012-06-13 19:19 ` Oleg Nesterov 0 siblings, 1 reply; 16+ messages in thread From: Peter Zijlstra @ 2012-06-13 9:58 UTC (permalink / raw) To: Oleg Nesterov Cc: Ananth N Mavinakayanahalli, Linus Torvalds, Al Viro, linuxppc-dev, lkml, michael, antonb, Paul Mackerras, benh, Ingo Molnar, Srikar Dronamraju, Jim Keniston On Mon, 2012-06-11 at 21:09 +0200, Oleg Nesterov wrote: > Stupid question. I'm afraid the answer is "no" but I'll ask anyway. > Is it safe to pass filp == NULL to mapping->readpage()? In fact > I do not understand why it needs "struct file*" and I do not see > any example of actual usage. Looking at afs_readpage it looks like its OK to pass in NULL. Same for nfs_readpage. They use the file, if provided, to avoid some lookups, but seem to deal with not having it. This answer by example is of course not authorative nor complete. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Q: a_ops->readpage() && struct file 2012-06-13 9:58 ` Peter Zijlstra @ 2012-06-13 19:19 ` Oleg Nesterov 0 siblings, 0 replies; 16+ messages in thread From: Oleg Nesterov @ 2012-06-13 19:19 UTC (permalink / raw) To: Peter Zijlstra Cc: Ananth N Mavinakayanahalli, Linus Torvalds, Al Viro, linuxppc-dev, lkml, michael, antonb, Paul Mackerras, benh, Ingo Molnar, Srikar Dronamraju, Jim Keniston On 06/13, Peter Zijlstra wrote: > > On Mon, 2012-06-11 at 21:09 +0200, Oleg Nesterov wrote: > > Stupid question. I'm afraid the answer is "no" but I'll ask anyway. > > Is it safe to pass filp == NULL to mapping->readpage()? In fact > > I do not understand why it needs "struct file*" and I do not see > > any example of actual usage. > > Looking at afs_readpage it looks like its OK to pass in NULL. Same for > nfs_readpage. They use the file, if provided, to avoid some lookups, but > seem to deal with not having it. Yes, and reiser4 does the same. > This answer by example is of course not authorative nor complete. Yes... perhaps we should simply change this code to use NULL and collect the bug-reports ;) Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-11 16:12 ` [PATCH v2 1/2] " Oleg Nesterov 2012-06-11 19:09 ` Q: a_ops->readpage() && struct file Oleg Nesterov @ 2012-06-12 16:54 ` Srikar Dronamraju 2012-06-12 17:43 ` Oleg Nesterov 1 sibling, 1 reply; 16+ messages in thread From: Srikar Dronamraju @ 2012-06-12 16:54 UTC (permalink / raw) To: Oleg Nesterov Cc: Ananth N Mavinakayanahalli, linuxppc-dev, lkml, michael, antonb, Paul Mackerras, benh, Ingo Molnar, peterz, Jim Keniston > > Well, IMHO, this is confusing. > > First of all, why do we have this "addr" or even "vaddr"? It should > not exists. We pass it to copy_insn(), but for what?? copy_insn() > should simply use uprobe->offset, the virtual address for this > particular mapping does not matter at all. I am going to send > the cleanup. > Yes, we can use uprobe->offset instead of vaddr/addr. > Note also that we should move this !UPROBE_COPY_INSN from > install_breakpoint() to somewhere near alloc_uprobe(). This code > is called only once, it looks a bit strange to use the "random" mm > (the first mm vma_prio_tree_foreach() finds) and its mapping to > verify the insn. In fact this is simply not correct and should be > fixed, note that on x86 arch_uprobe_analyze_insn() checks The reason we "delay" the copy_insn to the first insert is because we have to get access to mm. For archs like x86, we want to know if the executable is 32 bit or not (since we have a different valid set of instructions for 32 bit and 64 bit). So in effect, if we get access to struct file corresponding to the inode and if the inode corresponds to 32 bit executable file or 64 bit executable file during register, then we can move it around alloc_uprobe(). ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-12 16:54 ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Srikar Dronamraju @ 2012-06-12 17:43 ` Oleg Nesterov 2012-06-13 19:15 ` Oleg Nesterov 0 siblings, 1 reply; 16+ messages in thread From: Oleg Nesterov @ 2012-06-12 17:43 UTC (permalink / raw) To: Srikar Dronamraju Cc: Ananth N Mavinakayanahalli, linuxppc-dev, lkml, michael, antonb, Paul Mackerras, benh, Ingo Molnar, peterz, Jim Keniston On 06/12, Srikar Dronamraju wrote: > > > > Note also that we should move this !UPROBE_COPY_INSN from > > install_breakpoint() to somewhere near alloc_uprobe(). This code > > is called only once, it looks a bit strange to use the "random" mm > > (the first mm vma_prio_tree_foreach() finds) and its mapping to > > verify the insn. In fact this is simply not correct and should be > > fixed, note that on x86 arch_uprobe_analyze_insn() checks > > The reason we "delay" the copy_insn to the first insert is because > we have to get access to mm. For archs like x86, we want to know if the > executable is 32 bit or not Yes. And this is wrong afaics. Once again. This !UPROBE_COPY_INSN code is called only once, and it uses the "random" mm. After that install_breakpoint() just calls set_swbp(another_mm) while the insn can be invalid because another_mm->ia32_compat != mm->ia32_compat. > So in effect, if we get access to > struct file corresponding to the inode and if the inode corresponds to > 32 bit executable file or 64 bit executable file during register, then > we can move it around alloc_uprobe(). I don't think this can work. I have another simple fix in mind, I'll write another email later. Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-12 17:43 ` Oleg Nesterov @ 2012-06-13 19:15 ` Oleg Nesterov 2012-06-14 11:45 ` Srikar Dronamraju 0 siblings, 1 reply; 16+ messages in thread From: Oleg Nesterov @ 2012-06-13 19:15 UTC (permalink / raw) To: Srikar Dronamraju Cc: Ananth N Mavinakayanahalli, linuxppc-dev, lkml, michael, antonb, Paul Mackerras, benh, Ingo Molnar, peterz, Jim Keniston On 06/12, Oleg Nesterov wrote: > > On 06/12, Srikar Dronamraju wrote: > > > > > > Note also that we should move this !UPROBE_COPY_INSN from > > > install_breakpoint() to somewhere near alloc_uprobe(). This code > > > is called only once, it looks a bit strange to use the "random" mm > > > (the first mm vma_prio_tree_foreach() finds) and its mapping to > > > verify the insn. In fact this is simply not correct and should be > > > fixed, note that on x86 arch_uprobe_analyze_insn() checks > > > > The reason we "delay" the copy_insn to the first insert is because > > we have to get access to mm. For archs like x86, we want to know if the > > executable is 32 bit or not > > Yes. And this is wrong afaics. > > Once again. This !UPROBE_COPY_INSN code is called only once, and it > uses the "random" mm. After that install_breakpoint() just calls > set_swbp(another_mm) while the insn can be invalid because > another_mm->ia32_compat != mm->ia32_compat. > > > So in effect, if we get access to > > struct file corresponding to the inode and if the inode corresponds to > > 32 bit executable file or 64 bit executable file during register, then > > we can move it around alloc_uprobe(). > > I don't think this can work. I have another simple fix in mind, I'll > write another email later. For example. Suppose there is some instruction in /lib64/libc.so which is valid for 64-bit, but not for 32-bit. Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC). Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register() fails even if there are other 64-bit applications which could be traced. Or. uprobe_register() succeeds because it finds a 64-bit mm first, and then that 32-bit application actually executes the invalid insn. We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block. Or, perhaps, validate_insn_bits() should call both validate_insn_32bits() and validate_insn_64bits(), and set the UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint() should do the additinal check before set_swbp() and verify that .ia32_compat matches UPROBE_VALID_IF_*. What do you think? Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-13 19:15 ` Oleg Nesterov @ 2012-06-14 11:45 ` Srikar Dronamraju 2012-06-14 18:19 ` Oleg Nesterov 0 siblings, 1 reply; 16+ messages in thread From: Srikar Dronamraju @ 2012-06-14 11:45 UTC (permalink / raw) To: Oleg Nesterov Cc: Ananth N Mavinakayanahalli, linuxppc-dev, lkml, michael, antonb, Paul Mackerras, benh, Ingo Molnar, peterz, Jim Keniston * Oleg Nesterov <oleg@redhat.com> [2012-06-13 21:15:19]: > On 06/12, Oleg Nesterov wrote: > > > > On 06/12, Srikar Dronamraju wrote: > > > > > > > > Note also that we should move this !UPROBE_COPY_INSN from > > > > install_breakpoint() to somewhere near alloc_uprobe(). This code > > > > is called only once, it looks a bit strange to use the "random" mm > > > > (the first mm vma_prio_tree_foreach() finds) and its mapping to > > > > verify the insn. In fact this is simply not correct and should be > > > > fixed, note that on x86 arch_uprobe_analyze_insn() checks > > > > > > The reason we "delay" the copy_insn to the first insert is because > > > we have to get access to mm. For archs like x86, we want to know if the > > > executable is 32 bit or not > > > > Yes. And this is wrong afaics. > > > > Once again. This !UPROBE_COPY_INSN code is called only once, and it > > uses the "random" mm. After that install_breakpoint() just calls > > set_swbp(another_mm) while the insn can be invalid because > > another_mm->ia32_compat != mm->ia32_compat. > > > > > So in effect, if we get access to > > > struct file corresponding to the inode and if the inode corresponds to > > > 32 bit executable file or 64 bit executable file during register, then > > > we can move it around alloc_uprobe(). > > > > I don't think this can work. I have another simple fix in mind, I'll > > write another email later. > > For example. Suppose there is some instruction in /lib64/libc.so which > is valid for 64-bit, but not for 32-bit. > > Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC). > How correct is it to have a 32 bit binary link to a 64 bit binary/library? what if the 64 bit binary/executable were to make a jump to a 64 bit address? > Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register() > fails even if there are other 64-bit applications which could be traced. > > Or. uprobe_register() succeeds because it finds a 64-bit mm first, and > then that 32-bit application actually executes the invalid insn. > > We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block. > > Or, perhaps, validate_insn_bits() should call both > validate_insn_32bits() and validate_insn_64bits(), and set the > UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint() > should do the additinal check before set_swbp() and verify that > .ia32_compat matches UPROBE_VALID_IF_*. > > What do you think? > Lets say we do find a 32 bit app and 64 bit app using the same library and the underlying instruction is valid for tracing in 64 bit and not 32 bit. So when we are registering, and failed to insert a breakpoint for the 32 bit app, should we just bail out or should we return a failure? I would probably prefer to read the underlying file something similar to what exec does and based on the magic decipher if we should verify for 32 bit instructions or 64 bit instructions. I didnt find a good way to read the first few bytes just by looking at the inode. So one option is to read the underlying file at the first insertion (i.e similar to what we do now .. but instead of depending on ia32_compat of a random mm, depend on the exec magic) Better option would be read the underlying file at the register time and return an error even without attempting an insert if the instruction wasnt valid. But I am still ignorant on how to do this because we need a struct file to do this. -- Thanks and Regards Srikar ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-14 11:45 ` Srikar Dronamraju @ 2012-06-14 18:19 ` Oleg Nesterov 2012-06-15 12:33 ` Srikar Dronamraju 0 siblings, 1 reply; 16+ messages in thread From: Oleg Nesterov @ 2012-06-14 18:19 UTC (permalink / raw) To: Srikar Dronamraju Cc: Ananth N Mavinakayanahalli, linuxppc-dev, lkml, michael, antonb, Paul Mackerras, benh, Ingo Molnar, peterz, Jim Keniston On 06/14, Srikar Dronamraju wrote: > > * Oleg Nesterov <oleg@redhat.com> [2012-06-13 21:15:19]: > > > For example. Suppose there is some instruction in /lib64/libc.so which > > is valid for 64-bit, but not for 32-bit. > > > > Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC). > > > > How correct is it to have a 32 bit binary link to a 64 bit binary/library? No, I didn't mean this. I guess you misunderstood my point, see below. > > Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register() > > fails even if there are other 64-bit applications which could be traced. > > > > Or. uprobe_register() succeeds because it finds a 64-bit mm first, and > > then that 32-bit application actually executes the invalid insn. > > > > We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block. > > > > Or, perhaps, validate_insn_bits() should call both > > validate_insn_32bits() and validate_insn_64bits(), and set the > > UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint() > > should do the additinal check before set_swbp() and verify that > > .ia32_compat matches UPROBE_VALID_IF_*. > > > > > What do you think? > > > > Lets say we do find a 32 bit app and 64 bit app using the same library > and the underlying instruction is valid for tracing in 64 bit and not 32 > bit. So when we are registering, and failed to insert a breakpoint for > the 32 bit app, should we just bail out or should we return a failure? I do not really know, I tend to think we should not fail. But this is another story... Look. Suppose that a 32-bit app starts after uprobe_register() succeeds. In this case we have no option, uprobe_mmap()->install_breakpoint() should "silently" fail. Currently it doesn't, this is one of the reasons why I think the validation logic is wrong. And. if install_breakpoint() can fail later anyway (in this case), then I think uprobe_register() should not fail. But probably this needs more discussion. > I would probably prefer to read the underlying file something similar to > what exec does and based on the magic decipher if we should verify for > 32 bit instructions or 64 bit instructions. But this can't protect from the malicious user who does mmap(64-bit-code, PROT_EXEC) from a 32-bit app, and this can confuse uprobes even if that 32-bit app never tries to actually execute that 64-bit-code. That is why I think we need the additional (and arch-dependant) check before every set_swbp(), but arch_uprobe_analyze_insn/etc should not depend on task/mm/vaddr/whatever. Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-14 18:19 ` Oleg Nesterov @ 2012-06-15 12:33 ` Srikar Dronamraju 2012-06-16 18:05 ` Oleg Nesterov 0 siblings, 1 reply; 16+ messages in thread From: Srikar Dronamraju @ 2012-06-15 12:33 UTC (permalink / raw) To: Oleg Nesterov Cc: Ananth N Mavinakayanahalli, lkml, Ingo Molnar, peterz, Jim Keniston > > > > Lets say we do find a 32 bit app and 64 bit app using the same library > > and the underlying instruction is valid for tracing in 64 bit and not 32 > > bit. So when we are registering, and failed to insert a breakpoint for > > the 32 bit app, should we just bail out or should we return a failure? > > I do not really know, I tend to think we should not fail. But this is > another story... > > Look. Suppose that a 32-bit app starts after uprobe_register() succeeds. > In this case we have no option, uprobe_mmap()->install_breakpoint() > should "silently" fail. Currently it doesn't, this is one of the reasons > why I think the validation logic is wrong. > > And. if install_breakpoint() can fail later anyway (in this case), then > I think uprobe_register() should not fail. > > But probably this needs more discussion. > > > > I would probably prefer to read the underlying file something similar to > > what exec does and based on the magic decipher if we should verify for > > 32 bit instructions or 64 bit instructions. > > But this can't protect from the malicious user who does > mmap(64-bit-code, PROT_EXEC) from a 32-bit app, and this can confuse > uprobes even if that 32-bit app never tries to actually execute that > 64-bit-code. > So if we read just after we allocate uprobe struct but before probe insertion, then we dont need to check this for each process. So if the library was 64 bit mapped in 32 bit process and has a valid instruction for 64 bit, then we just check for valid 64 bit instructions and allow insertion of the breakpoint even in the 32 bit process. So in this case, the behaviour of such processes would be similar with or without breakpoints. > That is why I think we need the additional (and arch-dependant) check > before every set_swbp(), but arch_uprobe_analyze_insn/etc should not > depend on task/mm/vaddr/whatever. > Here is a very crude implementation of the same. Also this depends on read_mapping_page taking NULL as an valid argument for file. As a side-effect we can do away with UPROBE_COPY_INSN which was set and read at just one place. 1. Move the copy_insn to just after alloc_uprobe. 2. Assume that copy_insn can work without struct file 3. Read the elfhdr for the file. 4. Pass the elfhdr to the arch specific analyze insn 5. Move the analyze instruction to before the actual probe insertion. diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 94c46af..41effbc 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -32,6 +32,7 @@ #include <linux/swap.h> /* try_to_free_swap */ #include <linux/ptrace.h> /* user_enable_single_step */ #include <linux/kdebug.h> /* notifier mechanism */ +#include <linux/elf.h> #include <linux/uprobes.h> @@ -578,16 +579,14 @@ static bool consumer_del(struct uprobe *uprobe, struct uprobe_consumer *uc) } static int -__copy_insn(struct address_space *mapping, struct file *filp, char *insn, - unsigned long nbytes, loff_t offset) +__copy_insn(struct address_space *mapping, char *insn, unsigned long nbytes, + loff_t offset) { struct page *page; void *vaddr; unsigned long off; pgoff_t idx; - if (!filp) - return -EINVAL; if (!mapping->a_ops->readpage) return -EIO; @@ -598,7 +597,7 @@ __copy_insn(struct address_space *mapping, struct file *filp, char *insn, * Ensure that the page that has the original instruction is * populated and in page-cache. */ - page = read_mapping_page(mapping, idx, filp); + page = read_mapping_page(mapping, idx, NULL); if (IS_ERR(page)) return PTR_ERR(page); @@ -610,7 +609,7 @@ __copy_insn(struct address_space *mapping, struct file *filp, char *insn, return 0; } -static int copy_insn(struct uprobe *uprobe, struct file *filp) +static int copy_insn(struct uprobe *uprobe) { struct address_space *mapping; unsigned long nbytes; @@ -627,13 +626,13 @@ static int copy_insn(struct uprobe *uprobe, struct file *filp) /* Instruction at the page-boundary; copy bytes in second page */ if (nbytes < bytes) { - int err = __copy_insn(mapping, filp, uprobe->arch.insn + nbytes, + int err = __copy_insn(mapping, uprobe->arch.insn + nbytes, bytes - nbytes, uprobe->offset + nbytes); if (err) return err; bytes = nbytes; } - return __copy_insn(mapping, filp, uprobe->arch.insn, bytes, uprobe->offset); + return __copy_insn(mapping, uprobe->arch.insn, bytes, uprobe->offset); } /* @@ -675,25 +674,6 @@ install_breakpoint(struct uprobe *uprobe, struct mm_struct *mm, if (!uprobe->consumers) return -EEXIST; - if (!(uprobe->flags & UPROBE_COPY_INSN)) { - ret = copy_insn(uprobe, vma->vm_file); - if (ret) - return ret; - - if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn)) - return -ENOTSUPP; - - ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, vaddr); - if (ret) - return ret; - - /* write_opcode() assumes we don't cross page boundary */ - BUG_ON((uprobe->offset & ~PAGE_MASK) + - UPROBE_SWBP_INSN_SIZE > PAGE_SIZE); - - uprobe->flags |= UPROBE_COPY_INSN; - } - /* * Ideally, should be updating the probe count after the breakpoint * has been successfully inserted. However a thread could hit the @@ -897,6 +877,7 @@ static void __uprobe_unregister(struct uprobe *uprobe) */ int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc) { + struct elfhdr elf_ex; struct uprobe *uprobe; int ret; @@ -906,10 +887,29 @@ int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer * if (offset > i_size_read(inode)) return -EINVAL; - ret = 0; + if ((offset & ~PAGE_MASK) + UPROBE_SWBP_INSN_SIZE > PAGE_SIZE) + return -EINVAL; + mutex_lock(uprobes_hash(inode)); uprobe = alloc_uprobe(inode, offset); + ret = copy_insn(uprobe); + if (ret) + goto out; + + if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn)) { + ret = -ENOTSUPP; + goto out; + } + + ret = __copy_insn(uprobe->inode->i_mapping, &elf_ex, sizeof(elf_ex), 0); + if (ret) + goto out; + + ret = arch_uprobe_analyze_insn(&uprobe->arch, uprobe->offset, &elf_ex); + if (ret) + goto out; + if (uprobe && !consumer_add(uprobe, uc)) { ret = __uprobe_register(uprobe); if (ret) { @@ -920,6 +920,7 @@ int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer * } } +out: mutex_unlock(uprobes_hash(inode)); put_uprobe(uprobe); ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-15 12:33 ` Srikar Dronamraju @ 2012-06-16 18:05 ` Oleg Nesterov 2012-06-18 12:06 ` Srikar Dronamraju 0 siblings, 1 reply; 16+ messages in thread From: Oleg Nesterov @ 2012-06-16 18:05 UTC (permalink / raw) To: Srikar Dronamraju Cc: Ananth N Mavinakayanahalli, lkml, Ingo Molnar, peterz, Jim Keniston Srikar, To clarify: I am not arguing, I am asking because I know nothing about asm/opcodes/etc. On 06/15, Srikar Dronamraju wrote: > > > But this can't protect from the malicious user who does > > mmap(64-bit-code, PROT_EXEC) from a 32-bit app, and this can confuse > > uprobes even if that 32-bit app never tries to actually execute that > > 64-bit-code. > > > > So if we read just after we allocate uprobe struct but before > probe insertion, then we dont need to check this for each process. > > So if the library was 64 bit mapped in 32 bit process and has a valid > instruction for 64 bit, then we just check for valid 64 bit instructions > and allow insertion of the breakpoint even in the 32 bit process. And what happens if this insn is not valid from validate_insn_32bits() pov and that 32-bit application tries to execute it? Or vice versa, see below. > Here is a very crude implementation of the same. > Also this depends on read_mapping_page taking NULL as an valid argument > for file. As a side-effect we can do away with UPROBE_COPY_INSN which > was set and read at just one place. > > 1. Move the copy_insn to just after alloc_uprobe. > 2. Assume that copy_insn can work without struct file > ... > 5. Move the analyze instruction to before the actual probe insertion. OK, this is what I think we should do anyway (at least try to do). > 3. Read the elfhdr for the file. > 4. Pass the elfhdr to the arch specific analyze insn This assumes that everything is elf. Why? An application is free to create a file in any format and do mmap(PROT_EXEC). But OK, probably we can restrict uprobe_register() to work only with elf files which do not mix 32/64 bits. My concern is, are you sure an evil user can't confuse uprobes and do something bad? Just to explain what I mean. For example, we certainly do not want to allow to probe the "syscall" insn, at least with the current implementation. So I assume that validate_insn_64bits("syscall") must fail. Are you sure that validate_insn_32bits("syscall") will fail too? Of course, I am not asking about "syscall" in particular. In general, suppose that, say, validate_insn_64bits() returns true. Are you sure this insn can't do something different and harmful if it is executed by __USER32_CS task? Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-16 18:05 ` Oleg Nesterov @ 2012-06-18 12:06 ` Srikar Dronamraju 2012-06-20 17:15 ` Oleg Nesterov 0 siblings, 1 reply; 16+ messages in thread From: Srikar Dronamraju @ 2012-06-18 12:06 UTC (permalink / raw) To: Oleg Nesterov Cc: Ananth N Mavinakayanahalli, lkml, Ingo Molnar, peterz, Jim Keniston, H. Peter Anvin, torvalds * Oleg Nesterov <oleg@redhat.com> [2012-06-16 20:05:55]: > Srikar, > > To clarify: I am not arguing, I am asking because I know nothing about > asm/opcodes/etc. Certainly not, I think you are talking about a valid concern. > > On 06/15, Srikar Dronamraju wrote: > > > > > But this can't protect from the malicious user who does > > > mmap(64-bit-code, PROT_EXEC) from a 32-bit app, and this can confuse > > > uprobes even if that 32-bit app never tries to actually execute that > > > 64-bit-code. > > > > > > > So if we read just after we allocate uprobe struct but before > > probe insertion, then we dont need to check this for each process. > > > > So if the library was 64 bit mapped in 32 bit process and has a valid > > instruction for 64 bit, then we just check for valid 64 bit instructions > > and allow insertion of the breakpoint even in the 32 bit process. > > And what happens if this insn is not valid from validate_insn_32bits() > pov and that 32-bit application tries to execute it? Or vice versa, > see below. > > > Here is a very crude implementation of the same. > > Also this depends on read_mapping_page taking NULL as an valid argument > > for file. As a side-effect we can do away with UPROBE_COPY_INSN which > > was set and read at just one place. > > > > 1. Move the copy_insn to just after alloc_uprobe. > > 2. Assume that copy_insn can work without struct file > > ... > > 5. Move the analyze instruction to before the actual probe insertion. > > OK, this is what I think we should do anyway (at least try to do). > > > 3. Read the elfhdr for the file. > > 4. Pass the elfhdr to the arch specific analyze insn > > This assumes that everything is elf. Why? An application is free to > create a file in any format and do mmap(PROT_EXEC). > > But OK, probably we can restrict uprobe_register() to work only with > elf files which do not mix 32/64 bits. > We should probably be able to fix file type part, As I said it was just a thought. > > My concern is, are you sure an evil user can't confuse uprobes and > do something bad? > > Just to explain what I mean. For example, we certainly do not want > to allow to probe the "syscall" insn, at least with the current > implementation. So I assume that validate_insn_64bits("syscall") > must fail. > > Are you sure that validate_insn_32bits("syscall") will fail too? > > Of course, I am not asking about "syscall" in particular. In general, > suppose that, say, validate_insn_64bits() returns true. Are you sure > this insn can't do something different and harmful if it is executed > by __USER32_CS task? > validate_insn_64bits can return fail for two cases. 1. Few opcodes that uprobes refuses to place probes. 2. opcodes that are invalid from a 64 perspective. validate_insn_32bits() can return fail for similar reasons. The first set is a common set between validate_insn_64bits / validate_insn_32bits. This includes the syscall, lock prefix, etc. Coming to the second set, there can be an instruction that is valid for 64 bit and not valid for 32 bit. If the instruction is valid for 32 bit set but invalid instruction for 64 bit, and is part of a 32 bit executable file but was mapped by a 64 bit process. We would allow it to be probed since we only check for 32 bit set. [Assuming it runs till a breakpoint hit;] I assume singlestep should generate a SIGILL signal since its not a valid 64 bit instruction. However this behaviour is on par with the behaviour if the probe was not hit too. i.e Once this invalid instruction was executed, It would have generated SIGILL. The same should hold true for a 32 bit invalid instruction in a 64 bit executable mapped into 32 bit process. Please do let me know if my understanding is incorrect or if there are loopholes Again, this is all dependent on the ability to detect the executable type from the inode. If there is any case we cant detect the executable type, I think we should just skip this discussion and go with your suggestion. -- Thanks and Regards Srikar ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-18 12:06 ` Srikar Dronamraju @ 2012-06-20 17:15 ` Oleg Nesterov 0 siblings, 0 replies; 16+ messages in thread From: Oleg Nesterov @ 2012-06-20 17:15 UTC (permalink / raw) To: Srikar Dronamraju Cc: Ananth N Mavinakayanahalli, lkml, Ingo Molnar, peterz, Jim Keniston, H. Peter Anvin, torvalds On 06/18, Srikar Dronamraju wrote: > > > My concern is, are you sure an evil user can't confuse uprobes and > > do something bad? > > > > Just to explain what I mean. For example, we certainly do not want > > to allow to probe the "syscall" insn, at least with the current > > implementation. So I assume that validate_insn_64bits("syscall") > > must fail. > > > > Are you sure that validate_insn_32bits("syscall") will fail too? > > > > Of course, I am not asking about "syscall" in particular. In general, > > suppose that, say, validate_insn_64bits() returns true. Are you sure > > this insn can't do something different and harmful if it is executed > > by __USER32_CS task? > > > > validate_insn_64bits can return fail for two cases. > 1. Few opcodes that uprobes refuses to place probes. > 2. opcodes that are invalid from a 64 perspective. > > validate_insn_32bits() can return fail for similar reasons. > > The first set is a common set between validate_insn_64bits / > validate_insn_32bits. This includes the syscall, lock prefix, etc. > > Coming to the second set, there can be an instruction that is valid for > 64 bit and not valid for 32 bit. > > If the instruction is valid for 32 bit set but invalid instruction for > 64 bit, and is part of a 32 bit executable file but was mapped by a 64 > bit process. We would allow it to be probed since we only check for 32 > bit set. [Assuming it runs till a breakpoint hit;] I assume singlestep > should generate a SIGILL signal since its not a valid 64 bit > instruction. However this behaviour is on par with the behaviour if the > probe was not hit too. i.e Once this invalid instruction was executed, > It would have generated SIGILL. The same should hold true for a 32 bit > invalid instruction in a 64 bit executable mapped into 32 bit process. SIGILL (invalid insn) is fine, I was worried about the possibility to allow to execute the valid (from CPU pov) but "dangerous" (from uprobes pov) insn. > Please do let me know if my understanding is incorrect or if there are > loopholes Well, you understand this indefinitely better than me ;) If you do not see any hole then everything should be fine, I think. > Again, this is all dependent on the ability to detect the executable > type from the inode. Yes... I am still not sure about this. But again, I am not arguing, just I do not know. Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2012-06-20 17:18 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-06-08 9:32 [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Ananth N Mavinakayanahalli 2012-06-08 9:34 ` [PATCH v2 2/2] [POWERPC] uprobes: powerpc port Ananth N Mavinakayanahalli 2012-06-08 14:58 ` [tip:perf/core] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() tip-bot for Ananth N Mavinakayanahalli 2012-06-11 16:12 ` [PATCH v2 1/2] " Oleg Nesterov 2012-06-11 19:09 ` Q: a_ops->readpage() && struct file Oleg Nesterov 2012-06-13 9:58 ` Peter Zijlstra 2012-06-13 19:19 ` Oleg Nesterov 2012-06-12 16:54 ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Srikar Dronamraju 2012-06-12 17:43 ` Oleg Nesterov 2012-06-13 19:15 ` Oleg Nesterov 2012-06-14 11:45 ` Srikar Dronamraju 2012-06-14 18:19 ` Oleg Nesterov 2012-06-15 12:33 ` Srikar Dronamraju 2012-06-16 18:05 ` Oleg Nesterov 2012-06-18 12:06 ` Srikar Dronamraju 2012-06-20 17:15 ` Oleg Nesterov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).