linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn()
@ 2012-06-08  9:32 Ananth N Mavinakayanahalli
  2012-06-08  9:34 ` [PATCH v2 2/2] [POWERPC] uprobes: powerpc port Ananth N Mavinakayanahalli
  2012-06-11 16:12 ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Oleg Nesterov
  0 siblings, 2 replies; 11+ messages in thread
From: Ananth N Mavinakayanahalli @ 2012-06-08  9:32 UTC (permalink / raw)
  To: linuxppc-dev, lkml
  Cc: Srikar Dronamraju, peterz, antonb, oleg, michael, Jim Keniston,
	Paul Mackerras, Ingo Molnar

From: Ananth N Mavinakayanahalli <ananth@in.ibm.com>

On RISC architectures like powerpc, instructions are fixed size.
Instruction analysis on such platforms is just a matter of (insn % 4).
Pass the vaddr at which the uprobe is to be inserted so that
arch_uprobe_analyze_insn() can flag misaligned registration requests.

Changes in V2:
Pass (unsigned long)addr instead of (loff_t)vaddr to
arch_uprobe_analyze_insn(). We need the loff_t vaddr to take care of
the offset of inode:offset pair for large file sizes on 32bit.

Signed-off-by: Ananth N Mavinakaynahalli <ananth@in.ibm.com>
---
 arch/x86/include/asm/uprobes.h |    2 +-
 arch/x86/kernel/uprobes.c      |    3 ++-
 kernel/events/uprobes.c        |    2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

Index: linux-3.5-rc1/arch/x86/include/asm/uprobes.h
===================================================================
--- linux-3.5-rc1.orig/arch/x86/include/asm/uprobes.h
+++ linux-3.5-rc1/arch/x86/include/asm/uprobes.h
@@ -48,7 +48,7 @@ struct arch_uprobe_task {
 #endif
 };
 
-extern int  arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm);
+extern int  arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr);
 extern int  arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
 extern int  arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
 extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
Index: linux-3.5-rc1/arch/x86/kernel/uprobes.c
===================================================================
--- linux-3.5-rc1.orig/arch/x86/kernel/uprobes.c
+++ linux-3.5-rc1/arch/x86/kernel/uprobes.c
@@ -409,9 +409,10 @@ static int validate_insn_bits(struct arc
  * arch_uprobe_analyze_insn - instruction analysis including validity and fixups.
  * @mm: the probed address space.
  * @arch_uprobe: the probepoint information.
+ * @addr: virtual address at which to install the probepoint
  * Return 0 on success or a -ve number on error.
  */
-int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm)
+int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr)
 {
 	int ret;
 	struct insn insn;
Index: linux-3.5-rc1/kernel/events/uprobes.c
===================================================================
--- linux-3.5-rc1.orig/kernel/events/uprobes.c
+++ linux-3.5-rc1/kernel/events/uprobes.c
@@ -697,7 +697,7 @@ install_breakpoint(struct uprobe *uprobe
 		if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn))
 			return -EEXIST;
 
-		ret = arch_uprobe_analyze_insn(&uprobe->arch, mm);
+		ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, addr);
 		if (ret)
 			return ret;
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] [POWERPC] uprobes: powerpc port
  2012-06-08  9:32 [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Ananth N Mavinakayanahalli
@ 2012-06-08  9:34 ` Ananth N Mavinakayanahalli
  2012-06-11 16:12 ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Oleg Nesterov
  1 sibling, 0 replies; 11+ messages in thread
From: Ananth N Mavinakayanahalli @ 2012-06-08  9:34 UTC (permalink / raw)
  To: linuxppc-dev, lkml
  Cc: Srikar Dronamraju, peterz, antonb, oleg, michael, Jim Keniston,
	Paul Mackerras, Ingo Molnar

From: Ananth N Mavinakayanahalli <ananth@in.ibm.com>

This is the port of uprobes to powerpc. Usage is similar to x86.

One TODO in this port compared to x86 is the uprobe abort_xol() logic.
x86 depends on the thread_struct.trap_nr (absent in powerpc) to determine
if a signal was caused when the uprobed instruction was single-stepped/
emulated, in which case, we reset the instruction pointer to the probed
address and retry the probe again.

Tested on POWER6; I don't see anything here that should stop it from
working on a ppc32; since I don't have access to a ppc32 machine, it
would be good if somoene could verify that part.

[root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc
Added new event:
  probe_libc:malloc    (on 0xb4860)

You can now use it in all perf tools, such as:

	perf record -e probe_libc:malloc -aR sleep 1

[root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20
[ perf record: Woken up 22 times to write data ]
[ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ]
[root@xxxx ~]# ./bin/perf report --stdio
# ========
# captured on: Mon Jun  4 05:26:31 2012
# hostname : xxxx.ibm.com
# os release : 3.4.0-uprobe
# perf version : 3.4.0
# arch : ppc64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : POWER6 (raw), altivec supported
# cpuid : 62,769
# total memory : 7310528 kB
# cmdline : /root/bin/perf record -e probe_libc:malloc -aR sleep 20 
# event : name = probe_libc:malloc, type = 2, config = 0x124, config1 = 0x0, con
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# ========
#
# Samples: 83K of event 'probe_libc:malloc'
# Event count (approx.): 83484
#
# Overhead       Command  Shared Object      Symbol
# ........  ............  .............  ..........
#
    69.05%           tar  libc-2.12.so   [.] malloc
    28.57%            rm  libc-2.12.so   [.] malloc
     1.32%  avahi-daemon  libc-2.12.so   [.] malloc
     0.58%          bash  libc-2.12.so   [.] malloc
     0.28%          sshd  libc-2.12.so   [.] malloc
     0.08%    irqbalance  libc-2.12.so   [.] malloc
     0.05%         bzip2  libc-2.12.so   [.] malloc
     0.04%         sleep  libc-2.12.so   [.] malloc
     0.03%    multipathd  libc-2.12.so   [.] malloc
     0.01%      sendmail  libc-2.12.so   [.] malloc
     0.01%     automount  libc-2.12.so   [.] malloc

V2:
a. arch_uprobe_analyze_insn() now gets unsigned long addr.
b. Verified that mtmsr[d] and rfi[d] are handled correctly by
   emulate_step() (no changes to this patch).

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
---
 arch/powerpc/Kconfig                   |    3 
 arch/powerpc/include/asm/thread_info.h |    4 
 arch/powerpc/include/asm/uprobes.h     |   49 +++++++++
 arch/powerpc/kernel/Makefile           |    1 
 arch/powerpc/kernel/signal.c           |    6 +
 arch/powerpc/kernel/uprobes.c          |  164 +++++++++++++++++++++++++++++++++
 6 files changed, 226 insertions(+), 1 deletion(-)

Index: linux-3.5-rc1/arch/powerpc/include/asm/thread_info.h
===================================================================
--- linux-3.5-rc1.orig/arch/powerpc/include/asm/thread_info.h
+++ linux-3.5-rc1/arch/powerpc/include/asm/thread_info.h
@@ -96,6 +96,7 @@ static inline struct thread_info *curren
 #define TIF_RESTOREALL		11	/* Restore all regs (implies NOERROR) */
 #define TIF_NOERROR		12	/* Force successful syscall return */
 #define TIF_NOTIFY_RESUME	13	/* callback before returning to user */
+#define TIF_UPROBE		14	/* breakpointed or single-stepping */
 #define TIF_SYSCALL_TRACEPOINT	15	/* syscall tracepoint instrumentation */
 
 /* as above, but as bit values */
@@ -112,12 +113,13 @@ static inline struct thread_info *curren
 #define _TIF_RESTOREALL		(1<<TIF_RESTOREALL)
 #define _TIF_NOERROR		(1<<TIF_NOERROR)
 #define _TIF_NOTIFY_RESUME	(1<<TIF_NOTIFY_RESUME)
+#define _TIF_UPROBE		(1<<TIF_UPROBE)
 #define _TIF_SYSCALL_TRACEPOINT	(1<<TIF_SYSCALL_TRACEPOINT)
 #define _TIF_SYSCALL_T_OR_A	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 				 _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT)
 
 #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
-				 _TIF_NOTIFY_RESUME)
+				 _TIF_NOTIFY_RESUME | _TIF_UPROBE)
 #define _TIF_PERSYSCALL_MASK	(_TIF_RESTOREALL|_TIF_NOERROR)
 
 /* Bits in local_flags */
Index: linux-3.5-rc1/arch/powerpc/include/asm/uprobes.h
===================================================================
--- /dev/null
+++ linux-3.5-rc1/arch/powerpc/include/asm/uprobes.h
@@ -0,0 +1,49 @@
+#ifndef _ASM_UPROBES_H
+#define _ASM_UPROBES_H
+/*
+ * User-space Probes (UProbes) for powerpc
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2007-2012
+ *
+ * Adapted from the x86 port by Ananth N Mavinakayanahalli <ananth@in.ibm.com>
+ */
+
+#include <linux/notifier.h>
+
+typedef unsigned int uprobe_opcode_t;
+
+#define MAX_UINSN_BYTES			4
+#define UPROBE_XOL_SLOT_BYTES		(MAX_UINSN_BYTES)
+
+#define UPROBE_SWBP_INSN		0x7fe00008
+#define UPROBE_SWBP_INSN_SIZE		4 /* swbp insn size in bytes */
+
+struct arch_uprobe {
+	u8	insn[MAX_UINSN_BYTES];
+};
+
+struct arch_uprobe_task {
+};
+
+extern int  arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr);
+extern unsigned long uprobe_get_swbp_addr(struct pt_regs *regs);
+extern int  arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
+extern int  arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
+extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
+extern int  arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
+extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
+#endif	/* _ASM_UPROBES_H */
Index: linux-3.5-rc1/arch/powerpc/kernel/Makefile
===================================================================
--- linux-3.5-rc1.orig/arch/powerpc/kernel/Makefile
+++ linux-3.5-rc1/arch/powerpc/kernel/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_MODULES)		+= ppc_ksyms.o
 obj-$(CONFIG_BOOTX_TEXT)	+= btext.o
 obj-$(CONFIG_SMP)		+= smp.o
 obj-$(CONFIG_KPROBES)		+= kprobes.o
+obj-$(CONFIG_UPROBES)		+= uprobes.o
 obj-$(CONFIG_PPC_UDBG_16550)	+= legacy_serial.o udbg_16550.o
 obj-$(CONFIG_STACKTRACE)	+= stacktrace.o
 obj-$(CONFIG_SWIOTLB)		+= dma-swiotlb.o
Index: linux-3.5-rc1/arch/powerpc/kernel/signal.c
===================================================================
--- linux-3.5-rc1.orig/arch/powerpc/kernel/signal.c
+++ linux-3.5-rc1/arch/powerpc/kernel/signal.c
@@ -11,6 +11,7 @@
 
 #include <linux/tracehook.h>
 #include <linux/signal.h>
+#include <linux/uprobes.h>
 #include <linux/key.h>
 #include <asm/hw_breakpoint.h>
 #include <asm/uaccess.h>
@@ -157,6 +158,11 @@ static int do_signal(struct pt_regs *reg
 
 void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
 {
+	if (thread_info_flags & _TIF_UPROBE) {
+		clear_thread_flag(TIF_UPROBE);
+		uprobe_notify_resume(regs);
+	}
+
 	if (thread_info_flags & _TIF_SIGPENDING)
 		do_signal(regs);
 
Index: linux-3.5-rc1/arch/powerpc/kernel/uprobes.c
===================================================================
--- /dev/null
+++ linux-3.5-rc1/arch/powerpc/kernel/uprobes.c
@@ -0,0 +1,164 @@
+/*
+ * User-space Probes (UProbes) for powerpc
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2007-2012
+ *
+ * Adapted from the x86 port by Ananth N Mavinakayanahalli <ananth@in.ibm.com>
+ */
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/ptrace.h>
+#include <linux/uprobes.h>
+#include <linux/uaccess.h>
+
+#include <linux/kdebug.h>
+#include <asm/sstep.h>
+
+/**
+ * arch_uprobe_analyze_insn
+ * @mm: the probed address space.
+ * @arch_uprobe: the probepoint information.
+ * @addr: vaddr to probe.
+ * Return 0 on success or a -ve number on error.
+ */
+int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr)
+{
+	if (addr & 0x03)
+		return -EINVAL;
+	return 0;
+}
+
+/*
+ * arch_uprobe_pre_xol - prepare to execute out of line.
+ * @auprobe: the probepoint information.
+ * @regs: reflects the saved user state of current task.
+ */
+int arch_uprobe_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+{
+	/* FIXME: We don't support abort_xol on powerpc for now */
+	regs->nip = current->utask->xol_vaddr;
+	return 0;
+}
+
+/**
+ * uprobe_get_swbp_addr - compute address of swbp given post-swbp regs
+ * @regs: Reflects the saved state of the task after it has hit a breakpoint
+ * instruction.
+ * Return the address of the breakpoint instruction.
+ */
+unsigned long uprobe_get_swbp_addr(struct pt_regs *regs)
+{
+	return instruction_pointer(regs);
+}
+
+/*
+ * If xol insn itself traps and generates a signal (SIGILL/SIGSEGV/etc),
+ * then detect the case where a singlestepped instruction jumps back to its
+ * own address. It is assumed that anything like do_page_fault/do_trap/etc
+ * sets thread.trap_nr != -1.
+ *
+ * FIXME: powerpc however doesn't have thread.trap_nr yet.
+ *
+ * arch_uprobe_pre_xol/arch_uprobe_post_xol save/restore thread.trap_nr,
+ * arch_uprobe_xol_was_trapped() simply checks that ->trap_nr is not equal to
+ * UPROBE_TRAP_NR == -1 set by arch_uprobe_pre_xol().
+ */
+bool arch_uprobe_xol_was_trapped(struct task_struct *t)
+{
+	/* FIXME: We don't support abort_xol on powerpc for now */
+	return false;
+}
+
+/*
+ * Called after single-stepping. To avoid the SMP problems that can
+ * occur when we temporarily put back the original opcode to
+ * single-step, we single-stepped a copy of the instruction.
+ *
+ * This function prepares to resume execution after the single-step.
+ */
+int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+{
+	/* FIXME: We don't support abort_xol on powerpc for now */
+
+	/*
+	 * On powerpc, except for loads and stores, most instructions
+	 * including ones that alter code flow (branches, calls, returns)
+	 * are emulated in the kernel. We get here only if the emulation
+	 * support doesn't exist and have to fix-up the next instruction
+	 * to be executed.
+	 */
+	regs->nip = current->utask->vaddr + MAX_UINSN_BYTES;
+	return 0;
+}
+
+/* callback routine for handling exceptions. */
+int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data)
+{
+	struct die_args *args = data;
+	struct pt_regs *regs = args->regs;
+	int ret = NOTIFY_DONE;
+
+	/* We are only interested in userspace traps */
+	if (regs && !user_mode(regs))
+		return NOTIFY_DONE;
+
+	switch (val) {
+	case DIE_BPT:
+		if (uprobe_pre_sstep_notifier(regs))
+			ret = NOTIFY_STOP;
+		break;
+	case DIE_SSTEP:
+		if (uprobe_post_sstep_notifier(regs))
+			ret = NOTIFY_STOP;
+	default:
+		break;
+	}
+	return ret;
+}
+
+/*
+ * This function gets called when XOL instruction either gets trapped or
+ * the thread has a fatal signal, so reset the instruction pointer to its
+ * probed address.
+ */
+void arch_uprobe_abort_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+{
+	/* FIXME: We don't support abort_xol on powerpc for now */
+	return;
+}
+
+/*
+ * See if the instruction can be emulated.
+ * Returns true if instruction was emulated, false otherwise.
+ */
+bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
+{
+	int ret;
+	unsigned int insn;
+
+	memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES);
+
+	/*
+	 * emulate_step() returns 1 if the insn was successfully emulated.
+	 * For all other cases, we need to single-step in hardware.
+	 */
+	ret = emulate_step(regs, insn);
+	if (ret > 0)
+		return true;
+
+	return false;
+}
Index: linux-3.5-rc1/arch/powerpc/Kconfig
===================================================================
--- linux-3.5-rc1.orig/arch/powerpc/Kconfig
+++ linux-3.5-rc1/arch/powerpc/Kconfig
@@ -237,6 +237,9 @@ config PPC_OF_PLATFORM_PCI
 config ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	def_bool y
 
+config ARCH_SUPPORTS_UPROBES
+	def_bool y
+
 config PPC_ADV_DEBUG_REGS
 	bool
 	depends on 40x || BOOKE

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn()
  2012-06-08  9:32 [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Ananth N Mavinakayanahalli
  2012-06-08  9:34 ` [PATCH v2 2/2] [POWERPC] uprobes: powerpc port Ananth N Mavinakayanahalli
@ 2012-06-11 16:12 ` Oleg Nesterov
  2012-06-11 19:09   ` Q: a_ops->readpage() && struct file Oleg Nesterov
  2012-06-12 16:54   ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Srikar Dronamraju
  1 sibling, 2 replies; 11+ messages in thread
From: Oleg Nesterov @ 2012-06-11 16:12 UTC (permalink / raw)
  To: Ananth N Mavinakayanahalli
  Cc: Srikar Dronamraju, peterz, antonb, lkml, michael, Jim Keniston,
	Paul Mackerras, Ingo Molnar, linuxppc-dev

Ananth, Srikar,

I think the patch is correct and I am sorry for nit-picking,
this is really minor.

But,

On 06/08, Ananth N Mavinakayanahalli wrote:
>
> Changes in V2:
> Pass (unsigned long)addr

Well, IMHO, this is confusing.

First of all, why do we have this "addr" or even "vaddr"? It should
not exists. We pass it to copy_insn(), but for what?? copy_insn()
should simply use uprobe->offset, the virtual address for this
particular mapping does not matter at all. I am going to send
the cleanup.

Note also that we should move this !UPROBE_COPY_INSN from
install_breakpoint() to somewhere near alloc_uprobe(). This code
is called only once, it looks a bit strange to use the "random" mm
(the first mm vma_prio_tree_foreach() finds) and its mapping to
verify the insn. In fact this is simply not correct and should be
fixed, note that on x86 arch_uprobe_analyze_insn() checks
mm->context.ia32_compat.

IOW, Perhaps uprobe->offset makes more sense?

> --- linux-3.5-rc1.orig/kernel/events/uprobes.c
> +++ linux-3.5-rc1/kernel/events/uprobes.c
> @@ -697,7 +697,7 @@ install_breakpoint(struct uprobe *uprobe
>  		if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn))
>  			return -EEXIST;
>
> -		ret = arch_uprobe_analyze_insn(&uprobe->arch, mm);
> +		ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, addr);

Just fyi, this conflicts with
"[PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T"
I sent, but the conflict is trivial.

Oleg.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Q: a_ops->readpage() && struct file
  2012-06-11 16:12 ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Oleg Nesterov
@ 2012-06-11 19:09   ` Oleg Nesterov
  2012-06-13  9:58     ` Peter Zijlstra
  2012-06-12 16:54   ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Srikar Dronamraju
  1 sibling, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2012-06-11 19:09 UTC (permalink / raw)
  To: Ananth N Mavinakayanahalli, Linus Torvalds, Al Viro
  Cc: Srikar Dronamraju, peterz, antonb, lkml, michael, Jim Keniston,
	Paul Mackerras, Ingo Molnar, linuxppc-dev

On 06/11, Oleg Nesterov wrote:
>
> Note also that we should move this !UPROBE_COPY_INSN from
> install_breakpoint() to somewhere near alloc_uprobe().

The main problem is, uprobe_register() doesn't have struct file
for read_mapping_page().

Stupid question. I'm afraid the answer is "no" but I'll ask anyway.
Is it safe to pass filp == NULL to mapping->readpage()? In fact
I do not understand why it needs "struct file*" and I do not see
any example of actual usage. Yes, I didn't try to grep very much
and I understand that the filesystem can do something special.
Say it can use file->private_data...

However. There is read_cache_page_gfp() which does use
a_ops->readpage(filp => NULL), and the comment says nothing about
the riskiness.


If not, is there any other way
uprobe_register(struct inode *inode, loff_t offset) can read the
page at this offset?



And btw, read_mapping_page() accepts "void *data". Why? it uses
filler == a_ops->readpage, it shouldn't accept anything but file
pointer?


Please help, I know nothing about vfs.

Oleg.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn()
  2012-06-11 16:12 ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Oleg Nesterov
  2012-06-11 19:09   ` Q: a_ops->readpage() && struct file Oleg Nesterov
@ 2012-06-12 16:54   ` Srikar Dronamraju
  2012-06-12 17:43     ` Oleg Nesterov
  1 sibling, 1 reply; 11+ messages in thread
From: Srikar Dronamraju @ 2012-06-12 16:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: peterz, antonb, lkml, Jim Keniston, Paul Mackerras, Ingo Molnar,
	linuxppc-dev

> 
> Well, IMHO, this is confusing.
> 
> First of all, why do we have this "addr" or even "vaddr"? It should
> not exists. We pass it to copy_insn(), but for what?? copy_insn()
> should simply use uprobe->offset, the virtual address for this
> particular mapping does not matter at all. I am going to send
> the cleanup.
> 

Yes, we can use uprobe->offset instead of vaddr/addr.

> Note also that we should move this !UPROBE_COPY_INSN from
> install_breakpoint() to somewhere near alloc_uprobe(). This code
> is called only once, it looks a bit strange to use the "random" mm
> (the first mm vma_prio_tree_foreach() finds) and its mapping to
> verify the insn. In fact this is simply not correct and should be
> fixed, note that on x86 arch_uprobe_analyze_insn() checks

The reason we "delay" the copy_insn to the first insert is because 
we have to get access to mm. For archs like x86, we want to know if the
executable is 32 bit or not (since we have a different valid set of
instructions for 32 bit and 64 bit). So in effect, if we get access to
struct file corresponding to the inode and if the inode corresponds to
32 bit executable file or 64 bit executable file during register, then
we can move it around alloc_uprobe().

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn()
  2012-06-12 16:54   ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Srikar Dronamraju
@ 2012-06-12 17:43     ` Oleg Nesterov
  2012-06-13 19:15       ` Oleg Nesterov
  0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2012-06-12 17:43 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: peterz, antonb, lkml, Jim Keniston, Paul Mackerras, Ingo Molnar,
	linuxppc-dev

On 06/12, Srikar Dronamraju wrote:
> >
> > Note also that we should move this !UPROBE_COPY_INSN from
> > install_breakpoint() to somewhere near alloc_uprobe(). This code
> > is called only once, it looks a bit strange to use the "random" mm
> > (the first mm vma_prio_tree_foreach() finds) and its mapping to
> > verify the insn. In fact this is simply not correct and should be
> > fixed, note that on x86 arch_uprobe_analyze_insn() checks
>
> The reason we "delay" the copy_insn to the first insert is because
> we have to get access to mm. For archs like x86, we want to know if the
> executable is 32 bit or not

Yes. And this is wrong afaics.

Once again. This !UPROBE_COPY_INSN code is called only once, and it
uses the "random" mm. After that install_breakpoint() just calls
set_swbp(another_mm) while the insn can be invalid because
another_mm->ia32_compat != mm->ia32_compat.

> So in effect, if we get access to
> struct file corresponding to the inode and if the inode corresponds to
> 32 bit executable file or 64 bit executable file during register, then
> we can move it around alloc_uprobe().

I don't think this can work. I have another simple fix in mind, I'll
write another email later.

Oleg.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Q: a_ops->readpage() && struct file
  2012-06-11 19:09   ` Q: a_ops->readpage() && struct file Oleg Nesterov
@ 2012-06-13  9:58     ` Peter Zijlstra
  2012-06-13 19:19       ` Oleg Nesterov
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2012-06-13  9:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Linus Torvalds, lkml, Jim Keniston, Paul Mackerras, Al Viro,
	antonb, Ingo Molnar, linuxppc-dev, Srikar Dronamraju

On Mon, 2012-06-11 at 21:09 +0200, Oleg Nesterov wrote:
> Stupid question. I'm afraid the answer is "no" but I'll ask anyway.
> Is it safe to pass filp =3D=3D NULL to mapping->readpage()? In fact
> I do not understand why it needs "struct file*" and I do not see
> any example of actual usage.=20

Looking at afs_readpage it looks like its OK to pass in NULL. Same for
nfs_readpage. They use the file, if provided, to avoid some lookups, but
seem to deal with not having it.

This answer by example is of course not authorative nor complete.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn()
  2012-06-12 17:43     ` Oleg Nesterov
@ 2012-06-13 19:15       ` Oleg Nesterov
  2012-06-14 11:45         ` Srikar Dronamraju
  0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2012-06-13 19:15 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: peterz, antonb, lkml, Jim Keniston, Paul Mackerras, Ingo Molnar,
	linuxppc-dev

On 06/12, Oleg Nesterov wrote:
>
> On 06/12, Srikar Dronamraju wrote:
> > >
> > > Note also that we should move this !UPROBE_COPY_INSN from
> > > install_breakpoint() to somewhere near alloc_uprobe(). This code
> > > is called only once, it looks a bit strange to use the "random" mm
> > > (the first mm vma_prio_tree_foreach() finds) and its mapping to
> > > verify the insn. In fact this is simply not correct and should be
> > > fixed, note that on x86 arch_uprobe_analyze_insn() checks
> >
> > The reason we "delay" the copy_insn to the first insert is because
> > we have to get access to mm. For archs like x86, we want to know if the
> > executable is 32 bit or not
>
> Yes. And this is wrong afaics.
>
> Once again. This !UPROBE_COPY_INSN code is called only once, and it
> uses the "random" mm. After that install_breakpoint() just calls
> set_swbp(another_mm) while the insn can be invalid because
> another_mm->ia32_compat != mm->ia32_compat.
>
> > So in effect, if we get access to
> > struct file corresponding to the inode and if the inode corresponds to
> > 32 bit executable file or 64 bit executable file during register, then
> > we can move it around alloc_uprobe().
>
> I don't think this can work. I have another simple fix in mind, I'll
> write another email later.

For example. Suppose there is some instruction in /lib64/libc.so which
is valid for 64-bit, but not for 32-bit.

Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC).

Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register()
fails even if there are other 64-bit applications which could be traced.

Or. uprobe_register() succeeds because it finds a 64-bit mm first, and
then that 32-bit application actually executes the invalid insn.

We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block.

Or, perhaps, validate_insn_bits() should call both
validate_insn_32bits() and validate_insn_64bits(), and set the
UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint()
should do the additinal check before set_swbp() and verify that
.ia32_compat matches UPROBE_VALID_IF_*.

What do you think?

Oleg.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Q: a_ops->readpage() && struct file
  2012-06-13  9:58     ` Peter Zijlstra
@ 2012-06-13 19:19       ` Oleg Nesterov
  0 siblings, 0 replies; 11+ messages in thread
From: Oleg Nesterov @ 2012-06-13 19:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, lkml, Jim Keniston, Paul Mackerras, Al Viro,
	antonb, Ingo Molnar, linuxppc-dev, Srikar Dronamraju

On 06/13, Peter Zijlstra wrote:
>
> On Mon, 2012-06-11 at 21:09 +0200, Oleg Nesterov wrote:
> > Stupid question. I'm afraid the answer is "no" but I'll ask anyway.
> > Is it safe to pass filp == NULL to mapping->readpage()? In fact
> > I do not understand why it needs "struct file*" and I do not see
> > any example of actual usage.
>
> Looking at afs_readpage it looks like its OK to pass in NULL. Same for
> nfs_readpage. They use the file, if provided, to avoid some lookups, but
> seem to deal with not having it.

Yes, and reiser4 does the same.

> This answer by example is of course not authorative nor complete.

Yes... perhaps we should simply change this code to use NULL and
collect the bug-reports ;)

Oleg.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn()
  2012-06-13 19:15       ` Oleg Nesterov
@ 2012-06-14 11:45         ` Srikar Dronamraju
  2012-06-14 18:19           ` Oleg Nesterov
  0 siblings, 1 reply; 11+ messages in thread
From: Srikar Dronamraju @ 2012-06-14 11:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: peterz, antonb, lkml, Jim Keniston, Paul Mackerras, Ingo Molnar,
	linuxppc-dev

* Oleg Nesterov <oleg@redhat.com> [2012-06-13 21:15:19]:

> On 06/12, Oleg Nesterov wrote:
> >
> > On 06/12, Srikar Dronamraju wrote:
> > > >
> > > > Note also that we should move this !UPROBE_COPY_INSN from
> > > > install_breakpoint() to somewhere near alloc_uprobe(). This code
> > > > is called only once, it looks a bit strange to use the "random" mm
> > > > (the first mm vma_prio_tree_foreach() finds) and its mapping to
> > > > verify the insn. In fact this is simply not correct and should be
> > > > fixed, note that on x86 arch_uprobe_analyze_insn() checks
> > >
> > > The reason we "delay" the copy_insn to the first insert is because
> > > we have to get access to mm. For archs like x86, we want to know if the
> > > executable is 32 bit or not
> >
> > Yes. And this is wrong afaics.
> >
> > Once again. This !UPROBE_COPY_INSN code is called only once, and it
> > uses the "random" mm. After that install_breakpoint() just calls
> > set_swbp(another_mm) while the insn can be invalid because
> > another_mm->ia32_compat != mm->ia32_compat.
> >
> > > So in effect, if we get access to
> > > struct file corresponding to the inode and if the inode corresponds to
> > > 32 bit executable file or 64 bit executable file during register, then
> > > we can move it around alloc_uprobe().
> >
> > I don't think this can work. I have another simple fix in mind, I'll
> > write another email later.
> 
> For example. Suppose there is some instruction in /lib64/libc.so which
> is valid for 64-bit, but not for 32-bit.
> 
> Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC).
> 

How correct is it to have a 32 bit binary link to a 64 bit binary/library?
what if the 64 bit binary/executable were to make a jump to a 64 bit
address? 


> Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register()
> fails even if there are other 64-bit applications which could be traced.
> 
> Or. uprobe_register() succeeds because it finds a 64-bit mm first, and
> then that 32-bit application actually executes the invalid insn.
> 
> We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block.
> 
> Or, perhaps, validate_insn_bits() should call both
> validate_insn_32bits() and validate_insn_64bits(), and set the
> UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint()
> should do the additinal check before set_swbp() and verify that
> .ia32_compat matches UPROBE_VALID_IF_*.
> 

> What do you think?
> 

Lets say we do find a 32 bit app and 64 bit app using the same library
and the underlying instruction is valid for tracing in 64 bit and not 32
bit. So when we are registering, and failed to insert a breakpoint  for
the 32 bit app, should we just bail out or should we return a failure?

I would probably prefer to read the underlying file something similar to
what exec does and based on the magic decipher if we should verify for
32 bit instructions or 64 bit instructions.

I didnt find a good way to read the first few bytes just by looking at
the inode. So one option is to read the underlying file at the first
insertion (i.e similar to what we do now .. but instead of depending on
ia32_compat of a random mm, depend on the exec magic)

Better option would be read the underlying file at the register time and
return an error even without attempting an insert if the instruction
wasnt valid. But I am still ignorant on how to do this because we need a
struct file to do this.

-- 
Thanks and Regards
Srikar

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn()
  2012-06-14 11:45         ` Srikar Dronamraju
@ 2012-06-14 18:19           ` Oleg Nesterov
  0 siblings, 0 replies; 11+ messages in thread
From: Oleg Nesterov @ 2012-06-14 18:19 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: peterz, antonb, lkml, Jim Keniston, Paul Mackerras, Ingo Molnar,
	linuxppc-dev

On 06/14, Srikar Dronamraju wrote:
>
> * Oleg Nesterov <oleg@redhat.com> [2012-06-13 21:15:19]:
>
> > For example. Suppose there is some instruction in /lib64/libc.so which
> > is valid for 64-bit, but not for 32-bit.
> >
> > Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC).
> >
>
> How correct is it to have a 32 bit binary link to a 64 bit binary/library?

No, I didn't mean this. I guess you misunderstood my point, see below.

> > Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register()
> > fails even if there are other 64-bit applications which could be traced.
> >
> > Or. uprobe_register() succeeds because it finds a 64-bit mm first, and
> > then that 32-bit application actually executes the invalid insn.
> >
> > We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block.
> >
> > Or, perhaps, validate_insn_bits() should call both
> > validate_insn_32bits() and validate_insn_64bits(), and set the
> > UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint()
> > should do the additinal check before set_swbp() and verify that
> > .ia32_compat matches UPROBE_VALID_IF_*.
> >
>
> > What do you think?
> >
>
> Lets say we do find a 32 bit app and 64 bit app using the same library
> and the underlying instruction is valid for tracing in 64 bit and not 32
> bit. So when we are registering, and failed to insert a breakpoint  for
> the 32 bit app, should we just bail out or should we return a failure?

I do not really know, I tend to think we should not fail. But this is
another story...

Look. Suppose that a 32-bit app starts after uprobe_register() succeeds.
In this case we have no option, uprobe_mmap()->install_breakpoint()
should "silently" fail. Currently it doesn't, this is one of the reasons
why I think the validation logic is wrong.

And. if install_breakpoint() can fail later anyway (in this case), then
I think uprobe_register() should not fail.

But probably this needs more discussion.


> I would probably prefer to read the underlying file something similar to
> what exec does and based on the magic decipher if we should verify for
> 32 bit instructions or 64 bit instructions.

But this can't protect from the malicious user who does
mmap(64-bit-code, PROT_EXEC) from a 32-bit app, and this can confuse
uprobes even if that 32-bit app never tries to actually execute that
64-bit-code.

That is why I think we need the additional (and arch-dependant) check
before every set_swbp(), but arch_uprobe_analyze_insn/etc should not
depend on task/mm/vaddr/whatever.

Oleg.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-06-14 18:21 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-08  9:32 [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Ananth N Mavinakayanahalli
2012-06-08  9:34 ` [PATCH v2 2/2] [POWERPC] uprobes: powerpc port Ananth N Mavinakayanahalli
2012-06-11 16:12 ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Oleg Nesterov
2012-06-11 19:09   ` Q: a_ops->readpage() && struct file Oleg Nesterov
2012-06-13  9:58     ` Peter Zijlstra
2012-06-13 19:19       ` Oleg Nesterov
2012-06-12 16:54   ` [PATCH v2 1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() Srikar Dronamraju
2012-06-12 17:43     ` Oleg Nesterov
2012-06-13 19:15       ` Oleg Nesterov
2012-06-14 11:45         ` Srikar Dronamraju
2012-06-14 18:19           ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).