public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] ia64 function return probes
@ 2005-06-01 21:39 Rusty Lynch
  2005-06-03  2:17 ` Keith Owens
  2005-06-03 18:01 ` Lynch, Rusty
  0 siblings, 2 replies; 4+ messages in thread
From: Rusty Lynch @ 2005-06-01 21:39 UTC (permalink / raw)
  To: linux-ia64
  Cc: Anil S Keshavamurthy, linux-kernel, Rusty Lynch, systemtap,
	Hien Nguyen

The following is an implementation of the ia64 specific parts for implementing
the function return probes functionality that is a part of kprobes.  There 
were some assumptions about how the architectures work inside kernel/kprobes.c
that force me to do some odd things in this implementation.

For those that have not followed the function return probe discussions, the 
original idea is that in order to have a handler function called when a 
target function returns:

* At system initialization time, kernel/kprobes.c installs a kprobe
  on a function called kretprobe_trampoline() that is implemented in 
  the architecture specific code.  More on this later.

* When a return probe is registered using register_kretprobe(), 
  kernel/kprobes.c will install a kprobe on the first instruction of the 
  targeted function with the pre handler set to arch_prepare_kretprobe()
  which is implemented in the architecture specific code.

* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
  - nodes for hanging this instance in an empty or free list
  - a pointer to the return probe
  - the original return address
  - a pointer to the stack address

  With all this stowed away, arch_prepare_kretprobe() then sets the return
  address for the targeted function to a special trampoline function called
  kretprobe_trampoline().  

* The kprobe completes as normal, with control passing back to the target
  function that executes as normal, and eventually returns to our trampoline
  function.

* Since a kprobe was installed on kretprobe_trampoline() during system 
  initialization, control passes back to kprobes via the architecture
  specific function trampoline_probe_handler() which will lookup the 
  instance in an hlist maintained by kernel/kprobes.c, and then call
  the handler function.

* When trampoline_probe_handler() is done, the kprobes infrastruction
  single steps the original instruction (in this case just a nop), and
  then calls trampoline_post_handler().  trampoline_post_handler() then
  looks up the instance again, puts the instance back on the free list,
  and then makes a long jump back to the original return instruction.


Ok, that was the idea.  For ia64 complexities came up with respect to:

* the assumption that kernel/kprobes.c is working with a 
  stacked based architecture

* the assumption that changing the return address to kretprobe_trampoline()
  will always result in the first instruction of of kretprobe_trampoline
  being executed.

The following patch works around these problems by:

* Providing an empty kretprobe_trampoline(), but we don't really 
  use it as our trampoline function.  Instead we provide:

	/*
	 * void ia64_kretprobe_trampoline(void):
	 *
	 * When a return probe is set on a given function, it's return
	 * address (which really just points to the bundle) is set for
	 * this single bundle function.
	 *
	 * We don't know which slot of the bundle will be set, so we set
	 * a break using a special immediate value to gain control in
	 * each case so the registered return probe can be called and then
	 * restore the cr->iip  back to the real address
	 * (i.e. the original return address)
	 */
GLOBAL_ENTRY(ia64_kretprobe_trampoline)
{ .mii
	break.m __IA64_BREAK_RPROBE
	break.i __IA64_BREAK_RPROBE
	break.i __IA64_BREAK_RPROBE
}
END(ia64_kretprobe_trampoline)

  ... and then handle the break interrupts using this new reserved immediate
  value by just directly calling trampoline_probe_handler().

  Also, we do everything in trampoline_probe_handler() so there is no need
  to single step a nop instruction.

* The instances are stored in an hlist hashed off the task pointer, but it is 
  possible for more then one hash to use the same slot.  We use the return 
  probe instance stack_addr field to point to the pt_regs structure.  With this 
  information, plus an assumption that the first entry in the list for the 
  given task will always be the correct instance for recursive functions, 
  we have all we need.

    --rusty

signed-off-by: Rusty Lynch <Rusty.lynch@intel.com>

 arch/ia64/kernel/Makefile     |    2 
 arch/ia64/kernel/kprobes.c    |  155 +++++++++++++++++++++++++++++++++++++++++-
 arch/ia64/kernel/kretprobes.S |   44 +++++++++++
 arch/ia64/kernel/process.c    |   16 ++++
 arch/ia64/kernel/traps.c      |    1 
 include/asm-ia64/break.h      |    2 
 include/asm-ia64/kprobes.h    |    3 
 7 files changed, 221 insertions(+), 2 deletions(-)

Index: linux-2.6.12-rc5/arch/ia64/kernel/Makefile
=================================--- linux-2.6.12-rc5.orig/arch/ia64/kernel/Makefile
+++ linux-2.6.12-rc5/arch/ia64/kernel/Makefile
@@ -20,7 +20,7 @@ obj-$(CONFIG_SMP)		+= smp.o smpboot.o do
 obj-$(CONFIG_PERFMON)		+= perfmon_default_smpl.o
 obj-$(CONFIG_IA64_CYCLONE)	+= cyclone.o
 obj-$(CONFIG_IA64_MCA_RECOVERY)	+= mca_recovery.o
-obj-$(CONFIG_KPROBES)		+= kprobes.o jprobes.o
+obj-$(CONFIG_KPROBES)		+= kprobes.o jprobes.o kretprobes.o
 mca_recovery-y			+= mca_drv.o mca_drv_asm.o
 
 # The gate DSO image is built using a special linker script.
Index: linux-2.6.12-rc5/arch/ia64/kernel/kprobes.c
=================================--- linux-2.6.12-rc5.orig/arch/ia64/kernel/kprobes.c
+++ linux-2.6.12-rc5/arch/ia64/kernel/kprobes.c
@@ -1,4 +1,4 @@
-/*
+ /*
  *  Kernel Probes (KProbes)
  *  arch/ia64/kernel/kprobes.c
  *
@@ -36,6 +36,7 @@
 #include <asm/kdebug.h>
 
 extern void jprobe_inst_return(void);
+extern void ia64_kretprobe_trampoline(void);
 
 /* kprobe_status settings */
 #define KPROBE_HIT_ACTIVE	0x00000001
@@ -98,6 +99,132 @@ static inline void set_current_kprobe(st
 	current_kprobe = p;
 }
 
+/*
+ * At this point the target function has been tricked into
+ * returning into our trampoline.  Lookup the associated instance
+ * and then:
+ *    - call the handler function
+ *    - cleanup by marking the instance as unused
+ *    - long jump back to the original return address
+ */
+int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
+{
+	struct task_struct *tsk;
+	struct kretprobe_instance *ri = NULL;
+	struct hlist_head *head;
+	struct hlist_node *node;
+
+	tsk = arch_get_kprobe_task(regs);
+	head = kretprobe_inst_table_head(tsk);
+
+	/*
+	 * The first instance associated with the task is the instance
+	 * we need for this function return
+	 */
+	hlist_for_each_entry(ri, node, head, hlist)
+                /* we are using ri->stack_addr as a pt_regs pointer */
+		if (arch_get_kprobe_task(ri->stack_addr) = tsk)
+			break;
+
+	BUG_ON(!ri);
+
+	if (ri->rp && ri->rp->handler)
+		ri->rp->handler(ri, regs);
+
+	regs->cr_iip = (unsigned long)ri->ret_addr;
+	recycle_rp_inst(ri);
+
+	unlock_kprobes();
+	preempt_enable_no_resched();
+
+	/*
+	 * By returning a non-zero value, we are telling
+	 * pre_kprobes_handle() that we have handled unlocking
+	 * and re-enabling preemption.
+	 */
+	return 1;
+}
+
+/*
+ * The other architectures only call the return probe handler from the
+ * trampoline_probe_handler(), and then perform the cleanup and long
+ * jump from this function that gets called after single stepping the
+ * original nop instruction.
+ *
+ * We handle all of this from the trampoline_probe_handler() and do not
+ * need the extra overhead of single stepping the nop instruction.  This
+ * function is just hear to make kernel/kprobes.c happy.
+ */
+void trampoline_post_handler(struct kprobe *p, struct pt_regs *regs,
+			     unsigned long flags)
+{
+}
+
+struct task_struct  *arch_get_kprobe_task(void *ptr)
+{
+
+	return (struct task_struct *)(((struct pt_regs *)ptr)->r13);
+}
+
+void arch_kprobe_flush_task(struct task_struct *tk)
+{
+	struct kretprobe_instance *ri;
+	struct hlist_head *head;
+	struct hlist_node *node, *tmp;
+
+	head = kretprobe_inst_table_head(tk);
+
+	/*
+	 * The task is dead so cleanup any remaining instances
+	 */
+	hlist_for_each_entry_safe(ri, node, tmp, head, hlist) {
+		/*
+		 * The other arch's adjust the return address back
+		 * for each return probe instance, but that seems like
+		 * nonsense to me.  The only reason this this function
+		 * is called is because the task has died before it
+		 * had a chance to finish some return probes.
+		 *
+		 * What's the point in setting the return address of a dead
+		 * task, and is it even safe to be mucking around with
+		 * the task memory at this point?
+		 */
+
+		/* we are using ri->stack_addr as a pt_regs pointer */
+		if (arch_get_kprobe_task(ri->stack_addr) = tk)
+			recycle_rp_inst(ri);
+	}
+}
+
+void arch_prepare_kretprobe(struct kretprobe *rp, struct pt_regs *regs)
+{
+	struct kretprobe_instance *ri;
+
+	if ((ri = get_free_rp_inst(rp)) != NULL) {
+		ri->rp = rp;
+
+		/*
+		 * The stack address doesn't help us much for ia64,
+		 * so we overload this as a task pointer
+		 */
+		ri->stack_addr = regs;
+
+		/*
+		 * TODO: Properly handle the special cases where b6 or b7
+		 * is used instead of b0 for the return address
+		 */
+		ri->ret_addr = (unsigned long *)regs->b0;
+		regs->b0 = ((struct fnptr *)(ia64_kretprobe_trampoline))->ip;
+
+		/*
+		 * How is this instance list protected?
+		 */
+		add_rp_inst(ri);
+	} else {
+		rp->nmissed++;
+	}
+}
+
 int arch_prepare_kprobe(struct kprobe *p)
 {
 	unsigned long addr = (unsigned long) p->addr;
@@ -310,6 +437,11 @@ static int pre_kprobes_handler(struct di
 
 	preempt_disable();
 
+	if (args->err = __IA64_BREAK_RPROBE) {
+		trampoline_probe_handler(0, regs);
+		return 1;
+	}
+
 	/* Handle recursion cases */
 	if (kprobe_running()) {
 		p = get_kprobe(addr);
@@ -464,3 +596,24 @@ int longjmp_break_handler(struct kprobe 
 	*regs = jprobe_saved_regs;
 	return 1;
 }
+
+/*
+ * kernel/kprobes.c assumes that this is the function where we redirect
+ * functions that have a return probe installed.  The idea is that
+ * kernel/kprobes.c then installs a kprobe on the first instruction of this
+ * function to gain control after the targeted function returns.
+ *
+ * For ia64 the return address is just the bundle address and the correct
+ * slot to execute inside that bundle is restored via the psr.ri when the
+ * return instruction is executed.  Therefore we need our trampoline to
+ * have a break on each slot of the first bundle, not just the the first
+ * instruction (i.e. the first slot of the first bundle.)
+ *
+ * We do this without the help of kernel/kprobes.c by writting a function
+ * in kretprobes.S that is just a bundle of break instructions using reserved
+ * immediate value for this purpose.  When we catch that specific break
+ * instruction, we call trampoline_probe_handle() directly.
+ */
+void kretprobe_trampoline(void)
+{
+}
Index: linux-2.6.12-rc5/arch/ia64/kernel/kretprobes.S
=================================--- /dev/null
+++ linux-2.6.12-rc5/arch/ia64/kernel/kretprobes.S
@@ -0,0 +1,44 @@
+/*
+ * return probe specific operations
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) Intel Corporation, 2005
+ *
+ * 2005-May     Rusty Lynch <rusty.lynch@intel.com>
+ */
+#include <asm/break.h>
+#include <asm/asmmacro.h>
+
+	/*
+	 * void ia64_kretprobe_trampoline(void):
+	 *
+	 * When a return probe is set on a given function, it's return
+	 * address (which really just points to the bundle) is set for
+	 * this single bundle function.
+	 *
+	 * We don't know which slot of the bundle will be set, so we set
+	 * a break using a special immediate value to gain control in
+	 * each case so the registered return probe can be called and then
+	 * restore the cr->iip  back to the real address
+	 * (i.e. the original return address)
+	 */
+GLOBAL_ENTRY(ia64_kretprobe_trampoline)
+{ .mii
+	break.m __IA64_BREAK_RPROBE
+	break.i __IA64_BREAK_RPROBE
+	break.i __IA64_BREAK_RPROBE
+}
+END(ia64_kretprobe_trampoline)
Index: linux-2.6.12-rc5/include/asm-ia64/kprobes.h
=================================--- linux-2.6.12-rc5.orig/include/asm-ia64/kprobes.h
+++ linux-2.6.12-rc5/include/asm-ia64/kprobes.h
@@ -45,6 +45,9 @@ typedef struct _bundle {
 } __attribute__((__aligned__(16)))  bundle_t;
 
 #define JPROBE_ENTRY(pentry)	(kprobe_opcode_t *)pentry
+#define ARCH_SUPPORTS_KRETPROBES
+
+void kretprobe_trampoline(void);
 
 #define SLOT0_OPCODE_SHIFT	(37)
 #define SLOT1_p1_OPCODE_SHIFT	(37 - (64-46))
Index: linux-2.6.12-rc5/include/asm-ia64/break.h
=================================--- linux-2.6.12-rc5.orig/include/asm-ia64/break.h
+++ linux-2.6.12-rc5/include/asm-ia64/break.h
@@ -13,8 +13,10 @@
  */
 #define __IA64_BREAK_KDB		0x80100
 #define __IA64_BREAK_KPROBE		0x80200
+#define __IA64_BREAK_RPROBE             0x80201
 #define __IA64_BREAK_JPROBE		0x80300
 
+
 /*
  * OS-specific break numbers:
  */
Index: linux-2.6.12-rc5/arch/ia64/kernel/traps.c
=================================--- linux-2.6.12-rc5.orig/arch/ia64/kernel/traps.c
+++ linux-2.6.12-rc5/arch/ia64/kernel/traps.c
@@ -190,6 +190,7 @@ ia64_bad_break (unsigned long break_num,
 		break;
 
 	      case 0x80200:
+	      case 0x80201:
 	      case 0x80300:
 		if (notify_die(DIE_BREAK, "kprobe", regs, break_num, TRAP_BRKPT, SIGTRAP)
 			       	= NOTIFY_STOP) {
Index: linux-2.6.12-rc5/arch/ia64/kernel/process.c
=================================--- linux-2.6.12-rc5.orig/arch/ia64/kernel/process.c
+++ linux-2.6.12-rc5/arch/ia64/kernel/process.c
@@ -27,6 +27,7 @@
 #include <linux/efi.h>
 #include <linux/interrupt.h>
 #include <linux/delay.h>
+#include <linux/kprobes.h>
 
 #include <asm/cpu.h>
 #include <asm/delay.h>
@@ -707,6 +708,13 @@ kernel_thread_helper (int (*fn)(void *),
 void
 flush_thread (void)
 {
+	/*
+	 * Remove function-return probe instances associated with this task
+	 * and put them back on the free list. Do not insert an exit probe for
+	 * this function, it will be disabled by kprobe_flush_task if you do.
+	 */
+	kprobe_flush_task(current);
+
 	/* drop floating-point and debug-register state if it exists: */
 	current->thread.flags &= ~(IA64_THREAD_FPH_VALID | IA64_THREAD_DBG_VALID);
 	ia64_drop_fpu(current);
@@ -721,6 +729,14 @@ flush_thread (void)
 void
 exit_thread (void)
 {
+
+	/*
+	 * Remove function-return probe instances associated with this task
+	 * and put them back on the free list. Do not insert an exit probe for
+	 * this function, it will be disabled by kprobe_flush_task if you do.
+	 */
+	kprobe_flush_task(current);
+
 	ia64_drop_fpu(current);
 #ifdef CONFIG_PERFMON
        /* if needed, stop monitoring and flush state to perfmon context */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] ia64 function return probes
  2005-06-01 21:39 [RFC] ia64 function return probes Rusty Lynch
@ 2005-06-03  2:17 ` Keith Owens
  2005-06-03  4:56   ` David Mosberger
  2005-06-03 18:01 ` Lynch, Rusty
  1 sibling, 1 reply; 4+ messages in thread
From: Keith Owens @ 2005-06-03  2:17 UTC (permalink / raw)
  To: Rusty Lynch
  Cc: linux-ia64, Anil S Keshavamurthy, linux-kernel, systemtap,
	Hien Nguyen

On Wed, 1 Jun 2005 14:39:22 -0700, 
Rusty Lynch <rusty.lynch@intel.com> wrote:
>The following is an implementation of the ia64 specific parts for implementing
>the function return probes functionality that is a part of kprobes.  There 
>were some assumptions about how the architectures work inside kernel/kprobes.c
>that force me to do some odd things in this implementation.
>[snip]
>Ok, that was the idea.  For ia64 complexities came up with respect to:
>
>* the assumption that kernel/kprobes.c is working with a 
>  stacked based architecture
>
>* the assumption that changing the return address to kretprobe_trampoline()
>  will always result in the first instruction of of kretprobe_trampoline
>  being executed.

Normal br.call/return will always execute the first slot of the return
address.  br.call bn sets bn to IP+16, i.e. the start of the next
bundle, with no slot data.  br.return bn returns to the first slot in
the bundle defined by bn.  Intel IA64 arch vol3, 24531904.pdf page
3:21.

What made you think that you needed to handle return to a non-zero slot
number?  The only instruction that can return to a non-zero slot number
is rfi and, by definition, code that is entered for an interrupt does
not have a return address in any branch register.

>The following patch works around these problems by:
>
>* Providing an empty kretprobe_trampoline(), but we don't really 
>  use it as our trampoline function.  Instead we provide:
>
>	/*
>	 * void ia64_kretprobe_trampoline(void):
>	 *
>	 * When a return probe is set on a given function, it's return
>	 * address (which really just points to the bundle) is set for
>	 * this single bundle function.
>	 *
>	 * We don't know which slot of the bundle will be set, so we set
>	 * a break using a special immediate value to gain control in
>	 * each case so the registered return probe can be called and then
>	 * restore the cr->iip  back to the real address
>	 * (i.e. the original return address)
>	 */
>GLOBAL_ENTRY(ia64_kretprobe_trampoline)
>{ .mii
>	break.m __IA64_BREAK_RPROBE
>	break.i __IA64_BREAK_RPROBE
>	break.i __IA64_BREAK_RPROBE
>}
>END(ia64_kretprobe_trampoline)
>
>  ... and then handle the break interrupts using this new reserved immediate
>  value by just directly calling trampoline_probe_handler().

Implementing a return probe by changing the return address will prevent
us from getting any backtrace past the return probe.  If the function
being probed, or any of its callee functions, gets an error then the
backtrace will terminate at ia64_kretprobe_trampoline, the unwinder may
even loop.  This makes it impossible to find out what called the
function.

One option is to hack return probes into the unwinder as yet another a
special case - not nice.  Another option is this :-

* On entry to the function, arch_prepare_kretprobe() saves the current
  value of b0 in [sp+8] (architecture defined scratch area on stack),
  and saves the current value of ar.pfs in [sp].

* arch_prepare_kretprobe() bumps sp by 16 bytes, to account for the
  saved b0 and ar.pfs.

* arch_prepare_kretprobe() sets b0 to ia64_kretprobe_trampoline.

* arch_prepare_kretprobe() sets the cfm field in ar.pfs to 0.  Of
  course this is the ar.pfs that was saved in pt_regs when the function
  being hooked was entered.

* The function being hooked is entered.  When the function returns, it
  does so to ia64_kretprobe_trampoline with cfm = 0, i.e.
  ia64_kretprobe_trampoline has a zero sized register frame.

* ia64_kretprobe_trampoline looks like this.  Compiled but not tested.

GLOBAL_ENTRY(ia64_kretprobe_trampoline)
	.prologue
	.label_state 1
	.fframe 16
	.savesp b0, 8
	.savesp ar.pfs, 0

	break.m __IA64_BREAK_RPROBE
	ld8 r3=[sp]		// original ar.pfs
	add r2=8, sp		// original b0 was saved in sp+8
	;;
	ld8 r2=[r2]		// original b0

	.body
	mov ar.pfs=r3		// restore original ar.pfs
	;;
	mov b0=r2		// restore original b0
	add sp=-16, sp		// arch_prepare_kretprobe bumped stack by 16 bytes

	.copy_state 1
	br.ret.sptk.many rp
END(ia64_kretprobe_trampoline)

If I got my unwind directives right, that will make the backtrace look
like this

  ...
  hooked function
  ia64_kretprobe_trampoline
  whatever called the hooked function
  ...

It also means that the handler for __IA64_BREAK_RPROBE only has to log
the fact that the function returned then resume at the instruction
after __IA64_BREAK_RPROBE.  ia64_kretprobe_trampoline then returns to
the original caller.

>Index: linux-2.6.12-rc5/arch/ia64/kernel/kprobes.c
>=================================>--- linux-2.6.12-rc5.orig/arch/ia64/kernel/kprobes.c
>+++ linux-2.6.12-rc5/arch/ia64/kernel/kprobes.c
>@@ -1,4 +1,4 @@
>-/*
>+ /*

Interesting addition of white space there :)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] ia64 function return probes
  2005-06-03  2:17 ` Keith Owens
@ 2005-06-03  4:56   ` David Mosberger
  0 siblings, 0 replies; 4+ messages in thread
From: David Mosberger @ 2005-06-03  4:56 UTC (permalink / raw)
  To: Keith Owens
  Cc: Rusty Lynch, linux-ia64, Anil S Keshavamurthy, linux-kernel,
	systemtap, Hien Nguyen

>>>>> On Fri, 03 Jun 2005 12:17:29 +1000, Keith Owens <kaos@ocs.com.au> said:

  Keith> * arch_prepare_kretprobe() bumps sp by 16 bytes, to account
  Keith> for the saved b0 and ar.pfs.

What if function arguments are being passed on the stack (e.g., more
than 8 scalar arguments)?

	--david

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [RFC] ia64 function return probes
  2005-06-01 21:39 [RFC] ia64 function return probes Rusty Lynch
  2005-06-03  2:17 ` Keith Owens
@ 2005-06-03 18:01 ` Lynch, Rusty
  1 sibling, 0 replies; 4+ messages in thread
From: Lynch, Rusty @ 2005-06-03 18:01 UTC (permalink / raw)
  To: Keith Owens
  Cc: linux-ia64, Keshavamurthy, Anil S, linux-kernel, systemtap,
	Hien Nguyen

From: Keith Owens [mailto:kaos@ocs.com.au]
>Rusty Lynch <rusty.lynch@intel.com> wrote:
>>The following is an implementation of the ia64 specific parts for
>implementing
>>the function return probes functionality that is a part of kprobes.
There
>>were some assumptions about how the architectures work inside
>kernel/kprobes.c
>>that force me to do some odd things in this implementation.
>>[snip]
>>Ok, that was the idea.  For ia64 complexities came up with respect to:
>>
>>* the assumption that kernel/kprobes.c is working with a
>>  stacked based architecture
>>
>>* the assumption that changing the return address to
>kretprobe_trampoline()
>>  will always result in the first instruction of of
kretprobe_trampoline
>>  being executed.
>
>Normal br.call/return will always execute the first slot of the return
>address.  br.call bn sets bn to IP+16, i.e. the start of the next
>bundle, with no slot data.  br.return bn returns to the first slot in
>the bundle defined by bn.  Intel IA64 arch vol3, 24531904.pdf page
>3:21.
>
>What made you think that you needed to handle return to a non-zero slot
>number?  The only instruction that can return to a non-zero slot number
>is rfi and, by definition, code that is entered for an interrupt does
>not have a return address in any branch register.

This was just a misunderstanding on my part.  Looks like I need to
re-read (again) the section talking about the branch unit, but this will
simplify this implementation.

>
>>The following patch works around these problems by:
>>
>>* Providing an empty kretprobe_trampoline(), but we don't really
>>  use it as our trampoline function.  Instead we provide:
>>
>>	/*
>>	 * void ia64_kretprobe_trampoline(void):
>>	 *
>>	 * When a return probe is set on a given function, it's return
>>	 * address (which really just points to the bundle) is set for
>>	 * this single bundle function.
>>	 *
>>	 * We don't know which slot of the bundle will be set, so we set
>>	 * a break using a special immediate value to gain control in
>>	 * each case so the registered return probe can be called and
then
>>	 * restore the cr->iip  back to the real address
>>	 * (i.e. the original return address)
>>	 */
>>GLOBAL_ENTRY(ia64_kretprobe_trampoline)
>>{ .mii
>>	break.m __IA64_BREAK_RPROBE
>>	break.i __IA64_BREAK_RPROBE
>>	break.i __IA64_BREAK_RPROBE
>>}
>>END(ia64_kretprobe_trampoline)
>>
>>  ... and then handle the break interrupts using this new reserved
>immediate
>>  value by just directly calling trampoline_probe_handler().
>
>Implementing a return probe by changing the return address will prevent
>us from getting any backtrace past the return probe.  If the function
>being probed, or any of its callee functions, gets an error then the
>backtrace will terminate at ia64_kretprobe_trampoline, the unwinder may
>even loop.  This makes it impossible to find out what called the
>function.
>
>One option is to hack return probes into the unwinder as yet another a
>special case - not nice.  Another option is this :-
>
>* On entry to the function, arch_prepare_kretprobe() saves the current
>  value of b0 in [sp+8] (architecture defined scratch area on stack),
>  and saves the current value of ar.pfs in [sp].
>
>* arch_prepare_kretprobe() bumps sp by 16 bytes, to account for the
>  saved b0 and ar.pfs.
>
>* arch_prepare_kretprobe() sets b0 to ia64_kretprobe_trampoline.
>
>* arch_prepare_kretprobe() sets the cfm field in ar.pfs to 0.  Of
>  course this is the ar.pfs that was saved in pt_regs when the function
>  being hooked was entered.
>
>* The function being hooked is entered.  When the function returns, it
>  does so to ia64_kretprobe_trampoline with cfm = 0, i.e.
>  ia64_kretprobe_trampoline has a zero sized register frame.
>
>* ia64_kretprobe_trampoline looks like this.  Compiled but not tested.
>
>GLOBAL_ENTRY(ia64_kretprobe_trampoline)
>	.prologue
>	.label_state 1
>	.fframe 16
>	.savesp b0, 8
>	.savesp ar.pfs, 0
>
>	break.m __IA64_BREAK_RPROBE
>	ld8 r3=[sp]		// original ar.pfs
>	add r2=8, sp		// original b0 was saved in sp+8
>	;;
>	ld8 r2=[r2]		// original b0
>
>	.body
>	mov ar.pfs=r3		// restore original ar.pfs
>	;;
>	mov b0=r2		// restore original b0
>	add sp=-16, sp		// arch_prepare_kretprobe bumped stack
by 16
>bytes
>
>	.copy_state 1
>	br.ret.sptk.many rp
>END(ia64_kretprobe_trampoline)
>
>If I got my unwind directives right, that will make the backtrace look
>like this
>
>  ...
>  hooked function
>  ia64_kretprobe_trampoline
>  whatever called the hooked function
>  ...
>
>It also means that the handler for __IA64_BREAK_RPROBE only has to log
>the fact that the function returned then resume at the instruction
>after __IA64_BREAK_RPROBE.  ia64_kretprobe_trampoline then returns to
>the original caller.

Since we are returning twice before getting back to the original parent
(i.e the target function returns and then ia64_kretprobe_trampoline
returns), wouldn't this cause the parent to get it's parent's bsp?

>
>>Index: linux-2.6.12-rc5/arch/ia64/kernel/kprobes.c
>>=================================>>--- linux-2.6.12-rc5.orig/arch/ia64/kernel/kprobes.c
>>+++ linux-2.6.12-rc5/arch/ia64/kernel/kprobes.c
>>@@ -1,4 +1,4 @@
>>-/*
>>+ /*
>
>Interesting addition of white space there :)

Hey, I work in a cube, extra space is always a nice thing :->

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-06-03 18:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-01 21:39 [RFC] ia64 function return probes Rusty Lynch
2005-06-03  2:17 ` Keith Owens
2005-06-03  4:56   ` David Mosberger
2005-06-03 18:01 ` Lynch, Rusty

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox