From: Paul Mackerras <paulus@samba.org>
To: Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <peterz@infradead.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
benh@kernel.crashing.org
Cc: linuxppc-dev@ozlabs.org, linux-kernel@vger.kernel.org, anton@samba.org
Subject: [PATCH v2] powerpc/perf_events: Implement perf_arch_fetch_caller_regs
Date: Thu, 18 Mar 2010 16:05:13 +1100 [thread overview]
Message-ID: <20100318050513.GA6575@drongo> (raw)
This implements a powerpc version of perf_arch_fetch_caller_regs.
It's implemented in assembly because that way we can be sure there
isn't a stack frame for perf_arch_fetch_caller_regs. If it was in
C, gcc might or might not create a stack frame for it, which would
affect the number of levels we have to skip.
With this, we see results from perf record -e lock:lock_acquire like
this:
# Samples: 24878
#
# Overhead Command Shared Object Symbol
# ........ .............. ................. ......
#
14.99% perf [kernel.kallsyms] [k] ._raw_spin_lock
|
--- ._raw_spin_lock
|
|--25.00%-- .alloc_fd
| (nil)
| |
| |--50.00%-- .anon_inode_getfd
| | .sys_perf_event_open
| | syscall_exit
| | syscall
| | create_counter
| | __cmd_record
| | run_builtin
| | main
| | 0xfd2e704
| | 0xfd2e8c0
| | (nil)
... etc.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
arch/powerpc/include/asm/asm-compat.h | 2 ++
arch/powerpc/kernel/misc.S | 28 ++++++++++++++++++++++++++++
2 files changed, 30 insertions(+)
diff --git a/arch/powerpc/include/asm/asm-compat.h b/arch/powerpc/include/asm/asm-compat.h
index c1b475a..a9b91ed 100644
--- a/arch/powerpc/include/asm/asm-compat.h
+++ b/arch/powerpc/include/asm/asm-compat.h
@@ -28,6 +28,7 @@
#define PPC_LLARX(t, a, b, eh) PPC_LDARX(t, a, b, eh)
#define PPC_STLCX stringify_in_c(stdcx.)
#define PPC_CNTLZL stringify_in_c(cntlzd)
+#define PPC_LR_STKOFF 16
/* Move to CR, single-entry optimized version. Only available
* on POWER4 and later.
@@ -51,6 +52,7 @@
#define PPC_STLCX stringify_in_c(stwcx.)
#define PPC_CNTLZL stringify_in_c(cntlzw)
#define PPC_MTOCRF stringify_in_c(mtcrf)
+#define PPC_LR_STKOFF 4
#endif
diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S
index 2d29752..b485a87 100644
--- a/arch/powerpc/kernel/misc.S
+++ b/arch/powerpc/kernel/misc.S
@@ -127,3 +127,31 @@ _GLOBAL(__setup_cpu_power7)
_GLOBAL(__restore_cpu_power7)
/* place holder */
blr
+
+#ifdef CONFIG_EVENT_TRACING
+/*
+ * Get a minimal set of registers for our caller's nth caller.
+ * r3 = regs pointer, r5 = n.
+ *
+ * We only get R1 (stack pointer), NIP (next instruction pointer)
+ * and LR (link register). These are all we can get in the
+ * general case without doing complicated stack unwinding, but
+ * fortunately they are enough to do a stack backtrace, which
+ * is all we need them for.
+ */
+_GLOBAL(perf_arch_fetch_caller_regs)
+ mr r6,r1
+ cmpwi r5,0
+ mflr r4
+ ble 2f
+ mtctr r5
+1: PPC_LL r6,0(r6)
+ bdnz 1b
+ PPC_LL r4,PPC_LR_STKOFF(r6)
+2: PPC_LL r7,0(r6)
+ PPC_LL r7,PPC_LR_STKOFF(r7)
+ PPC_STL r6,GPR1-STACK_FRAME_OVERHEAD(r3)
+ PPC_STL r4,_NIP-STACK_FRAME_OVERHEAD(r3)
+ PPC_STL r7,_LINK-STACK_FRAME_OVERHEAD(r3)
+ blr
+#endif /* CONFIG_EVENT_TRACING */
WARNING: multiple messages have this Message-ID (diff)
From: Paul Mackerras <paulus@samba.org>
To: Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <peterz@infradead.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
benh@kernel.crashing.org
Cc: linux-kernel@vger.kernel.org, anton@samba.org, linuxppc-dev@ozlabs.org
Subject: [PATCH v2] powerpc/perf_events: Implement perf_arch_fetch_caller_regs
Date: Thu, 18 Mar 2010 16:05:13 +1100 [thread overview]
Message-ID: <20100318050513.GA6575@drongo> (raw)
This implements a powerpc version of perf_arch_fetch_caller_regs.
It's implemented in assembly because that way we can be sure there
isn't a stack frame for perf_arch_fetch_caller_regs. If it was in
C, gcc might or might not create a stack frame for it, which would
affect the number of levels we have to skip.
With this, we see results from perf record -e lock:lock_acquire like
this:
# Samples: 24878
#
# Overhead Command Shared Object Symbol
# ........ .............. ................. ......
#
14.99% perf [kernel.kallsyms] [k] ._raw_spin_lock
|
--- ._raw_spin_lock
|
|--25.00%-- .alloc_fd
| (nil)
| |
| |--50.00%-- .anon_inode_getfd
| | .sys_perf_event_open
| | syscall_exit
| | syscall
| | create_counter
| | __cmd_record
| | run_builtin
| | main
| | 0xfd2e704
| | 0xfd2e8c0
| | (nil)
... etc.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
arch/powerpc/include/asm/asm-compat.h | 2 ++
arch/powerpc/kernel/misc.S | 28 ++++++++++++++++++++++++++++
2 files changed, 30 insertions(+)
diff --git a/arch/powerpc/include/asm/asm-compat.h b/arch/powerpc/include/asm/asm-compat.h
index c1b475a..a9b91ed 100644
--- a/arch/powerpc/include/asm/asm-compat.h
+++ b/arch/powerpc/include/asm/asm-compat.h
@@ -28,6 +28,7 @@
#define PPC_LLARX(t, a, b, eh) PPC_LDARX(t, a, b, eh)
#define PPC_STLCX stringify_in_c(stdcx.)
#define PPC_CNTLZL stringify_in_c(cntlzd)
+#define PPC_LR_STKOFF 16
/* Move to CR, single-entry optimized version. Only available
* on POWER4 and later.
@@ -51,6 +52,7 @@
#define PPC_STLCX stringify_in_c(stwcx.)
#define PPC_CNTLZL stringify_in_c(cntlzw)
#define PPC_MTOCRF stringify_in_c(mtcrf)
+#define PPC_LR_STKOFF 4
#endif
diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S
index 2d29752..b485a87 100644
--- a/arch/powerpc/kernel/misc.S
+++ b/arch/powerpc/kernel/misc.S
@@ -127,3 +127,31 @@ _GLOBAL(__setup_cpu_power7)
_GLOBAL(__restore_cpu_power7)
/* place holder */
blr
+
+#ifdef CONFIG_EVENT_TRACING
+/*
+ * Get a minimal set of registers for our caller's nth caller.
+ * r3 = regs pointer, r5 = n.
+ *
+ * We only get R1 (stack pointer), NIP (next instruction pointer)
+ * and LR (link register). These are all we can get in the
+ * general case without doing complicated stack unwinding, but
+ * fortunately they are enough to do a stack backtrace, which
+ * is all we need them for.
+ */
+_GLOBAL(perf_arch_fetch_caller_regs)
+ mr r6,r1
+ cmpwi r5,0
+ mflr r4
+ ble 2f
+ mtctr r5
+1: PPC_LL r6,0(r6)
+ bdnz 1b
+ PPC_LL r4,PPC_LR_STKOFF(r6)
+2: PPC_LL r7,0(r6)
+ PPC_LL r7,PPC_LR_STKOFF(r7)
+ PPC_STL r6,GPR1-STACK_FRAME_OVERHEAD(r3)
+ PPC_STL r4,_NIP-STACK_FRAME_OVERHEAD(r3)
+ PPC_STL r7,_LINK-STACK_FRAME_OVERHEAD(r3)
+ blr
+#endif /* CONFIG_EVENT_TRACING */
next reply other threads:[~2010-03-18 5:05 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-18 5:05 Paul Mackerras [this message]
2010-03-18 5:05 ` [PATCH v2] powerpc/perf_events: Implement perf_arch_fetch_caller_regs Paul Mackerras
2010-03-18 5:51 ` [tip:perf/urgent] powerpc/perf_events: Fix call-graph recording, add perf_arch_fetch_caller_regs tip-bot for Paul Mackerras
2010-03-18 19:30 ` [PATCH v2] powerpc/perf_events: Implement perf_arch_fetch_caller_regs Frederic Weisbecker
2010-03-18 19:30 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100318050513.GA6575@drongo \
--to=paulus@samba.org \
--cc=anton@samba.org \
--cc=benh@kernel.crashing.org \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.