* [PATCH v4 0/2] arm & arm64: perf: Fix callchain parse error with kernel tracepoint events @ 2015-05-08 5:43 ` Hou Pengyang 0 siblings, 0 replies; 14+ messages in thread From: Hou Pengyang @ 2015-05-08 5:43 UTC (permalink / raw) To: linux-arm-kernel For arm & arm64, when tracing with tracepoint events, the IP and cpsr are set to 0, preventing the perf code parsing the callchain and resolving the symbols correctly. These two patches fix this by implementing perf_arch_fetch_caller_regs for arm and arm64, which fills several necessary register info for callchain unwinding and symbol resolving. v3->v4: - fix compile errors v2->v3: - split the original patch into two, one for arm and the other arm64; - change '|=' to '=' when setting cpsr. Hou Pengyang (2): arm: perf: Fix callchain parse error with kernel tracepoint events arm64: perf: Fix callchain parse error with kernel tracepoint events arch/arm/include/asm/perf_event.h | 7 +++++++ arch/arm64/include/asm/perf_event.h | 7 +++++++ 2 files changed, 14 insertions(+) -- 1.8.3.4 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v4 0/2] arm & arm64: perf: Fix callchain parse error with kernel tracepoint events @ 2015-05-08 5:43 ` Hou Pengyang 0 siblings, 0 replies; 14+ messages in thread From: Hou Pengyang @ 2015-05-08 5:43 UTC (permalink / raw) To: will.deacon, a.p.zijlstra, paulus, acme, mingo Cc: catalin.marinas, wannan0, linux-kernel, linux-arm-kernel For arm & arm64, when tracing with tracepoint events, the IP and cpsr are set to 0, preventing the perf code parsing the callchain and resolving the symbols correctly. These two patches fix this by implementing perf_arch_fetch_caller_regs for arm and arm64, which fills several necessary register info for callchain unwinding and symbol resolving. v3->v4: - fix compile errors v2->v3: - split the original patch into two, one for arm and the other arm64; - change '|=' to '=' when setting cpsr. Hou Pengyang (2): arm: perf: Fix callchain parse error with kernel tracepoint events arm64: perf: Fix callchain parse error with kernel tracepoint events arch/arm/include/asm/perf_event.h | 7 +++++++ arch/arm64/include/asm/perf_event.h | 7 +++++++ 2 files changed, 14 insertions(+) -- 1.8.3.4 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v4 1/2] arm: perf: Fix callchain parse error with kernel tracepoint events 2015-05-08 5:43 ` Hou Pengyang @ 2015-05-08 5:43 ` Hou Pengyang -1 siblings, 0 replies; 14+ messages in thread From: Hou Pengyang @ 2015-05-08 5:43 UTC (permalink / raw) To: linux-arm-kernel For ARM, when tracing with tracepoint events, the IP and cpsr are set to 0, preventing the perf code parsing the callchain and resolving the symbols correctly. ./perf record -e sched:sched_switch -g --call-graph dwarf ls [ perf record: Captured and wrote 0.006 MB perf.data ] ./perf report -f Samples: 5 of event 'sched:sched_switch', Event count (approx.): 5 Children Self Command Shared Object Symbol 100.00% 100.00% ls [unknown] [.] 00000000 The fix is to implement perf_arch_fetch_caller_regs for ARM, which fills several necessary registers used for callchain unwinding, including pc,sp, fp and cpsr. With this patch, callchain can be parsed correctly as : ..... - 100.00% 100.00% ls [kernel.kallsyms] [k] __sched_text_start + __sched_text_start + 20.00% 0.00% ls libc-2.18.so [.] _dl_addr + 20.00% 0.00% ls libc-2.18.so [.] write ..... Jean Pihet found this in ARM and come up with a patch: http://thread.gmane.org/gmane.linux.kernel/1734283/focus=1734280 This patch rewrite Jean's patch in C. Signed-off-by: Hou Pengyang <houpengyang@huawei.com> --- arch/arm/include/asm/perf_event.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h index d9cf138..4f9dec4 100644 --- a/arch/arm/include/asm/perf_event.h +++ b/arch/arm/include/asm/perf_event.h @@ -19,4 +19,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); #define perf_misc_flags(regs) perf_misc_flags(regs) #endif +#define perf_arch_fetch_caller_regs(regs, __ip) { \ + (regs)->ARM_pc = (__ip); \ + (regs)->ARM_fp = (unsigned long) __builtin_frame_address(0); \ + (regs)->ARM_sp = current_stack_pointer; \ + (regs)->ARM_cpsr = SVC_MODE; \ +} + #endif /* __ARM_PERF_EVENT_H__ */ -- 1.8.3.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 1/2] arm: perf: Fix callchain parse error with kernel tracepoint events @ 2015-05-08 5:43 ` Hou Pengyang 0 siblings, 0 replies; 14+ messages in thread From: Hou Pengyang @ 2015-05-08 5:43 UTC (permalink / raw) To: will.deacon, a.p.zijlstra, paulus, acme, mingo Cc: catalin.marinas, wannan0, linux-kernel, linux-arm-kernel For ARM, when tracing with tracepoint events, the IP and cpsr are set to 0, preventing the perf code parsing the callchain and resolving the symbols correctly. ./perf record -e sched:sched_switch -g --call-graph dwarf ls [ perf record: Captured and wrote 0.006 MB perf.data ] ./perf report -f Samples: 5 of event 'sched:sched_switch', Event count (approx.): 5 Children Self Command Shared Object Symbol 100.00% 100.00% ls [unknown] [.] 00000000 The fix is to implement perf_arch_fetch_caller_regs for ARM, which fills several necessary registers used for callchain unwinding, including pc,sp, fp and cpsr. With this patch, callchain can be parsed correctly as : ..... - 100.00% 100.00% ls [kernel.kallsyms] [k] __sched_text_start + __sched_text_start + 20.00% 0.00% ls libc-2.18.so [.] _dl_addr + 20.00% 0.00% ls libc-2.18.so [.] write ..... Jean Pihet found this in ARM and come up with a patch: http://thread.gmane.org/gmane.linux.kernel/1734283/focus=1734280 This patch rewrite Jean's patch in C. Signed-off-by: Hou Pengyang <houpengyang@huawei.com> --- arch/arm/include/asm/perf_event.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h index d9cf138..4f9dec4 100644 --- a/arch/arm/include/asm/perf_event.h +++ b/arch/arm/include/asm/perf_event.h @@ -19,4 +19,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); #define perf_misc_flags(regs) perf_misc_flags(regs) #endif +#define perf_arch_fetch_caller_regs(regs, __ip) { \ + (regs)->ARM_pc = (__ip); \ + (regs)->ARM_fp = (unsigned long) __builtin_frame_address(0); \ + (regs)->ARM_sp = current_stack_pointer; \ + (regs)->ARM_cpsr = SVC_MODE; \ +} + #endif /* __ARM_PERF_EVENT_H__ */ -- 1.8.3.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 2/2] arm64: perf: Fix callchain parse error with kernel tracepoint events 2015-05-08 5:43 ` Hou Pengyang @ 2015-05-08 5:43 ` Hou Pengyang -1 siblings, 0 replies; 14+ messages in thread From: Hou Pengyang @ 2015-05-08 5:43 UTC (permalink / raw) To: linux-arm-kernel For ARM64, when tracing with tracepoint events, the IP and pstate are set to 0, preventing the perf code parsing the callchain and resolving the symbols correctly. ./perf record -e sched:sched_switch -g --call-graph dwarf ls [ perf record: Captured and wrote 0.146 MB perf.data ] ./perf report -f Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194 Children Self Command Shared Object Symbol 100.00% 100.00% ls [unknown] [.] 0000000000000000 The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills several necessary registers used for callchain unwinding, including pc,sp, fp and spsr . With this patch, callchain can be parsed correctly as follows: ...... + 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink + 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down + 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get + 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33 - 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify pfkey_send_policy_notify pfkey_get v9fs_vfs_rename page_follow_link_light link_path_walk el0_svc_naked ....... Signed-off-by: Hou Pengyang <houpengyang@huawei.com> --- arch/arm64/include/asm/perf_event.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index d26d1d5..6471773 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -24,4 +24,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); #define perf_misc_flags(regs) perf_misc_flags(regs) #endif +#define perf_arch_fetch_caller_regs(regs, __ip) { \ + (regs)->pc = (__ip); \ + (regs)->regs[AARCH64_INSN_REG_FP] = (unsigned long) __builtin_frame_address(0); \ + (regs)->sp = current_stack_pointer; \ + (regs)->pstate = PSR_MODE_EL1h; \ +} + #endif -- 1.8.3.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 2/2] arm64: perf: Fix callchain parse error with kernel tracepoint events @ 2015-05-08 5:43 ` Hou Pengyang 0 siblings, 0 replies; 14+ messages in thread From: Hou Pengyang @ 2015-05-08 5:43 UTC (permalink / raw) To: will.deacon, a.p.zijlstra, paulus, acme, mingo Cc: catalin.marinas, wannan0, linux-kernel, linux-arm-kernel For ARM64, when tracing with tracepoint events, the IP and pstate are set to 0, preventing the perf code parsing the callchain and resolving the symbols correctly. ./perf record -e sched:sched_switch -g --call-graph dwarf ls [ perf record: Captured and wrote 0.146 MB perf.data ] ./perf report -f Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194 Children Self Command Shared Object Symbol 100.00% 100.00% ls [unknown] [.] 0000000000000000 The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills several necessary registers used for callchain unwinding, including pc,sp, fp and spsr . With this patch, callchain can be parsed correctly as follows: ...... + 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink + 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down + 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get + 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33 - 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify pfkey_send_policy_notify pfkey_get v9fs_vfs_rename page_follow_link_light link_path_walk el0_svc_naked ....... Signed-off-by: Hou Pengyang <houpengyang@huawei.com> --- arch/arm64/include/asm/perf_event.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index d26d1d5..6471773 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -24,4 +24,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); #define perf_misc_flags(regs) perf_misc_flags(regs) #endif +#define perf_arch_fetch_caller_regs(regs, __ip) { \ + (regs)->pc = (__ip); \ + (regs)->regs[AARCH64_INSN_REG_FP] = (unsigned long) __builtin_frame_address(0); \ + (regs)->sp = current_stack_pointer; \ + (regs)->pstate = PSR_MODE_EL1h; \ +} + #endif -- 1.8.3.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 2/2] arm64: perf: Fix callchain parse error with kernel tracepoint events 2015-05-08 5:43 ` Hou Pengyang @ 2015-05-08 15:37 ` Will Deacon -1 siblings, 0 replies; 14+ messages in thread From: Will Deacon @ 2015-05-08 15:37 UTC (permalink / raw) To: linux-arm-kernel On Fri, May 08, 2015 at 06:43:04AM +0100, Hou Pengyang wrote: > For ARM64, when tracing with tracepoint events, the IP and pstate are set > to 0, preventing the perf code parsing the callchain and resolving the > symbols correctly. > > ./perf record -e sched:sched_switch -g --call-graph dwarf ls > [ perf record: Captured and wrote 0.146 MB perf.data ] > ./perf report -f > Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194 > Children Self Command Shared Object Symbol > 100.00% 100.00% ls [unknown] [.] 0000000000000000 > > The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills > several necessary registers used for callchain unwinding, including pc,sp, > fp and spsr . > > With this patch, callchain can be parsed correctly as follows: > > ...... > + 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink > + 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down > + 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get > + 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33 > - 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify > pfkey_send_policy_notify > pfkey_get > v9fs_vfs_rename > page_follow_link_light > link_path_walk > el0_svc_naked > ....... > > Signed-off-by: Hou Pengyang <houpengyang@huawei.com> > --- > arch/arm64/include/asm/perf_event.h | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h > index d26d1d5..6471773 100644 > --- a/arch/arm64/include/asm/perf_event.h > +++ b/arch/arm64/include/asm/perf_event.h > @@ -24,4 +24,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); > #define perf_misc_flags(regs) perf_misc_flags(regs) > #endif > > +#define perf_arch_fetch_caller_regs(regs, __ip) { \ > + (regs)->pc = (__ip); \ > + (regs)->regs[AARCH64_INSN_REG_FP] = (unsigned long) __builtin_frame_address(0); \ Just a minor thing, but I'd rather we explicitly used '29' as the index here. The AARCH64_INSN_REG_FP is really for the instruction generation code used by BPF and I think it's better to be explicit about the register number here. Anyway, I've queued your arch/arm/ patch and Catalin can take this one for 4.2 once you've made the small change above and added my Ack. Thanks, Will ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 2/2] arm64: perf: Fix callchain parse error with kernel tracepoint events @ 2015-05-08 15:37 ` Will Deacon 0 siblings, 0 replies; 14+ messages in thread From: Will Deacon @ 2015-05-08 15:37 UTC (permalink / raw) To: Hou Pengyang Cc: a.p.zijlstra@chello.nl, paulus@samba.org, acme@kernel.org, mingo@redhat.com, Catalin Marinas, wannan0@huawei.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org On Fri, May 08, 2015 at 06:43:04AM +0100, Hou Pengyang wrote: > For ARM64, when tracing with tracepoint events, the IP and pstate are set > to 0, preventing the perf code parsing the callchain and resolving the > symbols correctly. > > ./perf record -e sched:sched_switch -g --call-graph dwarf ls > [ perf record: Captured and wrote 0.146 MB perf.data ] > ./perf report -f > Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194 > Children Self Command Shared Object Symbol > 100.00% 100.00% ls [unknown] [.] 0000000000000000 > > The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills > several necessary registers used for callchain unwinding, including pc,sp, > fp and spsr . > > With this patch, callchain can be parsed correctly as follows: > > ...... > + 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink > + 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down > + 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get > + 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33 > - 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify > pfkey_send_policy_notify > pfkey_get > v9fs_vfs_rename > page_follow_link_light > link_path_walk > el0_svc_naked > ....... > > Signed-off-by: Hou Pengyang <houpengyang@huawei.com> > --- > arch/arm64/include/asm/perf_event.h | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h > index d26d1d5..6471773 100644 > --- a/arch/arm64/include/asm/perf_event.h > +++ b/arch/arm64/include/asm/perf_event.h > @@ -24,4 +24,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); > #define perf_misc_flags(regs) perf_misc_flags(regs) > #endif > > +#define perf_arch_fetch_caller_regs(regs, __ip) { \ > + (regs)->pc = (__ip); \ > + (regs)->regs[AARCH64_INSN_REG_FP] = (unsigned long) __builtin_frame_address(0); \ Just a minor thing, but I'd rather we explicitly used '29' as the index here. The AARCH64_INSN_REG_FP is really for the instruction generation code used by BPF and I think it's better to be explicit about the register number here. Anyway, I've queued your arch/arm/ patch and Catalin can take this one for 4.2 once you've made the small change above and added my Ack. Thanks, Will ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v5] arm64: perf: Fix callchain parse error with kernel tracepoint events 2015-05-08 15:37 ` Will Deacon @ 2015-05-10 11:07 ` Hou Pengyang -1 siblings, 0 replies; 14+ messages in thread From: Hou Pengyang @ 2015-05-10 11:07 UTC (permalink / raw) To: linux-arm-kernel For ARM64, when tracing with tracepoint events, the IP and pstate are set to 0, preventing the perf code parsing the callchain and resolving the symbols correctly. ./perf record -e sched:sched_switch -g --call-graph dwarf ls [ perf record: Captured and wrote 0.146 MB perf.data ] ./perf report -f Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194 Children Self Command Shared Object Symbol 100.00% 100.00% ls [unknown] [.] 0000000000000000 The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills several necessary registers used for callchain unwinding, including pc,sp, fp and spsr . With this patch, callchain can be parsed correctly as follows: ...... + 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink + 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down + 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get + 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33 - 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify pfkey_send_policy_notify pfkey_get v9fs_vfs_rename page_follow_link_light link_path_walk el0_svc_naked ....... Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Acked-by: Will Deacon <will.deacon@arm.com> --- arch/arm64/include/asm/perf_event.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index d26d1d5..6471773 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -24,4 +24,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); #define perf_misc_flags(regs) perf_misc_flags(regs) #endif +#define perf_arch_fetch_caller_regs(regs, __ip) { \ + (regs)->pc = (__ip); \ + (regs)->regs[29] = (unsigned long) __builtin_frame_address(0); \ + (regs)->sp = current_stack_pointer; \ + (regs)->pstate = PSR_MODE_EL1h; \ +} + #endif -- 1.8.5.2 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v5] arm64: perf: Fix callchain parse error with kernel tracepoint events @ 2015-05-10 11:07 ` Hou Pengyang 0 siblings, 0 replies; 14+ messages in thread From: Hou Pengyang @ 2015-05-10 11:07 UTC (permalink / raw) To: will.deacon, catalin.marinas Cc: a.p.zijlstra, paulus, acme, mingo, wangnan0, linux-arm-kernel, linux-kernel For ARM64, when tracing with tracepoint events, the IP and pstate are set to 0, preventing the perf code parsing the callchain and resolving the symbols correctly. ./perf record -e sched:sched_switch -g --call-graph dwarf ls [ perf record: Captured and wrote 0.146 MB perf.data ] ./perf report -f Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194 Children Self Command Shared Object Symbol 100.00% 100.00% ls [unknown] [.] 0000000000000000 The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills several necessary registers used for callchain unwinding, including pc,sp, fp and spsr . With this patch, callchain can be parsed correctly as follows: ...... + 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink + 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down + 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get + 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33 - 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify pfkey_send_policy_notify pfkey_get v9fs_vfs_rename page_follow_link_light link_path_walk el0_svc_naked ....... Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Acked-by: Will Deacon <will.deacon@arm.com> --- arch/arm64/include/asm/perf_event.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index d26d1d5..6471773 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -24,4 +24,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); #define perf_misc_flags(regs) perf_misc_flags(regs) #endif +#define perf_arch_fetch_caller_regs(regs, __ip) { \ + (regs)->pc = (__ip); \ + (regs)->regs[29] = (unsigned long) __builtin_frame_address(0); \ + (regs)->sp = current_stack_pointer; \ + (regs)->pstate = PSR_MODE_EL1h; \ +} + #endif -- 1.8.5.2 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v5] arm64: perf: Fix callchain parse error with kernel tracepoint events 2015-05-10 11:07 ` Hou Pengyang @ 2015-05-19 16:52 ` Catalin Marinas -1 siblings, 0 replies; 14+ messages in thread From: Catalin Marinas @ 2015-05-19 16:52 UTC (permalink / raw) To: linux-arm-kernel On Sun, May 10, 2015 at 11:07:40AM +0000, Hou Pengyang wrote: > For ARM64, when tracing with tracepoint events, the IP and pstate are set > to 0, preventing the perf code parsing the callchain and resolving the > symbols correctly. > > ./perf record -e sched:sched_switch -g --call-graph dwarf ls > [ perf record: Captured and wrote 0.146 MB perf.data ] > ./perf report -f > Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194 > Children Self Command Shared Object Symbol > 100.00% 100.00% ls [unknown] [.] 0000000000000000 > > The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills > several necessary registers used for callchain unwinding, including pc,sp, > fp and spsr . > > With this patch, callchain can be parsed correctly as follows: > > ...... > + 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink > + 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down > + 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get > + 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33 > - 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify > pfkey_send_policy_notify > pfkey_get > v9fs_vfs_rename > page_follow_link_light > link_path_walk > el0_svc_naked > ....... > > Signed-off-by: Hou Pengyang <houpengyang@huawei.com> > Acked-by: Will Deacon <will.deacon@arm.com> Queued for 4.2. Thanks. -- Catalin ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v5] arm64: perf: Fix callchain parse error with kernel tracepoint events @ 2015-05-19 16:52 ` Catalin Marinas 0 siblings, 0 replies; 14+ messages in thread From: Catalin Marinas @ 2015-05-19 16:52 UTC (permalink / raw) To: Hou Pengyang Cc: will.deacon, wangnan0, a.p.zijlstra, linux-kernel, acme, mingo, paulus, linux-arm-kernel On Sun, May 10, 2015 at 11:07:40AM +0000, Hou Pengyang wrote: > For ARM64, when tracing with tracepoint events, the IP and pstate are set > to 0, preventing the perf code parsing the callchain and resolving the > symbols correctly. > > ./perf record -e sched:sched_switch -g --call-graph dwarf ls > [ perf record: Captured and wrote 0.146 MB perf.data ] > ./perf report -f > Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194 > Children Self Command Shared Object Symbol > 100.00% 100.00% ls [unknown] [.] 0000000000000000 > > The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills > several necessary registers used for callchain unwinding, including pc,sp, > fp and spsr . > > With this patch, callchain can be parsed correctly as follows: > > ...... > + 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink > + 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down > + 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get > + 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33 > - 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify > pfkey_send_policy_notify > pfkey_get > v9fs_vfs_rename > page_follow_link_light > link_path_walk > el0_svc_naked > ....... > > Signed-off-by: Hou Pengyang <houpengyang@huawei.com> > Acked-by: Will Deacon <will.deacon@arm.com> Queued for 4.2. Thanks. -- Catalin ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v5] arm64: perf: Fix callchain parse error with kernel tracepoint events 2015-05-19 16:52 ` Catalin Marinas @ 2015-05-20 6:53 ` Jean Pihet -1 siblings, 0 replies; 14+ messages in thread From: Jean Pihet @ 2015-05-20 6:53 UTC (permalink / raw) To: linux-arm-kernel Hi Catalin, Will, On Tue, May 19, 2015 at 6:52 PM, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Sun, May 10, 2015 at 11:07:40AM +0000, Hou Pengyang wrote: >> For ARM64, when tracing with tracepoint events, the IP and pstate are set >> to 0, preventing the perf code parsing the callchain and resolving the >> symbols correctly. >> >> ./perf record -e sched:sched_switch -g --call-graph dwarf ls >> [ perf record: Captured and wrote 0.146 MB perf.data ] >> ./perf report -f >> Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194 >> Children Self Command Shared Object Symbol >> 100.00% 100.00% ls [unknown] [.] 0000000000000000 >> >> The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills >> several necessary registers used for callchain unwinding, including pc,sp, >> fp and spsr . >> >> With this patch, callchain can be parsed correctly as follows: >> >> ...... >> + 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink >> + 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down >> + 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get >> + 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33 >> - 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify >> pfkey_send_policy_notify >> pfkey_get >> v9fs_vfs_rename >> page_follow_link_light >> link_path_walk >> el0_svc_naked >> ....... >> >> Signed-off-by: Hou Pengyang <houpengyang@huawei.com> >> Acked-by: Will Deacon <will.deacon@arm.com> > > Queued for 4.2. Thanks. Nice to see this one going out, finally. Cheers, Jean > > -- > Catalin > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v5] arm64: perf: Fix callchain parse error with kernel tracepoint events @ 2015-05-20 6:53 ` Jean Pihet 0 siblings, 0 replies; 14+ messages in thread From: Jean Pihet @ 2015-05-20 6:53 UTC (permalink / raw) To: Catalin Marinas, Will Deacon Cc: Hou Pengyang, wangnan0, Peter Zijlstra, LKML, acme, mingo, Paul Mackerras, linux-arm Hi Catalin, Will, On Tue, May 19, 2015 at 6:52 PM, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Sun, May 10, 2015 at 11:07:40AM +0000, Hou Pengyang wrote: >> For ARM64, when tracing with tracepoint events, the IP and pstate are set >> to 0, preventing the perf code parsing the callchain and resolving the >> symbols correctly. >> >> ./perf record -e sched:sched_switch -g --call-graph dwarf ls >> [ perf record: Captured and wrote 0.146 MB perf.data ] >> ./perf report -f >> Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194 >> Children Self Command Shared Object Symbol >> 100.00% 100.00% ls [unknown] [.] 0000000000000000 >> >> The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills >> several necessary registers used for callchain unwinding, including pc,sp, >> fp and spsr . >> >> With this patch, callchain can be parsed correctly as follows: >> >> ...... >> + 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink >> + 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down >> + 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get >> + 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33 >> - 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify >> pfkey_send_policy_notify >> pfkey_get >> v9fs_vfs_rename >> page_follow_link_light >> link_path_walk >> el0_svc_naked >> ....... >> >> Signed-off-by: Hou Pengyang <houpengyang@huawei.com> >> Acked-by: Will Deacon <will.deacon@arm.com> > > Queued for 4.2. Thanks. Nice to see this one going out, finally. Cheers, Jean > > -- > Catalin > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-05-20 6:53 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-05-08 5:43 [PATCH v4 0/2] arm & arm64: perf: Fix callchain parse error with kernel tracepoint events Hou Pengyang 2015-05-08 5:43 ` Hou Pengyang 2015-05-08 5:43 ` [PATCH v4 1/2] arm: " Hou Pengyang 2015-05-08 5:43 ` Hou Pengyang 2015-05-08 5:43 ` [PATCH v4 2/2] arm64: " Hou Pengyang 2015-05-08 5:43 ` Hou Pengyang 2015-05-08 15:37 ` Will Deacon 2015-05-08 15:37 ` Will Deacon 2015-05-10 11:07 ` [PATCH v5] " Hou Pengyang 2015-05-10 11:07 ` Hou Pengyang 2015-05-19 16:52 ` Catalin Marinas 2015-05-19 16:52 ` Catalin Marinas 2015-05-20 6:53 ` Jean Pihet 2015-05-20 6:53 ` Jean Pihet
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.