From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>, Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
Clark Williams <williams@redhat.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Alexey Budankov <alexey.budankov@linux.intel.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Jin Yao <yao.jin@linux.intel.com>, Jiri Olsa <jolsa@redhat.com>,
Kan Liang <kan.liang@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: [PATCH 10/17] perf report: Prefer DWARF callstacks to LBR ones when captured both
Date: Tue, 20 Aug 2019 16:27:26 -0300 [thread overview]
Message-ID: <20190820192733.19180-11-acme@kernel.org> (raw)
In-Reply-To: <20190820192733.19180-1-acme@kernel.org>
From: Alexey Budankov <alexey.budankov@linux.intel.com>
Display DWARF based callchains when the perf.data file contains raw thread
stack data as LBR callstack data.
Commiter testing:
This changes the output from the branch stack based one, i.e. without
this patch, for the same file as in the previous csets:
# perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 13 of event 'cycles'
# Event count (approx.): 13
#
# Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles
# ........ ....... .................... ........................... ......................................... ..................
#
7.69% ls libpthread-2.29.so [.] _init [.] __pthread_initialize_minimal_internal 6827
7.69% ls ld-2.29.so [k] _start [k] _dl_start -
7.69% ls ld-2.29.so [.] _dl_start_user [.] _dl_init -24790
7.69% ls ld-2.29.so [k] _dl_start [k] _dl_sysdep_start 278
7.69% ls ld-2.29.so [k] dl_main [k] _dl_map_object_deps 15581
7.69% ls ld-2.29.so [k] open_verify.constprop.0 [k] lseek64 4228
7.69% ls ld-2.29.so [k] _dl_map_object [k] open_verify.constprop.0 55
7.69% ls ld-2.29.so [k] openaux [k] _dl_map_object 67
7.69% ls ld-2.29.so [k] _dl_map_object_deps [k] 0x00007f441b57c090 112
7.69% ls ld-2.29.so [.] call_init.part.0 [.] _init 334
7.69% ls ld-2.29.so [.] _dl_init [.] call_init.part.0 383
7.69% ls ld-2.29.so [k] _dl_sysdep_start [k] dl_main 45
7.69% ls ld-2.29.so [k] _dl_catch_exception [k] openaux 116
#
# (Tip: For memory address profiling, try: perf mem record / perf mem report)
#
To the one that shows call chains:
# perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 10 of event 'cycles'
# Event count (approx.): 3204047
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... .................. .........................................
#
55.01% 0.00% ls [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
|
---entry_SYSCALL_64_after_hwframe
do_syscall_64
|
--16.01%--__x64_sys_execve
__do_execve_file.isra.0
search_binary_handler
load_elf_binary
elf_map
vm_mmap_pgoff
do_mmap
mmap_region
perf_event_mmap
perf_iterate_sb
perf_iterate_ctx
perf_event_mmap_output
perf_output_copy
memcpy_erms
55.01% 39.00% ls [kernel.vmlinux] [k] do_syscall_64
|
|--39.00%--0xffffffffffffffff
| _dl_map_object
| open_verify.constprop.0
| __lseek64 (inlined)
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
|
--16.01%--do_syscall_64
__x64_sys_execve
__do_execve_file.isra.0
search_binary_handler
load_elf_binary
elf_map
vm_mmap_pgoff
do_mmap
mmap_region
perf_event_mmap
perf_iterate_sb
perf_iterate_ctx
perf_event_mmap_output
perf_output_copy
memcpy_erms
42.95% 42.95% ls libpthread-2.29.so [.] __pthread_initialize_minimal_internal
|
---_init
__pthread_initialize_minimal_internal
42.95% 0.00% ls libpthread-2.29.so [.] _init
|
---_init
__pthread_initialize_minimal_internal
<SNIP>
#
# (Tip: Profiling branch (mis)predictions with: perf record -b / perf report)
#
#
The branch stack view be explicitely selected using:
# perf report -h branch-stack
Usage: perf report [<options>]
-b, --branch-stack use branch records for per branch histogram filling
#
I.e. after this patch:
# perf report -b --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 13 of event 'cycles'
# Event count (approx.): 13
#
# Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles
# ........ ....... .................... ........................... ......................................... ..................
#
7.69% ls libpthread-2.29.so [.] _init [.] __pthread_initialize_minimal_internal 6827
7.69% ls ld-2.29.so [k] _start [k] _dl_start -
7.69% ls ld-2.29.so [.] _dl_start_user [.] _dl_init -24790
7.69% ls ld-2.29.so [k] _dl_start [k] _dl_sysdep_start 278
7.69% ls ld-2.29.so [k] dl_main [k] _dl_map_object_deps 15581
7.69% ls ld-2.29.so [k] open_verify.constprop.0 [k] lseek64 4228
7.69% ls ld-2.29.so [k] _dl_map_object [k] open_verify.constprop.0 55
7.69% ls ld-2.29.so [k] openaux [k] _dl_map_object 67
7.69% ls ld-2.29.so [k] _dl_map_object_deps [k] 0x00007f441b57c090 112
7.69% ls ld-2.29.so [.] call_init.part.0 [.] _init 334
7.69% ls ld-2.29.so [.] _dl_init [.] call_init.part.0 383
7.69% ls ld-2.29.so [k] _dl_sysdep_start [k] dl_main 45
7.69% ls ld-2.29.so [k] _dl_catch_exception [k] openaux 116
#
# (Tip: Show current config key-value pairs: perf config --list)
#
#
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ccbd9583-82f4-dec5-7e84-64bf56e351fb@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-report.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 5e003d02821e..79dfb1139f94 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1281,6 +1281,8 @@ int cmd_report(int argc, const char **argv)
has_br_stack = perf_header__has_feat(&session->header,
HEADER_BRANCH_STACK);
+ if (perf_evlist__combined_sample_type(session->evlist) & PERF_SAMPLE_STACK_USER)
+ has_br_stack = false;
setup_forced_leader(&report, session->evlist);
--
2.21.0
next prev parent reply other threads:[~2019-08-20 19:27 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-20 19:27 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 01/17] tools headers: Add limits.h to access __WORDSIZE Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 02/17] perf tools: tools/include should come before tools/uapi/include Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 03/17] tools headers: Grab copy of linux/const.h, needed by linux/bits.h Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 04/17] tools headers: Synchronize linux/bits.h with the kernel sources Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 05/17] tools arch x86: Sync asm/cpufeatures.h with the with the kernel Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 06/17] perf evsel: Add comment for 'idx' member in 'struct perf_sample_id Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 07/17] tools lib traceevent: Fix "robust" test of do_generate_dynamic_list_file Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 08/17] perf record: Enable LBR callstack capture jointly with thread stack Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 09/17] perf report: Dump LBR callstack data by -D " Arnaldo Carvalho de Melo
2019-08-20 19:27 ` Arnaldo Carvalho de Melo [this message]
2019-08-20 19:27 ` [PATCH 11/17] perf cs-etm: Support sample flags 'insn' and 'insnlen' Arnaldo Carvalho de Melo
2019-08-20 19:27 ` Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 12/17] perf ui: Make 'exit_msg' optional in ui__question_window() Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 13/17] perf ui: Introduce non-interactive ui__info_window() function Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 14/17] perf ui browser: Allow specifying message to show when no samples are available to display Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 15/17] perf top: Show info message while collecting samples Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 16/17] tools headers: Fixup bitsperlong per arch includes Arnaldo Carvalho de Melo
2019-08-20 19:27 ` [PATCH 17/17] libperf: Fix arch include paths Arnaldo Carvalho de Melo
2019-08-20 19:39 ` [GIT PULL] perf/core improvements and fixes Ingo Molnar
2019-08-20 19:44 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190820192733.19180-11-acme@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=alexey.budankov@linux.intel.com \
--cc=jolsa@kernel.org \
--cc=jolsa@redhat.com \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=williams@redhat.com \
--cc=yao.jin@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.