From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
Clark Williams <williams@redhat.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Adrian Hunter <adrian.hunter@intel.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 48/53] perf thread-stack: Represent jmps to the start of a different symbol
Date: Wed, 6 Feb 2019 15:48:58 -0300 [thread overview]
Message-ID: <20190206184903.24054-49-acme@kernel.org> (raw)
In-Reply-To: <20190206184903.24054-1-acme@kernel.org>
From: Adrian Hunter <adrian.hunter@intel.com>
The compiler might optimize a call/ret combination by making it a jmp.
However the thread-stack does not presently cater for that, so that such
control flow is not visible in the call graph. Make it visible by
recording on the stack a branch to the start of a different symbol.
Note, that means when a ret pops the stack, all jmps must be popped off
first.
Example:
$ cat jmp-to-fn.c
__attribute__((noinline)) int bar(void)
{
return -1;
}
__attribute__((noinline)) int foo(void)
{
return bar() + 1;
}
int main()
{
return foo();
}
$ gcc -ggdb3 -Wall -Wextra -O2 -o jmp-to-fn jmp-to-fn.c
$ objdump -d jmp-to-fn
<SNIP>
0000000000001040 <main>:
1040: 31 c0 xor %eax,%eax
1042: e9 09 01 00 00 jmpq 1150 <foo>
<SNIP>
0000000000001140 <bar>:
1140: b8 ff ff ff ff mov $0xffffffff,%eax
1145: c3 retq
<SNIP>
0000000000001150 <foo>:
1150: 31 c0 xor %eax,%eax
1152: e8 e9 ff ff ff callq 1140 <bar>
1157: 83 c0 01 add $0x1,%eax
115a: c3 retq
<SNIP>
$ perf record -o jmp-to-fn.perf.data -e intel_pt/cyc/u ./jmp-to-fn
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0,017 MB jmp-to-fn.perf.data ]
$ perf script -i jmp-to-fn.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py jmp-to-fn.db branches calls
2019-01-08 13:24:58.783069 Creating database...
2019-01-08 13:24:58.794650 Writing records...
2019-01-08 13:24:59.008050 Adding indexes
2019-01-08 13:24:59.015802 Done
$ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py jmp-to-fn.db
Before:
main
-> bar
After:
main
-> foo
-> bar
Committer testing:
Install the python2-pyside package, then select these menu options
on the GUI:
"Reports"
"Context sensitive callgraphs"
Then go on expanding the symbols, to get, full picture when doing this
on a fedora:29 with gcc version 8.2.1 20181215 (Red Hat 8.2.1-6) (GCC):
jmp-to-fn
PID:TID
_start (ld-2.28.so)
__libc_start_main
main
foo
bar
To verify that indeed, this fixes the problem.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190109091835.5570-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
.../scripts/python/export-to-postgresql.py | 2 +-
tools/perf/scripts/python/export-to-sqlite.py | 2 +-
tools/perf/util/thread-stack.c | 30 +++++++++++++++++--
tools/perf/util/thread-stack.h | 3 ++
4 files changed, 33 insertions(+), 4 deletions(-)
diff --git a/tools/perf/scripts/python/export-to-postgresql.py b/tools/perf/scripts/python/export-to-postgresql.py
index 0564dd7377f2..30130213da7e 100644
--- a/tools/perf/scripts/python/export-to-postgresql.py
+++ b/tools/perf/scripts/python/export-to-postgresql.py
@@ -478,7 +478,7 @@ if perf_db_export_calls:
'branch_count,'
'call_id,'
'return_id,'
- 'CASE WHEN flags=1 THEN \'no call\' WHEN flags=2 THEN \'no return\' WHEN flags=3 THEN \'no call/return\' ELSE \'\' END AS flags,'
+ 'CASE WHEN flags=0 THEN \'\' WHEN flags=1 THEN \'no call\' WHEN flags=2 THEN \'no return\' WHEN flags=3 THEN \'no call/return\' WHEN flags=6 THEN \'jump\' ELSE flags END AS flags,'
'parent_call_path_id'
' FROM calls INNER JOIN call_paths ON call_paths.id = call_path_id')
diff --git a/tools/perf/scripts/python/export-to-sqlite.py b/tools/perf/scripts/python/export-to-sqlite.py
index 245caf2643ed..ed237f2ed03f 100644
--- a/tools/perf/scripts/python/export-to-sqlite.py
+++ b/tools/perf/scripts/python/export-to-sqlite.py
@@ -320,7 +320,7 @@ if perf_db_export_calls:
'branch_count,'
'call_id,'
'return_id,'
- 'CASE WHEN flags=1 THEN \'no call\' WHEN flags=2 THEN \'no return\' WHEN flags=3 THEN \'no call/return\' ELSE \'\' END AS flags,'
+ 'CASE WHEN flags=0 THEN \'\' WHEN flags=1 THEN \'no call\' WHEN flags=2 THEN \'no return\' WHEN flags=3 THEN \'no call/return\' WHEN flags=6 THEN \'jump\' ELSE flags END AS flags,'
'parent_call_path_id'
' FROM calls INNER JOIN call_paths ON call_paths.id = call_path_id')
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index 7f8eff018c16..f52c0f90915d 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -38,6 +38,7 @@
* @cp: call path
* @no_call: a 'call' was not seen
* @trace_end: a 'call' but trace ended
+ * @non_call: a branch but not a 'call' to the start of a different symbol
*/
struct thread_stack_entry {
u64 ret_addr;
@@ -47,6 +48,7 @@ struct thread_stack_entry {
struct call_path *cp;
bool no_call;
bool trace_end;
+ bool non_call;
};
/**
@@ -268,6 +270,8 @@ static int thread_stack__call_return(struct thread *thread,
cr.flags |= CALL_RETURN_NO_CALL;
if (no_return)
cr.flags |= CALL_RETURN_NO_RETURN;
+ if (tse->non_call)
+ cr.flags |= CALL_RETURN_NON_CALL;
return crp->process(&cr, crp->data);
}
@@ -510,6 +514,7 @@ static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
tse->cp = cp;
tse->no_call = no_call;
tse->trace_end = trace_end;
+ tse->non_call = false;
return 0;
}
@@ -531,14 +536,16 @@ static int thread_stack__pop_cp(struct thread *thread, struct thread_stack *ts,
timestamp, ref, false);
}
- if (ts->stack[ts->cnt - 1].ret_addr == ret_addr) {
+ if (ts->stack[ts->cnt - 1].ret_addr == ret_addr &&
+ !ts->stack[ts->cnt - 1].non_call) {
return thread_stack__call_return(thread, ts, --ts->cnt,
timestamp, ref, false);
} else {
size_t i = ts->cnt - 1;
while (i--) {
- if (ts->stack[i].ret_addr != ret_addr)
+ if (ts->stack[i].ret_addr != ret_addr ||
+ ts->stack[i].non_call)
continue;
i += 1;
while (ts->cnt > i) {
@@ -757,6 +764,25 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
err = thread_stack__trace_begin(thread, ts, sample->time, ref);
} else if (sample->flags & PERF_IP_FLAG_TRACE_END) {
err = thread_stack__trace_end(ts, sample, ref);
+ } else if (sample->flags & PERF_IP_FLAG_BRANCH &&
+ from_al->sym != to_al->sym && to_al->sym &&
+ to_al->addr == to_al->sym->start) {
+ struct call_path_root *cpr = ts->crp->cpr;
+ struct call_path *cp;
+
+ /*
+ * The compiler might optimize a call/ret combination by making
+ * it a jmp. Make that visible by recording on the stack a
+ * branch to the start of a different symbol. Note, that means
+ * when a ret pops the stack, all jmps must be popped off first.
+ */
+ cp = call_path__findnew(cpr, ts->stack[ts->cnt - 1].cp,
+ to_al->sym, sample->addr,
+ ts->kernel_start);
+ err = thread_stack__push_cp(ts, 0, sample->time, ref, cp, false,
+ false);
+ if (!err)
+ ts->stack[ts->cnt - 1].non_call = true;
}
return err;
diff --git a/tools/perf/util/thread-stack.h b/tools/perf/util/thread-stack.h
index 1f626f4a1c40..b7c04e19ad41 100644
--- a/tools/perf/util/thread-stack.h
+++ b/tools/perf/util/thread-stack.h
@@ -35,10 +35,13 @@ struct call_path;
*
* CALL_RETURN_NO_CALL: 'return' but no matching 'call'
* CALL_RETURN_NO_RETURN: 'call' but no matching 'return'
+ * CALL_RETURN_NON_CALL: a branch but not a 'call' to the start of a different
+ * symbol
*/
enum {
CALL_RETURN_NO_CALL = 1 << 0,
CALL_RETURN_NO_RETURN = 1 << 1,
+ CALL_RETURN_NON_CALL = 1 << 2,
};
/**
--
2.20.1
next prev parent reply other threads:[~2019-02-06 18:51 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-06 18:48 [GIT PULL 00/53] perf/core improvements and fixes Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 01/53] perf powerpc: Add missing headers to skip-callchain-idx.c Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 02/53] perf arm pmu: Add missing linux/string.h header Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 03/53] perf srccode: Move struct definition from map.h to srccode.h Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 04/53] perf callchain: Uninline callchain_cursor_reset() to remove map.h dependency Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 05/53] perf symbols: Introduce map_symbol.h Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 06/53] pref tools: Add missing map.h includes Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 07/53] perf map: Move structs and prototypes for map groups to a separate header Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 08/53] perf tests: Add missing headers so far obtained indirectly Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 09/53] perf hist: Remove symbol.h from hist.h, just fwd decls are needed Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 10/53] perf tools: Add missing include for symbols.h Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 11/53] perf evsel: No need to include symbol.h in evsel.h, symbol_conf.h is enough Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 12/53] perf thread: Don't include symbol.h, " Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 13/53] perf tools: Add missing include <callchain.h> in various places Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 14/53] perf hist: Remove the needless callchain.h include from hist.h Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 15/53] perf tests pmu: Add missing headers Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 16/53] perf pmu: Remove needless evsel.h include, only needs one fwd decl Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 17/53] perf kvm stat: Replace kvm-stat.h includes with forward declarations Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 18/53] perf powerpc kvm-stat: Add missing evlist.h header Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 19/53] perf bpf-loader: Remove unecessary includes from bpf-loader.h Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 20/53] perf/aux: Make perf_event accessible to setup_aux() Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 21/53] coresight: perf: Add "sinks" group to PMU directory Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 22/53] coresight: Use event attributes for sink selection Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 23/53] perf pmu: Move EVENT_SOURCE_DEVICE_PATH to PMU header file Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 24/53] perf arm cs-etm: Use event attributes to send sink information to kernel Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 25/53] perf coresight: Remove set_drv_config() API Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 26/53] perf pmu: Remove set_drv_config API Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 27/53] perf record: Allocate affinity masks Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 28/53] perf record: Bind the AIO user space buffers to nodes Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 29/53] perf tools: Add fallback versions for CPU_{OR,EQUAL}() Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 30/53] perf record: Apply affinity masks when reading mmap buffers Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 31/53] perf clang: Do not use 'return std::move(something)' Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 32/53] perf hists: Add argument to hists__resort_cb_t callback Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 33/53] perf evsel: Add output_resort_cb method Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 34/53] perf report: Move symbol annotation to the resort phase Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 35/53] perf tools: Add documentation for BPF event selection Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 36/53] perf cs-etm: Add last instruction information in packet Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 37/53] perf cs-etm: Set sample flags for instruction range packet Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 38/53] perf cs-etm: Set sample flags for trace discontinuity Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 39/53] perf cs-etm: Add exception number in exception packet Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 40/53] perf cs-etm: Change tuple from traceID-CPU# to traceID-metadata Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 41/53] perf cs-etm: Add traceID in packet Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 42/53] perf cs-etm: Set sample flags for exception packet Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 43/53] perf cs-etm: Set sample flags for exception return packet Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 44/53] perf vendor events intel: Fix Load_Miss_Real_Latency on CLX Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 45/53] perf tools: Fix split_kallsyms_for_kcore() for trampoline symbols Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 46/53] perf thread-stack: Tidy thread_stack__push_cp() usage Arnaldo Carvalho de Melo
2019-02-06 18:48 ` [PATCH 47/53] perf thread-stack: Tidy thread_stack__no_call_return() by adding more local variables Arnaldo Carvalho de Melo
2019-02-06 18:48 ` Arnaldo Carvalho de Melo [this message]
2019-02-06 18:48 ` [PATCH 49/53] perf auxtrace: Define auxtrace record alignment Arnaldo Carvalho de Melo
2019-02-06 18:49 ` [PATCH 50/53] perf intel-pt: Fix overlap calculation for padding Arnaldo Carvalho de Melo
2019-02-06 18:49 ` [PATCH 51/53] perf intel-pt: Fix CYC timestamp calculation after OVF Arnaldo Carvalho de Melo
2019-02-06 18:49 ` [PATCH 52/53] perf intel-pt: Packet splitting can happen only on 32-bit Arnaldo Carvalho de Melo
2019-02-06 18:49 ` [PATCH 53/53] perf auxtrace: Add timestamp to auxtrace errors Arnaldo Carvalho de Melo
2019-02-09 12:17 ` [GIT PULL 00/53] perf/core improvements and fixes Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190206184903.24054-49-acme@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=adrian.hunter@intel.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox