* [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters
@ 2018-09-20 13:00 Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
` (7 more replies)
0 siblings, 8 replies; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Hi
Here is V2 of some Intel PT patches to improve the data displayed when using
address filters.
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends with a
call. That happens when using address filters, for example:
$ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data ]
Before:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: call 4015b9 main+0x29 => 0 [unknown]
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: call 4015c8 main+0x38 => 0 [unknown]
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: call 4015d7 main+0x47 => 0 [unknown]
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: call 4015e1 main+0x51 => 0 [unknown]
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: call 4015eb main+0x5b => 0 [unknown]
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: call 401612 main+0x82 => 0 [unknown]
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: call 401847 main+0x2b7 => 0 [unknown]
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: call 4019bf main+0x42f => 0 [unknown]
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: call 4019f5 main+0x465 => 0 [unknown]
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: call 401832 main+0x2a2 => 0 [unknown]
After:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: tr end call 4015b9 main+0x29 => 401ef0 set_program_name+0x0
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: tr end call 4015c8 main+0x38 => 4014b0 setlocale@plt+0x0
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: tr end call 4015d7 main+0x47 => 4012d0 bindtextdomain@plt+0x0
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: tr end call 4015e1 main+0x51 => 4012b0 textdomain@plt+0x0
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: tr end call 4015eb main+0x5b => 404340 atexit+0x0
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: tr end call 401612 main+0x82 => 401320 getopt_long@plt+0x0
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: tr end call 401847 main+0x2b7 => 401360 uname@plt+0x0
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: tr end call 4019bf main+0x42f => 401b10 print_element+0x0
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: tr end call 4019f5 main+0x465 => 401340 __overflow@plt+0x0
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: tr end call 401832 main+0x2a2 => 401520 exit@plt+0x0
Changes in V2:
Improve commit messages
Adrian Hunter (6):
perf script: Enhance sample flags for trace begin / end
perf db-export: Add trace begin / end branch type variants
perf tools: Improve thread_stack__event() for trace begin / end
perf tools: Improve thread_stack__process() for trace begin / end
perf intel-pt: Add decoder flags for trace begin / end
perf intel-pt: Implement decoder flags for trace begin / end
tools/perf/builtin-script.c | 36 +++++++++++----
tools/perf/util/db-export.c | 22 ++++++++++
.../perf/util/intel-pt-decoder/intel-pt-decoder.c | 34 ++++++++++-----
.../perf/util/intel-pt-decoder/intel-pt-decoder.h | 2 +
tools/perf/util/intel-pt.c | 5 +++
tools/perf/util/thread-stack.c | 51 +++++++++++++++++-----
6 files changed, 118 insertions(+), 32 deletions(-)
Regards
Adrian
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:53 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
` (6 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Allow for different combinations of sample flags with "trace begin" or
"trace end".
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, prepare 'perf script' to display
sample flags with more combinations that include trace begin / end. In
those cases display 'tr start' and 'tr end' separately.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
tools/perf/builtin-script.c | 36 +++++++++++++++++++++++++++---------
1 file changed, 27 insertions(+), 9 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 6176bae177c2..4982380ba96d 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1255,6 +1255,18 @@ static struct {
{0, NULL}
};
+static const char *sample_flags_to_name(u32 flags)
+{
+ int i;
+
+ for (i = 0; sample_flags[i].name ; i++) {
+ if (sample_flags[i].flags == flags)
+ return sample_flags[i].name;
+ }
+
+ return NULL;
+}
+
static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
{
const char *chars = PERF_IP_FLAG_CHARS;
@@ -1264,11 +1276,20 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
char str[33];
int i, pos = 0;
- for (i = 0; sample_flags[i].name ; i++) {
- if (sample_flags[i].flags == (flags & ~PERF_IP_FLAG_IN_TX)) {
- name = sample_flags[i].name;
- break;
- }
+ name = sample_flags_to_name(flags & ~PERF_IP_FLAG_IN_TX);
+ if (name)
+ return fprintf(fp, " %-15s%4s ", name, in_tx ? "(x)" : "");
+
+ if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+ name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_BEGIN));
+ if (name)
+ return fprintf(fp, " tr strt %-7s%4s ", name, in_tx ? "(x)" : "");
+ }
+
+ if (flags & PERF_IP_FLAG_TRACE_END) {
+ name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_END));
+ if (name)
+ return fprintf(fp, " tr end %-7s%4s ", name, in_tx ? "(x)" : "");
}
for (i = 0; i < n; i++, flags >>= 1) {
@@ -1281,10 +1302,7 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
}
str[pos] = 0;
- if (name)
- return fprintf(fp, " %-7s%4s ", name, in_tx ? "(x)" : "");
-
- return fprintf(fp, " %-11s ", str);
+ return fprintf(fp, " %-19s ", str);
}
struct printer_data {
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:54 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
` (5 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Add branch types to cover different combinations with "trace begin" or
"trace end".
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, prepare the database export to
export branch types with more combinations that include trace begin / end.
In those cases extend the descriptions to include 'trace begin' and
'trace end' separately.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
tools/perf/util/db-export.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 7123746edcf4..69fbb0a72d0c 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -463,6 +463,28 @@ int db_export__branch_types(struct db_export *dbe)
if (err)
break;
}
+
+ /* Add trace begin / end variants */
+ for (i = 0; branch_types[i].name ; i++) {
+ const char *name = branch_types[i].name;
+ u32 type = branch_types[i].branch_type;
+ char buf[64];
+
+ if (type == PERF_IP_FLAG_BRANCH ||
+ (type & (PERF_IP_FLAG_TRACE_BEGIN | PERF_IP_FLAG_TRACE_END)))
+ continue;
+
+ snprintf(buf, sizeof(buf), "trace begin / %s", name);
+ err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_BEGIN, buf);
+ if (err)
+ break;
+
+ snprintf(buf, sizeof(buf), "%s / trace end", name);
+ err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_END, buf);
+ if (err)
+ break;
+ }
+
return err;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:54 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
` (4 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
thread_stack__event() is used to create call stacks, by keeping track of
calls and returns. Improve the handling of trace begin / end to allow for a
trace that ends in a call.
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, enhance the thread stack so that
it does not expect to see the 'return' for a 'call' that ends the trace.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
tools/perf/util/thread-stack.c | 35 +++++++++++++++++++++++++++++-----
1 file changed, 30 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index dd17d6a38d3a..cea28b9074c1 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -36,6 +36,7 @@
* @branch_count: the branch count when the entry was created
* @cp: call path
* @no_call: a 'call' was not seen
+ * @trace_end: a 'call' but trace ended
*/
struct thread_stack_entry {
u64 ret_addr;
@@ -44,6 +45,7 @@ struct thread_stack_entry {
u64 branch_count;
struct call_path *cp;
bool no_call;
+ bool trace_end;
};
/**
@@ -112,7 +114,8 @@ static struct thread_stack *thread_stack__new(struct thread *thread,
return ts;
}
-static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
+static int thread_stack__push(struct thread_stack *ts, u64 ret_addr,
+ bool trace_end)
{
int err = 0;
@@ -124,6 +127,7 @@ static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
}
}
+ ts->stack[ts->cnt].trace_end = trace_end;
ts->stack[ts->cnt++].ret_addr = ret_addr;
return err;
@@ -150,6 +154,18 @@ static void thread_stack__pop(struct thread_stack *ts, u64 ret_addr)
}
}
+static void thread_stack__pop_trace_end(struct thread_stack *ts)
+{
+ size_t i;
+
+ for (i = ts->cnt; i; ) {
+ if (ts->stack[--i].trace_end)
+ ts->cnt = i;
+ else
+ return;
+ }
+}
+
static bool thread_stack__in_kernel(struct thread_stack *ts)
{
if (!ts->cnt)
@@ -254,10 +270,19 @@ int thread_stack__event(struct thread *thread, u32 flags, u64 from_ip,
ret_addr = from_ip + insn_len;
if (ret_addr == to_ip)
return 0; /* Zero-length calls are excluded */
- return thread_stack__push(thread->ts, ret_addr);
- } else if (flags & PERF_IP_FLAG_RETURN) {
- if (!from_ip)
- return 0;
+ return thread_stack__push(thread->ts, ret_addr,
+ flags && PERF_IP_FLAG_TRACE_END);
+ } else if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+ /*
+ * If the caller did not change the trace number (which would
+ * have flushed the stack) then try to make sense of the stack.
+ * Possibly, tracing began after returning to the current
+ * address, so try to pop that. Also, do not expect a call made
+ * when the trace ended, to return, so pop that.
+ */
+ thread_stack__pop(thread->ts, to_ip);
+ thread_stack__pop_trace_end(thread->ts);
+ } else if ((flags & PERF_IP_FLAG_RETURN) && from_ip) {
thread_stack__pop(thread->ts, to_ip);
}
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V2 4/6] perf tools: Improve thread_stack__process() for trace begin / end
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
` (2 preceding siblings ...)
2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:55 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
` (3 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
thread_stack__process() is used to create call paths for database export.
Improve the handling of trace begin / end to allow for a trace that ends in
a call.
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, enhance the thread stack so that
it identifies the trace end by the flag instead of by ip == 0.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
tools/perf/util/thread-stack.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index cea28b9074c1..45a97d15c6c8 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -357,7 +357,7 @@ void call_return_processor__free(struct call_return_processor *crp)
static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
u64 timestamp, u64 ref, struct call_path *cp,
- bool no_call)
+ bool no_call, bool trace_end)
{
struct thread_stack_entry *tse;
int err;
@@ -375,6 +375,7 @@ static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
tse->branch_count = ts->branch_count;
tse->cp = cp;
tse->no_call = no_call;
+ tse->trace_end = trace_end;
return 0;
}
@@ -448,7 +449,7 @@ static int thread_stack__bottom(struct thread *thread, struct thread_stack *ts,
return -ENOMEM;
return thread_stack__push_cp(thread->ts, ip, sample->time, ref, cp,
- true);
+ true, false);
}
static int thread_stack__no_call_return(struct thread *thread,
@@ -480,7 +481,7 @@ static int thread_stack__no_call_return(struct thread *thread,
if (!cp)
return -ENOMEM;
return thread_stack__push_cp(ts, 0, sample->time, ref,
- cp, true);
+ cp, true, false);
}
} else if (thread_stack__in_kernel(ts) && sample->ip < ks) {
/* Return to userspace, so pop all kernel addresses */
@@ -505,7 +506,7 @@ static int thread_stack__no_call_return(struct thread *thread,
return -ENOMEM;
err = thread_stack__push_cp(ts, sample->addr, sample->time, ref, cp,
- true);
+ true, false);
if (err)
return err;
@@ -525,7 +526,7 @@ static int thread_stack__trace_begin(struct thread *thread,
/* Pop trace end */
tse = &ts->stack[ts->cnt - 1];
- if (tse->cp->sym == NULL && tse->cp->ip == 0) {
+ if (tse->trace_end) {
err = thread_stack__call_return(thread, ts, --ts->cnt,
timestamp, ref, false);
if (err)
@@ -554,7 +555,7 @@ static int thread_stack__trace_end(struct thread_stack *ts,
ret_addr = sample->ip + sample->insn_len;
return thread_stack__push_cp(ts, ret_addr, sample->time, ref, cp,
- false);
+ false, true);
}
int thread_stack__process(struct thread *thread, struct comm *comm,
@@ -604,6 +605,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
ts->last_time = sample->time;
if (sample->flags & PERF_IP_FLAG_CALL) {
+ bool trace_end = sample->flags & PERF_IP_FLAG_TRACE_END;
struct call_path_root *cpr = ts->crp->cpr;
struct call_path *cp;
u64 ret_addr;
@@ -621,7 +623,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
if (!cp)
return -ENOMEM;
err = thread_stack__push_cp(ts, ret_addr, sample->time, ref,
- cp, false);
+ cp, false, trace_end);
} else if (sample->flags & PERF_IP_FLAG_RETURN) {
if (!sample->ip || !sample->addr)
return 0;
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V2 5/6] perf intel-pt: Add decoder flags for trace begin / end
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
` (3 preceding siblings ...)
2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:56 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
` (2 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends with a
call. To prepare for remedying that, add Intel PT decoder flags for trace
begin / end and map them to the existing sample flags.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.h | 2 ++
tools/perf/util/intel-pt.c | 5 +++++
2 files changed, 7 insertions(+)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index 51c18d67f4ca..ed088d4726ba 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -37,6 +37,8 @@ enum intel_pt_sample_type {
INTEL_PT_EX_STOP = 1 << 6,
INTEL_PT_PWR_EXIT = 1 << 7,
INTEL_PT_CBR_CHG = 1 << 8,
+ INTEL_PT_TRACE_BEGIN = 1 << 9,
+ INTEL_PT_TRACE_END = 1 << 10,
};
enum intel_pt_period_type {
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index aec68908d604..48c1d415c6b0 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -908,6 +908,11 @@ static void intel_pt_sample_flags(struct intel_pt_queue *ptq)
ptq->insn_len = ptq->state->insn_len;
memcpy(ptq->insn, ptq->state->insn, INTEL_PT_INSN_BUF_SZ);
}
+
+ if (ptq->state->type & INTEL_PT_TRACE_BEGIN)
+ ptq->flags |= PERF_IP_FLAG_TRACE_BEGIN;
+ if (ptq->state->type & INTEL_PT_TRACE_END)
+ ptq->flags |= PERF_IP_FLAG_TRACE_END;
}
static int intel_pt_setup_queue(struct intel_pt *pt,
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V2 6/6] perf intel-pt: Implement decoder flags for trace begin / end
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
` (4 preceding siblings ...)
2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:56 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:41 ` [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Arnaldo Carvalho de Melo
2018-09-20 14:13 ` Arnaldo Carvalho de Melo
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Have the Intel PT decoder implement the new Intel PT decoder flags for
trace begin / end.
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends with a
call. That happens when using address filters, for example:
$ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data ]
Before:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: call 4015b9 main+0x29 => 0 [unknown]
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: call 4015c8 main+0x38 => 0 [unknown]
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: call 4015d7 main+0x47 => 0 [unknown]
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: call 4015e1 main+0x51 => 0 [unknown]
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: call 4015eb main+0x5b => 0 [unknown]
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: call 401612 main+0x82 => 0 [unknown]
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: call 401847 main+0x2b7 => 0 [unknown]
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: call 4019bf main+0x42f => 0 [unknown]
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: call 4019f5 main+0x465 => 0 [unknown]
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: call 401832 main+0x2a2 => 0 [unknown]
After:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: tr end call 4015b9 main+0x29 => 401ef0 set_program_name+0x0
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: tr end call 4015c8 main+0x38 => 4014b0 setlocale@plt+0x0
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: tr end call 4015d7 main+0x47 => 4012d0 bindtextdomain@plt+0x0
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: tr end call 4015e1 main+0x51 => 4012b0 textdomain@plt+0x0
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: tr end call 4015eb main+0x5b => 404340 atexit+0x0
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: tr end call 401612 main+0x82 => 401320 getopt_long@plt+0x0
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: tr end call 401847 main+0x2b7 => 401360 uname@plt+0x0
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: tr end call 4019bf main+0x42f => 401b10 print_element+0x0
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: tr end call 4019f5 main+0x465 => 401340 __overflow@plt+0x0
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: tr end call 401832 main+0x2a2 => 401520 exit@plt+0x0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
.../util/intel-pt-decoder/intel-pt-decoder.c | 34 +++++++++++++------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index d404bed7003a..58f6a9ceb590 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1165,7 +1165,7 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->pge = false;
decoder->continuous_period = false;
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
- decoder->state.to_ip = 0;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
}
if (err == INTEL_PT_RETURN)
@@ -1179,9 +1179,13 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->continuous_period = false;
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
- if (decoder->packet.count != 0)
+ if (decoder->packet.count == 0) {
+ decoder->state.to_ip = 0;
+ } else {
+ decoder->state.to_ip = decoder->last_ip;
decoder->ip = decoder->last_ip;
+ }
+ decoder->state.type |= INTEL_PT_TRACE_END;
} else {
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->state.from_ip = decoder->ip;
@@ -1208,7 +1212,8 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->ip = to_ip;
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
+ decoder->state.to_ip = to_ip;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
}
intel_pt_log_at("ERROR: Conditional branch when expecting indirect branch",
@@ -1640,14 +1645,15 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
case INTEL_PT_TIP_PGD:
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
- if (decoder->packet.count != 0) {
+ if (decoder->packet.count == 0) {
+ decoder->state.to_ip = 0;
+ } else {
intel_pt_set_ip(decoder);
- intel_pt_log("Omitting PGD ip " x64_fmt "\n",
- decoder->ip);
+ decoder->state.to_ip = decoder->ip;
}
decoder->pge = false;
decoder->continuous_period = false;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
case INTEL_PT_TIP_PGE:
@@ -1661,6 +1667,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
intel_pt_set_ip(decoder);
decoder->state.to_ip = decoder->ip;
}
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
return 0;
case INTEL_PT_TIP:
@@ -1739,6 +1746,7 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
intel_pt_set_ip(decoder);
decoder->state.from_ip = 0;
decoder->state.to_ip = decoder->ip;
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
return 0;
}
@@ -2077,9 +2085,13 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
decoder->pge = decoder->packet.type != INTEL_PT_TIP_PGD;
if (intel_pt_have_ip(decoder))
intel_pt_set_ip(decoder);
- if (decoder->ip)
- return 0;
- break;
+ if (!decoder->ip)
+ break;
+ if (decoder->packet.type == INTEL_PT_TIP_PGE)
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+ if (decoder->packet.type == INTEL_PT_TIP_PGD)
+ decoder->state.type |= INTEL_PT_TRACE_END;
+ return 0;
case INTEL_PT_FUP:
if (intel_pt_have_ip(decoder))
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
` (5 preceding siblings ...)
2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
@ 2018-09-20 13:41 ` Arnaldo Carvalho de Melo
2018-09-20 14:13 ` Arnaldo Carvalho de Melo
7 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-20 13:41 UTC (permalink / raw)
To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Em Thu, Sep 20, 2018 at 04:00:42PM +0300, Adrian Hunter escreveu:
> Here is V2 of some Intel PT patches to improve the data displayed when using
> address filters.
<SNIP>
> Changes in V2:
>
> Improve commit messages
Thanks a lot, helps a lot,
- Arnaldo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
` (6 preceding siblings ...)
2018-09-20 13:41 ` [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Arnaldo Carvalho de Melo
@ 2018-09-20 14:13 ` Arnaldo Carvalho de Melo
7 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-20 14:13 UTC (permalink / raw)
To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Em Thu, Sep 20, 2018 at 04:00:42PM +0300, Adrian Hunter escreveu:
> Hi
>
> Here is V2 of some Intel PT patches to improve the data displayed when using
> address filters.
>
> Previously, the decoder would indicate begin / end by a branch from / to
> zero. That hides useful information, in particular when a trace ends with a
> call. That happens when using address filters, for example:
>
> $ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname
> Linux
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.031 MB perf.data ]
Thanks, applied.
- Arnaldo
^ permalink raw reply [flat|nested] 15+ messages in thread
* [tip:perf/core] perf script: Enhance sample flags for trace begin / end
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
@ 2018-09-26 8:53 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:53 UTC (permalink / raw)
To: linux-tip-commits
Cc: acme, jolsa, linux-kernel, tglx, ak, adrian.hunter, hpa, mingo
Commit-ID: 62cb1b8868a70c932b15959a98594df537df2ffc
Gitweb: https://git.kernel.org/tip/62cb1b8868a70c932b15959a98594df537df2ffc
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:43 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 11:09:55 -0300
perf script: Enhance sample flags for trace begin / end
Allow for different combinations of sample flags with "trace begin" or
"trace end".
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, prepare 'perf script' to
display sample flags with more combinations that include trace begin /
end. In those cases display 'tr start' and 'tr end' separately.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-script.c | 36 +++++++++++++++++++++++++++---------
1 file changed, 27 insertions(+), 9 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7732346bd9dd..4da5e32b9e03 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1262,6 +1262,18 @@ static struct {
{0, NULL}
};
+static const char *sample_flags_to_name(u32 flags)
+{
+ int i;
+
+ for (i = 0; sample_flags[i].name ; i++) {
+ if (sample_flags[i].flags == flags)
+ return sample_flags[i].name;
+ }
+
+ return NULL;
+}
+
static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
{
const char *chars = PERF_IP_FLAG_CHARS;
@@ -1271,11 +1283,20 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
char str[33];
int i, pos = 0;
- for (i = 0; sample_flags[i].name ; i++) {
- if (sample_flags[i].flags == (flags & ~PERF_IP_FLAG_IN_TX)) {
- name = sample_flags[i].name;
- break;
- }
+ name = sample_flags_to_name(flags & ~PERF_IP_FLAG_IN_TX);
+ if (name)
+ return fprintf(fp, " %-15s%4s ", name, in_tx ? "(x)" : "");
+
+ if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+ name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_BEGIN));
+ if (name)
+ return fprintf(fp, " tr strt %-7s%4s ", name, in_tx ? "(x)" : "");
+ }
+
+ if (flags & PERF_IP_FLAG_TRACE_END) {
+ name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_END));
+ if (name)
+ return fprintf(fp, " tr end %-7s%4s ", name, in_tx ? "(x)" : "");
}
for (i = 0; i < n; i++, flags >>= 1) {
@@ -1288,10 +1309,7 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
}
str[pos] = 0;
- if (name)
- return fprintf(fp, " %-7s%4s ", name, in_tx ? "(x)" : "");
-
- return fprintf(fp, " %-11s ", str);
+ return fprintf(fp, " %-19s ", str);
}
struct printer_data {
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:perf/core] perf db-export: Add trace begin / end branch type variants
2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
@ 2018-09-26 8:54 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:54 UTC (permalink / raw)
To: linux-tip-commits
Cc: jolsa, acme, tglx, hpa, linux-kernel, adrian.hunter, mingo, ak
Commit-ID: ff645daf30cafb6fa74bee9a73733700bac2aff7
Gitweb: https://git.kernel.org/tip/ff645daf30cafb6fa74bee9a73733700bac2aff7
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:44 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 11:10:25 -0300
perf db-export: Add trace begin / end branch type variants
Add branch types to cover different combinations with "trace begin" or
"trace end".
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, prepare the database
export to export branch types with more combinations that include trace
begin / end. In those cases extend the descriptions to include 'trace
begin' and 'trace end' separately.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/db-export.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 7123746edcf4..69fbb0a72d0c 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -463,6 +463,28 @@ int db_export__branch_types(struct db_export *dbe)
if (err)
break;
}
+
+ /* Add trace begin / end variants */
+ for (i = 0; branch_types[i].name ; i++) {
+ const char *name = branch_types[i].name;
+ u32 type = branch_types[i].branch_type;
+ char buf[64];
+
+ if (type == PERF_IP_FLAG_BRANCH ||
+ (type & (PERF_IP_FLAG_TRACE_BEGIN | PERF_IP_FLAG_TRACE_END)))
+ continue;
+
+ snprintf(buf, sizeof(buf), "trace begin / %s", name);
+ err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_BEGIN, buf);
+ if (err)
+ break;
+
+ snprintf(buf, sizeof(buf), "%s / trace end", name);
+ err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_END, buf);
+ if (err)
+ break;
+ }
+
return err;
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:perf/core] perf tools: Improve thread_stack__event() for trace begin / end
2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
@ 2018-09-26 8:54 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:54 UTC (permalink / raw)
To: linux-tip-commits
Cc: jolsa, adrian.hunter, acme, tglx, ak, linux-kernel, hpa, mingo
Commit-ID: 4d60e5e36aa6f11b4d9eadc5d2b94128f24870c7
Gitweb: https://git.kernel.org/tip/4d60e5e36aa6f11b4d9eadc5d2b94128f24870c7
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:45 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:16:17 -0300
perf tools: Improve thread_stack__event() for trace begin / end
thread_stack__event() is used to create call stacks, by keeping track of
calls and returns. Improve the handling of trace begin / end to allow
for a trace that ends in a call.
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, enhance the thread stack
so that it does not expect to see the 'return' for a 'call' that ends
the trace.
Committer notes:
Added this:
return thread_stack__push(thread->ts, ret_addr,
- flags && PERF_IP_FLAG_TRACE_END);
+ flags & PERF_IP_FLAG_TRACE_END);
To fix problem spotted by:
debian:9: clang version 3.8.1-24 (tags/RELEASE_381/final)
debian:experimental: clang version 6.0.1-6 (tags/RELEASE_601/final)
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/thread-stack.c | 35 ++++++++++++++++++++++++++++++-----
1 file changed, 30 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index dd17d6a38d3a..e3f7dfecafa9 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -36,6 +36,7 @@
* @branch_count: the branch count when the entry was created
* @cp: call path
* @no_call: a 'call' was not seen
+ * @trace_end: a 'call' but trace ended
*/
struct thread_stack_entry {
u64 ret_addr;
@@ -44,6 +45,7 @@ struct thread_stack_entry {
u64 branch_count;
struct call_path *cp;
bool no_call;
+ bool trace_end;
};
/**
@@ -112,7 +114,8 @@ static struct thread_stack *thread_stack__new(struct thread *thread,
return ts;
}
-static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
+static int thread_stack__push(struct thread_stack *ts, u64 ret_addr,
+ bool trace_end)
{
int err = 0;
@@ -124,6 +127,7 @@ static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
}
}
+ ts->stack[ts->cnt].trace_end = trace_end;
ts->stack[ts->cnt++].ret_addr = ret_addr;
return err;
@@ -150,6 +154,18 @@ static void thread_stack__pop(struct thread_stack *ts, u64 ret_addr)
}
}
+static void thread_stack__pop_trace_end(struct thread_stack *ts)
+{
+ size_t i;
+
+ for (i = ts->cnt; i; ) {
+ if (ts->stack[--i].trace_end)
+ ts->cnt = i;
+ else
+ return;
+ }
+}
+
static bool thread_stack__in_kernel(struct thread_stack *ts)
{
if (!ts->cnt)
@@ -254,10 +270,19 @@ int thread_stack__event(struct thread *thread, u32 flags, u64 from_ip,
ret_addr = from_ip + insn_len;
if (ret_addr == to_ip)
return 0; /* Zero-length calls are excluded */
- return thread_stack__push(thread->ts, ret_addr);
- } else if (flags & PERF_IP_FLAG_RETURN) {
- if (!from_ip)
- return 0;
+ return thread_stack__push(thread->ts, ret_addr,
+ flags & PERF_IP_FLAG_TRACE_END);
+ } else if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+ /*
+ * If the caller did not change the trace number (which would
+ * have flushed the stack) then try to make sense of the stack.
+ * Possibly, tracing began after returning to the current
+ * address, so try to pop that. Also, do not expect a call made
+ * when the trace ended, to return, so pop that.
+ */
+ thread_stack__pop(thread->ts, to_ip);
+ thread_stack__pop_trace_end(thread->ts);
+ } else if ((flags & PERF_IP_FLAG_RETURN) && from_ip) {
thread_stack__pop(thread->ts, to_ip);
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:perf/core] perf tools: Improve thread_stack__process() for trace begin / end
2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
@ 2018-09-26 8:55 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:55 UTC (permalink / raw)
To: linux-tip-commits
Cc: mingo, hpa, jolsa, tglx, ak, linux-kernel, adrian.hunter, acme
Commit-ID: 2dcde4e152a3e319cc7e76c7c6b8548a3c72310d
Gitweb: https://git.kernel.org/tip/2dcde4e152a3e319cc7e76c7c6b8548a3c72310d
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:46 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:19:50 -0300
perf tools: Improve thread_stack__process() for trace begin / end
thread_stack__process() is used to create call paths for database
export. Improve the handling of trace begin / end to allow for a trace
that ends in a call.
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, enhance the thread stack
so that it identifies the trace end by the flag instead of by ip == 0.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/thread-stack.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index e3f7dfecafa9..c091635bf7dc 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -357,7 +357,7 @@ void call_return_processor__free(struct call_return_processor *crp)
static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
u64 timestamp, u64 ref, struct call_path *cp,
- bool no_call)
+ bool no_call, bool trace_end)
{
struct thread_stack_entry *tse;
int err;
@@ -375,6 +375,7 @@ static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
tse->branch_count = ts->branch_count;
tse->cp = cp;
tse->no_call = no_call;
+ tse->trace_end = trace_end;
return 0;
}
@@ -448,7 +449,7 @@ static int thread_stack__bottom(struct thread *thread, struct thread_stack *ts,
return -ENOMEM;
return thread_stack__push_cp(thread->ts, ip, sample->time, ref, cp,
- true);
+ true, false);
}
static int thread_stack__no_call_return(struct thread *thread,
@@ -480,7 +481,7 @@ static int thread_stack__no_call_return(struct thread *thread,
if (!cp)
return -ENOMEM;
return thread_stack__push_cp(ts, 0, sample->time, ref,
- cp, true);
+ cp, true, false);
}
} else if (thread_stack__in_kernel(ts) && sample->ip < ks) {
/* Return to userspace, so pop all kernel addresses */
@@ -505,7 +506,7 @@ static int thread_stack__no_call_return(struct thread *thread,
return -ENOMEM;
err = thread_stack__push_cp(ts, sample->addr, sample->time, ref, cp,
- true);
+ true, false);
if (err)
return err;
@@ -525,7 +526,7 @@ static int thread_stack__trace_begin(struct thread *thread,
/* Pop trace end */
tse = &ts->stack[ts->cnt - 1];
- if (tse->cp->sym == NULL && tse->cp->ip == 0) {
+ if (tse->trace_end) {
err = thread_stack__call_return(thread, ts, --ts->cnt,
timestamp, ref, false);
if (err)
@@ -554,7 +555,7 @@ static int thread_stack__trace_end(struct thread_stack *ts,
ret_addr = sample->ip + sample->insn_len;
return thread_stack__push_cp(ts, ret_addr, sample->time, ref, cp,
- false);
+ false, true);
}
int thread_stack__process(struct thread *thread, struct comm *comm,
@@ -604,6 +605,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
ts->last_time = sample->time;
if (sample->flags & PERF_IP_FLAG_CALL) {
+ bool trace_end = sample->flags & PERF_IP_FLAG_TRACE_END;
struct call_path_root *cpr = ts->crp->cpr;
struct call_path *cp;
u64 ret_addr;
@@ -621,7 +623,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
if (!cp)
return -ENOMEM;
err = thread_stack__push_cp(ts, ret_addr, sample->time, ref,
- cp, false);
+ cp, false, trace_end);
} else if (sample->flags & PERF_IP_FLAG_RETURN) {
if (!sample->ip || !sample->addr)
return 0;
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:perf/core] perf intel-pt: Add decoder flags for trace begin / end
2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
@ 2018-09-26 8:56 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:56 UTC (permalink / raw)
To: linux-tip-commits
Cc: acme, hpa, jolsa, tglx, ak, mingo, adrian.hunter, linux-kernel
Commit-ID: c6b5da093a8ba740b71dd0052f3846016986fd21
Gitweb: https://git.kernel.org/tip/c6b5da093a8ba740b71dd0052f3846016986fd21
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:47 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:19:51 -0300
perf intel-pt: Add decoder flags for trace begin / end
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends
with a call. To prepare for remedying that, add Intel PT decoder flags
for trace begin / end and map them to the existing sample flags.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.h | 2 ++
tools/perf/util/intel-pt.c | 5 +++++
2 files changed, 7 insertions(+)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index 51c18d67f4ca..ed088d4726ba 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -37,6 +37,8 @@ enum intel_pt_sample_type {
INTEL_PT_EX_STOP = 1 << 6,
INTEL_PT_PWR_EXIT = 1 << 7,
INTEL_PT_CBR_CHG = 1 << 8,
+ INTEL_PT_TRACE_BEGIN = 1 << 9,
+ INTEL_PT_TRACE_END = 1 << 10,
};
enum intel_pt_period_type {
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index aec68908d604..48c1d415c6b0 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -908,6 +908,11 @@ static void intel_pt_sample_flags(struct intel_pt_queue *ptq)
ptq->insn_len = ptq->state->insn_len;
memcpy(ptq->insn, ptq->state->insn, INTEL_PT_INSN_BUF_SZ);
}
+
+ if (ptq->state->type & INTEL_PT_TRACE_BEGIN)
+ ptq->flags |= PERF_IP_FLAG_TRACE_BEGIN;
+ if (ptq->state->type & INTEL_PT_TRACE_END)
+ ptq->flags |= PERF_IP_FLAG_TRACE_END;
}
static int intel_pt_setup_queue(struct intel_pt *pt,
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:perf/core] perf intel-pt: Implement decoder flags for trace begin / end
2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
@ 2018-09-26 8:56 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:56 UTC (permalink / raw)
To: linux-tip-commits
Cc: jolsa, ak, acme, hpa, mingo, adrian.hunter, linux-kernel, tglx
Commit-ID: bea6385789b8b5e1e3228a281978ca6c4a8c70a0
Gitweb: https://git.kernel.org/tip/bea6385789b8b5e1e3228a281978ca6c4a8c70a0
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:48 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:19:52 -0300
perf intel-pt: Implement decoder flags for trace begin / end
Have the Intel PT decoder implement the new Intel PT decoder flags for
trace begin / end.
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends
with a call. That happens when using address filters, for example:
$ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data ]
Before:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: call 4015b9 main+0x29 => 0 [unknown]
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: call 4015c8 main+0x38 => 0 [unknown]
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: call 4015d7 main+0x47 => 0 [unknown]
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: call 4015e1 main+0x51 => 0 [unknown]
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: call 4015eb main+0x5b => 0 [unknown]
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: call 401612 main+0x82 => 0 [unknown]
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: call 401847 main+0x2b7 => 0 [unknown]
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: call 4019bf main+0x42f => 0 [unknown]
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: call 4019f5 main+0x465 => 0 [unknown]
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: call 401832 main+0x2a2 => 0 [unknown]
After:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: tr end call 4015b9 main+0x29 => 401ef0 set_program_name+0x0
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: tr end call 4015c8 main+0x38 => 4014b0 setlocale@plt+0x0
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: tr end call 4015d7 main+0x47 => 4012d0 bindtextdomain@plt+0x0
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: tr end call 4015e1 main+0x51 => 4012b0 textdomain@plt+0x0
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: tr end call 4015eb main+0x5b => 404340 atexit+0x0
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: tr end call 401612 main+0x82 => 401320 getopt_long@plt+0x0
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: tr end call 401847 main+0x2b7 => 401360 uname@plt+0x0
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: tr end call 4019bf main+0x42f => 401b10 print_element+0x0
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: tr end call 4019f5 main+0x465 => 401340 __overflow@plt+0x0
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: tr end call 401832 main+0x2a2 => 401520 exit@plt+0x0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
.../perf/util/intel-pt-decoder/intel-pt-decoder.c | 34 +++++++++++++++-------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index d404bed7003a..58f6a9ceb590 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1165,7 +1165,7 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->pge = false;
decoder->continuous_period = false;
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
- decoder->state.to_ip = 0;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
}
if (err == INTEL_PT_RETURN)
@@ -1179,9 +1179,13 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->continuous_period = false;
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
- if (decoder->packet.count != 0)
+ if (decoder->packet.count == 0) {
+ decoder->state.to_ip = 0;
+ } else {
+ decoder->state.to_ip = decoder->last_ip;
decoder->ip = decoder->last_ip;
+ }
+ decoder->state.type |= INTEL_PT_TRACE_END;
} else {
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->state.from_ip = decoder->ip;
@@ -1208,7 +1212,8 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->ip = to_ip;
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
+ decoder->state.to_ip = to_ip;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
}
intel_pt_log_at("ERROR: Conditional branch when expecting indirect branch",
@@ -1640,14 +1645,15 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
case INTEL_PT_TIP_PGD:
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
- if (decoder->packet.count != 0) {
+ if (decoder->packet.count == 0) {
+ decoder->state.to_ip = 0;
+ } else {
intel_pt_set_ip(decoder);
- intel_pt_log("Omitting PGD ip " x64_fmt "\n",
- decoder->ip);
+ decoder->state.to_ip = decoder->ip;
}
decoder->pge = false;
decoder->continuous_period = false;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
case INTEL_PT_TIP_PGE:
@@ -1661,6 +1667,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
intel_pt_set_ip(decoder);
decoder->state.to_ip = decoder->ip;
}
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
return 0;
case INTEL_PT_TIP:
@@ -1739,6 +1746,7 @@ next:
intel_pt_set_ip(decoder);
decoder->state.from_ip = 0;
decoder->state.to_ip = decoder->ip;
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
return 0;
}
@@ -2077,9 +2085,13 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
decoder->pge = decoder->packet.type != INTEL_PT_TIP_PGD;
if (intel_pt_have_ip(decoder))
intel_pt_set_ip(decoder);
- if (decoder->ip)
- return 0;
- break;
+ if (!decoder->ip)
+ break;
+ if (decoder->packet.type == INTEL_PT_TIP_PGE)
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+ if (decoder->packet.type == INTEL_PT_TIP_PGD)
+ decoder->state.type |= INTEL_PT_TRACE_END;
+ return 0;
case INTEL_PT_FUP:
if (intel_pt_have_ip(decoder))
^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-09-26 8:56 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
2018-09-26 8:53 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
2018-09-26 8:54 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
2018-09-26 8:54 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
2018-09-26 8:55 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
2018-09-26 8:56 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
2018-09-26 8:56 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:41 ` [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Arnaldo Carvalho de Melo
2018-09-20 14:13 ` Arnaldo Carvalho de Melo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.