linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/3] perf script: Add callindent option
@ 2016-06-23 13:40 Adrian Hunter
  2016-06-23 13:40 ` [PATCH V2 1/3] perf script: Print sample flags more nicely Adrian Hunter
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Adrian Hunter @ 2016-06-23 13:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel, Andi Kleen

Hi

Andi Kleen sent a couple of patches to add a callindent option to
perf script. Andi was agreeable to this alternative implementation.

While there are some differences in the resulting output, the main
difference is:

 1. Tell the decoder to feed branches to the thread stack, which has the
 advantage that it happens before branch filtering and so can be used
 with different itrace options (e.g. it still works when only showing
 calls, even though the thread stack needs to see calls and returns). Also
 it does not conflict with using the thread stack to get callchains.

Changes in V2:

	Remove the prefix string
	Add a patch to display the sample flags more nicely


Adrian Hunter (3):
      perf script: Print sample flags more nicely
      perf auxtrace: Add option to feed branches to the thread stack
      perf script: Add callindent option

 tools/perf/Documentation/perf-script.txt |  11 +++-
 tools/perf/builtin-script.c              | 103 ++++++++++++++++++++++++++++++-
 tools/perf/util/auxtrace.h               |   2 +
 tools/perf/util/intel-bts.c              |  22 +++++--
 tools/perf/util/intel-pt.c               |   5 +-
 tools/perf/util/thread-stack.c           |   7 +++
 tools/perf/util/thread-stack.h           |   1 +
 7 files changed, 142 insertions(+), 9 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH V2 1/3] perf script: Print sample flags more nicely
  2016-06-23 13:40 [PATCH V2 0/3] perf script: Add callindent option Adrian Hunter
@ 2016-06-23 13:40 ` Adrian Hunter
  2016-06-26 10:59   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2016-06-23 13:40 ` [PATCH V2 2/3] perf auxtrace: Add option to feed branches to the thread stack Adrian Hunter
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Adrian Hunter @ 2016-06-23 13:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel, Andi Kleen

The flags field is synthesized and may have a value when Instruction
Trace decoding. The flags are "bcrosyiABEx" which stand for branch,
call, return, conditional, system, asynchronous, interrupt,
transaction abort, trace begin, trace end, and in transaction,
respectively.

Change the display so that known combinations of flags are printed more
nicely e.g.
"call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b",
"int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs",
"async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB",
"tr end" for "bE".

However the "x" flag will be display separately in those cases
e.g. "jcc     (x)" for a condition branch within a transaction.

Example:

    perf record -e intel_pt//u ls
    perf script --ns -F comm,cpu,pid,tid,time,ip,addr,sym,dso,symoff,flags
    ...
    ls  3689/3689  [001]  2062.020965237:   jcc          7f06a958847a _dl_sysdep_start+0xfa (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a9588450 _dl_sysdep_start+0xd0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965237:   jmp          7f06a9588461 _dl_sysdep_start+0xe1 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a95885a0 _dl_sysdep_start+0x220 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965237:   jmp          7f06a95885a4 _dl_sysdep_start+0x224 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a9588470 _dl_sysdep_start+0xf0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965904:   call         7f06a95884c3 _dl_sysdep_start+0x143 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a9589140 brk+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965904:   syscall      7f06a958914a brk+0xa (/lib/x86_64-linux-gnu/ld-2.19.so) =>                0 [unknown] ([unknown])
    ls  3689/3689  [001]  2062.020966237:   tr strt                 0 [unknown] ([unknown]) =>     7f06a958914c brk+0xc (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   return       7f06a9589165 brk+0x25 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a95884c8 _dl_sysdep_start+0x148 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   jcc          7f06a95884d7 _dl_sysdep_start+0x157 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   call         7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a958ac50 strlen+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   jcc          7f06a958ac6e strlen+0x1e (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a958ac60 strlen+0x10 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-script.txt |  7 ++++++-
 tools/perf/builtin-script.c              | 35 +++++++++++++++++++++++++++++++-
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 4f34379ebd77..a46030d8962d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -170,7 +170,12 @@ OPTIONS
 	Trace decoding. The flags are "bcrosyiABEx" which stand for branch,
 	call, return, conditional, system, asynchronous, interrupt,
 	transaction abort, trace begin, trace end, and in transaction,
-	respectively.
+	respectively. Known combinations of flags are printed more nicely e.g.
+	"call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b",
+	"int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs",
+	"async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB",
+	"tr end" for "bE". However the "x" flag will be display separately in those
+	cases e.g. "jcc     (x)" for a condition branch within a transaction.
 
 	Finally, a user may not set fields to none for all event types.
 	i.e., -F "" is not allowed.
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 46011235af5d..5a7633274d0a 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -606,13 +606,42 @@ static void print_sample_bts(struct perf_sample *sample,
 	printf("\n");
 }
 
+static struct {
+	u32 flags;
+	const char *name;
+} sample_flags[] = {
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL, "call"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN, "return"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CONDITIONAL, "jcc"},
+	{PERF_IP_FLAG_BRANCH, "jmp"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_INTERRUPT, "int"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN | PERF_IP_FLAG_INTERRUPT, "iret"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_SYSCALLRET, "syscall"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN | PERF_IP_FLAG_SYSCALLRET, "sysret"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_ASYNC, "async"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_ASYNC |	PERF_IP_FLAG_INTERRUPT, "hw int"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT, "tx abrt"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_BEGIN, "tr strt"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END, "tr end"},
+	{0, NULL}
+};
+
 static void print_sample_flags(u32 flags)
 {
 	const char *chars = PERF_IP_FLAG_CHARS;
 	const int n = strlen(PERF_IP_FLAG_CHARS);
+	bool in_tx = flags & PERF_IP_FLAG_IN_TX;
+	const char *name = NULL;
 	char str[33];
 	int i, pos = 0;
 
+	for (i = 0; sample_flags[i].name ; i++) {
+		if (sample_flags[i].flags == (flags & ~PERF_IP_FLAG_IN_TX)) {
+			name = sample_flags[i].name;
+			break;
+		}
+	}
+
 	for (i = 0; i < n; i++, flags >>= 1) {
 		if (flags & 1)
 			str[pos++] = chars[i];
@@ -622,7 +651,11 @@ static void print_sample_flags(u32 flags)
 			str[pos++] = '?';
 	}
 	str[pos] = 0;
-	printf("  %-4s ", str);
+
+	if (name)
+		printf("  %-7s%4s ", name, in_tx ? "(x)" : "");
+	else
+		printf("  %-11s ", str);
 }
 
 struct printer_data {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V2 2/3] perf auxtrace: Add option to feed branches to the thread stack
  2016-06-23 13:40 [PATCH V2 0/3] perf script: Add callindent option Adrian Hunter
  2016-06-23 13:40 ` [PATCH V2 1/3] perf script: Print sample flags more nicely Adrian Hunter
@ 2016-06-23 13:40 ` Adrian Hunter
  2016-06-26 10:59   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2016-06-23 13:40 ` [PATCH V2 3/3] perf script: Add callindent option Adrian Hunter
  2016-06-23 16:18 ` [PATCH V2 0/3] " Andi Kleen
  3 siblings, 1 reply; 11+ messages in thread
From: Adrian Hunter @ 2016-06-23 13:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel, Andi Kleen

In preparation for using the thread stack to print an indent representing
the stack depth in perf script, add an option to tell decoders to feed
branches to the thread stack. Add support for that option to Intel PT and
Intel BTS.

The advantage of using the decoder to feed the thread stack is that it
happens before branch filtering and so can be used with different itrace
options (e.g. it still works when only showing calls, even though the
thread stack needs to see calls and returns). Also it does not conflict
with using the thread stack to get callchains.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/auxtrace.h  |  2 ++
 tools/perf/util/intel-bts.c | 22 +++++++++++++++++-----
 tools/perf/util/intel-pt.c  |  5 ++++-
 3 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 767989e0e312..ac5f0d7167e6 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -63,6 +63,7 @@ enum itrace_period_type {
  * @calls: limit branch samples to calls (can be combined with @returns)
  * @returns: limit branch samples to returns (can be combined with @calls)
  * @callchain: add callchain to 'instructions' events
+ * @thread_stack: feed branches to the thread_stack
  * @last_branch: add branch context to 'instruction' events
  * @callchain_sz: maximum callchain size
  * @last_branch_sz: branch context size
@@ -82,6 +83,7 @@ struct itrace_synth_opts {
 	bool			calls;
 	bool			returns;
 	bool			callchain;
+	bool			thread_stack;
 	bool			last_branch;
 	unsigned int		callchain_sz;
 	unsigned int		last_branch_sz;
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 9df996085563..6d16b5c059b9 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -422,7 +422,8 @@ static int intel_bts_get_branch_type(struct intel_bts_queue *btsq,
 }
 
 static int intel_bts_process_buffer(struct intel_bts_queue *btsq,
-				    struct auxtrace_buffer *buffer)
+				    struct auxtrace_buffer *buffer,
+				    struct thread *thread)
 {
 	struct branch *branch;
 	size_t sz, bsz = sizeof(struct branch);
@@ -444,6 +445,12 @@ static int intel_bts_process_buffer(struct intel_bts_queue *btsq,
 		if (!branch->from && !branch->to)
 			continue;
 		intel_bts_get_branch_type(btsq, branch);
+		if (btsq->bts->synth_opts.thread_stack)
+			thread_stack__event(thread, btsq->sample_flags,
+					    le64_to_cpu(branch->from),
+					    le64_to_cpu(branch->to),
+					    btsq->intel_pt_insn.length,
+					    buffer->buffer_nr + 1);
 		if (filter && !(filter & btsq->sample_flags))
 			continue;
 		err = intel_bts_synth_branch_sample(btsq, branch);
@@ -507,12 +514,13 @@ static int intel_bts_process_queue(struct intel_bts_queue *btsq, u64 *timestamp)
 		goto out_put;
 	}
 
-	if (!btsq->bts->synth_opts.callchain && thread &&
+	if (!btsq->bts->synth_opts.callchain &&
+	    !btsq->bts->synth_opts.thread_stack && thread &&
 	    (!old_buffer || btsq->bts->sampling_mode ||
 	     (btsq->bts->snapshot_mode && !buffer->consecutive)))
 		thread_stack__set_trace_nr(thread, buffer->buffer_nr + 1);
 
-	err = intel_bts_process_buffer(btsq, buffer);
+	err = intel_bts_process_buffer(btsq, buffer, thread);
 
 	auxtrace_buffer__drop_data(buffer);
 
@@ -905,10 +913,14 @@ int intel_bts_process_auxtrace_info(union perf_event *event,
 	if (dump_trace)
 		return 0;
 
-	if (session->itrace_synth_opts && session->itrace_synth_opts->set)
+	if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
 		bts->synth_opts = *session->itrace_synth_opts;
-	else
+	} else {
 		itrace_synth_opts__set_default(&bts->synth_opts);
+		if (session->itrace_synth_opts)
+			bts->synth_opts.thread_stack =
+				session->itrace_synth_opts->thread_stack;
+	}
 
 	if (bts->synth_opts.calls)
 		bts->branches_filter |= PERF_IP_FLAG_CALL | PERF_IP_FLAG_ASYNC |
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 137196990012..b2f7e5474f3c 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -1233,7 +1233,7 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
 	if (!(state->type & INTEL_PT_BRANCH))
 		return 0;
 
-	if (pt->synth_opts.callchain)
+	if (pt->synth_opts.callchain || pt->synth_opts.thread_stack)
 		thread_stack__event(ptq->thread, ptq->flags, state->from_ip,
 				    state->to_ip, ptq->insn_len,
 				    state->trace_nr);
@@ -2136,6 +2136,9 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 			pt->synth_opts.branches = false;
 			pt->synth_opts.callchain = true;
 		}
+		if (session->itrace_synth_opts)
+			pt->synth_opts.thread_stack =
+				session->itrace_synth_opts->thread_stack;
 	}
 
 	if (pt->synth_opts.log)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V2 3/3] perf script: Add callindent option
  2016-06-23 13:40 [PATCH V2 0/3] perf script: Add callindent option Adrian Hunter
  2016-06-23 13:40 ` [PATCH V2 1/3] perf script: Print sample flags more nicely Adrian Hunter
  2016-06-23 13:40 ` [PATCH V2 2/3] perf auxtrace: Add option to feed branches to the thread stack Adrian Hunter
@ 2016-06-23 13:40 ` Adrian Hunter
  2016-06-23 14:09   ` Arnaldo Carvalho de Melo
  2016-06-26 11:00   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2016-06-23 16:18 ` [PATCH V2 0/3] " Andi Kleen
  3 siblings, 2 replies; 11+ messages in thread
From: Adrian Hunter @ 2016-06-23 13:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel, Andi Kleen

Based on patches from Andi Kleen.

When printing PT instruction traces with perf script it is rather useful to
see some indentation for the call tree. This patch adds a new callindent
field to perf script that prints spaces for the function call stack depth.

We already have code to track the function call stack for PT, that we can
reuse with minor modifications.

The resulting output is not quite as nice as ftrace yet, but a lot better
than what was there before.

Note there are some corner cases when the thread stack gets code confused
and prints incorrect indentation. Even with that it is fairly useful.

When displaying kernel code traces it is recommended to run as root, as
otherwise perf doesn't understand the kernel addresses properly, and may
not reset the call stack correctly on kernel boundaries.

Example output:

	sudo perf-with-kcore record eg2 -a -e intel_pt// -- sleep 1
	sudo perf-with-kcore script eg2 --ns -F callindent,time,comm,pid,sym,ip,addr,flags,cpu --itrace=cre | less
	...
         swapper     0 [000]  5830.389116586:   call        irq_exit                                                     ffffffff8104d620 smp_call_function_single_interrupt+0x30 => ffffffff8107e720 irq_exit
         swapper     0 [000]  5830.389116586:   call            idle_cpu                                                 ffffffff8107e769 irq_exit+0x49 => ffffffff810a3970 idle_cpu
         swapper     0 [000]  5830.389116586:   return          idle_cpu                                                 ffffffff810a39b7 idle_cpu+0x47 => ffffffff8107e76e irq_exit
         swapper     0 [000]  5830.389116586:   call            tick_nohz_irq_exit                                       ffffffff8107e7bd irq_exit+0x9d => ffffffff810f2fc0 tick_nohz_irq_exit
         swapper     0 [000]  5830.389116919:   call                __tick_nohz_idle_enter                               ffffffff810f2fe0 tick_nohz_irq_exit+0x20 => ffffffff810f28d0 __tick_nohz_idle_enter
         swapper     0 [000]  5830.389116919:   call                    ktime_get                                        ffffffff810f28f1 __tick_nohz_idle_enter+0x21 => ffffffff810e9ec0 ktime_get
         swapper     0 [000]  5830.389116919:   call                        read_tsc                                     ffffffff810e9ef6 ktime_get+0x36 => ffffffff81035070 read_tsc
         swapper     0 [000]  5830.389116919:   return                      read_tsc                                     ffffffff81035084 read_tsc+0x14 => ffffffff810e9efc ktime_get
         swapper     0 [000]  5830.389116919:   return                  ktime_get                                        ffffffff810e9f46 ktime_get+0x86 => ffffffff810f28f6 __tick_nohz_idle_enter
         swapper     0 [000]  5830.389116919:   call                    sched_clock_idle_sleep_event                     ffffffff810f290b __tick_nohz_idle_enter+0x3b => ffffffff810a7380 sched_clock_idle_sleep_event
         swapper     0 [000]  5830.389116919:   call                        sched_clock_cpu                              ffffffff810a738b sched_clock_idle_sleep_event+0xb => ffffffff810a72e0 sched_clock_cpu
         swapper     0 [000]  5830.389116919:   call                            sched_clock                              ffffffff810a734d sched_clock_cpu+0x6d => ffffffff81035750 sched_clock
         swapper     0 [000]  5830.389116919:   call                                native_sched_clock                   ffffffff81035754 sched_clock+0x4 => ffffffff81035640 native_sched_clock
         swapper     0 [000]  5830.389116919:   return                              native_sched_clock                   ffffffff8103568c native_sched_clock+0x4c => ffffffff81035759 sched_clock
         swapper     0 [000]  5830.389116919:   return                          sched_clock                              ffffffff8103575c sched_clock+0xc => ffffffff810a7352 sched_clock_cpu
         swapper     0 [000]  5830.389116919:   return                      sched_clock_cpu                              ffffffff810a7356 sched_clock_cpu+0x76 => ffffffff810a7390 sched_clock_idle_sleep_event
         swapper     0 [000]  5830.389116919:   return                  sched_clock_idle_sleep_event                     ffffffff810a7391 sched_clock_idle_sleep_event+0x11 => ffffffff810f2910 __tick_nohz_idle_enter
	...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-script.txt |  4 ++
 tools/perf/builtin-script.c              | 68 +++++++++++++++++++++++++++++++-
 tools/perf/util/thread-stack.c           |  7 ++++
 tools/perf/util/thread-stack.h           |  1 +
 4 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index a46030d8962d..1f6c70594f0f 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -177,6 +177,10 @@ OPTIONS
 	"tr end" for "bE". However the "x" flag will be display separately in those
 	cases e.g. "jcc     (x)" for a condition branch within a transaction.
 
+	The callindent field is synthesized and may have a value when
+	Instruction Trace decoding. For calls and returns, it will display the
+	name of the symbol indented with spaces to reflect the stack depth.
+
 	Finally, a user may not set fields to none for all event types.
 	i.e., -F "" is not allowed.
 
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 5a7633274d0a..c5648fd86229 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -21,6 +21,7 @@
 #include "util/cpumap.h"
 #include "util/thread_map.h"
 #include "util/stat.h"
+#include "util/thread-stack.h"
 #include <linux/bitmap.h>
 #include <linux/stringify.h>
 #include "asm/bug.h"
@@ -63,6 +64,7 @@ enum perf_output_field {
 	PERF_OUTPUT_DATA_SRC	    = 1U << 17,
 	PERF_OUTPUT_WEIGHT	    = 1U << 18,
 	PERF_OUTPUT_BPF_OUTPUT	    = 1U << 19,
+	PERF_OUTPUT_CALLINDENT	    = 1U << 20,
 };
 
 struct output_option {
@@ -89,6 +91,7 @@ struct output_option {
 	{.str = "data_src", .field = PERF_OUTPUT_DATA_SRC},
 	{.str = "weight",   .field = PERF_OUTPUT_WEIGHT},
 	{.str = "bpf-output",   .field = PERF_OUTPUT_BPF_OUTPUT},
+	{.str = "callindent", .field = PERF_OUTPUT_CALLINDENT},
 };
 
 /* default set to maintain compatibility with current format */
@@ -562,6 +565,62 @@ static void print_sample_addr(struct perf_sample *sample,
 	}
 }
 
+static void print_sample_callindent(struct perf_sample *sample,
+				    struct perf_evsel *evsel,
+				    struct thread *thread,
+				    struct addr_location *al)
+{
+	struct perf_event_attr *attr = &evsel->attr;
+	size_t depth = thread_stack__depth(thread);
+	struct addr_location addr_al;
+	const char *name = NULL;
+	static int spacing;
+	int len = 0;
+	u64 ip = 0;
+
+	/*
+	 * The 'return' has already been popped off the stack so the depth has
+	 * to be adjusted to match the 'call'.
+	 */
+	if (thread->ts && sample->flags & PERF_IP_FLAG_RETURN)
+		depth += 1;
+
+	if (sample->flags & (PERF_IP_FLAG_CALL | PERF_IP_FLAG_TRACE_BEGIN)) {
+		if (sample_addr_correlates_sym(attr)) {
+			thread__resolve(thread, &addr_al, sample);
+			if (addr_al.sym)
+				name = addr_al.sym->name;
+			else
+				ip = sample->addr;
+		} else {
+			ip = sample->addr;
+		}
+	} else if (sample->flags & (PERF_IP_FLAG_RETURN | PERF_IP_FLAG_TRACE_END)) {
+		if (al->sym)
+			name = al->sym->name;
+		else
+			ip = sample->ip;
+	}
+
+	if (name)
+		len = printf("%*s%s", (int)depth * 4, "", name);
+	else if (ip)
+		len = printf("%*s%16" PRIx64, (int)depth * 4, "", ip);
+
+	if (len < 0)
+		return;
+
+	/*
+	 * Try to keep the output length from changing frequently so that the
+	 * output lines up more nicely.
+	 */
+	if (len > spacing || (len && len < spacing - 52))
+		spacing = round_up(len + 4, 32);
+
+	if (len < spacing)
+		printf("%*s", spacing - len, "");
+}
+
 static void print_sample_bts(struct perf_sample *sample,
 			     struct perf_evsel *evsel,
 			     struct thread *thread,
@@ -570,6 +629,9 @@ static void print_sample_bts(struct perf_sample *sample,
 	struct perf_event_attr *attr = &evsel->attr;
 	bool print_srcline_last = false;
 
+	if (PRINT_FIELD(CALLINDENT))
+		print_sample_callindent(sample, evsel, thread, al);
+
 	/* print branch_from information */
 	if (PRINT_FIELD(IP)) {
 		unsigned int print_opts = output[attr->type].print_ip_opts;
@@ -2053,7 +2115,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "comma separated output fields prepend with 'type:'. "
 		     "Valid types: hw,sw,trace,raw. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
-		     "addr,symoff,period,iregs,brstack,brstacksym,flags", parse_output_fields),
+		     "addr,symoff,period,iregs,brstack,brstacksym,flags,"
+		     "callindent", parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
@@ -2292,6 +2355,9 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 	script.session = session;
 	script__setup_sample_type(&script);
 
+	if (output[PERF_TYPE_HARDWARE].fields & PERF_OUTPUT_CALLINDENT)
+		itrace_synth_opts.thread_stack = true;
+
 	session->itrace_synth_opts = &itrace_synth_opts;
 
 	if (cpu_list) {
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index 825086aa9a08..d3301529f6a7 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -616,3 +616,10 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
 
 	return err;
 }
+
+size_t thread_stack__depth(struct thread *thread)
+{
+	if (!thread->ts)
+		return 0;
+	return thread->ts->cnt;
+}
diff --git a/tools/perf/util/thread-stack.h b/tools/perf/util/thread-stack.h
index ad44c7944b8e..b7e41c4ebfdd 100644
--- a/tools/perf/util/thread-stack.h
+++ b/tools/perf/util/thread-stack.h
@@ -87,6 +87,7 @@ void thread_stack__sample(struct thread *thread, struct ip_callchain *chain,
 			  size_t sz, u64 ip);
 int thread_stack__flush(struct thread *thread);
 void thread_stack__free(struct thread *thread);
+size_t thread_stack__depth(struct thread *thread);
 
 struct call_return_processor *
 call_return_processor__new(int (*process)(struct call_return *cr, void *data),
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 3/3] perf script: Add callindent option
  2016-06-23 13:40 ` [PATCH V2 3/3] perf script: Add callindent option Adrian Hunter
@ 2016-06-23 14:09   ` Arnaldo Carvalho de Melo
  2016-06-23 16:20     ` Andi Kleen
  2016-06-26 11:00   ` [tip:perf/core] " tip-bot for Adrian Hunter
  1 sibling, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-06-23 14:09 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, linux-kernel, Andi Kleen

Em Thu, Jun 23, 2016 at 04:40:58PM +0300, Adrian Hunter escreveu:
> Based on patches from Andi Kleen.
> 
> When printing PT instruction traces with perf script it is rather useful to
> see some indentation for the call tree. This patch adds a new callindent
> field to perf script that prints spaces for the function call stack depth.
> 
> We already have code to track the function call stack for PT, that we can
> reuse with minor modifications.
> 
> The resulting output is not quite as nice as ftrace yet, but a lot better
> than what was there before.

This is shaping up really nicely, well done! Hope Andi can send his ack
for this series if he sees no problems :-)
 
> Note there are some corner cases when the thread stack gets code confused
> and prints incorrect indentation. Even with that it is fairly useful.
> 
> When displaying kernel code traces it is recommended to run as root, as
> otherwise perf doesn't understand the kernel addresses properly, and may
> not reset the call stack correctly on kernel boundaries.
> 
> Example output:

Here I think we should try to make the output more compact, so perhaps
we should have some way to ask 'perf script' to show just timestamp
deltas from the previous entry, for instance, i.e. replace that 'time'
with a new 'delta' -F entry.

- Arnaldo

> 	sudo perf-with-kcore record eg2 -a -e intel_pt// -- sleep 1
> 	sudo perf-with-kcore script eg2 --ns -F callindent,time,comm,pid,sym,ip,addr,flags,cpu --itrace=cre | less
> 	...
>          swapper     0 [000]  5830.389116586:   call        irq_exit                                                     ffffffff8104d620 smp_call_function_single_interrupt+0x30 => ffffffff8107e720 irq_exit
>          swapper     0 [000]  5830.389116586:   call            idle_cpu                                                 ffffffff8107e769 irq_exit+0x49 => ffffffff810a3970 idle_cpu
>          swapper     0 [000]  5830.389116586:   return          idle_cpu                                                 ffffffff810a39b7 idle_cpu+0x47 => ffffffff8107e76e irq_exit
>          swapper     0 [000]  5830.389116586:   call            tick_nohz_irq_exit                                       ffffffff8107e7bd irq_exit+0x9d => ffffffff810f2fc0 tick_nohz_irq_exit
>          swapper     0 [000]  5830.389116919:   call                __tick_nohz_idle_enter                               ffffffff810f2fe0 tick_nohz_irq_exit+0x20 => ffffffff810f28d0 __tick_nohz_idle_enter
>          swapper     0 [000]  5830.389116919:   call                    ktime_get                                        ffffffff810f28f1 __tick_nohz_idle_enter+0x21 => ffffffff810e9ec0 ktime_get
>          swapper     0 [000]  5830.389116919:   call                        read_tsc                                     ffffffff810e9ef6 ktime_get+0x36 => ffffffff81035070 read_tsc
>          swapper     0 [000]  5830.389116919:   return                      read_tsc                                     ffffffff81035084 read_tsc+0x14 => ffffffff810e9efc ktime_get
>          swapper     0 [000]  5830.389116919:   return                  ktime_get                                        ffffffff810e9f46 ktime_get+0x86 => ffffffff810f28f6 __tick_nohz_idle_enter
>          swapper     0 [000]  5830.389116919:   call                    sched_clock_idle_sleep_event                     ffffffff810f290b __tick_nohz_idle_enter+0x3b => ffffffff810a7380 sched_clock_idle_sleep_event
>          swapper     0 [000]  5830.389116919:   call                        sched_clock_cpu                              ffffffff810a738b sched_clock_idle_sleep_event+0xb => ffffffff810a72e0 sched_clock_cpu
>          swapper     0 [000]  5830.389116919:   call                            sched_clock                              ffffffff810a734d sched_clock_cpu+0x6d => ffffffff81035750 sched_clock
>          swapper     0 [000]  5830.389116919:   call                                native_sched_clock                   ffffffff81035754 sched_clock+0x4 => ffffffff81035640 native_sched_clock
>          swapper     0 [000]  5830.389116919:   return                              native_sched_clock                   ffffffff8103568c native_sched_clock+0x4c => ffffffff81035759 sched_clock
>          swapper     0 [000]  5830.389116919:   return                          sched_clock                              ffffffff8103575c sched_clock+0xc => ffffffff810a7352 sched_clock_cpu
>          swapper     0 [000]  5830.389116919:   return                      sched_clock_cpu                              ffffffff810a7356 sched_clock_cpu+0x76 => ffffffff810a7390 sched_clock_idle_sleep_event
>          swapper     0 [000]  5830.389116919:   return                  sched_clock_idle_sleep_event                     ffffffff810a7391 sched_clock_idle_sleep_event+0x11 => ffffffff810f2910 __tick_nohz_idle_enter
> 	...
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/perf/Documentation/perf-script.txt |  4 ++
>  tools/perf/builtin-script.c              | 68 +++++++++++++++++++++++++++++++-
>  tools/perf/util/thread-stack.c           |  7 ++++
>  tools/perf/util/thread-stack.h           |  1 +
>  4 files changed, 79 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
> index a46030d8962d..1f6c70594f0f 100644
> --- a/tools/perf/Documentation/perf-script.txt
> +++ b/tools/perf/Documentation/perf-script.txt
> @@ -177,6 +177,10 @@ OPTIONS
>  	"tr end" for "bE". However the "x" flag will be display separately in those
>  	cases e.g. "jcc     (x)" for a condition branch within a transaction.
>  
> +	The callindent field is synthesized and may have a value when
> +	Instruction Trace decoding. For calls and returns, it will display the
> +	name of the symbol indented with spaces to reflect the stack depth.
> +
>  	Finally, a user may not set fields to none for all event types.
>  	i.e., -F "" is not allowed.
>  
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 5a7633274d0a..c5648fd86229 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -21,6 +21,7 @@
>  #include "util/cpumap.h"
>  #include "util/thread_map.h"
>  #include "util/stat.h"
> +#include "util/thread-stack.h"
>  #include <linux/bitmap.h>
>  #include <linux/stringify.h>
>  #include "asm/bug.h"
> @@ -63,6 +64,7 @@ enum perf_output_field {
>  	PERF_OUTPUT_DATA_SRC	    = 1U << 17,
>  	PERF_OUTPUT_WEIGHT	    = 1U << 18,
>  	PERF_OUTPUT_BPF_OUTPUT	    = 1U << 19,
> +	PERF_OUTPUT_CALLINDENT	    = 1U << 20,
>  };
>  
>  struct output_option {
> @@ -89,6 +91,7 @@ struct output_option {
>  	{.str = "data_src", .field = PERF_OUTPUT_DATA_SRC},
>  	{.str = "weight",   .field = PERF_OUTPUT_WEIGHT},
>  	{.str = "bpf-output",   .field = PERF_OUTPUT_BPF_OUTPUT},
> +	{.str = "callindent", .field = PERF_OUTPUT_CALLINDENT},
>  };
>  
>  /* default set to maintain compatibility with current format */
> @@ -562,6 +565,62 @@ static void print_sample_addr(struct perf_sample *sample,
>  	}
>  }
>  
> +static void print_sample_callindent(struct perf_sample *sample,
> +				    struct perf_evsel *evsel,
> +				    struct thread *thread,
> +				    struct addr_location *al)
> +{
> +	struct perf_event_attr *attr = &evsel->attr;
> +	size_t depth = thread_stack__depth(thread);
> +	struct addr_location addr_al;
> +	const char *name = NULL;
> +	static int spacing;
> +	int len = 0;
> +	u64 ip = 0;
> +
> +	/*
> +	 * The 'return' has already been popped off the stack so the depth has
> +	 * to be adjusted to match the 'call'.
> +	 */
> +	if (thread->ts && sample->flags & PERF_IP_FLAG_RETURN)
> +		depth += 1;
> +
> +	if (sample->flags & (PERF_IP_FLAG_CALL | PERF_IP_FLAG_TRACE_BEGIN)) {
> +		if (sample_addr_correlates_sym(attr)) {
> +			thread__resolve(thread, &addr_al, sample);
> +			if (addr_al.sym)
> +				name = addr_al.sym->name;
> +			else
> +				ip = sample->addr;
> +		} else {
> +			ip = sample->addr;
> +		}
> +	} else if (sample->flags & (PERF_IP_FLAG_RETURN | PERF_IP_FLAG_TRACE_END)) {
> +		if (al->sym)
> +			name = al->sym->name;
> +		else
> +			ip = sample->ip;
> +	}
> +
> +	if (name)
> +		len = printf("%*s%s", (int)depth * 4, "", name);
> +	else if (ip)
> +		len = printf("%*s%16" PRIx64, (int)depth * 4, "", ip);
> +
> +	if (len < 0)
> +		return;
> +
> +	/*
> +	 * Try to keep the output length from changing frequently so that the
> +	 * output lines up more nicely.
> +	 */
> +	if (len > spacing || (len && len < spacing - 52))
> +		spacing = round_up(len + 4, 32);
> +
> +	if (len < spacing)
> +		printf("%*s", spacing - len, "");
> +}
> +
>  static void print_sample_bts(struct perf_sample *sample,
>  			     struct perf_evsel *evsel,
>  			     struct thread *thread,
> @@ -570,6 +629,9 @@ static void print_sample_bts(struct perf_sample *sample,
>  	struct perf_event_attr *attr = &evsel->attr;
>  	bool print_srcline_last = false;
>  
> +	if (PRINT_FIELD(CALLINDENT))
> +		print_sample_callindent(sample, evsel, thread, al);
> +
>  	/* print branch_from information */
>  	if (PRINT_FIELD(IP)) {
>  		unsigned int print_opts = output[attr->type].print_ip_opts;
> @@ -2053,7 +2115,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
>  		     "comma separated output fields prepend with 'type:'. "
>  		     "Valid types: hw,sw,trace,raw. "
>  		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
> -		     "addr,symoff,period,iregs,brstack,brstacksym,flags", parse_output_fields),
> +		     "addr,symoff,period,iregs,brstack,brstacksym,flags,"
> +		     "callindent", parse_output_fields),
>  	OPT_BOOLEAN('a', "all-cpus", &system_wide,
>  		    "system-wide collection from all CPUs"),
>  	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
> @@ -2292,6 +2355,9 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
>  	script.session = session;
>  	script__setup_sample_type(&script);
>  
> +	if (output[PERF_TYPE_HARDWARE].fields & PERF_OUTPUT_CALLINDENT)
> +		itrace_synth_opts.thread_stack = true;
> +
>  	session->itrace_synth_opts = &itrace_synth_opts;
>  
>  	if (cpu_list) {
> diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
> index 825086aa9a08..d3301529f6a7 100644
> --- a/tools/perf/util/thread-stack.c
> +++ b/tools/perf/util/thread-stack.c
> @@ -616,3 +616,10 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
>  
>  	return err;
>  }
> +
> +size_t thread_stack__depth(struct thread *thread)
> +{
> +	if (!thread->ts)
> +		return 0;
> +	return thread->ts->cnt;
> +}
> diff --git a/tools/perf/util/thread-stack.h b/tools/perf/util/thread-stack.h
> index ad44c7944b8e..b7e41c4ebfdd 100644
> --- a/tools/perf/util/thread-stack.h
> +++ b/tools/perf/util/thread-stack.h
> @@ -87,6 +87,7 @@ void thread_stack__sample(struct thread *thread, struct ip_callchain *chain,
>  			  size_t sz, u64 ip);
>  int thread_stack__flush(struct thread *thread);
>  void thread_stack__free(struct thread *thread);
> +size_t thread_stack__depth(struct thread *thread);
>  
>  struct call_return_processor *
>  call_return_processor__new(int (*process)(struct call_return *cr, void *data),
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 0/3] perf script: Add callindent option
  2016-06-23 13:40 [PATCH V2 0/3] perf script: Add callindent option Adrian Hunter
                   ` (2 preceding siblings ...)
  2016-06-23 13:40 ` [PATCH V2 3/3] perf script: Add callindent option Adrian Hunter
@ 2016-06-23 16:18 ` Andi Kleen
  3 siblings, 0 replies; 11+ messages in thread
From: Andi Kleen @ 2016-06-23 16:18 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel

On Thu, Jun 23, 2016 at 04:40:55PM +0300, Adrian Hunter wrote:
> Hi
> 
> Andi Kleen sent a couple of patches to add a callindent option to
> perf script. Andi was agreeable to this alternative implementation.
> 
> While there are some differences in the resulting output, the main
> difference is:
> 
>  1. Tell the decoder to feed branches to the thread stack, which has the
>  advantage that it happens before branch filtering and so can be used
>  with different itrace options (e.g. it still works when only showing
>  calls, even though the thread stack needs to see calls and returns). Also
>  it does not conflict with using the thread stack to get callchains.

Looks all good to me. 

Acked-by: Andi Kleen <ak@linux.intel.com>

Thanks,
-Andi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 3/3] perf script: Add callindent option
  2016-06-23 14:09   ` Arnaldo Carvalho de Melo
@ 2016-06-23 16:20     ` Andi Kleen
  2016-06-23 16:46       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2016-06-23 16:20 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Adrian Hunter, Jiri Olsa, linux-kernel

> Here I think we should try to make the output more compact, so perhaps
> we should have some way to ask 'perf script' to show just timestamp
> deltas from the previous entry, for instance, i.e. replace that 'time'
> with a new 'delta' -F entry.

Agreed, also have a way to hide the numerical addresses, and some summary
options to make it easier to use.

However this can be all done as followups later, no need to hold up this basic
feature series for it.

-Andi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 3/3] perf script: Add callindent option
  2016-06-23 16:20     ` Andi Kleen
@ 2016-06-23 16:46       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-06-23 16:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Adrian Hunter, Jiri Olsa, linux-kernel

Em Thu, Jun 23, 2016 at 09:20:32AM -0700, Andi Kleen escreveu:
> > Here I think we should try to make the output more compact, so perhaps
> > we should have some way to ask 'perf script' to show just timestamp
> > deltas from the previous entry, for instance, i.e. replace that 'time'
> > with a new 'delta' -F entry.
> 
> Agreed, also have a way to hide the numerical addresses, and some summary
> options to make it easier to use.
> 
> However this can be all done as followups later, no need to hold up this basic
> feature series for it.

Yeah, I was just waiting for you to ack it, which you did, thanks, will
process it later today and ship it to Ingo, unless somebody hollers.

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tip:perf/core] perf script: Print sample flags more nicely
  2016-06-23 13:40 ` [PATCH V2 1/3] perf script: Print sample flags more nicely Adrian Hunter
@ 2016-06-26 10:59   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Adrian Hunter @ 2016-06-26 10:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: adrian.hunter, tglx, acme, hpa, ak, linux-kernel, mingo, jolsa

Commit-ID:  055cd33d93ac310ce68e8b1ebcf88f829426adf5
Gitweb:     http://git.kernel.org/tip/055cd33d93ac310ce68e8b1ebcf88f829426adf5
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 23 Jun 2016 16:40:56 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 23 Jun 2016 16:36:59 -0300

perf script: Print sample flags more nicely

The flags field is synthesized and may have a value when Instruction
Trace decoding. The flags are "bcrosyiABEx" which stand for branch,
call, return, conditional, system, asynchronous, interrupt, transaction
abort, trace begin, trace end, and in transaction, respectively.

Change the display so that known combinations of flags are printed more
nicely e.g.: "call" for "bc", "return" for "br", "jcc" for "bo", "jmp"
for "b", "int" for "bci", "iret" for "bri", "syscall" for "bcs",
"sysret" for "brs", "async" for "by", "hw int" for "bcyi", "tx abrt" for
"bA", "tr strt" for "bB", "tr end" for "bE".

However the "x" flag will be displayed separately in those cases e.g.
"jcc (x)" for a condition branch within a transaction.

Example:

    perf record -e intel_pt//u ls
    perf script --ns -F comm,cpu,pid,tid,time,ip,addr,sym,dso,symoff,flags
    ...
    ls  3689/3689  [001]  2062.020965237:   jcc          7f06a958847a _dl_sysdep_start+0xfa (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a9588450 _dl_sysdep_start+0xd0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965237:   jmp          7f06a9588461 _dl_sysdep_start+0xe1 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a95885a0 _dl_sysdep_start+0x220 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965237:   jmp          7f06a95885a4 _dl_sysdep_start+0x224 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a9588470 _dl_sysdep_start+0xf0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965904:   call         7f06a95884c3 _dl_sysdep_start+0x143 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a9589140 brk+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965904:   syscall      7f06a958914a brk+0xa (/lib/x86_64-linux-gnu/ld-2.19.so) =>                0 [unknown] ([unknown])
    ls  3689/3689  [001]  2062.020966237:   tr strt                 0 [unknown] ([unknown]) =>     7f06a958914c brk+0xc (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   return       7f06a9589165 brk+0x25 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a95884c8 _dl_sysdep_start+0x148 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   jcc          7f06a95884d7 _dl_sysdep_start+0x157 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   call         7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a958ac50 strlen+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   jcc          7f06a958ac6e strlen+0x1e (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a958ac60 strlen+0x10 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1466689258-28493-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt |  7 ++++++-
 tools/perf/builtin-script.c              | 35 +++++++++++++++++++++++++++++++-
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 4f34379..a46030d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -170,7 +170,12 @@ OPTIONS
 	Trace decoding. The flags are "bcrosyiABEx" which stand for branch,
 	call, return, conditional, system, asynchronous, interrupt,
 	transaction abort, trace begin, trace end, and in transaction,
-	respectively.
+	respectively. Known combinations of flags are printed more nicely e.g.
+	"call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b",
+	"int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs",
+	"async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB",
+	"tr end" for "bE". However the "x" flag will be display separately in those
+	cases e.g. "jcc     (x)" for a condition branch within a transaction.
 
 	Finally, a user may not set fields to none for all event types.
 	i.e., -F "" is not allowed.
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 0e18e06..56b2fb8 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -606,13 +606,42 @@ static void print_sample_bts(struct perf_sample *sample,
 	printf("\n");
 }
 
+static struct {
+	u32 flags;
+	const char *name;
+} sample_flags[] = {
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL, "call"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN, "return"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CONDITIONAL, "jcc"},
+	{PERF_IP_FLAG_BRANCH, "jmp"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_INTERRUPT, "int"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN | PERF_IP_FLAG_INTERRUPT, "iret"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_SYSCALLRET, "syscall"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN | PERF_IP_FLAG_SYSCALLRET, "sysret"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_ASYNC, "async"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_ASYNC |	PERF_IP_FLAG_INTERRUPT, "hw int"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT, "tx abrt"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_BEGIN, "tr strt"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END, "tr end"},
+	{0, NULL}
+};
+
 static void print_sample_flags(u32 flags)
 {
 	const char *chars = PERF_IP_FLAG_CHARS;
 	const int n = strlen(PERF_IP_FLAG_CHARS);
+	bool in_tx = flags & PERF_IP_FLAG_IN_TX;
+	const char *name = NULL;
 	char str[33];
 	int i, pos = 0;
 
+	for (i = 0; sample_flags[i].name ; i++) {
+		if (sample_flags[i].flags == (flags & ~PERF_IP_FLAG_IN_TX)) {
+			name = sample_flags[i].name;
+			break;
+		}
+	}
+
 	for (i = 0; i < n; i++, flags >>= 1) {
 		if (flags & 1)
 			str[pos++] = chars[i];
@@ -622,7 +651,11 @@ static void print_sample_flags(u32 flags)
 			str[pos++] = '?';
 	}
 	str[pos] = 0;
-	printf("  %-4s ", str);
+
+	if (name)
+		printf("  %-7s%4s ", name, in_tx ? "(x)" : "");
+	else
+		printf("  %-11s ", str);
 }
 
 struct printer_data {

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:perf/core] perf auxtrace: Add option to feed branches to the thread stack
  2016-06-23 13:40 ` [PATCH V2 2/3] perf auxtrace: Add option to feed branches to the thread stack Adrian Hunter
@ 2016-06-26 10:59   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Adrian Hunter @ 2016-06-26 10:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, mingo, adrian.hunter, tglx, ak, hpa, linux-kernel, acme

Commit-ID:  50f736372d5ea0ce97ede698f957c9b141aa569e
Gitweb:     http://git.kernel.org/tip/50f736372d5ea0ce97ede698f957c9b141aa569e
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 23 Jun 2016 16:40:57 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 23 Jun 2016 17:02:59 -0300

perf auxtrace: Add option to feed branches to the thread stack

In preparation for using the thread stack to print an indent
representing the stack depth in perf script, add an option to tell
decoders to feed branches to the thread stack. Add support for that
option to Intel PT and Intel BTS.

The advantage of using the decoder to feed the thread stack is that it
happens before branch filtering and so can be used with different itrace
options (e.g. it still works when only showing calls, even though the
thread stack needs to see calls and returns). Also it does not conflict
with using the thread stack to get callchains.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1466689258-28493-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/auxtrace.h  |  2 ++
 tools/perf/util/intel-bts.c | 22 +++++++++++++++++-----
 tools/perf/util/intel-pt.c  |  5 ++++-
 3 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 767989e..ac5f0d7 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -63,6 +63,7 @@ enum itrace_period_type {
  * @calls: limit branch samples to calls (can be combined with @returns)
  * @returns: limit branch samples to returns (can be combined with @calls)
  * @callchain: add callchain to 'instructions' events
+ * @thread_stack: feed branches to the thread_stack
  * @last_branch: add branch context to 'instruction' events
  * @callchain_sz: maximum callchain size
  * @last_branch_sz: branch context size
@@ -82,6 +83,7 @@ struct itrace_synth_opts {
 	bool			calls;
 	bool			returns;
 	bool			callchain;
+	bool			thread_stack;
 	bool			last_branch;
 	unsigned int		callchain_sz;
 	unsigned int		last_branch_sz;
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index ecec73f..749e6f2 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -422,7 +422,8 @@ static int intel_bts_get_branch_type(struct intel_bts_queue *btsq,
 }
 
 static int intel_bts_process_buffer(struct intel_bts_queue *btsq,
-				    struct auxtrace_buffer *buffer)
+				    struct auxtrace_buffer *buffer,
+				    struct thread *thread)
 {
 	struct branch *branch;
 	size_t sz, bsz = sizeof(struct branch);
@@ -444,6 +445,12 @@ static int intel_bts_process_buffer(struct intel_bts_queue *btsq,
 		if (!branch->from && !branch->to)
 			continue;
 		intel_bts_get_branch_type(btsq, branch);
+		if (btsq->bts->synth_opts.thread_stack)
+			thread_stack__event(thread, btsq->sample_flags,
+					    le64_to_cpu(branch->from),
+					    le64_to_cpu(branch->to),
+					    btsq->intel_pt_insn.length,
+					    buffer->buffer_nr + 1);
 		if (filter && !(filter & btsq->sample_flags))
 			continue;
 		err = intel_bts_synth_branch_sample(btsq, branch);
@@ -507,12 +514,13 @@ static int intel_bts_process_queue(struct intel_bts_queue *btsq, u64 *timestamp)
 		goto out_put;
 	}
 
-	if (!btsq->bts->synth_opts.callchain && thread &&
+	if (!btsq->bts->synth_opts.callchain &&
+	    !btsq->bts->synth_opts.thread_stack && thread &&
 	    (!old_buffer || btsq->bts->sampling_mode ||
 	     (btsq->bts->snapshot_mode && !buffer->consecutive)))
 		thread_stack__set_trace_nr(thread, buffer->buffer_nr + 1);
 
-	err = intel_bts_process_buffer(btsq, buffer);
+	err = intel_bts_process_buffer(btsq, buffer, thread);
 
 	auxtrace_buffer__drop_data(buffer);
 
@@ -905,10 +913,14 @@ int intel_bts_process_auxtrace_info(union perf_event *event,
 	if (dump_trace)
 		return 0;
 
-	if (session->itrace_synth_opts && session->itrace_synth_opts->set)
+	if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
 		bts->synth_opts = *session->itrace_synth_opts;
-	else
+	} else {
 		itrace_synth_opts__set_default(&bts->synth_opts);
+		if (session->itrace_synth_opts)
+			bts->synth_opts.thread_stack =
+				session->itrace_synth_opts->thread_stack;
+	}
 
 	if (bts->synth_opts.calls)
 		bts->branches_filter |= PERF_IP_FLAG_CALL | PERF_IP_FLAG_ASYNC |
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index dc243b1..551ff6f 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -1234,7 +1234,7 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
 	if (!(state->type & INTEL_PT_BRANCH))
 		return 0;
 
-	if (pt->synth_opts.callchain)
+	if (pt->synth_opts.callchain || pt->synth_opts.thread_stack)
 		thread_stack__event(ptq->thread, ptq->flags, state->from_ip,
 				    state->to_ip, ptq->insn_len,
 				    state->trace_nr);
@@ -2137,6 +2137,9 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 			pt->synth_opts.branches = false;
 			pt->synth_opts.callchain = true;
 		}
+		if (session->itrace_synth_opts)
+			pt->synth_opts.thread_stack =
+				session->itrace_synth_opts->thread_stack;
 	}
 
 	if (pt->synth_opts.log)

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:perf/core] perf script: Add callindent option
  2016-06-23 13:40 ` [PATCH V2 3/3] perf script: Add callindent option Adrian Hunter
  2016-06-23 14:09   ` Arnaldo Carvalho de Melo
@ 2016-06-26 11:00   ` tip-bot for Adrian Hunter
  1 sibling, 0 replies; 11+ messages in thread
From: tip-bot for Adrian Hunter @ 2016-06-26 11:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, linux-kernel, adrian.hunter, ak, jolsa, acme, mingo, hpa

Commit-ID:  e216708d982a1c262f411fee2fcac2bd9ec93a32
Gitweb:     http://git.kernel.org/tip/e216708d982a1c262f411fee2fcac2bd9ec93a32
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 23 Jun 2016 16:40:58 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 23 Jun 2016 17:04:26 -0300

perf script: Add callindent option

Based on patches from Andi Kleen.

When printing PT instruction traces with perf script it is rather useful
to see some indentation for the call tree. This patch adds a new
callindent field to perf script that prints spaces for the function call
stack depth.

We already have code to track the function call stack for PT, that we
can reuse with minor modifications.

The resulting output is not quite as nice as ftrace yet, but a lot
better than what was there before.

Note there are some corner cases when the thread stack gets code
confused and prints incorrect indentation. Even with that it is fairly
useful.

When displaying kernel code traces it is recommended to run as root, as
otherwise perf doesn't understand the kernel addresses properly, and may
not reset the call stack correctly on kernel boundaries.

Example output:

	sudo perf-with-kcore record eg2 -a -e intel_pt// -- sleep 1
	sudo perf-with-kcore script eg2 --ns -F callindent,time,comm,pid,sym,ip,addr,flags,cpu --itrace=cre | less
	...
         swapper     0 [000]  5830.389116586:   call        irq_exit                                                     ffffffff8104d620 smp_call_function_single_interrupt+0x30 => ffffffff8107e720 irq_exit
         swapper     0 [000]  5830.389116586:   call            idle_cpu                                                 ffffffff8107e769 irq_exit+0x49 => ffffffff810a3970 idle_cpu
         swapper     0 [000]  5830.389116586:   return          idle_cpu                                                 ffffffff810a39b7 idle_cpu+0x47 => ffffffff8107e76e irq_exit
         swapper     0 [000]  5830.389116586:   call            tick_nohz_irq_exit                                       ffffffff8107e7bd irq_exit+0x9d => ffffffff810f2fc0 tick_nohz_irq_exit
         swapper     0 [000]  5830.389116919:   call                __tick_nohz_idle_enter                               ffffffff810f2fe0 tick_nohz_irq_exit+0x20 => ffffffff810f28d0 __tick_nohz_idle_enter
         swapper     0 [000]  5830.389116919:   call                    ktime_get                                        ffffffff810f28f1 __tick_nohz_idle_enter+0x21 => ffffffff810e9ec0 ktime_get
         swapper     0 [000]  5830.389116919:   call                        read_tsc                                     ffffffff810e9ef6 ktime_get+0x36 => ffffffff81035070 read_tsc
         swapper     0 [000]  5830.389116919:   return                      read_tsc                                     ffffffff81035084 read_tsc+0x14 => ffffffff810e9efc ktime_get
         swapper     0 [000]  5830.389116919:   return                  ktime_get                                        ffffffff810e9f46 ktime_get+0x86 => ffffffff810f28f6 __tick_nohz_idle_enter
         swapper     0 [000]  5830.389116919:   call                    sched_clock_idle_sleep_event                     ffffffff810f290b __tick_nohz_idle_enter+0x3b => ffffffff810a7380 sched_clock_idle_sleep_event
         swapper     0 [000]  5830.389116919:   call                        sched_clock_cpu                              ffffffff810a738b sched_clock_idle_sleep_event+0xb => ffffffff810a72e0 sched_clock_cpu
         swapper     0 [000]  5830.389116919:   call                            sched_clock                              ffffffff810a734d sched_clock_cpu+0x6d => ffffffff81035750 sched_clock
         swapper     0 [000]  5830.389116919:   call                                native_sched_clock                   ffffffff81035754 sched_clock+0x4 => ffffffff81035640 native_sched_clock
         swapper     0 [000]  5830.389116919:   return                              native_sched_clock                   ffffffff8103568c native_sched_clock+0x4c => ffffffff81035759 sched_clock
         swapper     0 [000]  5830.389116919:   return                          sched_clock                              ffffffff8103575c sched_clock+0xc => ffffffff810a7352 sched_clock_cpu
         swapper     0 [000]  5830.389116919:   return                      sched_clock_cpu                              ffffffff810a7356 sched_clock_cpu+0x76 => ffffffff810a7390 sched_clock_idle_sleep_event
         swapper     0 [000]  5830.389116919:   return                  sched_clock_idle_sleep_event                     ffffffff810a7391 sched_clock_idle_sleep_event+0x11 => ffffffff810f2910 __tick_nohz_idle_enter
	...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1466689258-28493-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt |  4 ++
 tools/perf/builtin-script.c              | 68 +++++++++++++++++++++++++++++++-
 tools/perf/util/thread-stack.c           |  7 ++++
 tools/perf/util/thread-stack.h           |  1 +
 4 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index a46030d..1f6c705 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -177,6 +177,10 @@ OPTIONS
 	"tr end" for "bE". However the "x" flag will be display separately in those
 	cases e.g. "jcc     (x)" for a condition branch within a transaction.
 
+	The callindent field is synthesized and may have a value when
+	Instruction Trace decoding. For calls and returns, it will display the
+	name of the symbol indented with spaces to reflect the stack depth.
+
 	Finally, a user may not set fields to none for all event types.
 	i.e., -F "" is not allowed.
 
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 56b2fb8..971ff91 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -21,6 +21,7 @@
 #include "util/cpumap.h"
 #include "util/thread_map.h"
 #include "util/stat.h"
+#include "util/thread-stack.h"
 #include <linux/bitmap.h>
 #include <linux/stringify.h>
 #include "asm/bug.h"
@@ -63,6 +64,7 @@ enum perf_output_field {
 	PERF_OUTPUT_DATA_SRC	    = 1U << 17,
 	PERF_OUTPUT_WEIGHT	    = 1U << 18,
 	PERF_OUTPUT_BPF_OUTPUT	    = 1U << 19,
+	PERF_OUTPUT_CALLINDENT	    = 1U << 20,
 };
 
 struct output_option {
@@ -89,6 +91,7 @@ struct output_option {
 	{.str = "data_src", .field = PERF_OUTPUT_DATA_SRC},
 	{.str = "weight",   .field = PERF_OUTPUT_WEIGHT},
 	{.str = "bpf-output",   .field = PERF_OUTPUT_BPF_OUTPUT},
+	{.str = "callindent", .field = PERF_OUTPUT_CALLINDENT},
 };
 
 /* default set to maintain compatibility with current format */
@@ -562,6 +565,62 @@ static void print_sample_addr(struct perf_sample *sample,
 	}
 }
 
+static void print_sample_callindent(struct perf_sample *sample,
+				    struct perf_evsel *evsel,
+				    struct thread *thread,
+				    struct addr_location *al)
+{
+	struct perf_event_attr *attr = &evsel->attr;
+	size_t depth = thread_stack__depth(thread);
+	struct addr_location addr_al;
+	const char *name = NULL;
+	static int spacing;
+	int len = 0;
+	u64 ip = 0;
+
+	/*
+	 * The 'return' has already been popped off the stack so the depth has
+	 * to be adjusted to match the 'call'.
+	 */
+	if (thread->ts && sample->flags & PERF_IP_FLAG_RETURN)
+		depth += 1;
+
+	if (sample->flags & (PERF_IP_FLAG_CALL | PERF_IP_FLAG_TRACE_BEGIN)) {
+		if (sample_addr_correlates_sym(attr)) {
+			thread__resolve(thread, &addr_al, sample);
+			if (addr_al.sym)
+				name = addr_al.sym->name;
+			else
+				ip = sample->addr;
+		} else {
+			ip = sample->addr;
+		}
+	} else if (sample->flags & (PERF_IP_FLAG_RETURN | PERF_IP_FLAG_TRACE_END)) {
+		if (al->sym)
+			name = al->sym->name;
+		else
+			ip = sample->ip;
+	}
+
+	if (name)
+		len = printf("%*s%s", (int)depth * 4, "", name);
+	else if (ip)
+		len = printf("%*s%16" PRIx64, (int)depth * 4, "", ip);
+
+	if (len < 0)
+		return;
+
+	/*
+	 * Try to keep the output length from changing frequently so that the
+	 * output lines up more nicely.
+	 */
+	if (len > spacing || (len && len < spacing - 52))
+		spacing = round_up(len + 4, 32);
+
+	if (len < spacing)
+		printf("%*s", spacing - len, "");
+}
+
 static void print_sample_bts(struct perf_sample *sample,
 			     struct perf_evsel *evsel,
 			     struct thread *thread,
@@ -570,6 +629,9 @@ static void print_sample_bts(struct perf_sample *sample,
 	struct perf_event_attr *attr = &evsel->attr;
 	bool print_srcline_last = false;
 
+	if (PRINT_FIELD(CALLINDENT))
+		print_sample_callindent(sample, evsel, thread, al);
+
 	/* print branch_from information */
 	if (PRINT_FIELD(IP)) {
 		unsigned int print_opts = output[attr->type].print_ip_opts;
@@ -2053,7 +2115,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "comma separated output fields prepend with 'type:'. "
 		     "Valid types: hw,sw,trace,raw. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
-		     "addr,symoff,period,iregs,brstack,brstacksym,flags", parse_output_fields),
+		     "addr,symoff,period,iregs,brstack,brstacksym,flags,"
+		     "callindent", parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
@@ -2292,6 +2355,9 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 	script.session = session;
 	script__setup_sample_type(&script);
 
+	if (output[PERF_TYPE_HARDWARE].fields & PERF_OUTPUT_CALLINDENT)
+		itrace_synth_opts.thread_stack = true;
+
 	session->itrace_synth_opts = &itrace_synth_opts;
 
 	if (cpu_list) {
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index 825086a..d330152 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -616,3 +616,10 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
 
 	return err;
 }
+
+size_t thread_stack__depth(struct thread *thread)
+{
+	if (!thread->ts)
+		return 0;
+	return thread->ts->cnt;
+}
diff --git a/tools/perf/util/thread-stack.h b/tools/perf/util/thread-stack.h
index ad44c79..b7e41c4 100644
--- a/tools/perf/util/thread-stack.h
+++ b/tools/perf/util/thread-stack.h
@@ -87,6 +87,7 @@ void thread_stack__sample(struct thread *thread, struct ip_callchain *chain,
 			  size_t sz, u64 ip);
 int thread_stack__flush(struct thread *thread);
 void thread_stack__free(struct thread *thread);
+size_t thread_stack__depth(struct thread *thread);
 
 struct call_return_processor *
 call_return_processor__new(int (*process)(struct call_return *cr, void *data),

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-06-26 11:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-23 13:40 [PATCH V2 0/3] perf script: Add callindent option Adrian Hunter
2016-06-23 13:40 ` [PATCH V2 1/3] perf script: Print sample flags more nicely Adrian Hunter
2016-06-26 10:59   ` [tip:perf/core] " tip-bot for Adrian Hunter
2016-06-23 13:40 ` [PATCH V2 2/3] perf auxtrace: Add option to feed branches to the thread stack Adrian Hunter
2016-06-26 10:59   ` [tip:perf/core] " tip-bot for Adrian Hunter
2016-06-23 13:40 ` [PATCH V2 3/3] perf script: Add callindent option Adrian Hunter
2016-06-23 14:09   ` Arnaldo Carvalho de Melo
2016-06-23 16:20     ` Andi Kleen
2016-06-23 16:46       ` Arnaldo Carvalho de Melo
2016-06-26 11:00   ` [tip:perf/core] " tip-bot for Adrian Hunter
2016-06-23 16:18 ` [PATCH V2 0/3] " Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).