linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] perf: improve script and record for iregs and brstack
@ 2015-08-31 16:41 Stephane Eranian
  2015-08-31 16:41 ` [PATCH v2 1/4] perf script: enable printing of interrupted machine state Stephane Eranian
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Stephane Eranian @ 2015-08-31 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, peterz, mingo, ak, jolsa, namhyung, kan.liang, dsahern,
	adrian.hunter

This short series of patches improves perf record and perf script support
for interrupted machine state and branch stack.

this makes it easier to postprocess the data and narrow down the volume
and limit the overhead of capturing interrupt machine state registers.
For some analysis, only a subset of the registers is useful.

Changes:
   - Make --intr-regs accept register names to limit volume of data collected
   - Make perf script print interrupted machine register values with -F iregs
   - Make perf script print branch stack content with -F brstack

   $ perf record --intr-regs=\?
   available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 
   ...

   $ perf record --intr-regs=ax,bx,cx,dx,si ....

   $ perf script -F ip,iregs
   40afc2   AX:0x6c5770    BX:0x1e    CX:0x5f4d80a    DX:0x101010101010101    SI:0x1

   $ perf script -F ip,brstack
   5d3000 0x401aa0/0x5d2000/M/-/-/-/0 ...

 $ perf script -F ip,brstacksym
  4011e0 noploop+0x0/noploop+0x0/P/-/-/0

In V2, perf script adds brstacksym option to print the branch stack using symbolic names
for the from and to addresses. The brstack prints only raw addresses. For each entry,
the branch stacks flags are decoded, including the latency for Skylake systems.
The patch is rebased to tip.fit 4.2.0-rc7.

Stephane Eranian (4):
  perf script: enable printing of interrupted machine state
  perf/x86: add list of register names
  perf record: add ability to name registers to record
  perf script: enable printing of branch stack

 tools/perf/Documentation/perf-record.txt |   6 +-
 tools/perf/Documentation/perf-script.txt |  14 +++-
 tools/perf/arch/x86/util/Build           |   1 +
 tools/perf/arch/x86/util/perf_regs.c     |  31 +++++++++
 tools/perf/builtin-record.c              |   7 +-
 tools/perf/builtin-script.c              | 111 ++++++++++++++++++++++++++++++-
 tools/perf/perf.h                        |   2 +-
 tools/perf/util/Build                    |   1 +
 tools/perf/util/evsel.c                  |   2 +-
 tools/perf/util/parse-regs-options.c     |  71 ++++++++++++++++++++
 tools/perf/util/parse-regs-options.h     |   5 ++
 tools/perf/util/perf_regs.h              |   7 ++
 12 files changed, 250 insertions(+), 8 deletions(-)
 create mode 100644 tools/perf/arch/x86/util/perf_regs.c
 create mode 100644 tools/perf/util/parse-regs-options.c
 create mode 100644 tools/perf/util/parse-regs-options.h

-- 
1.9.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/4] perf script: enable printing of interrupted machine state
  2015-08-31 16:41 [PATCH v2 0/4] perf: improve script and record for iregs and brstack Stephane Eranian
@ 2015-08-31 16:41 ` Stephane Eranian
  2015-08-31 20:51   ` Arnaldo Carvalho de Melo
  2015-09-01  8:31   ` [tip:perf/urgent] perf script: Enable " tip-bot for Stephane Eranian
  2015-08-31 16:41 ` [PATCH v2 2/4] perf/x86: add list of register names Stephane Eranian
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 13+ messages in thread
From: Stephane Eranian @ 2015-08-31 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, peterz, mingo, ak, jolsa, namhyung, kan.liang, dsahern,
	adrian.hunter

This patch adds the output of the interrupted machine state (iregs)
to perf script. It presents them  as NAME:VALUE so this is easy to
parse during post processing.

To capture the interrupted machine state:
   $ perf record -I ....

to display iregs, use the -F option:

   $ perf script -F ip,iregs
   40afc2   AX:0x6c5770    BX:0x1e    CX:0x5f4d80a    DX:0x101010101010101    SI:0x1

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/Documentation/perf-script.txt |  2 +-
 tools/perf/builtin-script.c              | 31 ++++++++++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 614b2c7..dc3ec78 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -116,7 +116,7 @@ OPTIONS
 --fields::
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
-	srcline, period, flags.
+	srcline, period, iregs, flags.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -f sw:comm,tid,time,ip,sym  and -f trace:time,cpu,trace
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4430340..eb51325 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -6,6 +6,7 @@
 #include "util/exec_cmd.h"
 #include "util/header.h"
 #include "util/parse-options.h"
+#include "util/perf_regs.h"
 #include "util/session.h"
 #include "util/tool.h"
 #include "util/symbol.h"
@@ -46,6 +47,7 @@ enum perf_output_field {
 	PERF_OUTPUT_SYMOFFSET       = 1U << 11,
 	PERF_OUTPUT_SRCLINE         = 1U << 12,
 	PERF_OUTPUT_PERIOD          = 1U << 13,
+	PERF_OUTPUT_IREGS	    = 1U << 14,
 };
 
 struct output_option {
@@ -66,6 +68,7 @@ struct output_option {
 	{.str = "symoff", .field = PERF_OUTPUT_SYMOFFSET},
 	{.str = "srcline", .field = PERF_OUTPUT_SRCLINE},
 	{.str = "period", .field = PERF_OUTPUT_PERIOD},
+	{.str = "iregs", .field = PERF_OUTPUT_IREGS},
 };
 
 /* default set to maintain compatibility with current format */
@@ -255,6 +258,11 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
 					PERF_OUTPUT_PERIOD))
 		return -EINVAL;
 
+	if (PRINT_FIELD(IREGS) &&
+		perf_evsel__check_stype(evsel, PERF_SAMPLE_REGS_INTR, "IREGS",
+					PERF_OUTPUT_IREGS))
+		return -EINVAL;
+
 	return 0;
 }
 
@@ -352,6 +360,24 @@ static int perf_session__check_output_opt(struct perf_session *session)
 	return 0;
 }
 
+static void print_sample_iregs(union perf_event *event __maybe_unused,
+			  struct perf_sample *sample,
+			  struct thread *thread __maybe_unused,
+			  struct perf_event_attr *attr)
+{
+	struct regs_dump *regs = &sample->intr_regs;
+	uint64_t mask = attr->sample_regs_intr;
+	unsigned i = 0, r;
+
+	if (!regs)
+		return;
+
+	for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
+		u64 val = regs->regs[i++];
+		printf("%5s:0x%"PRIx64" ", perf_reg_name(r), val);
+	}
+}
+
 static void print_sample_start(struct perf_sample *sample,
 			       struct thread *thread,
 			       struct perf_evsel *evsel)
@@ -525,6 +551,9 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 				     PERF_MAX_STACK_DEPTH);
 	}
 
+	if (PRINT_FIELD(IREGS))
+		print_sample_iregs(event, sample, thread, attr);
+
 	printf("\n");
 }
 
@@ -1643,7 +1672,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "comma separated output fields prepend with 'type:'. "
 		     "Valid types: hw,sw,trace,raw. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
-		     "addr,symoff,period,flags", parse_output_fields),
+		     "addr,symoff,period,iregs,flags", parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/4] perf/x86: add list of register names
  2015-08-31 16:41 [PATCH v2 0/4] perf: improve script and record for iregs and brstack Stephane Eranian
  2015-08-31 16:41 ` [PATCH v2 1/4] perf script: enable printing of interrupted machine state Stephane Eranian
@ 2015-08-31 16:41 ` Stephane Eranian
  2015-09-01  8:31   ` [tip:perf/urgent] perf/x86: Add " tip-bot for Stephane Eranian
  2015-08-31 16:41 ` [PATCH v2 3/4] perf record: add ability to name registers to record Stephane Eranian
  2015-08-31 16:41 ` [PATCH v2 4/4] perf script: enable printing of branch stack Stephane Eranian
  3 siblings, 1 reply; 13+ messages in thread
From: Stephane Eranian @ 2015-08-31 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, peterz, mingo, ak, jolsa, namhyung, kan.liang, dsahern,
	adrian.hunter

This patch adds a way to locate a register identifier (PERF_X86_REG_*)
based on its name, e.g., AX.

This will be used by a subsequent patch to improved flexibility of
perf record.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/arch/x86/util/Build       |  1 +
 tools/perf/arch/x86/util/perf_regs.c | 31 +++++++++++++++++++++++++++++++
 tools/perf/util/perf_regs.h          |  7 +++++++
 3 files changed, 39 insertions(+)
 create mode 100644 tools/perf/arch/x86/util/perf_regs.c

diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 2c55e1b..ff63649 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -2,6 +2,7 @@ libperf-y += header.o
 libperf-y += tsc.o
 libperf-y += pmu.o
 libperf-y += kvm-stat.o
+libperf-y += perf_regs.o
 
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
 
diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
new file mode 100644
index 0000000..3c75faf
--- /dev/null
+++ b/tools/perf/arch/x86/util/perf_regs.c
@@ -0,0 +1,31 @@
+#include "../../perf.h"
+#include "../../util/perf_regs.h"
+
+#define REG(n, b) { .name = #n, .mask = 1ULL << (b) }
+#define REG_END { .name = NULL }
+const struct sample_reg sample_reg_masks[] = {
+	REG(AX, PERF_REG_X86_AX),
+	REG(BX, PERF_REG_X86_BX),
+	REG(CX, PERF_REG_X86_CX),
+	REG(DX, PERF_REG_X86_DX),
+	REG(SI, PERF_REG_X86_SI),
+	REG(DI, PERF_REG_X86_DI),
+	REG(BP, PERF_REG_X86_BP),
+	REG(SP, PERF_REG_X86_SP),
+	REG(IP, PERF_REG_X86_IP),
+	REG(FLAGS, PERF_REG_X86_FLAGS),
+	REG(CS, PERF_REG_X86_CS),
+	REG(SS, PERF_REG_X86_SS),
+#ifdef HAVE_ARCH_X86_64_SUPPORT
+	REG(R8, PERF_REG_X86_R8),
+	REG(R9, PERF_REG_X86_R9),
+	REG(R10, PERF_REG_X86_R10),
+	REG(R11, PERF_REG_X86_R11),
+	REG(R12, PERF_REG_X86_R12),
+	REG(R13, PERF_REG_X86_R13),
+	REG(R14, PERF_REG_X86_R14),
+	REG(R15, PERF_REG_X86_R15),
+#endif
+	REG_END
+};
+
diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
index 980dbf7..92c1fff 100644
--- a/tools/perf/util/perf_regs.h
+++ b/tools/perf/util/perf_regs.h
@@ -5,6 +5,13 @@
 
 struct regs_dump;
 
+struct sample_reg {
+	const char *name;
+	uint64_t mask;
+};
+
+extern const struct sample_reg sample_reg_masks[];
+
 #ifdef HAVE_PERF_REGS_SUPPORT
 #include <perf_regs.h>
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/4] perf record: add ability to name registers to record
  2015-08-31 16:41 [PATCH v2 0/4] perf: improve script and record for iregs and brstack Stephane Eranian
  2015-08-31 16:41 ` [PATCH v2 1/4] perf script: enable printing of interrupted machine state Stephane Eranian
  2015-08-31 16:41 ` [PATCH v2 2/4] perf/x86: add list of register names Stephane Eranian
@ 2015-08-31 16:41 ` Stephane Eranian
  2015-08-31 21:02   ` Arnaldo Carvalho de Melo
  2015-09-01  8:32   ` [tip:perf/urgent] perf record: Add " tip-bot for Stephane Eranian
  2015-08-31 16:41 ` [PATCH v2 4/4] perf script: enable printing of branch stack Stephane Eranian
  3 siblings, 2 replies; 13+ messages in thread
From: Stephane Eranian @ 2015-08-31 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, peterz, mingo, ak, jolsa, namhyung, kan.liang, dsahern,
	adrian.hunter

This patch modifies the -I/--int-regs option to enablepassing the name
of the registers to sample on interrupt. Registers can be specified
by their symbolic names. For instance on x86, --intr-regs=ax,si.

The motivation is to reduce the size of the perf.data file and the
overhead of sampling by only collecting the registers useful to
a specific analysis. For instance, for value profiling, sampling
only the registers used to passed arguements to functions.

With no parameter, the --intr-regs still records all possible
registers based on the architecture.

To name registers, it is necessary to use the long form of the
option, i.e., --intr-regs:

  $ perf record --intr-regs=si,di,r8,r9 .....

To record any possible registers:
  $ perf record -I .....
  $ perf report --intr-regs ...

To display the register, one can use perf report -D

To list the available registers:
  $ perf record --intr-regs=\?
  available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/Documentation/perf-record.txt |  6 ++-
 tools/perf/builtin-record.c              |  7 +++-
 tools/perf/perf.h                        |  2 +-
 tools/perf/util/Build                    |  1 +
 tools/perf/util/evsel.c                  |  2 +-
 tools/perf/util/parse-regs-options.c     | 71 ++++++++++++++++++++++++++++++++
 tools/perf/util/parse-regs-options.h     |  5 +++
 7 files changed, 89 insertions(+), 5 deletions(-)
 create mode 100644 tools/perf/util/parse-regs-options.c
 create mode 100644 tools/perf/util/parse-regs-options.h

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 347a273..2e9ce77 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -276,7 +276,11 @@ filter out the startup phase of the program, which is often very different.
 --intr-regs::
 Capture machine state (registers) at interrupt, i.e., on counter overflows for
 each sample. List of captured registers depends on the architecture. This option
-is off by default.
+is off by default. It is possible to select the registers to sample using their
+symbolic names, e.g. on x86, ax, si. To list the available registers use
+--intr-regs=\?. To name registers, pass a comma separated list such as
+--intr-regs=ax,bx. The list of register is architecture dependent.
+
 
 --running-time::
 Record running and enabled time for read events (:S)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a660022..d3a5d91 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -27,8 +27,10 @@
 #include "util/cpumap.h"
 #include "util/thread_map.h"
 #include "util/data.h"
+#include "util/perf_regs.h"
 #include "util/auxtrace.h"
 #include "util/parse-branch-options.h"
+#include "util/parse-regs-options.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -1080,8 +1082,9 @@ struct option __record_options[] = {
 		    "sample transaction flags (special events only)"),
 	OPT_BOOLEAN(0, "per-thread", &record.opts.target.per_thread,
 		    "use per-thread mmaps"),
-	OPT_BOOLEAN('I', "intr-regs", &record.opts.sample_intr_regs,
-		    "Sample machine registers on interrupt"),
+	OPT_CALLBACK_OPTARG('I', "intr-regs", &record.opts.sample_intr_regs, NULL, "any register",
+		    "sample selected machine registers on interrupt,"
+		    " use -I ? to list register names", parse_regs),
 	OPT_BOOLEAN(0, "running-time", &record.opts.running_time,
 		    "Record running/enabled time of read (:S) events"),
 	OPT_CALLBACK('k', "clockid", &record.opts,
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index cccb4cf..90129ac 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -54,7 +54,6 @@ struct record_opts {
 	bool	     sample_time_set;
 	bool	     callgraph_set;
 	bool	     period;
-	bool	     sample_intr_regs;
 	bool	     running_time;
 	bool	     full_auxtrace;
 	bool	     auxtrace_snapshot_mode;
@@ -64,6 +63,7 @@ struct record_opts {
 	unsigned int auxtrace_mmap_pages;
 	unsigned int user_freq;
 	u64          branch_stack;
+	u64	     sample_intr_regs;
 	u64	     default_interval;
 	u64	     user_interval;
 	size_t	     auxtrace_snapshot_size;
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index e912856..7df4937 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -82,6 +82,7 @@ libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
 libperf-$(CONFIG_AUXTRACE) += intel-pt.o
 libperf-$(CONFIG_AUXTRACE) += intel-bts.o
 libperf-y += parse-branch-options.o
+libperf-y += parse-regs-options.o
 
 libperf-$(CONFIG_LIBELF) += symbol-elf.o
 libperf-$(CONFIG_LIBELF) += probe-file.o
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index fd53cc2..b049633 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -787,7 +787,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 		perf_evsel__config_callgraph(evsel, opts, &callchain_param);
 
 	if (opts->sample_intr_regs) {
-		attr->sample_regs_intr = PERF_REGS_MASK;
+		attr->sample_regs_intr = opts->sample_intr_regs;
 		perf_evsel__set_sample_bit(evsel, REGS_INTR);
 	}
 
diff --git a/tools/perf/util/parse-regs-options.c b/tools/perf/util/parse-regs-options.c
new file mode 100644
index 0000000..4f2c1c2
--- /dev/null
+++ b/tools/perf/util/parse-regs-options.c
@@ -0,0 +1,71 @@
+#include "perf.h"
+#include "util/util.h"
+#include "util/debug.h"
+#include "util/parse-options.h"
+#include "util/parse-regs-options.h"
+
+int
+parse_regs(const struct option *opt, const char *str, int unset)
+{
+	uint64_t *mode = (uint64_t *)opt->value;
+	const struct sample_reg *r;
+	char *s, *os = NULL, *p;
+	int ret = -1;
+
+	if (unset)
+		return 0;
+
+	/*
+	 * cannot set it twice
+	 */
+	if (*mode)
+		return -1;
+
+	/* str may be NULL in case no arg is passed to -I */
+	if (str) {
+		/* because str is read-only */
+		s = os = strdup(str);
+		if (!s)
+			return -1;
+
+		for (;;) {
+			p = strchr(s, ',');
+			if (p)
+				*p = '\0';
+
+			if (!strcmp(s, "?")) {
+				fprintf(stderr, "available registers: ");
+				for (r = sample_reg_masks; r->name; r++) {
+					fprintf(stderr, "%s ", r->name);
+				}
+				fputc('\n', stderr);
+				/* just printing available regs */
+				return -1;
+			}
+			for (r = sample_reg_masks; r->name; r++) {
+				if (!strcasecmp(s, r->name))
+					break;
+			}
+			if (!r->name) {
+				ui__warning("unknown register %s,"
+					    " check man page\n", s);
+				goto error;
+			}
+
+			*mode |= r->mask;
+
+			if (!p)
+				break;
+
+			s = p + 1;
+		}
+	}
+	ret = 0;
+
+	/* default to all possible regs */
+	if (*mode == 0)
+		*mode = PERF_REGS_MASK;
+error:
+	free(os);
+	return ret;
+}
diff --git a/tools/perf/util/parse-regs-options.h b/tools/perf/util/parse-regs-options.h
new file mode 100644
index 0000000..7d762b1
--- /dev/null
+++ b/tools/perf/util/parse-regs-options.h
@@ -0,0 +1,5 @@
+#ifndef _PERF_PARSE_REGS_OPTIONS_H
+#define _PERF_PARSE_REGS_OPTIONS_H 1
+struct option;
+int parse_regs(const struct option *opt, const char *str, int unset);
+#endif /* _PERF_PARSE_REGS_OPTIONS_H */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/4] perf script: enable printing of branch stack
  2015-08-31 16:41 [PATCH v2 0/4] perf: improve script and record for iregs and brstack Stephane Eranian
                   ` (2 preceding siblings ...)
  2015-08-31 16:41 ` [PATCH v2 3/4] perf record: add ability to name registers to record Stephane Eranian
@ 2015-08-31 16:41 ` Stephane Eranian
  2015-08-31 17:05   ` Andi Kleen
  2015-10-30  9:13   ` [tip:perf/core] perf script: Enable " tip-bot for Stephane Eranian
  3 siblings, 2 replies; 13+ messages in thread
From: Stephane Eranian @ 2015-08-31 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, peterz, mingo, ak, jolsa, namhyung, kan.liang, dsahern,
	adrian.hunter

This patch improves perf script by enabling printing of the
branch stack via the 'brstack' and 'brstacksym' arguments to
the field selection option -F. The option is off by default
and operates only if the perf.data file has branch stack content.

The branches are printed in to/from pairs. The most recent branch
is printed first. The number of branch entries vary based on the
underlying hardware and filtering used.

The brstack prints FROM/TO addresses in raw hexadecimal format.
The brstacksym prints FROM/TO addresses in symbolic form wherever
possible.

 $ perf script -F ip,brstack
  5d3000 0x401aa0/0x5d2000/M/-/-/-/0 ...

 $ perf script -F ip,brstacksym
  4011e0 noploop+0x0/noploop+0x0/P/-/-/0

The notation F/T/M/X/A/C describes the attributes of the branch.
F=from, T=to, M/P=misprediction/prediction, X=TSX, A=TSX abort, C=cycles (SKL)

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/Documentation/perf-script.txt | 14 +++++-
 tools/perf/builtin-script.c              | 82 +++++++++++++++++++++++++++++++-
 2 files changed, 93 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index dc3ec78..22e7b4d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -112,11 +112,11 @@ OPTIONS
 --debug-mode::
         Do various checks like samples ordering and lost events.
 
--f::
+-F::
 --fields::
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
-	srcline, period, iregs, flags.
+	srcline, period, iregs, brstack, brstacksym, flags.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -f sw:comm,tid,time,ip,sym  and -f trace:time,cpu,trace
@@ -175,6 +175,16 @@ OPTIONS
 	Finally, a user may not set fields to none for all event types.
 	i.e., -f "" is not allowed.
 
+	The brstack output includes branch related information with raw addresses using the
+	/v/v/v/v/ syntax in the following order:
+	FROM: branch source instruction
+	TO  : branch target instruction
+        M/P/-: M=branch target mispredicted or branch direction was mispredicted, P=target predicted or direction predicted, -=not supported
+	X/- : X=branch inside a transactional region, -=not in transaction region or not supported
+	A/- : A=TSX abort entry, -=not aborted region or not supported
+
+	The brstacksym is identical to brstack, except that the FROM and TO addresses are printed in a symbolic form if possible.
+
 -k::
 --vmlinux=<file>::
         vmlinux pathname
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index eb51325..93c86b9 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -48,6 +48,8 @@ enum perf_output_field {
 	PERF_OUTPUT_SRCLINE         = 1U << 12,
 	PERF_OUTPUT_PERIOD          = 1U << 13,
 	PERF_OUTPUT_IREGS	    = 1U << 14,
+	PERF_OUTPUT_BRSTACK	    = 1U << 15,
+	PERF_OUTPUT_BRSTACKSYM	    = 1U << 16,
 };
 
 struct output_option {
@@ -69,6 +71,8 @@ struct output_option {
 	{.str = "srcline", .field = PERF_OUTPUT_SRCLINE},
 	{.str = "period", .field = PERF_OUTPUT_PERIOD},
 	{.str = "iregs", .field = PERF_OUTPUT_IREGS},
+	{.str = "brstack", .field = PERF_OUTPUT_BRSTACK},
+	{.str = "brstacksym", .field = PERF_OUTPUT_BRSTACKSYM},
 };
 
 /* default set to maintain compatibility with current format */
@@ -419,6 +423,77 @@ static void print_sample_start(struct perf_sample *sample,
 	}
 }
 
+static inline char
+mispred_str(struct branch_entry *br)
+{
+	if (!(br->flags.mispred  || br->flags.predicted))
+		return '-';
+
+	return br->flags.predicted ? 'P' : 'M';
+}
+
+static void print_sample_brstack(union perf_event *event __maybe_unused,
+			  struct perf_sample *sample,
+			  struct thread *thread __maybe_unused,
+			  struct perf_event_attr *attr __maybe_unused)
+{
+	struct branch_stack *br = sample->branch_stack;
+	u64 i;
+
+	if (!(br && br->nr))
+		return;
+
+	for (i = 0; i < br->nr; i++) {
+		printf(" 0x%"PRIx64"/0x%"PRIx64"/%c/%c/%c/%d ",
+			br->entries[i].from,
+			br->entries[i].to,
+			mispred_str( br->entries + i),
+			br->entries[i].flags.in_tx? 'X' : '-',
+			br->entries[i].flags.abort? 'A' : '-',
+			br->entries[i].flags.cycles);
+	}
+}
+
+static void print_sample_brstacksym(union perf_event *event __maybe_unused,
+			  struct perf_sample *sample,
+			  struct thread *thread __maybe_unused,
+			  struct perf_event_attr *attr __maybe_unused)
+{
+	struct branch_stack *br = sample->branch_stack;
+	struct addr_location alf, alt;
+	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+	u64 i, from, to;
+
+	if (!(br && br->nr))
+		return;
+
+	for (i = 0; i < br->nr; i++) {
+
+		memset(&alf, 0, sizeof(alf));
+		memset(&alt, 0, sizeof(alt));
+		from = br->entries[i].from;
+		to   = br->entries[i].to;
+
+		thread__find_addr_map(thread, cpumode, MAP__FUNCTION, from, &alf);
+		if (alf.map)
+			alf.sym = map__find_symbol(alf.map, alf.addr, NULL);
+
+		thread__find_addr_map(thread, cpumode, MAP__FUNCTION, to, &alt);
+		if (alt.map)
+			alt.sym = map__find_symbol(alt.map, alt.addr, NULL);
+
+		symbol__fprintf_symname_offs(alf.sym, &alf, stdout);
+		putchar('/');
+		symbol__fprintf_symname_offs(alt.sym, &alt, stdout);
+		printf("/%c/%c/%c/%d ",
+			mispred_str( br->entries + i),
+			br->entries[i].flags.in_tx? 'X' : '-',
+			br->entries[i].flags.abort? 'A' : '-',
+			br->entries[i].flags.cycles);
+	}
+}
+
+
 static void print_sample_addr(union perf_event *event,
 			  struct perf_sample *sample,
 			  struct thread *thread,
@@ -554,6 +629,11 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 	if (PRINT_FIELD(IREGS))
 		print_sample_iregs(event, sample, thread, attr);
 
+	if (PRINT_FIELD(BRSTACK))
+		print_sample_brstack(event, sample, thread, attr);
+	else if (PRINT_FIELD(BRSTACKSYM))
+		print_sample_brstacksym(event, sample, thread, attr);
+
 	printf("\n");
 }
 
@@ -1672,7 +1752,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "comma separated output fields prepend with 'type:'. "
 		     "Valid types: hw,sw,trace,raw. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
-		     "addr,symoff,period,iregs,flags", parse_output_fields),
+		     "addr,symoff,period,iregs,brstack,brstacksym,flags", parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] perf script: enable printing of branch stack
  2015-08-31 16:41 ` [PATCH v2 4/4] perf script: enable printing of branch stack Stephane Eranian
@ 2015-08-31 17:05   ` Andi Kleen
  2015-08-31 17:08     ` Andi Kleen
  2015-10-30  9:13   ` [tip:perf/core] perf script: Enable " tip-bot for Stephane Eranian
  1 sibling, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2015-08-31 17:05 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, acme, peterz, mingo, jolsa, namhyung, kan.liang,
	dsahern, adrian.hunter

>  $ perf script -F ip,brstack
>   5d3000 0x401aa0/0x5d2000/M/-/-/-/0 ...
> 
>  $ perf script -F ip,brstacksym
>   4011e0 noploop+0x0/noploop+0x0/P/-/-/0

That's a weird format that's hard to parse with standard tools like
awk, and also for humans. How about separating with spaces?

> @@ -175,6 +175,16 @@ OPTIONS
>  	Finally, a user may not set fields to none for all event types.
>  	i.e., -f "" is not allowed.
>  
> +	The brstack output includes branch related information with raw addresses using the
> +	/v/v/v/v/ syntax in the following order:
> +	FROM: branch source instruction
> +	TO  : branch target instruction
> +        M/P/-: M=branch target mispredicted or branch direction was mispredicted, P=target predicted or direction predicted, -=not supported
> +	X/- : X=branch inside a transactional region, -=not in transaction region or not supported
> +	A/- : A=TSX abort entry, -=not aborted region or not supported

Need to describe cycles here too.

The rest looks good to me. Should probably add brstacksrcline too, but that
can be done later.

-Andi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] perf script: enable printing of branch stack
  2015-08-31 17:05   ` Andi Kleen
@ 2015-08-31 17:08     ` Andi Kleen
  0 siblings, 0 replies; 13+ messages in thread
From: Andi Kleen @ 2015-08-31 17:08 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, acme, peterz, mingo, jolsa, namhyung, kan.liang,
	dsahern, adrian.hunter

Andi Kleen <ak@linux.intel.com> writes:

>>  $ perf script -F ip,brstack
>>   5d3000 0x401aa0/0x5d2000/M/-/-/-/0 ...
>> 
>>  $ perf script -F ip,brstacksym
>>   4011e0 noploop+0x0/noploop+0x0/P/-/-/0
>
> That's a weird format that's hard to parse with standard tools like
> awk, and also for humans. How about separating with spaces?

Also can you include the number of the LBR in the branch record?
That can be useful sometimes.

BTW I haven't checked, but it would be also good to make sure
the print order is the same as program order (perf stores
the LBRs reversed)

-Andi


-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] perf script: enable printing of interrupted machine state
  2015-08-31 16:41 ` [PATCH v2 1/4] perf script: enable printing of interrupted machine state Stephane Eranian
@ 2015-08-31 20:51   ` Arnaldo Carvalho de Melo
  2015-09-01  8:31   ` [tip:perf/urgent] perf script: Enable " tip-bot for Stephane Eranian
  1 sibling, 0 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-31 20:51 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, ak, jolsa, namhyung, kan.liang,
	dsahern, adrian.hunter

Em Mon, Aug 31, 2015 at 06:41:10PM +0200, Stephane Eranian escreveu:
> This patch adds the output of the interrupted machine state (iregs)
> to perf script. It presents them  as NAME:VALUE so this is easy to
> parse during post processing.
> 
> To capture the interrupted machine state:
>    $ perf record -I ....
> 
> to display iregs, use the -F option:

Tested and applied,

- Arnaldo
 
>    $ perf script -F ip,iregs
>    40afc2   AX:0x6c5770    BX:0x1e    CX:0x5f4d80a    DX:0x101010101010101    SI:0x1
> 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/Documentation/perf-script.txt |  2 +-
>  tools/perf/builtin-script.c              | 31 ++++++++++++++++++++++++++++++-
>  2 files changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
> index 614b2c7..dc3ec78 100644
> --- a/tools/perf/Documentation/perf-script.txt
> +++ b/tools/perf/Documentation/perf-script.txt
> @@ -116,7 +116,7 @@ OPTIONS
>  --fields::
>          Comma separated list of fields to print. Options are:
>          comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
> -	srcline, period, flags.
> +	srcline, period, iregs, flags.
>          Field list can be prepended with the type, trace, sw or hw,
>          to indicate to which event type the field list applies.
>          e.g., -f sw:comm,tid,time,ip,sym  and -f trace:time,cpu,trace
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 4430340..eb51325 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -6,6 +6,7 @@
>  #include "util/exec_cmd.h"
>  #include "util/header.h"
>  #include "util/parse-options.h"
> +#include "util/perf_regs.h"
>  #include "util/session.h"
>  #include "util/tool.h"
>  #include "util/symbol.h"
> @@ -46,6 +47,7 @@ enum perf_output_field {
>  	PERF_OUTPUT_SYMOFFSET       = 1U << 11,
>  	PERF_OUTPUT_SRCLINE         = 1U << 12,
>  	PERF_OUTPUT_PERIOD          = 1U << 13,
> +	PERF_OUTPUT_IREGS	    = 1U << 14,
>  };
>  
>  struct output_option {
> @@ -66,6 +68,7 @@ struct output_option {
>  	{.str = "symoff", .field = PERF_OUTPUT_SYMOFFSET},
>  	{.str = "srcline", .field = PERF_OUTPUT_SRCLINE},
>  	{.str = "period", .field = PERF_OUTPUT_PERIOD},
> +	{.str = "iregs", .field = PERF_OUTPUT_IREGS},
>  };
>  
>  /* default set to maintain compatibility with current format */
> @@ -255,6 +258,11 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
>  					PERF_OUTPUT_PERIOD))
>  		return -EINVAL;
>  
> +	if (PRINT_FIELD(IREGS) &&
> +		perf_evsel__check_stype(evsel, PERF_SAMPLE_REGS_INTR, "IREGS",
> +					PERF_OUTPUT_IREGS))
> +		return -EINVAL;
> +
>  	return 0;
>  }
>  
> @@ -352,6 +360,24 @@ static int perf_session__check_output_opt(struct perf_session *session)
>  	return 0;
>  }
>  
> +static void print_sample_iregs(union perf_event *event __maybe_unused,
> +			  struct perf_sample *sample,
> +			  struct thread *thread __maybe_unused,
> +			  struct perf_event_attr *attr)
> +{
> +	struct regs_dump *regs = &sample->intr_regs;
> +	uint64_t mask = attr->sample_regs_intr;
> +	unsigned i = 0, r;
> +
> +	if (!regs)
> +		return;
> +
> +	for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
> +		u64 val = regs->regs[i++];
> +		printf("%5s:0x%"PRIx64" ", perf_reg_name(r), val);
> +	}
> +}
> +
>  static void print_sample_start(struct perf_sample *sample,
>  			       struct thread *thread,
>  			       struct perf_evsel *evsel)
> @@ -525,6 +551,9 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
>  				     PERF_MAX_STACK_DEPTH);
>  	}
>  
> +	if (PRINT_FIELD(IREGS))
> +		print_sample_iregs(event, sample, thread, attr);
> +
>  	printf("\n");
>  }
>  
> @@ -1643,7 +1672,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
>  		     "comma separated output fields prepend with 'type:'. "
>  		     "Valid types: hw,sw,trace,raw. "
>  		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
> -		     "addr,symoff,period,flags", parse_output_fields),
> +		     "addr,symoff,period,iregs,flags", parse_output_fields),
>  	OPT_BOOLEAN('a', "all-cpus", &system_wide,
>  		    "system-wide collection from all CPUs"),
>  	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] perf record: add ability to name registers to record
  2015-08-31 16:41 ` [PATCH v2 3/4] perf record: add ability to name registers to record Stephane Eranian
@ 2015-08-31 21:02   ` Arnaldo Carvalho de Melo
  2015-09-01  8:32   ` [tip:perf/urgent] perf record: Add " tip-bot for Stephane Eranian
  1 sibling, 0 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-31 21:02 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, ak, jolsa, namhyung, kan.liang,
	dsahern, adrian.hunter

Em Mon, Aug 31, 2015 at 06:41:12PM +0200, Stephane Eranian escreveu:
> This patch modifies the -I/--int-regs option to enablepassing the name
> of the registers to sample on interrupt. Registers can be specified
> by their symbolic names. For instance on x86, --intr-regs=ax,si.
> 
> The motivation is to reduce the size of the perf.data file and the
> overhead of sampling by only collecting the registers useful to
> a specific analysis. For instance, for value profiling, sampling
> only the registers used to passed arguements to functions.
> 
> With no parameter, the --intr-regs still records all possible
> registers based on the architecture.

Applied and tested up to this one, waiting for the discussion with Andi
to proceed to the last one.

- Arnaldo
 
> To name registers, it is necessary to use the long form of the
> option, i.e., --intr-regs:
> 
>   $ perf record --intr-regs=si,di,r8,r9 .....
> 
> To record any possible registers:
>   $ perf record -I .....
>   $ perf report --intr-regs ...
> 
> To display the register, one can use perf report -D
> 
> To list the available registers:
>   $ perf record --intr-regs=\?
>   available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15
> 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/Documentation/perf-record.txt |  6 ++-
>  tools/perf/builtin-record.c              |  7 +++-
>  tools/perf/perf.h                        |  2 +-
>  tools/perf/util/Build                    |  1 +
>  tools/perf/util/evsel.c                  |  2 +-
>  tools/perf/util/parse-regs-options.c     | 71 ++++++++++++++++++++++++++++++++
>  tools/perf/util/parse-regs-options.h     |  5 +++
>  7 files changed, 89 insertions(+), 5 deletions(-)
>  create mode 100644 tools/perf/util/parse-regs-options.c
>  create mode 100644 tools/perf/util/parse-regs-options.h
> 
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index 347a273..2e9ce77 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -276,7 +276,11 @@ filter out the startup phase of the program, which is often very different.
>  --intr-regs::
>  Capture machine state (registers) at interrupt, i.e., on counter overflows for
>  each sample. List of captured registers depends on the architecture. This option
> -is off by default.
> +is off by default. It is possible to select the registers to sample using their
> +symbolic names, e.g. on x86, ax, si. To list the available registers use
> +--intr-regs=\?. To name registers, pass a comma separated list such as
> +--intr-regs=ax,bx. The list of register is architecture dependent.
> +
>  
>  --running-time::
>  Record running and enabled time for read events (:S)
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index a660022..d3a5d91 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -27,8 +27,10 @@
>  #include "util/cpumap.h"
>  #include "util/thread_map.h"
>  #include "util/data.h"
> +#include "util/perf_regs.h"
>  #include "util/auxtrace.h"
>  #include "util/parse-branch-options.h"
> +#include "util/parse-regs-options.h"
>  
>  #include <unistd.h>
>  #include <sched.h>
> @@ -1080,8 +1082,9 @@ struct option __record_options[] = {
>  		    "sample transaction flags (special events only)"),
>  	OPT_BOOLEAN(0, "per-thread", &record.opts.target.per_thread,
>  		    "use per-thread mmaps"),
> -	OPT_BOOLEAN('I', "intr-regs", &record.opts.sample_intr_regs,
> -		    "Sample machine registers on interrupt"),
> +	OPT_CALLBACK_OPTARG('I', "intr-regs", &record.opts.sample_intr_regs, NULL, "any register",
> +		    "sample selected machine registers on interrupt,"
> +		    " use -I ? to list register names", parse_regs),
>  	OPT_BOOLEAN(0, "running-time", &record.opts.running_time,
>  		    "Record running/enabled time of read (:S) events"),
>  	OPT_CALLBACK('k', "clockid", &record.opts,
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index cccb4cf..90129ac 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -54,7 +54,6 @@ struct record_opts {
>  	bool	     sample_time_set;
>  	bool	     callgraph_set;
>  	bool	     period;
> -	bool	     sample_intr_regs;
>  	bool	     running_time;
>  	bool	     full_auxtrace;
>  	bool	     auxtrace_snapshot_mode;
> @@ -64,6 +63,7 @@ struct record_opts {
>  	unsigned int auxtrace_mmap_pages;
>  	unsigned int user_freq;
>  	u64          branch_stack;
> +	u64	     sample_intr_regs;
>  	u64	     default_interval;
>  	u64	     user_interval;
>  	size_t	     auxtrace_snapshot_size;
> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index e912856..7df4937 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -82,6 +82,7 @@ libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
>  libperf-$(CONFIG_AUXTRACE) += intel-pt.o
>  libperf-$(CONFIG_AUXTRACE) += intel-bts.o
>  libperf-y += parse-branch-options.o
> +libperf-y += parse-regs-options.o
>  
>  libperf-$(CONFIG_LIBELF) += symbol-elf.o
>  libperf-$(CONFIG_LIBELF) += probe-file.o
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index fd53cc2..b049633 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -787,7 +787,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
>  		perf_evsel__config_callgraph(evsel, opts, &callchain_param);
>  
>  	if (opts->sample_intr_regs) {
> -		attr->sample_regs_intr = PERF_REGS_MASK;
> +		attr->sample_regs_intr = opts->sample_intr_regs;
>  		perf_evsel__set_sample_bit(evsel, REGS_INTR);
>  	}
>  
> diff --git a/tools/perf/util/parse-regs-options.c b/tools/perf/util/parse-regs-options.c
> new file mode 100644
> index 0000000..4f2c1c2
> --- /dev/null
> +++ b/tools/perf/util/parse-regs-options.c
> @@ -0,0 +1,71 @@
> +#include "perf.h"
> +#include "util/util.h"
> +#include "util/debug.h"
> +#include "util/parse-options.h"
> +#include "util/parse-regs-options.h"
> +
> +int
> +parse_regs(const struct option *opt, const char *str, int unset)
> +{
> +	uint64_t *mode = (uint64_t *)opt->value;
> +	const struct sample_reg *r;
> +	char *s, *os = NULL, *p;
> +	int ret = -1;
> +
> +	if (unset)
> +		return 0;
> +
> +	/*
> +	 * cannot set it twice
> +	 */
> +	if (*mode)
> +		return -1;
> +
> +	/* str may be NULL in case no arg is passed to -I */
> +	if (str) {
> +		/* because str is read-only */
> +		s = os = strdup(str);
> +		if (!s)
> +			return -1;
> +
> +		for (;;) {
> +			p = strchr(s, ',');
> +			if (p)
> +				*p = '\0';
> +
> +			if (!strcmp(s, "?")) {
> +				fprintf(stderr, "available registers: ");
> +				for (r = sample_reg_masks; r->name; r++) {
> +					fprintf(stderr, "%s ", r->name);
> +				}
> +				fputc('\n', stderr);
> +				/* just printing available regs */
> +				return -1;
> +			}
> +			for (r = sample_reg_masks; r->name; r++) {
> +				if (!strcasecmp(s, r->name))
> +					break;
> +			}
> +			if (!r->name) {
> +				ui__warning("unknown register %s,"
> +					    " check man page\n", s);
> +				goto error;
> +			}
> +
> +			*mode |= r->mask;
> +
> +			if (!p)
> +				break;
> +
> +			s = p + 1;
> +		}
> +	}
> +	ret = 0;
> +
> +	/* default to all possible regs */
> +	if (*mode == 0)
> +		*mode = PERF_REGS_MASK;
> +error:
> +	free(os);
> +	return ret;
> +}
> diff --git a/tools/perf/util/parse-regs-options.h b/tools/perf/util/parse-regs-options.h
> new file mode 100644
> index 0000000..7d762b1
> --- /dev/null
> +++ b/tools/perf/util/parse-regs-options.h
> @@ -0,0 +1,5 @@
> +#ifndef _PERF_PARSE_REGS_OPTIONS_H
> +#define _PERF_PARSE_REGS_OPTIONS_H 1
> +struct option;
> +int parse_regs(const struct option *opt, const char *str, int unset);
> +#endif /* _PERF_PARSE_REGS_OPTIONS_H */
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip:perf/urgent] perf script: Enable printing of interrupted machine state
  2015-08-31 16:41 ` [PATCH v2 1/4] perf script: enable printing of interrupted machine state Stephane Eranian
  2015-08-31 20:51   ` Arnaldo Carvalho de Melo
@ 2015-09-01  8:31   ` tip-bot for Stephane Eranian
  1 sibling, 0 replies; 13+ messages in thread
From: tip-bot for Stephane Eranian @ 2015-09-01  8:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, eranian, kan.liang, tglx, hpa, peterz, namhyung,
	jolsa, dsahern, acme, mingo, adrian.hunter, ak

Commit-ID:  fc36f9485aee3a62b22be1f561543a31bce6d48e
Gitweb:     http://git.kernel.org/tip/fc36f9485aee3a62b22be1f561543a31bce6d48e
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Mon, 31 Aug 2015 18:41:10 +0200
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 31 Aug 2015 17:51:07 -0300

perf script: Enable printing of interrupted machine state

This patch adds the output of the interrupted machine state (iregs) to
perf script. It presents them  as NAME:VALUE so this is easy to parse
during post processing.

To capture the interrupted machine state:
   $ perf record -I ....

to display iregs, use the -F option:

   $ perf script -F ip,iregs
   40afc2   AX:0x6c5770    BX:0x1e    CX:0x5f4d80a    DX:0x101010101010101    SI:0x1

Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1441039273-16260-2-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt |  2 +-
 tools/perf/builtin-script.c              | 31 ++++++++++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 614b2c7..dc3ec78 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -116,7 +116,7 @@ OPTIONS
 --fields::
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
-	srcline, period, flags.
+	srcline, period, iregs, flags.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -f sw:comm,tid,time,ip,sym  and -f trace:time,cpu,trace
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4430340..eb51325 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -6,6 +6,7 @@
 #include "util/exec_cmd.h"
 #include "util/header.h"
 #include "util/parse-options.h"
+#include "util/perf_regs.h"
 #include "util/session.h"
 #include "util/tool.h"
 #include "util/symbol.h"
@@ -46,6 +47,7 @@ enum perf_output_field {
 	PERF_OUTPUT_SYMOFFSET       = 1U << 11,
 	PERF_OUTPUT_SRCLINE         = 1U << 12,
 	PERF_OUTPUT_PERIOD          = 1U << 13,
+	PERF_OUTPUT_IREGS	    = 1U << 14,
 };
 
 struct output_option {
@@ -66,6 +68,7 @@ struct output_option {
 	{.str = "symoff", .field = PERF_OUTPUT_SYMOFFSET},
 	{.str = "srcline", .field = PERF_OUTPUT_SRCLINE},
 	{.str = "period", .field = PERF_OUTPUT_PERIOD},
+	{.str = "iregs", .field = PERF_OUTPUT_IREGS},
 };
 
 /* default set to maintain compatibility with current format */
@@ -255,6 +258,11 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
 					PERF_OUTPUT_PERIOD))
 		return -EINVAL;
 
+	if (PRINT_FIELD(IREGS) &&
+		perf_evsel__check_stype(evsel, PERF_SAMPLE_REGS_INTR, "IREGS",
+					PERF_OUTPUT_IREGS))
+		return -EINVAL;
+
 	return 0;
 }
 
@@ -352,6 +360,24 @@ out:
 	return 0;
 }
 
+static void print_sample_iregs(union perf_event *event __maybe_unused,
+			  struct perf_sample *sample,
+			  struct thread *thread __maybe_unused,
+			  struct perf_event_attr *attr)
+{
+	struct regs_dump *regs = &sample->intr_regs;
+	uint64_t mask = attr->sample_regs_intr;
+	unsigned i = 0, r;
+
+	if (!regs)
+		return;
+
+	for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
+		u64 val = regs->regs[i++];
+		printf("%5s:0x%"PRIx64" ", perf_reg_name(r), val);
+	}
+}
+
 static void print_sample_start(struct perf_sample *sample,
 			       struct thread *thread,
 			       struct perf_evsel *evsel)
@@ -525,6 +551,9 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 				     PERF_MAX_STACK_DEPTH);
 	}
 
+	if (PRINT_FIELD(IREGS))
+		print_sample_iregs(event, sample, thread, attr);
+
 	printf("\n");
 }
 
@@ -1643,7 +1672,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "comma separated output fields prepend with 'type:'. "
 		     "Valid types: hw,sw,trace,raw. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
-		     "addr,symoff,period,flags", parse_output_fields),
+		     "addr,symoff,period,iregs,flags", parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [tip:perf/urgent] perf/x86: Add list of register names
  2015-08-31 16:41 ` [PATCH v2 2/4] perf/x86: add list of register names Stephane Eranian
@ 2015-09-01  8:31   ` tip-bot for Stephane Eranian
  0 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Stephane Eranian @ 2015-09-01  8:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, dsahern, tglx, peterz, acme, linux-kernel, kan.liang,
	adrian.hunter, namhyung, hpa, eranian, ak, mingo

Commit-ID:  c5e991ee9dff0f8136168ed2d0d1a8cc3620dac4
Gitweb:     http://git.kernel.org/tip/c5e991ee9dff0f8136168ed2d0d1a8cc3620dac4
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Mon, 31 Aug 2015 18:41:11 +0200
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 31 Aug 2015 17:56:37 -0300

perf/x86: Add list of register names

This patch adds a way to locate a register identifier (PERF_X86_REG_*)
based on its name, e.g., AX.

This will be used by a subsequent patch to improved flexibility of perf
record.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1441039273-16260-3-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/util/Build       |  1 +
 tools/perf/arch/x86/util/perf_regs.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/perf_regs.h          |  7 +++++++
 3 files changed, 38 insertions(+)

diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 2c55e1b..ff63649 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -2,6 +2,7 @@ libperf-y += header.o
 libperf-y += tsc.o
 libperf-y += pmu.o
 libperf-y += kvm-stat.o
+libperf-y += perf_regs.o
 
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
 
diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
new file mode 100644
index 0000000..087c84e
--- /dev/null
+++ b/tools/perf/arch/x86/util/perf_regs.c
@@ -0,0 +1,30 @@
+#include "../../perf.h"
+#include "../../util/perf_regs.h"
+
+#define REG(n, b) { .name = #n, .mask = 1ULL << (b) }
+#define REG_END { .name = NULL }
+const struct sample_reg sample_reg_masks[] = {
+	REG(AX, PERF_REG_X86_AX),
+	REG(BX, PERF_REG_X86_BX),
+	REG(CX, PERF_REG_X86_CX),
+	REG(DX, PERF_REG_X86_DX),
+	REG(SI, PERF_REG_X86_SI),
+	REG(DI, PERF_REG_X86_DI),
+	REG(BP, PERF_REG_X86_BP),
+	REG(SP, PERF_REG_X86_SP),
+	REG(IP, PERF_REG_X86_IP),
+	REG(FLAGS, PERF_REG_X86_FLAGS),
+	REG(CS, PERF_REG_X86_CS),
+	REG(SS, PERF_REG_X86_SS),
+#ifdef HAVE_ARCH_X86_64_SUPPORT
+	REG(R8, PERF_REG_X86_R8),
+	REG(R9, PERF_REG_X86_R9),
+	REG(R10, PERF_REG_X86_R10),
+	REG(R11, PERF_REG_X86_R11),
+	REG(R12, PERF_REG_X86_R12),
+	REG(R13, PERF_REG_X86_R13),
+	REG(R14, PERF_REG_X86_R14),
+	REG(R15, PERF_REG_X86_R15),
+#endif
+	REG_END
+};
diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
index 980dbf7..92c1fff 100644
--- a/tools/perf/util/perf_regs.h
+++ b/tools/perf/util/perf_regs.h
@@ -5,6 +5,13 @@
 
 struct regs_dump;
 
+struct sample_reg {
+	const char *name;
+	uint64_t mask;
+};
+
+extern const struct sample_reg sample_reg_masks[];
+
 #ifdef HAVE_PERF_REGS_SUPPORT
 #include <perf_regs.h>
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [tip:perf/urgent] perf record: Add ability to name registers to record
  2015-08-31 16:41 ` [PATCH v2 3/4] perf record: add ability to name registers to record Stephane Eranian
  2015-08-31 21:02   ` Arnaldo Carvalho de Melo
@ 2015-09-01  8:32   ` tip-bot for Stephane Eranian
  1 sibling, 0 replies; 13+ messages in thread
From: tip-bot for Stephane Eranian @ 2015-09-01  8:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: namhyung, hpa, acme, dsahern, peterz, linux-kernel, ak, tglx,
	jolsa, adrian.hunter, kan.liang, mingo, eranian

Commit-ID:  bcc84ec65ad1bd9f777a1fade6f8e5e0c5808fa5
Gitweb:     http://git.kernel.org/tip/bcc84ec65ad1bd9f777a1fade6f8e5e0c5808fa5
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Mon, 31 Aug 2015 18:41:12 +0200
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 31 Aug 2015 18:01:33 -0300

perf record: Add ability to name registers to record

This patch modifies the -I/--int-regs option to enablepassing the name
of the registers to sample on interrupt. Registers can be specified by
their symbolic names. For instance on x86, --intr-regs=ax,si.

The motivation is to reduce the size of the perf.data file and the
overhead of sampling by only collecting the registers useful to a
specific analysis. For instance, for value profiling, sampling only the
registers used to passed arguements to functions.

With no parameter, the --intr-regs still records all possible registers
based on the architecture.

To name registers, it is necessary to use the long form of the option,
i.e., --intr-regs:

  $ perf record --intr-regs=si,di,r8,r9 .....

To record any possible registers:

  $ perf record -I .....
  $ perf report --intr-regs ...

To display the register, one can use perf report -D

To list the available registers:

  $ perf record --intr-regs=\?
  available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15

Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1441039273-16260-4-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-record.txt |  6 ++-
 tools/perf/builtin-record.c              |  7 +++-
 tools/perf/perf.h                        |  2 +-
 tools/perf/util/Build                    |  1 +
 tools/perf/util/evsel.c                  |  2 +-
 tools/perf/util/parse-regs-options.c     | 71 ++++++++++++++++++++++++++++++++
 tools/perf/util/parse-regs-options.h     |  5 +++
 7 files changed, 89 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 347a273..2e9ce77 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -276,7 +276,11 @@ filter out the startup phase of the program, which is often very different.
 --intr-regs::
 Capture machine state (registers) at interrupt, i.e., on counter overflows for
 each sample. List of captured registers depends on the architecture. This option
-is off by default.
+is off by default. It is possible to select the registers to sample using their
+symbolic names, e.g. on x86, ax, si. To list the available registers use
+--intr-regs=\?. To name registers, pass a comma separated list such as
+--intr-regs=ax,bx. The list of register is architecture dependent.
+
 
 --running-time::
 Record running and enabled time for read events (:S)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1d14f38..142eeb3 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -27,8 +27,10 @@
 #include "util/cpumap.h"
 #include "util/thread_map.h"
 #include "util/data.h"
+#include "util/perf_regs.h"
 #include "util/auxtrace.h"
 #include "util/parse-branch-options.h"
+#include "util/parse-regs-options.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -1080,8 +1082,9 @@ struct option __record_options[] = {
 		    "sample transaction flags (special events only)"),
 	OPT_BOOLEAN(0, "per-thread", &record.opts.target.per_thread,
 		    "use per-thread mmaps"),
-	OPT_BOOLEAN('I', "intr-regs", &record.opts.sample_intr_regs,
-		    "Sample machine registers on interrupt"),
+	OPT_CALLBACK_OPTARG('I', "intr-regs", &record.opts.sample_intr_regs, NULL, "any register",
+		    "sample selected machine registers on interrupt,"
+		    " use -I ? to list register names", parse_regs),
 	OPT_BOOLEAN(0, "running-time", &record.opts.running_time,
 		    "Record running/enabled time of read (:S) events"),
 	OPT_CALLBACK('k', "clockid", &record.opts,
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index cccb4cf..90129ac 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -54,7 +54,6 @@ struct record_opts {
 	bool	     sample_time_set;
 	bool	     callgraph_set;
 	bool	     period;
-	bool	     sample_intr_regs;
 	bool	     running_time;
 	bool	     full_auxtrace;
 	bool	     auxtrace_snapshot_mode;
@@ -64,6 +63,7 @@ struct record_opts {
 	unsigned int auxtrace_mmap_pages;
 	unsigned int user_freq;
 	u64          branch_stack;
+	u64	     sample_intr_regs;
 	u64	     default_interval;
 	u64	     user_interval;
 	size_t	     auxtrace_snapshot_size;
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index e79e452..349bc96 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -83,6 +83,7 @@ libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
 libperf-$(CONFIG_AUXTRACE) += intel-pt.o
 libperf-$(CONFIG_AUXTRACE) += intel-bts.o
 libperf-y += parse-branch-options.o
+libperf-y += parse-regs-options.o
 
 libperf-$(CONFIG_LIBELF) += symbol-elf.o
 libperf-$(CONFIG_LIBELF) += probe-file.o
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index bac25f4..c53f791 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -787,7 +787,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 		perf_evsel__config_callgraph(evsel, opts, &callchain_param);
 
 	if (opts->sample_intr_regs) {
-		attr->sample_regs_intr = PERF_REGS_MASK;
+		attr->sample_regs_intr = opts->sample_intr_regs;
 		perf_evsel__set_sample_bit(evsel, REGS_INTR);
 	}
 
diff --git a/tools/perf/util/parse-regs-options.c b/tools/perf/util/parse-regs-options.c
new file mode 100644
index 0000000..4f2c1c2
--- /dev/null
+++ b/tools/perf/util/parse-regs-options.c
@@ -0,0 +1,71 @@
+#include "perf.h"
+#include "util/util.h"
+#include "util/debug.h"
+#include "util/parse-options.h"
+#include "util/parse-regs-options.h"
+
+int
+parse_regs(const struct option *opt, const char *str, int unset)
+{
+	uint64_t *mode = (uint64_t *)opt->value;
+	const struct sample_reg *r;
+	char *s, *os = NULL, *p;
+	int ret = -1;
+
+	if (unset)
+		return 0;
+
+	/*
+	 * cannot set it twice
+	 */
+	if (*mode)
+		return -1;
+
+	/* str may be NULL in case no arg is passed to -I */
+	if (str) {
+		/* because str is read-only */
+		s = os = strdup(str);
+		if (!s)
+			return -1;
+
+		for (;;) {
+			p = strchr(s, ',');
+			if (p)
+				*p = '\0';
+
+			if (!strcmp(s, "?")) {
+				fprintf(stderr, "available registers: ");
+				for (r = sample_reg_masks; r->name; r++) {
+					fprintf(stderr, "%s ", r->name);
+				}
+				fputc('\n', stderr);
+				/* just printing available regs */
+				return -1;
+			}
+			for (r = sample_reg_masks; r->name; r++) {
+				if (!strcasecmp(s, r->name))
+					break;
+			}
+			if (!r->name) {
+				ui__warning("unknown register %s,"
+					    " check man page\n", s);
+				goto error;
+			}
+
+			*mode |= r->mask;
+
+			if (!p)
+				break;
+
+			s = p + 1;
+		}
+	}
+	ret = 0;
+
+	/* default to all possible regs */
+	if (*mode == 0)
+		*mode = PERF_REGS_MASK;
+error:
+	free(os);
+	return ret;
+}
diff --git a/tools/perf/util/parse-regs-options.h b/tools/perf/util/parse-regs-options.h
new file mode 100644
index 0000000..7d762b1
--- /dev/null
+++ b/tools/perf/util/parse-regs-options.h
@@ -0,0 +1,5 @@
+#ifndef _PERF_PARSE_REGS_OPTIONS_H
+#define _PERF_PARSE_REGS_OPTIONS_H 1
+struct option;
+int parse_regs(const struct option *opt, const char *str, int unset);
+#endif /* _PERF_PARSE_REGS_OPTIONS_H */

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [tip:perf/core] perf script: Enable printing of branch stack
  2015-08-31 16:41 ` [PATCH v2 4/4] perf script: enable printing of branch stack Stephane Eranian
  2015-08-31 17:05   ` Andi Kleen
@ 2015-10-30  9:13   ` tip-bot for Stephane Eranian
  1 sibling, 0 replies; 13+ messages in thread
From: tip-bot for Stephane Eranian @ 2015-10-30  9:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: cyfmxc, mingo, kan.liang, dsahern, namhyung, tglx, eranian, ak,
	acme, peterz, jolsa, linux-kernel, hpa, adrian.hunter

Commit-ID:  dc323ce8e72d6d1beb9af9bbd29c4d55ce3d7fb0
Gitweb:     http://git.kernel.org/tip/dc323ce8e72d6d1beb9af9bbd29c4d55ce3d7fb0
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Mon, 31 Aug 2015 18:41:13 +0200
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 29 Oct 2015 17:16:20 -0300

perf script: Enable printing of branch stack

This patch improves perf script by enabling printing of the
branch stack via the 'brstack' and 'brstacksym' arguments to
the field selection option -F. The option is off by default
and operates only if the perf.data file has branch stack content.

The branches are printed in to/from pairs. The most recent branch
is printed first. The number of branch entries vary based on the
underlying hardware and filtering used.

The brstack prints FROM/TO addresses in raw hexadecimal format.
The brstacksym prints FROM/TO addresses in symbolic form wherever
possible.

 $ perf script -F ip,brstack
  5d3000 0x401aa0/0x5d2000/M/-/-/-/0 ...

 $ perf script -F ip,brstacksym
  4011e0 noploop+0x0/noploop+0x0/P/-/-/0

The notation F/T/M/X/A/C describes the attributes of the branch.
F=from, T=to, M/P=misprediction/prediction, X=TSX, A=TSX abort, C=cycles (SKL)

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yuanfang Chen <cyfmxc@gmail.com>
Link: http://lkml.kernel.org/r/1441039273-16260-5-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt | 14 +++++-
 tools/perf/builtin-script.c              | 82 +++++++++++++++++++++++++++++++-
 2 files changed, 93 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index b3b42f9..382ddfb 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -112,11 +112,11 @@ OPTIONS
 --debug-mode::
         Do various checks like samples ordering and lost events.
 
--f::
+-F::
 --fields::
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
-	srcline, period, iregs, flags.
+	srcline, period, iregs, brstack, brstacksym, flags.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -f sw:comm,tid,time,ip,sym  and -f trace:time,cpu,trace
@@ -175,6 +175,16 @@ OPTIONS
 	Finally, a user may not set fields to none for all event types.
 	i.e., -f "" is not allowed.
 
+	The brstack output includes branch related information with raw addresses using the
+	/v/v/v/v/ syntax in the following order:
+	FROM: branch source instruction
+	TO  : branch target instruction
+        M/P/-: M=branch target mispredicted or branch direction was mispredicted, P=target predicted or direction predicted, -=not supported
+	X/- : X=branch inside a transactional region, -=not in transaction region or not supported
+	A/- : A=TSX abort entry, -=not aborted region or not supported
+
+	The brstacksym is identical to brstack, except that the FROM and TO addresses are printed in a symbolic form if possible.
+
 -k::
 --vmlinux=<file>::
         vmlinux pathname
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 278acb2..72b5deb 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -51,6 +51,8 @@ enum perf_output_field {
 	PERF_OUTPUT_SRCLINE         = 1U << 12,
 	PERF_OUTPUT_PERIOD          = 1U << 13,
 	PERF_OUTPUT_IREGS	    = 1U << 14,
+	PERF_OUTPUT_BRSTACK	    = 1U << 15,
+	PERF_OUTPUT_BRSTACKSYM	    = 1U << 16,
 };
 
 struct output_option {
@@ -72,6 +74,8 @@ struct output_option {
 	{.str = "srcline", .field = PERF_OUTPUT_SRCLINE},
 	{.str = "period", .field = PERF_OUTPUT_PERIOD},
 	{.str = "iregs", .field = PERF_OUTPUT_IREGS},
+	{.str = "brstack", .field = PERF_OUTPUT_BRSTACK},
+	{.str = "brstacksym", .field = PERF_OUTPUT_BRSTACKSYM},
 };
 
 /* default set to maintain compatibility with current format */
@@ -425,6 +429,77 @@ static void print_sample_start(struct perf_sample *sample,
 	}
 }
 
+static inline char
+mispred_str(struct branch_entry *br)
+{
+	if (!(br->flags.mispred  || br->flags.predicted))
+		return '-';
+
+	return br->flags.predicted ? 'P' : 'M';
+}
+
+static void print_sample_brstack(union perf_event *event __maybe_unused,
+			  struct perf_sample *sample,
+			  struct thread *thread __maybe_unused,
+			  struct perf_event_attr *attr __maybe_unused)
+{
+	struct branch_stack *br = sample->branch_stack;
+	u64 i;
+
+	if (!(br && br->nr))
+		return;
+
+	for (i = 0; i < br->nr; i++) {
+		printf(" 0x%"PRIx64"/0x%"PRIx64"/%c/%c/%c/%d ",
+			br->entries[i].from,
+			br->entries[i].to,
+			mispred_str( br->entries + i),
+			br->entries[i].flags.in_tx? 'X' : '-',
+			br->entries[i].flags.abort? 'A' : '-',
+			br->entries[i].flags.cycles);
+	}
+}
+
+static void print_sample_brstacksym(union perf_event *event __maybe_unused,
+			  struct perf_sample *sample,
+			  struct thread *thread __maybe_unused,
+			  struct perf_event_attr *attr __maybe_unused)
+{
+	struct branch_stack *br = sample->branch_stack;
+	struct addr_location alf, alt;
+	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+	u64 i, from, to;
+
+	if (!(br && br->nr))
+		return;
+
+	for (i = 0; i < br->nr; i++) {
+
+		memset(&alf, 0, sizeof(alf));
+		memset(&alt, 0, sizeof(alt));
+		from = br->entries[i].from;
+		to   = br->entries[i].to;
+
+		thread__find_addr_map(thread, cpumode, MAP__FUNCTION, from, &alf);
+		if (alf.map)
+			alf.sym = map__find_symbol(alf.map, alf.addr, NULL);
+
+		thread__find_addr_map(thread, cpumode, MAP__FUNCTION, to, &alt);
+		if (alt.map)
+			alt.sym = map__find_symbol(alt.map, alt.addr, NULL);
+
+		symbol__fprintf_symname_offs(alf.sym, &alf, stdout);
+		putchar('/');
+		symbol__fprintf_symname_offs(alt.sym, &alt, stdout);
+		printf("/%c/%c/%c/%d ",
+			mispred_str( br->entries + i),
+			br->entries[i].flags.in_tx? 'X' : '-',
+			br->entries[i].flags.abort? 'A' : '-',
+			br->entries[i].flags.cycles);
+	}
+}
+
+
 static void print_sample_addr(union perf_event *event,
 			  struct perf_sample *sample,
 			  struct thread *thread,
@@ -560,6 +635,11 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 	if (PRINT_FIELD(IREGS))
 		print_sample_iregs(event, sample, thread, attr);
 
+	if (PRINT_FIELD(BRSTACK))
+		print_sample_brstack(event, sample, thread, attr);
+	else if (PRINT_FIELD(BRSTACKSYM))
+		print_sample_brstacksym(event, sample, thread, attr);
+
 	printf("\n");
 }
 
@@ -1681,7 +1761,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "comma separated output fields prepend with 'type:'. "
 		     "Valid types: hw,sw,trace,raw. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
-		     "addr,symoff,period,iregs,flags", parse_output_fields),
+		     "addr,symoff,period,iregs,brstack,brstacksym,flags", parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-10-30  9:14 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-31 16:41 [PATCH v2 0/4] perf: improve script and record for iregs and brstack Stephane Eranian
2015-08-31 16:41 ` [PATCH v2 1/4] perf script: enable printing of interrupted machine state Stephane Eranian
2015-08-31 20:51   ` Arnaldo Carvalho de Melo
2015-09-01  8:31   ` [tip:perf/urgent] perf script: Enable " tip-bot for Stephane Eranian
2015-08-31 16:41 ` [PATCH v2 2/4] perf/x86: add list of register names Stephane Eranian
2015-09-01  8:31   ` [tip:perf/urgent] perf/x86: Add " tip-bot for Stephane Eranian
2015-08-31 16:41 ` [PATCH v2 3/4] perf record: add ability to name registers to record Stephane Eranian
2015-08-31 21:02   ` Arnaldo Carvalho de Melo
2015-09-01  8:32   ` [tip:perf/urgent] perf record: Add " tip-bot for Stephane Eranian
2015-08-31 16:41 ` [PATCH v2 4/4] perf script: enable printing of branch stack Stephane Eranian
2015-08-31 17:05   ` Andi Kleen
2015-08-31 17:08     ` Andi Kleen
2015-10-30  9:13   ` [tip:perf/core] perf script: Enable " tip-bot for Stephane Eranian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).