linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 00/27] perf/core improvements and fixes
@ 2014-06-01 13:31 Jiri Olsa
  2014-06-01 13:31 ` [PATCH 01/27] perf tools: Introduce hists__inc_nr_samples() Jiri Olsa
                   ` (27 more replies)
  0 siblings, 28 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo, Arun Sharma,
	David Ahern, Don Zickus, Frederic Weisbecker, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Rodrigo Campos, Stephane Eranian,
	Jiri Olsa

hi Ingo,
please consider pulling

thanks,
jirka


The following changes since commit e450f90e8c7d0bf70519223c1b848446ae63f313:

  Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core (2014-05-22 11:37:40 +0200)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git tags/perf-core-for-mingo

for you to fetch changes up to 0506aecce999d4370b979892f88cf1118cfe8dcb:

  perf tests: Add a test case for cumulating callchains (2014-06-01 14:35:11 +0200)

----------------------------------------------------------------
perf/core improvements and fixes:

. Add support to accumulate hist periods (Namhyung Kim)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>

----------------------------------------------------------------
Namhyung Kim (27):
      perf tools: Introduce hists__inc_nr_samples()
      perf tools: Introduce struct hist_entry_iter
      perf hists: Add support for accumulated stat of hist entry
      perf hists: Check if accumulated when adding a hist entry
      perf hists: Accumulate hist entry stat based on the callchain
      perf tools: Update cpumode for each cumulative entry
      perf report: Cache cumulative callchains
      perf callchain: Add callchain_cursor_snapshot()
      perf tools: Save callchain info for each cumulative entry
      perf ui/hist: Add support to accumulated hist stat
      perf ui/browser: Add support to accumulated hist stat
      perf ui/gtk: Add support to accumulated hist stat
      perf tools: Apply percent-limit to cumulative percentage
      perf tools: Add more hpp helper functions
      perf report: Add --children option
      perf report: Add report.children config option
      perf tools: Do not auto-remove Children column if --fields given
      perf tools: Add callback function to hist_entry_iter
      perf top: Convert to hist_entry_iter
      perf top: Add --children option
      perf top: Add top.children config option
      perf tools: Enable --children option by default
      perf ui/stdio: Fix invalid percentage value of cumulated hist entries
      perf ui/gtk: Fix callchain display
      perf tools: Reset output/sort order to default
      perf tests: Define and use symbolic names for fake symbols
      perf tests: Add a test case for cumulating callchains

 tools/perf/Documentation/perf-report.txt |   7 +-
 tools/perf/Documentation/perf-top.txt    |   8 +-
 tools/perf/Makefile.perf                 |   1 +
 tools/perf/builtin-annotate.c            |   5 +-
 tools/perf/builtin-diff.c                |   2 +-
 tools/perf/builtin-report.c              | 210 +++------
 tools/perf/builtin-sched.c               |   2 +-
 tools/perf/builtin-top.c                 |  90 ++--
 tools/perf/tests/builtin-test.c          |   4 +
 tools/perf/tests/hists_common.c          |  52 ++-
 tools/perf/tests/hists_common.h          |  32 +-
 tools/perf/tests/hists_cumulate.c        | 726 +++++++++++++++++++++++++++++++
 tools/perf/tests/hists_filter.c          |  39 +-
 tools/perf/tests/hists_link.c            |  36 +-
 tools/perf/tests/hists_output.c          |  31 +-
 tools/perf/tests/tests.h                 |   1 +
 tools/perf/ui/browsers/hists.c           |  65 +--
 tools/perf/ui/gtk/hists.c                |  33 +-
 tools/perf/ui/hist.c                     | 119 +++++
 tools/perf/ui/stdio/hist.c               |   8 +-
 tools/perf/util/callchain.c              |  45 +-
 tools/perf/util/callchain.h              |  11 +
 tools/perf/util/hist.c                   | 481 +++++++++++++++++++-
 tools/perf/util/hist.h                   |  49 ++-
 tools/perf/util/sort.c                   |   4 +
 tools/perf/util/sort.h                   |  18 +-
 tools/perf/util/symbol.c                 |  11 +-
 tools/perf/util/symbol.h                 |   1 +
 28 files changed, 1768 insertions(+), 323 deletions(-)
 create mode 100644 tools/perf/tests/hists_cumulate.c

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 01/27] perf tools: Introduce hists__inc_nr_samples()
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 02/27] perf tools: Introduce struct hist_entry_iter Jiri Olsa
                   ` (26 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

There're some duplicate code for counting number of samples.  Add
hists__inc_nr_samples() and reuse it.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1401335910-16832-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-annotate.c   | 2 +-
 tools/perf/builtin-report.c     | 4 +---
 tools/perf/builtin-sched.c      | 2 +-
 tools/perf/builtin-top.c        | 5 +----
 tools/perf/tests/hists_filter.c | 4 +---
 tools/perf/util/hist.c          | 7 +++++++
 tools/perf/util/hist.h          | 1 +
 7 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index d30d2c2..bf52461 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -70,7 +70,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
 		return -ENOMEM;
 
 	ret = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
-	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
+	hists__inc_nr_samples(&evsel->hists, true);
 	return ret;
 }
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index bc0eec1..4a3b84d 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -92,9 +92,7 @@ static void report__inc_stats(struct report *rep, struct hist_entry *he)
 	 * counted in perf_session_deliver_event().  The dump_trace
 	 * requires this info is ready before going to the output tree.
 	 */
-	hists__inc_nr_events(he->hists, PERF_RECORD_SAMPLE);
-	if (!he->filtered)
-		he->hists->stats.nr_non_filtered_samples++;
+	hists__inc_nr_samples(he->hists, he->filtered);
 }
 
 static int report__add_mem_hist_entry(struct report *rep, struct addr_location *al,
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index d717683..c38d06c 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1428,7 +1428,7 @@ static int perf_sched__process_tracepoint_sample(struct perf_tool *tool __maybe_
 	int err = 0;
 
 	evsel->hists.stats.total_period += sample->period;
-	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
+	hists__inc_nr_samples(&evsel->hists, true);
 
 	if (evsel->handler != NULL) {
 		tracepoint_handler f = evsel->handler;
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5b389ce..5130926 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -252,10 +252,7 @@ static struct hist_entry *perf_evsel__add_hist_entry(struct perf_evsel *evsel,
 	if (he == NULL)
 		return NULL;
 
-	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
-	if (!he->filtered)
-		evsel->hists.stats.nr_non_filtered_samples++;
-
+	hists__inc_nr_samples(&evsel->hists, he->filtered);
 	return he;
 }
 
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index c5ba924..0a71ef4 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -85,9 +85,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 			fake_samples[i].map = al.map;
 			fake_samples[i].sym = al.sym;
 
-			hists__inc_nr_events(he->hists, PERF_RECORD_SAMPLE);
-			if (!he->filtered)
-				he->hists->stats.nr_non_filtered_samples++;
+			hists__inc_nr_samples(he->hists, he->filtered);
 		}
 	}
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b262b44..5943ba6 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -800,6 +800,13 @@ void hists__inc_nr_events(struct hists *hists, u32 type)
 	events_stats__inc(&hists->stats, type);
 }
 
+void hists__inc_nr_samples(struct hists *hists, bool filtered)
+{
+	events_stats__inc(&hists->stats, PERF_RECORD_SAMPLE);
+	if (!filtered)
+		hists->stats.nr_non_filtered_samples++;
+}
+
 static struct hist_entry *hists__add_dummy_entry(struct hists *hists,
 						 struct hist_entry *pair)
 {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index a8418d1..03ae1db 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -119,6 +119,7 @@ u64 hists__total_period(struct hists *hists);
 void hists__reset_stats(struct hists *hists);
 void hists__inc_stats(struct hists *hists, struct hist_entry *h);
 void hists__inc_nr_events(struct hists *hists, u32 type);
+void hists__inc_nr_samples(struct hists *hists, bool filtered);
 void events_stats__inc(struct events_stats *stats, u32 type);
 size_t events_stats__fprintf(struct events_stats *stats, FILE *fp);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 02/27] perf tools: Introduce struct hist_entry_iter
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
  2014-06-01 13:31 ` [PATCH 01/27] perf tools: Introduce hists__inc_nr_samples() Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 03/27] perf hists: Add support for accumulated stat of hist entry Jiri Olsa
                   ` (25 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Namhyung Kim, David Ahern, Frederic Weisbecker,
	Stephane Eranian, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

There're some duplicate code when adding hist entries.  They are
different in that some have branch info or mem info but generally do
same thing.  So introduce new struct hist_entry_iter and add callbacks
to customize each case in general way.

The new perf_evsel__add_entry() function will look like:

  iter->prepare_entry();
  iter->add_single_entry();

  while (iter->next_entry())
    iter->add_next_entry();

  iter->finish_entry();

This will help further work like the cumulative callchain patchset.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1401335910-16832-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-report.c     | 192 ++++----------------------
 tools/perf/tests/hists_filter.c |  16 ++-
 tools/perf/tests/hists_output.c |  11 +-
 tools/perf/util/hist.c          | 299 ++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/hist.h          |  33 +++++
 5 files changed, 372 insertions(+), 179 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 4a3b84d..3201bdf 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -76,163 +76,16 @@ static int report__config(const char *var, const char *value, void *cb)
 	return perf_default_config(var, value, cb);
 }
 
-static void report__inc_stats(struct report *rep, struct hist_entry *he)
+static void report__inc_stats(struct report *rep,
+			      struct hist_entry *he __maybe_unused)
 {
 	/*
-	 * The @he is either of a newly created one or an existing one
-	 * merging current sample.  We only want to count a new one so
-	 * checking ->nr_events being 1.
+	 * We cannot access @he at this time.  Just assume it's a new entry.
+	 * It'll be fixed once we have a callback mechanism in hist_iter.
 	 */
-	if (he->stat.nr_events == 1)
-		rep->nr_entries++;
-
-	/*
-	 * Only counts number of samples at this stage as it's more
-	 * natural to do it here and non-sample events are also
-	 * counted in perf_session_deliver_event().  The dump_trace
-	 * requires this info is ready before going to the output tree.
-	 */
-	hists__inc_nr_samples(he->hists, he->filtered);
-}
-
-static int report__add_mem_hist_entry(struct report *rep, struct addr_location *al,
-				      struct perf_sample *sample, struct perf_evsel *evsel)
-{
-	struct symbol *parent = NULL;
-	struct hist_entry *he;
-	struct mem_info *mi, *mx;
-	uint64_t cost;
-	int err = sample__resolve_callchain(sample, &parent, evsel, al, rep->max_stack);
-
-	if (err)
-		return err;
-
-	mi = sample__resolve_mem(sample, al);
-	if (!mi)
-		return -ENOMEM;
-
-	if (rep->hide_unresolved && !al->sym)
-		return 0;
-
-	cost = sample->weight;
-	if (!cost)
-		cost = 1;
-
-	/*
-	 * must pass period=weight in order to get the correct
-	 * sorting from hists__collapse_resort() which is solely
-	 * based on periods. We want sorting be done on nr_events * weight
-	 * and this is indirectly achieved by passing period=weight here
-	 * and the he_stat__add_period() function.
-	 */
-	he = __hists__add_entry(&evsel->hists, al, parent, NULL, mi,
-				cost, cost, 0);
-	if (!he)
-		return -ENOMEM;
-
-	if (ui__has_annotation()) {
-		err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
-		if (err)
-			goto out;
-
-		mx = he->mem_info;
-		err = addr_map_symbol__inc_samples(&mx->daddr, evsel->idx);
-		if (err)
-			goto out;
-	}
-
-	report__inc_stats(rep, he);
-
-	err = hist_entry__append_callchain(he, sample);
-out:
-	return err;
-}
-
-static int report__add_branch_hist_entry(struct report *rep, struct addr_location *al,
-					 struct perf_sample *sample, struct perf_evsel *evsel)
-{
-	struct symbol *parent = NULL;
-	unsigned i;
-	struct hist_entry *he;
-	struct branch_info *bi, *bx;
-	int err = sample__resolve_callchain(sample, &parent, evsel, al, rep->max_stack);
-
-	if (err)
-		return err;
-
-	bi = sample__resolve_bstack(sample, al);
-	if (!bi)
-		return -ENOMEM;
-
-	for (i = 0; i < sample->branch_stack->nr; i++) {
-		if (rep->hide_unresolved && !(bi[i].from.sym && bi[i].to.sym))
-			continue;
-
-		err = -ENOMEM;
-
-		/* overwrite the 'al' to branch-to info */
-		al->map = bi[i].to.map;
-		al->sym = bi[i].to.sym;
-		al->addr = bi[i].to.addr;
-		/*
-		 * The report shows the percentage of total branches captured
-		 * and not events sampled. Thus we use a pseudo period of 1.
-		 */
-		he = __hists__add_entry(&evsel->hists, al, parent, &bi[i], NULL,
-					1, 1, 0);
-		if (he) {
-			if (ui__has_annotation()) {
-				bx = he->branch_info;
-				err = addr_map_symbol__inc_samples(&bx->from,
-								   evsel->idx);
-				if (err)
-					goto out;
-
-				err = addr_map_symbol__inc_samples(&bx->to,
-								   evsel->idx);
-				if (err)
-					goto out;
-			}
-			report__inc_stats(rep, he);
-		} else
-			goto out;
-	}
-	err = 0;
-out:
-	free(bi);
-	return err;
+	rep->nr_entries++;
 }
 
-static int report__add_hist_entry(struct report *rep, struct perf_evsel *evsel,
-				  struct addr_location *al, struct perf_sample *sample)
-{
-	struct symbol *parent = NULL;
-	struct hist_entry *he;
-	int err = sample__resolve_callchain(sample, &parent, evsel, al, rep->max_stack);
-
-	if (err)
-		return err;
-
-	he = __hists__add_entry(&evsel->hists, al, parent, NULL, NULL,
-				sample->period, sample->weight,
-				sample->transaction);
-	if (he == NULL)
-		return -ENOMEM;
-
-	err = hist_entry__append_callchain(he, sample);
-	if (err)
-		goto out;
-
-	if (ui__has_annotation())
-		err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
-
-	report__inc_stats(rep, he);
-
-out:
-	return err;
-}
-
-
 static int process_sample_event(struct perf_tool *tool,
 				union perf_event *event,
 				struct perf_sample *sample,
@@ -241,6 +94,9 @@ static int process_sample_event(struct perf_tool *tool,
 {
 	struct report *rep = container_of(tool, struct report, tool);
 	struct addr_location al;
+	struct hist_entry_iter iter = {
+		.hide_unresolved = rep->hide_unresolved,
+	};
 	int ret;
 
 	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
@@ -255,22 +111,22 @@ static int process_sample_event(struct perf_tool *tool,
 	if (rep->cpu_list && !test_bit(sample->cpu, rep->cpu_bitmap))
 		return 0;
 
-	if (sort__mode == SORT_MODE__BRANCH) {
-		ret = report__add_branch_hist_entry(rep, &al, sample, evsel);
-		if (ret < 0)
-			pr_debug("problem adding lbr entry, skipping event\n");
-	} else if (rep->mem_mode == 1) {
-		ret = report__add_mem_hist_entry(rep, &al, sample, evsel);
-		if (ret < 0)
-			pr_debug("problem adding mem entry, skipping event\n");
-	} else {
-		if (al.map != NULL)
-			al.map->dso->hit = 1;
-
-		ret = report__add_hist_entry(rep, evsel, &al, sample);
-		if (ret < 0)
-			pr_debug("problem incrementing symbol period, skipping event\n");
-	}
+	if (sort__mode == SORT_MODE__BRANCH)
+		iter.ops = &hist_iter_branch;
+	else if (rep->mem_mode)
+		iter.ops = &hist_iter_mem;
+	else
+		iter.ops = &hist_iter_normal;
+
+	if (al.map != NULL)
+		al.map->dso->hit = 1;
+
+	report__inc_stats(rep, NULL);
+
+	ret = hist_entry_iter__add(&iter, &al, evsel, sample, rep->max_stack);
+	if (ret < 0)
+		pr_debug("problem adding hist entry, skipping event\n");
+
 	return ret;
 }
 
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 0a71ef4..76b02e1 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -42,11 +42,11 @@ static struct sample fake_samples[] = {
 	{ .pid = 300, .ip = 0xf0000 + 800, },
 };
 
-static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
+static int add_hist_entries(struct perf_evlist *evlist,
+			    struct machine *machine __maybe_unused)
 {
 	struct perf_evsel *evsel;
 	struct addr_location al;
-	struct hist_entry *he;
 	struct perf_sample sample = { .cpu = 0, };
 	size_t i;
 
@@ -62,6 +62,10 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 					.misc = PERF_RECORD_MISC_USER,
 				},
 			};
+			struct hist_entry_iter iter = {
+				.ops = &hist_iter_normal,
+				.hide_unresolved = false,
+			};
 
 			/* make sure it has no filter at first */
 			evsel->hists.thread_filter = NULL;
@@ -71,21 +75,19 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 			sample.pid = fake_samples[i].pid;
 			sample.tid = fake_samples[i].pid;
 			sample.ip = fake_samples[i].ip;
+			sample.period = 100;
 
 			if (perf_event__preprocess_sample(&event, machine, &al,
 							  &sample) < 0)
 				goto out;
 
-			he = __hists__add_entry(&evsel->hists, &al, NULL,
-						NULL, NULL, 100, 1, 0);
-			if (he == NULL)
+			if (hist_entry_iter__add(&iter, &al, evsel, &sample,
+						 PERF_MAX_STACK_DEPTH) < 0)
 				goto out;
 
 			fake_samples[i].thread = al.thread;
 			fake_samples[i].map = al.map;
 			fake_samples[i].sym = al.sym;
-
-			hists__inc_nr_samples(he->hists, he->filtered);
 		}
 	}
 
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index a168505..1308f88 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -46,7 +46,7 @@ static struct sample fake_samples[] = {
 static int add_hist_entries(struct hists *hists, struct machine *machine)
 {
 	struct addr_location al;
-	struct hist_entry *he;
+	struct perf_evsel *evsel = hists_to_evsel(hists);
 	struct perf_sample sample = { .period = 100, };
 	size_t i;
 
@@ -56,6 +56,10 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 				.misc = PERF_RECORD_MISC_USER,
 			},
 		};
+		struct hist_entry_iter iter = {
+			.ops = &hist_iter_normal,
+			.hide_unresolved = false,
+		};
 
 		sample.cpu = fake_samples[i].cpu;
 		sample.pid = fake_samples[i].pid;
@@ -66,9 +70,8 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 						  &sample) < 0)
 			goto out;
 
-		he = __hists__add_entry(hists, &al, NULL, NULL, NULL,
-					sample.period, 1, 0);
-		if (he == NULL)
+		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
+					 PERF_MAX_STACK_DEPTH) < 0)
 			goto out;
 
 		fake_samples[i].thread = al.thread;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 5943ba6..d866235 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -4,6 +4,7 @@
 #include "session.h"
 #include "sort.h"
 #include "evsel.h"
+#include "annotate.h"
 #include <math.h>
 
 static bool hists__filter_entry_by_dso(struct hists *hists,
@@ -429,6 +430,304 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 	return add_hist_entry(hists, &entry, al);
 }
 
+static int
+iter_next_nop_entry(struct hist_entry_iter *iter __maybe_unused,
+		    struct addr_location *al __maybe_unused)
+{
+	return 0;
+}
+
+static int
+iter_add_next_nop_entry(struct hist_entry_iter *iter __maybe_unused,
+			struct addr_location *al __maybe_unused)
+{
+	return 0;
+}
+
+static int
+iter_prepare_mem_entry(struct hist_entry_iter *iter, struct addr_location *al)
+{
+	struct perf_sample *sample = iter->sample;
+	struct mem_info *mi;
+
+	mi = sample__resolve_mem(sample, al);
+	if (mi == NULL)
+		return -ENOMEM;
+
+	iter->priv = mi;
+	return 0;
+}
+
+static int
+iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al)
+{
+	u64 cost;
+	struct mem_info *mi = iter->priv;
+	struct hist_entry *he;
+
+	if (mi == NULL)
+		return -EINVAL;
+
+	cost = iter->sample->weight;
+	if (!cost)
+		cost = 1;
+
+	/*
+	 * must pass period=weight in order to get the correct
+	 * sorting from hists__collapse_resort() which is solely
+	 * based on periods. We want sorting be done on nr_events * weight
+	 * and this is indirectly achieved by passing period=weight here
+	 * and the he_stat__add_period() function.
+	 */
+	he = __hists__add_entry(&iter->evsel->hists, al, iter->parent, NULL, mi,
+				cost, cost, 0);
+	if (!he)
+		return -ENOMEM;
+
+	iter->he = he;
+	return 0;
+}
+
+static int
+iter_finish_mem_entry(struct hist_entry_iter *iter, struct addr_location *al)
+{
+	struct perf_evsel *evsel = iter->evsel;
+	struct hist_entry *he = iter->he;
+	struct mem_info *mx;
+	int err = -EINVAL;
+
+	if (he == NULL)
+		goto out;
+
+	if (ui__has_annotation()) {
+		err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
+		if (err)
+			goto out;
+
+		mx = he->mem_info;
+		err = addr_map_symbol__inc_samples(&mx->daddr, evsel->idx);
+		if (err)
+			goto out;
+	}
+
+	hists__inc_nr_samples(&evsel->hists, he->filtered);
+
+	err = hist_entry__append_callchain(he, iter->sample);
+
+out:
+	/*
+	 * We don't need to free iter->priv (mem_info) here since
+	 * the mem info was either already freed in add_hist_entry() or
+	 * passed to a new hist entry by hist_entry__new().
+	 */
+	iter->priv = NULL;
+
+	iter->he = NULL;
+	return err;
+}
+
+static int
+iter_prepare_branch_entry(struct hist_entry_iter *iter, struct addr_location *al)
+{
+	struct branch_info *bi;
+	struct perf_sample *sample = iter->sample;
+
+	bi = sample__resolve_bstack(sample, al);
+	if (!bi)
+		return -ENOMEM;
+
+	iter->curr = 0;
+	iter->total = sample->branch_stack->nr;
+
+	iter->priv = bi;
+	return 0;
+}
+
+static int
+iter_add_single_branch_entry(struct hist_entry_iter *iter __maybe_unused,
+			     struct addr_location *al __maybe_unused)
+{
+	return 0;
+}
+
+static int
+iter_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *al)
+{
+	struct branch_info *bi = iter->priv;
+	int i = iter->curr;
+
+	if (bi == NULL)
+		return 0;
+
+	if (iter->curr >= iter->total)
+		return 0;
+
+	al->map = bi[i].to.map;
+	al->sym = bi[i].to.sym;
+	al->addr = bi[i].to.addr;
+	return 1;
+}
+
+static int
+iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *al)
+{
+	struct branch_info *bi, *bx;
+	struct perf_evsel *evsel = iter->evsel;
+	struct hist_entry *he = NULL;
+	int i = iter->curr;
+	int err = 0;
+
+	bi = iter->priv;
+
+	if (iter->hide_unresolved && !(bi[i].from.sym && bi[i].to.sym))
+		goto out;
+
+	/*
+	 * The report shows the percentage of total branches captured
+	 * and not events sampled. Thus we use a pseudo period of 1.
+	 */
+	he = __hists__add_entry(&evsel->hists, al, iter->parent, &bi[i], NULL,
+				1, 1, 0);
+	if (he == NULL)
+		return -ENOMEM;
+
+	if (ui__has_annotation()) {
+		bx = he->branch_info;
+		err = addr_map_symbol__inc_samples(&bx->from, evsel->idx);
+		if (err)
+			goto out;
+
+		err = addr_map_symbol__inc_samples(&bx->to, evsel->idx);
+		if (err)
+			goto out;
+	}
+
+	hists__inc_nr_samples(&evsel->hists, he->filtered);
+
+out:
+	iter->he = he;
+	iter->curr++;
+	return err;
+}
+
+static int
+iter_finish_branch_entry(struct hist_entry_iter *iter,
+			 struct addr_location *al __maybe_unused)
+{
+	zfree(&iter->priv);
+	iter->he = NULL;
+
+	return iter->curr >= iter->total ? 0 : -1;
+}
+
+static int
+iter_prepare_normal_entry(struct hist_entry_iter *iter __maybe_unused,
+			  struct addr_location *al __maybe_unused)
+{
+	return 0;
+}
+
+static int
+iter_add_single_normal_entry(struct hist_entry_iter *iter, struct addr_location *al)
+{
+	struct perf_evsel *evsel = iter->evsel;
+	struct perf_sample *sample = iter->sample;
+	struct hist_entry *he;
+
+	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
+				sample->period, sample->weight,
+				sample->transaction);
+	if (he == NULL)
+		return -ENOMEM;
+
+	iter->he = he;
+	return 0;
+}
+
+static int
+iter_finish_normal_entry(struct hist_entry_iter *iter, struct addr_location *al)
+{
+	int err;
+	struct hist_entry *he = iter->he;
+	struct perf_evsel *evsel = iter->evsel;
+	struct perf_sample *sample = iter->sample;
+
+	if (he == NULL)
+		return 0;
+
+	iter->he = NULL;
+
+	if (ui__has_annotation()) {
+		err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
+		if (err)
+			return err;
+	}
+
+	hists__inc_nr_samples(&evsel->hists, he->filtered);
+
+	return hist_entry__append_callchain(he, sample);
+}
+
+const struct hist_iter_ops hist_iter_mem = {
+	.prepare_entry 		= iter_prepare_mem_entry,
+	.add_single_entry 	= iter_add_single_mem_entry,
+	.next_entry 		= iter_next_nop_entry,
+	.add_next_entry 	= iter_add_next_nop_entry,
+	.finish_entry 		= iter_finish_mem_entry,
+};
+
+const struct hist_iter_ops hist_iter_branch = {
+	.prepare_entry 		= iter_prepare_branch_entry,
+	.add_single_entry 	= iter_add_single_branch_entry,
+	.next_entry 		= iter_next_branch_entry,
+	.add_next_entry 	= iter_add_next_branch_entry,
+	.finish_entry 		= iter_finish_branch_entry,
+};
+
+const struct hist_iter_ops hist_iter_normal = {
+	.prepare_entry 		= iter_prepare_normal_entry,
+	.add_single_entry 	= iter_add_single_normal_entry,
+	.next_entry 		= iter_next_nop_entry,
+	.add_next_entry 	= iter_add_next_nop_entry,
+	.finish_entry 		= iter_finish_normal_entry,
+};
+
+int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
+			 struct perf_evsel *evsel, struct perf_sample *sample,
+			 int max_stack_depth)
+{
+	int err, err2;
+
+	err = sample__resolve_callchain(sample, &iter->parent, evsel, al,
+					max_stack_depth);
+	if (err)
+		return err;
+
+	iter->evsel = evsel;
+	iter->sample = sample;
+
+	err = iter->ops->prepare_entry(iter, al);
+	if (err)
+		goto out;
+
+	err = iter->ops->add_single_entry(iter, al);
+	if (err)
+		goto out;
+
+	while (iter->ops->next_entry(iter, al)) {
+		err = iter->ops->add_next_entry(iter, al);
+		if (err)
+			break;
+	}
+
+out:
+	err2 = iter->ops->finish_entry(iter, al);
+	if (!err)
+		err = err2;
+
+	return err;
+}
+
 int64_t
 hist_entry__cmp(struct hist_entry *left, struct hist_entry *right)
 {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 03ae1db..8894f18 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -96,12 +96,45 @@ struct hists {
 	u16			col_len[HISTC_NR_COLS];
 };
 
+struct hist_entry_iter;
+
+struct hist_iter_ops {
+	int (*prepare_entry)(struct hist_entry_iter *, struct addr_location *);
+	int (*add_single_entry)(struct hist_entry_iter *, struct addr_location *);
+	int (*next_entry)(struct hist_entry_iter *, struct addr_location *);
+	int (*add_next_entry)(struct hist_entry_iter *, struct addr_location *);
+	int (*finish_entry)(struct hist_entry_iter *, struct addr_location *);
+};
+
+struct hist_entry_iter {
+	int total;
+	int curr;
+
+	bool hide_unresolved;
+
+	struct perf_evsel *evsel;
+	struct perf_sample *sample;
+	struct hist_entry *he;
+	struct symbol *parent;
+	void *priv;
+
+	const struct hist_iter_ops *ops;
+};
+
+extern const struct hist_iter_ops hist_iter_normal;
+extern const struct hist_iter_ops hist_iter_branch;
+extern const struct hist_iter_ops hist_iter_mem;
+
 struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct addr_location *al,
 				      struct symbol *parent,
 				      struct branch_info *bi,
 				      struct mem_info *mi, u64 period,
 				      u64 weight, u64 transaction);
+int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
+			 struct perf_evsel *evsel, struct perf_sample *sample,
+			 int max_stack_depth);
+
 int64_t hist_entry__cmp(struct hist_entry *left, struct hist_entry *right);
 int64_t hist_entry__collapse(struct hist_entry *left, struct hist_entry *right);
 int hist_entry__transaction_len(void);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 03/27] perf hists: Add support for accumulated stat of hist entry
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
  2014-06-01 13:31 ` [PATCH 01/27] perf tools: Introduce hists__inc_nr_samples() Jiri Olsa
  2014-06-01 13:31 ` [PATCH 02/27] perf tools: Introduce struct hist_entry_iter Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 04/27] perf hists: Check if accumulated when adding a " Jiri Olsa
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Maintain accumulated stat information in hist_entry->stat_acc if
symbol_conf.cumulate_callchain is set.  Fields in ->stat_acc have same
vaules initially, and will be updated as callchain is processed later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/hist.c   | 28 ++++++++++++++++++++++++++--
 tools/perf/util/sort.h   |  1 +
 tools/perf/util/symbol.h |  1 +
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index d866235..dfff2ee 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -232,6 +232,8 @@ static bool hists__decay_entry(struct hists *hists, struct hist_entry *he)
 		return true;
 
 	he_stat__decay(&he->stat);
+	if (symbol_conf.cumulate_callchain)
+		he_stat__decay(he->stat_acc);
 
 	diff = prev_period - he->stat.period;
 
@@ -279,12 +281,26 @@ void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel)
 
 static struct hist_entry *hist_entry__new(struct hist_entry *template)
 {
-	size_t callchain_size = symbol_conf.use_callchain ? sizeof(struct callchain_root) : 0;
-	struct hist_entry *he = zalloc(sizeof(*he) + callchain_size);
+	size_t callchain_size = 0;
+	struct hist_entry *he;
+
+	if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain)
+		callchain_size = sizeof(struct callchain_root);
+
+	he = zalloc(sizeof(*he) + callchain_size);
 
 	if (he != NULL) {
 		*he = *template;
 
+		if (symbol_conf.cumulate_callchain) {
+			he->stat_acc = malloc(sizeof(he->stat));
+			if (he->stat_acc == NULL) {
+				free(he);
+				return NULL;
+			}
+			memcpy(he->stat_acc, &he->stat, sizeof(he->stat));
+		}
+
 		if (he->ms.map)
 			he->ms.map->referenced = true;
 
@@ -296,6 +312,7 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
 			 */
 			he->branch_info = malloc(sizeof(*he->branch_info));
 			if (he->branch_info == NULL) {
+				free(he->stat_acc);
 				free(he);
 				return NULL;
 			}
@@ -359,6 +376,8 @@ static struct hist_entry *add_hist_entry(struct hists *hists,
 
 		if (!cmp) {
 			he_stat__add_period(&he->stat, period, weight);
+			if (symbol_conf.cumulate_callchain)
+				he_stat__add_period(he->stat_acc, period, weight);
 
 			/*
 			 * This mem info was allocated from sample__resolve_mem
@@ -394,6 +413,8 @@ static struct hist_entry *add_hist_entry(struct hists *hists,
 	rb_insert_color(&he->rb_node_in, hists->entries_in);
 out:
 	he_stat__add_cpumode_period(&he->stat, al->cpumode, period);
+	if (symbol_conf.cumulate_callchain)
+		he_stat__add_cpumode_period(he->stat_acc, al->cpumode, period);
 	return he;
 }
 
@@ -768,6 +789,7 @@ void hist_entry__free(struct hist_entry *he)
 {
 	zfree(&he->branch_info);
 	zfree(&he->mem_info);
+	zfree(&he->stat_acc);
 	free_srcline(he->srcline);
 	free(he);
 }
@@ -793,6 +815,8 @@ static bool hists__collapse_insert_entry(struct hists *hists __maybe_unused,
 
 		if (!cmp) {
 			he_stat__add_stat(&iter->stat, &he->stat);
+			if (symbol_conf.cumulate_callchain)
+				he_stat__add_stat(iter->stat_acc, he->stat_acc);
 
 			if (symbol_conf.use_callchain) {
 				callchain_cursor_reset(&callchain_cursor);
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 5f38d92..c9ffa03 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -82,6 +82,7 @@ struct hist_entry {
 		struct list_head head;
 	} pairs;
 	struct he_stat		stat;
+	struct he_stat		*stat_acc;
 	struct map_symbol	ms;
 	struct thread		*thread;
 	struct comm		*comm;
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 33ede53..615c752 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -109,6 +109,7 @@ struct symbol_conf {
 			show_nr_samples,
 			show_total_period,
 			use_callchain,
+			cumulate_callchain,
 			exclude_other,
 			show_cpu_utilization,
 			initialized,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 04/27] perf hists: Check if accumulated when adding a hist entry
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (2 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 03/27] perf hists: Add support for accumulated stat of hist entry Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 05/27] perf hists: Accumulate hist entry stat based on the callchain Jiri Olsa
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

To support callchain accumulation, @entry should be recognized if it's
accumulated or not when add_hist_entry() called.  The period of an
accumulated entry should be added to ->stat_acc but not ->stat. Add
@sample_self arg for that.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-5-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-annotate.c |  3 ++-
 tools/perf/builtin-diff.c     |  2 +-
 tools/perf/builtin-top.c      |  2 +-
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/util/hist.c        | 29 ++++++++++++++++++-----------
 tools/perf/util/hist.h        |  3 ++-
 6 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index bf52461..1ec429f 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -65,7 +65,8 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
 		return 0;
 	}
 
-	he = __hists__add_entry(&evsel->hists, al, NULL, NULL, NULL, 1, 1, 0);
+	he = __hists__add_entry(&evsel->hists, al, NULL, NULL, NULL, 1, 1, 0,
+				true);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 8bff543..9a5a035 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -315,7 +315,7 @@ static int hists__add_entry(struct hists *hists,
 			    u64 weight, u64 transaction)
 {
 	if (__hists__add_entry(hists, al, NULL, NULL, NULL, period, weight,
-			       transaction) != NULL)
+			       transaction, true) != NULL)
 		return 0;
 	return -ENOMEM;
 }
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5130926..12e2e12 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -247,7 +247,7 @@ static struct hist_entry *perf_evsel__add_hist_entry(struct perf_evsel *evsel,
 	pthread_mutex_lock(&evsel->hists.lock);
 	he = __hists__add_entry(&evsel->hists, al, NULL, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction);
+				sample->transaction, true);
 	pthread_mutex_unlock(&evsel->hists.lock);
 	if (he == NULL)
 		return NULL;
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 5ffa2c3..ca6693b 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -88,7 +88,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 				goto out;
 
 			he = __hists__add_entry(&evsel->hists, &al, NULL,
-						NULL, NULL, 1, 1, 0);
+						NULL, NULL, 1, 1, 0, true);
 			if (he == NULL)
 				goto out;
 
@@ -112,7 +112,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 				goto out;
 
 			he = __hists__add_entry(&evsel->hists, &al, NULL,
-						NULL, NULL, 1, 1, 0);
+						NULL, NULL, 1, 1, 0, true);
 			if (he == NULL)
 				goto out;
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index dfff2ee..b9facf3 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -279,7 +279,8 @@ void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel)
  * histogram, sorted on item, collects periods
  */
 
-static struct hist_entry *hist_entry__new(struct hist_entry *template)
+static struct hist_entry *hist_entry__new(struct hist_entry *template,
+					  bool sample_self)
 {
 	size_t callchain_size = 0;
 	struct hist_entry *he;
@@ -299,6 +300,8 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
 				return NULL;
 			}
 			memcpy(he->stat_acc, &he->stat, sizeof(he->stat));
+			if (!sample_self)
+				memset(&he->stat, 0, sizeof(he->stat));
 		}
 
 		if (he->ms.map)
@@ -351,7 +354,8 @@ static u8 symbol__parent_filter(const struct symbol *parent)
 
 static struct hist_entry *add_hist_entry(struct hists *hists,
 					 struct hist_entry *entry,
-					 struct addr_location *al)
+					 struct addr_location *al,
+					 bool sample_self)
 {
 	struct rb_node **p;
 	struct rb_node *parent = NULL;
@@ -375,7 +379,8 @@ static struct hist_entry *add_hist_entry(struct hists *hists,
 		cmp = hist_entry__cmp(he, entry);
 
 		if (!cmp) {
-			he_stat__add_period(&he->stat, period, weight);
+			if (sample_self)
+				he_stat__add_period(&he->stat, period, weight);
 			if (symbol_conf.cumulate_callchain)
 				he_stat__add_period(he->stat_acc, period, weight);
 
@@ -405,14 +410,15 @@ static struct hist_entry *add_hist_entry(struct hists *hists,
 			p = &(*p)->rb_right;
 	}
 
-	he = hist_entry__new(entry);
+	he = hist_entry__new(entry, sample_self);
 	if (!he)
 		return NULL;
 
 	rb_link_node(&he->rb_node_in, parent, p);
 	rb_insert_color(&he->rb_node_in, hists->entries_in);
 out:
-	he_stat__add_cpumode_period(&he->stat, al->cpumode, period);
+	if (sample_self)
+		he_stat__add_cpumode_period(&he->stat, al->cpumode, period);
 	if (symbol_conf.cumulate_callchain)
 		he_stat__add_cpumode_period(he->stat_acc, al->cpumode, period);
 	return he;
@@ -423,7 +429,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct symbol *sym_parent,
 				      struct branch_info *bi,
 				      struct mem_info *mi,
-				      u64 period, u64 weight, u64 transaction)
+				      u64 period, u64 weight, u64 transaction,
+				      bool sample_self)
 {
 	struct hist_entry entry = {
 		.thread	= al->thread,
@@ -448,7 +455,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 		.transaction = transaction,
 	};
 
-	return add_hist_entry(hists, &entry, al);
+	return add_hist_entry(hists, &entry, al, sample_self);
 }
 
 static int
@@ -501,7 +508,7 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 	 * and the he_stat__add_period() function.
 	 */
 	he = __hists__add_entry(&iter->evsel->hists, al, iter->parent, NULL, mi,
-				cost, cost, 0);
+				cost, cost, 0, true);
 	if (!he)
 		return -ENOMEM;
 
@@ -608,7 +615,7 @@ iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *a
 	 * and not events sampled. Thus we use a pseudo period of 1.
 	 */
 	he = __hists__add_entry(&evsel->hists, al, iter->parent, &bi[i], NULL,
-				1, 1, 0);
+				1, 1, 0, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -657,7 +664,7 @@ iter_add_single_normal_entry(struct hist_entry_iter *iter, struct addr_location
 
 	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction);
+				sample->transaction, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -1161,7 +1168,7 @@ static struct hist_entry *hists__add_dummy_entry(struct hists *hists,
 			p = &(*p)->rb_right;
 	}
 
-	he = hist_entry__new(pair);
+	he = hist_entry__new(pair, true);
 	if (he) {
 		memset(&he->stat, 0, sizeof(he->stat));
 		he->hists = hists;
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 8894f18..bedb24d 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -130,7 +130,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct symbol *parent,
 				      struct branch_info *bi,
 				      struct mem_info *mi, u64 period,
-				      u64 weight, u64 transaction);
+				      u64 weight, u64 transaction,
+				      bool sample_self);
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 			 struct perf_evsel *evsel, struct perf_sample *sample,
 			 int max_stack_depth);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 05/27] perf hists: Accumulate hist entry stat based on the callchain
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (3 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 04/27] perf hists: Check if accumulated when adding a " Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 06/27] perf tools: Update cpumode for each cumulative entry Jiri Olsa
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Call __hists__add_entry() for each callchain node to get an
accumulated stat for an entry.  Introduce new cumulative_iter ops to
process them properly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-report.c |  2 +
 tools/perf/util/callchain.c |  3 +-
 tools/perf/util/hist.c      | 96 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/hist.h      |  1 +
 4 files changed, 101 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3201bdf..e8fa9fe 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -115,6 +115,8 @@ static int process_sample_event(struct perf_tool *tool,
 		iter.ops = &hist_iter_branch;
 	else if (rep->mem_mode)
 		iter.ops = &hist_iter_mem;
+	else if (symbol_conf.cumulate_callchain)
+		iter.ops = &hist_iter_cumulative;
 	else
 		iter.ops = &hist_iter_normal;
 
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 9a42382..2af69c4 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -616,7 +616,8 @@ int sample__resolve_callchain(struct perf_sample *sample, struct symbol **parent
 	if (sample->callchain == NULL)
 		return 0;
 
-	if (symbol_conf.use_callchain || sort__has_parent) {
+	if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain ||
+	    sort__has_parent) {
 		return machine__resolve_callchain(al->machine, evsel, al->thread,
 						  sample, parent, al, max_stack);
 	}
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b9facf3..6079b5a 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -696,6 +696,94 @@ iter_finish_normal_entry(struct hist_entry_iter *iter, struct addr_location *al)
 	return hist_entry__append_callchain(he, sample);
 }
 
+static int
+iter_prepare_cumulative_entry(struct hist_entry_iter *iter __maybe_unused,
+			      struct addr_location *al __maybe_unused)
+{
+	callchain_cursor_commit(&callchain_cursor);
+	return 0;
+}
+
+static int
+iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
+				 struct addr_location *al)
+{
+	struct perf_evsel *evsel = iter->evsel;
+	struct perf_sample *sample = iter->sample;
+	struct hist_entry *he;
+	int err = 0;
+
+	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
+				sample->period, sample->weight,
+				sample->transaction, true);
+	if (he == NULL)
+		return -ENOMEM;
+
+	iter->he = he;
+
+	/*
+	 * The iter->he will be over-written after ->add_next_entry()
+	 * called so inc stats for the original entry now.
+	 */
+	if (ui__has_annotation())
+		err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
+
+	hists__inc_nr_samples(&evsel->hists, he->filtered);
+
+	return err;
+}
+
+static int
+iter_next_cumulative_entry(struct hist_entry_iter *iter,
+			   struct addr_location *al)
+{
+	struct callchain_cursor_node *node;
+
+	node = callchain_cursor_current(&callchain_cursor);
+	if (node == NULL)
+		return 0;
+
+	al->map = node->map;
+	al->sym = node->sym;
+	if (node->map)
+		al->addr = node->map->map_ip(node->map, node->ip);
+	else
+		al->addr = node->ip;
+
+	if (iter->hide_unresolved && al->sym == NULL)
+		return 0;
+
+	callchain_cursor_advance(&callchain_cursor);
+	return 1;
+}
+
+static int
+iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
+			       struct addr_location *al)
+{
+	struct perf_evsel *evsel = iter->evsel;
+	struct perf_sample *sample = iter->sample;
+	struct hist_entry *he;
+
+	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
+				sample->period, sample->weight,
+				sample->transaction, false);
+	if (he == NULL)
+		return -ENOMEM;
+
+	iter->he = he;
+
+	return 0;
+}
+
+static int
+iter_finish_cumulative_entry(struct hist_entry_iter *iter,
+			     struct addr_location *al __maybe_unused)
+{
+	iter->he = NULL;
+	return 0;
+}
+
 const struct hist_iter_ops hist_iter_mem = {
 	.prepare_entry 		= iter_prepare_mem_entry,
 	.add_single_entry 	= iter_add_single_mem_entry,
@@ -720,6 +808,14 @@ const struct hist_iter_ops hist_iter_normal = {
 	.finish_entry 		= iter_finish_normal_entry,
 };
 
+const struct hist_iter_ops hist_iter_cumulative = {
+	.prepare_entry 		= iter_prepare_cumulative_entry,
+	.add_single_entry 	= iter_add_single_cumulative_entry,
+	.next_entry 		= iter_next_cumulative_entry,
+	.add_next_entry 	= iter_add_next_cumulative_entry,
+	.finish_entry 		= iter_finish_cumulative_entry,
+};
+
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 			 struct perf_evsel *evsel, struct perf_sample *sample,
 			 int max_stack_depth)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index bedb24d..78409f9 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -124,6 +124,7 @@ struct hist_entry_iter {
 extern const struct hist_iter_ops hist_iter_normal;
 extern const struct hist_iter_ops hist_iter_branch;
 extern const struct hist_iter_ops hist_iter_mem;
+extern const struct hist_iter_ops hist_iter_cumulative;
 
 struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct addr_location *al,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 06/27] perf tools: Update cpumode for each cumulative entry
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (4 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 05/27] perf hists: Accumulate hist entry stat based on the callchain Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 07/27] perf report: Cache cumulative callchains Jiri Olsa
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

The cpumode and level in struct addr_localtion was set for a sample
and but updated as cumulative callchains were added.  This led to have
non-matching symbol and cpumode in the output.

Update it accordingly based on the fact whether the map is a part of
the kernel or not.  This is a reverse of what thread__find_addr_map()
does.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-7-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/callchain.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/callchain.h |  2 ++
 tools/perf/util/hist.c      | 13 ++-----------
 3 files changed, 46 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 2af69c4..48b6d3f 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -630,3 +630,45 @@ int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample *samp
 		return 0;
 	return callchain_append(he->callchain, &callchain_cursor, sample->period);
 }
+
+int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *node,
+			bool hide_unresolved)
+{
+	al->map = node->map;
+	al->sym = node->sym;
+	if (node->map)
+		al->addr = node->map->map_ip(node->map, node->ip);
+	else
+		al->addr = node->ip;
+
+	if (al->sym == NULL) {
+		if (hide_unresolved)
+			return 0;
+		if (al->map == NULL)
+			goto out;
+	}
+
+	if (al->map->groups == &al->machine->kmaps) {
+		if (machine__is_host(al->machine)) {
+			al->cpumode = PERF_RECORD_MISC_KERNEL;
+			al->level = 'k';
+		} else {
+			al->cpumode = PERF_RECORD_MISC_GUEST_KERNEL;
+			al->level = 'g';
+		}
+	} else {
+		if (machine__is_host(al->machine)) {
+			al->cpumode = PERF_RECORD_MISC_USER;
+			al->level = '.';
+		} else if (perf_guest) {
+			al->cpumode = PERF_RECORD_MISC_GUEST_USER;
+			al->level = 'u';
+		} else {
+			al->cpumode = PERF_RECORD_MISC_HYPERVISOR;
+			al->level = 'H';
+		}
+	}
+
+out:
+	return 1;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index bde2b0c..24a53d5 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -162,6 +162,8 @@ int sample__resolve_callchain(struct perf_sample *sample, struct symbol **parent
 			      struct perf_evsel *evsel, struct addr_location *al,
 			      int max_stack);
 int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample *sample);
+int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *node,
+			bool hide_unresolved);
 
 extern const char record_callchain_help[];
 int parse_callchain_report_opt(const char *arg);
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6079b5a..37c28fc 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -743,18 +743,9 @@ iter_next_cumulative_entry(struct hist_entry_iter *iter,
 	if (node == NULL)
 		return 0;
 
-	al->map = node->map;
-	al->sym = node->sym;
-	if (node->map)
-		al->addr = node->map->map_ip(node->map, node->ip);
-	else
-		al->addr = node->ip;
-
-	if (iter->hide_unresolved && al->sym == NULL)
-		return 0;
-
 	callchain_cursor_advance(&callchain_cursor);
-	return 1;
+
+	return fill_callchain_info(al, node, iter->hide_unresolved);
 }
 
 static int
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 07/27] perf report: Cache cumulative callchains
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (5 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 06/27] perf tools: Update cpumode for each cumulative entry Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 08/27] perf callchain: Add callchain_cursor_snapshot() Jiri Olsa
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

It is possble that a callchain has cycles or recursive calls.  In that
case it'll end up having entries more than 100% overhead in the
output.  In order to prevent such entries, cache each callchain node
and skip if same entry already cumulated.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-8-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/hist.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 37c28fc..bf03db5 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -700,7 +700,22 @@ static int
 iter_prepare_cumulative_entry(struct hist_entry_iter *iter __maybe_unused,
 			      struct addr_location *al __maybe_unused)
 {
+	struct hist_entry **he_cache;
+
 	callchain_cursor_commit(&callchain_cursor);
+
+	/*
+	 * This is for detecting cycles or recursions so that they're
+	 * cumulated only one time to prevent entries more than 100%
+	 * overhead.
+	 */
+	he_cache = malloc(sizeof(*he_cache) * (PERF_MAX_STACK_DEPTH + 1));
+	if (he_cache == NULL)
+		return -ENOMEM;
+
+	iter->priv = he_cache;
+	iter->curr = 0;
+
 	return 0;
 }
 
@@ -710,6 +725,7 @@ iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
 {
 	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
+	struct hist_entry **he_cache = iter->priv;
 	struct hist_entry *he;
 	int err = 0;
 
@@ -720,6 +736,7 @@ iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
 		return -ENOMEM;
 
 	iter->he = he;
+	he_cache[iter->curr++] = he;
 
 	/*
 	 * The iter->he will be over-written after ->add_next_entry()
@@ -754,7 +771,29 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 {
 	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
+	struct hist_entry **he_cache = iter->priv;
 	struct hist_entry *he;
+	struct hist_entry he_tmp = {
+		.cpu = al->cpu,
+		.thread = al->thread,
+		.comm = thread__comm(al->thread),
+		.ip = al->addr,
+		.ms = {
+			.map = al->map,
+			.sym = al->sym,
+		},
+		.parent = iter->parent,
+	};
+	int i;
+
+	/*
+	 * Check if there's duplicate entries in the callchain.
+	 * It's possible that it has cycles or recursive calls.
+	 */
+	for (i = 0; i < iter->curr; i++) {
+		if (hist_entry__cmp(he_cache[i], &he_tmp) == 0)
+			return 0;
+	}
 
 	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
@@ -763,6 +802,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 		return -ENOMEM;
 
 	iter->he = he;
+	he_cache[iter->curr++] = he;
 
 	return 0;
 }
@@ -771,7 +811,9 @@ static int
 iter_finish_cumulative_entry(struct hist_entry_iter *iter,
 			     struct addr_location *al __maybe_unused)
 {
+	zfree(&iter->priv);
 	iter->he = NULL;
+
 	return 0;
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 08/27] perf callchain: Add callchain_cursor_snapshot()
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (6 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 07/27] perf report: Cache cumulative callchains Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 09/27] perf tools: Save callchain info for each cumulative entry Jiri Olsa
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

The callchain_cursor_snapshot() is for saving current status of the
callchain.  It'll be used to accumulate callchain information for each node.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-9-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/callchain.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 24a53d5..8f84423 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -167,4 +167,13 @@ int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *
 
 extern const char record_callchain_help[];
 int parse_callchain_report_opt(const char *arg);
+
+static inline void callchain_cursor_snapshot(struct callchain_cursor *dest,
+					     struct callchain_cursor *src)
+{
+	*dest = *src;
+
+	dest->first = src->curr;
+	dest->nr -= src->pos;
+}
 #endif	/* __PERF_CALLCHAIN_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 09/27] perf tools: Save callchain info for each cumulative entry
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (7 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 08/27] perf callchain: Add callchain_cursor_snapshot() Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 10/27] perf ui/hist: Add support to accumulated hist stat Jiri Olsa
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

When accumulating callchain entry, also save current snapshot of the
chain so that it can show the rest of the chain.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-10-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/hist.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index bf03db5..c6f5f52 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -738,6 +738,14 @@ iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
 	iter->he = he;
 	he_cache[iter->curr++] = he;
 
+	callchain_append(he->callchain, &callchain_cursor, sample->period);
+
+	/*
+	 * We need to re-initialize the cursor since callchain_append()
+	 * advanced the cursor to the end.
+	 */
+	callchain_cursor_commit(&callchain_cursor);
+
 	/*
 	 * The iter->he will be over-written after ->add_next_entry()
 	 * called so inc stats for the original entry now.
@@ -760,8 +768,6 @@ iter_next_cumulative_entry(struct hist_entry_iter *iter,
 	if (node == NULL)
 		return 0;
 
-	callchain_cursor_advance(&callchain_cursor);
-
 	return fill_callchain_info(al, node, iter->hide_unresolved);
 }
 
@@ -785,6 +791,11 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 		.parent = iter->parent,
 	};
 	int i;
+	struct callchain_cursor cursor;
+
+	callchain_cursor_snapshot(&cursor, &callchain_cursor);
+
+	callchain_cursor_advance(&callchain_cursor);
 
 	/*
 	 * Check if there's duplicate entries in the callchain.
@@ -804,6 +815,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 	iter->he = he;
 	he_cache[iter->curr++] = he;
 
+	callchain_append(he->callchain, &cursor, sample->period);
 	return 0;
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 10/27] perf ui/hist: Add support to accumulated hist stat
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (8 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 09/27] perf tools: Save callchain info for each cumulative entry Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 11/27] perf ui/browser: " Jiri Olsa
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Print accumulated stat of a hist entry if requested.

To do that, add new HPP_PERCENT_ACC_FNS macro and generate a
perf_hpp_fmt using it.  The __hpp__sort_acc() function sorts entries
by accumulated period value.  When accumulated periods of two entries
are same (i.e. single path callchain) put the caller above since
accumulation tends to put callers on higher position for obvious
reason.

Also add "overhead_children" output field to be selected by user.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-11-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/ui/hist.c   | 99 ++++++++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/hist.h |  4 ++
 tools/perf/util/sort.c |  1 +
 3 files changed, 104 insertions(+)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 4484f5b..0ce3e79 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -104,6 +104,18 @@ int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
 	return ret;
 }
 
+int __hpp__fmt_acc(struct perf_hpp *hpp, struct hist_entry *he,
+		   hpp_field_fn get_field, const char *fmt,
+		   hpp_snprint_fn print_fn, bool fmt_percent)
+{
+	if (!symbol_conf.cumulate_callchain) {
+		return snprintf(hpp->buf, hpp->size, "%*s",
+				fmt_percent ? 8 : 12, "N/A");
+	}
+
+	return __hpp__fmt(hpp, he, get_field, fmt, print_fn, fmt_percent);
+}
+
 static int field_cmp(u64 field_a, u64 field_b)
 {
 	if (field_a > field_b)
@@ -160,6 +172,24 @@ out:
 	return ret;
 }
 
+static int __hpp__sort_acc(struct hist_entry *a, struct hist_entry *b,
+			   hpp_field_fn get_field)
+{
+	s64 ret = 0;
+
+	if (symbol_conf.cumulate_callchain) {
+		/*
+		 * Put caller above callee when they have equal period.
+		 */
+		ret = field_cmp(get_field(a), get_field(b));
+		if (ret)
+			return ret;
+
+		ret = b->callchain->max_depth - a->callchain->max_depth;
+	}
+	return ret;
+}
+
 #define __HPP_HEADER_FN(_type, _str, _min_width, _unit_width) 		\
 static int hpp__header_##_type(struct perf_hpp_fmt *fmt __maybe_unused,	\
 			       struct perf_hpp *hpp,			\
@@ -242,6 +272,34 @@ static int64_t hpp__sort_##_type(struct hist_entry *a, struct hist_entry *b)	\
 	return __hpp__sort(a, b, he_get_##_field);				\
 }
 
+#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
+static u64 he_get_acc_##_field(struct hist_entry *he)				\
+{										\
+	return he->stat_acc->_field;						\
+}										\
+										\
+static int hpp__color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,		\
+			      struct perf_hpp *hpp, struct hist_entry *he) 	\
+{										\
+	return __hpp__fmt_acc(hpp, he, he_get_acc_##_field, " %6.2f%%",		\
+			      hpp_color_scnprintf, true);			\
+}
+
+#define __HPP_ENTRY_ACC_PERCENT_FN(_type, _field)				\
+static int hpp__entry_##_type(struct perf_hpp_fmt *_fmt __maybe_unused,		\
+			      struct perf_hpp *hpp, struct hist_entry *he) 	\
+{										\
+	const char *fmt = symbol_conf.field_sep ? " %.2f" : " %6.2f%%";		\
+	return __hpp__fmt_acc(hpp, he, he_get_acc_##_field, fmt,		\
+			      hpp_entry_scnprintf, true);			\
+}
+
+#define __HPP_SORT_ACC_FN(_type, _field)					\
+static int64_t hpp__sort_##_type(struct hist_entry *a, struct hist_entry *b)	\
+{										\
+	return __hpp__sort_acc(a, b, he_get_acc_##_field);			\
+}
+
 #define __HPP_ENTRY_RAW_FN(_type, _field)					\
 static u64 he_get_raw_##_field(struct hist_entry *he)				\
 {										\
@@ -270,18 +328,27 @@ __HPP_COLOR_PERCENT_FN(_type, _field)					\
 __HPP_ENTRY_PERCENT_FN(_type, _field)					\
 __HPP_SORT_FN(_type, _field)
 
+#define HPP_PERCENT_ACC_FNS(_type, _str, _field, _min_width, _unit_width)\
+__HPP_HEADER_FN(_type, _str, _min_width, _unit_width)			\
+__HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
+__HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
+__HPP_ENTRY_ACC_PERCENT_FN(_type, _field)				\
+__HPP_SORT_ACC_FN(_type, _field)
+
 #define HPP_RAW_FNS(_type, _str, _field, _min_width, _unit_width)	\
 __HPP_HEADER_FN(_type, _str, _min_width, _unit_width)			\
 __HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
 __HPP_ENTRY_RAW_FN(_type, _field)					\
 __HPP_SORT_RAW_FN(_type, _field)
 
+__HPP_HEADER_FN(overhead_self, "Self", 8, 8)
 
 HPP_PERCENT_FNS(overhead, "Overhead", period, 8, 8)
 HPP_PERCENT_FNS(overhead_sys, "sys", period_sys, 8, 8)
 HPP_PERCENT_FNS(overhead_us, "usr", period_us, 8, 8)
 HPP_PERCENT_FNS(overhead_guest_sys, "guest sys", period_guest_sys, 9, 8)
 HPP_PERCENT_FNS(overhead_guest_us, "guest usr", period_guest_us, 9, 8)
+HPP_PERCENT_ACC_FNS(overhead_acc, "Children", period, 8, 8)
 
 HPP_RAW_FNS(samples, "Samples", nr_events, 12, 12)
 HPP_RAW_FNS(period, "Period", period, 12, 12)
@@ -303,6 +370,17 @@ static int64_t hpp__nop_cmp(struct hist_entry *a __maybe_unused,
 		.sort	= hpp__sort_ ## _name,		\
 	}
 
+#define HPP__COLOR_ACC_PRINT_FNS(_name)			\
+	{						\
+		.header	= hpp__header_ ## _name,	\
+		.width	= hpp__width_ ## _name,		\
+		.color	= hpp__color_ ## _name,		\
+		.entry	= hpp__entry_ ## _name,		\
+		.cmp	= hpp__nop_cmp,			\
+		.collapse = hpp__nop_cmp,		\
+		.sort	= hpp__sort_ ## _name,		\
+	}
+
 #define HPP__PRINT_FNS(_name)				\
 	{						\
 		.header	= hpp__header_ ## _name,	\
@@ -319,6 +397,7 @@ struct perf_hpp_fmt perf_hpp__format[] = {
 	HPP__COLOR_PRINT_FNS(overhead_us),
 	HPP__COLOR_PRINT_FNS(overhead_guest_sys),
 	HPP__COLOR_PRINT_FNS(overhead_guest_us),
+	HPP__COLOR_ACC_PRINT_FNS(overhead_acc),
 	HPP__PRINT_FNS(samples),
 	HPP__PRINT_FNS(period)
 };
@@ -328,16 +407,23 @@ LIST_HEAD(perf_hpp__sort_list);
 
 
 #undef HPP__COLOR_PRINT_FNS
+#undef HPP__COLOR_ACC_PRINT_FNS
 #undef HPP__PRINT_FNS
 
 #undef HPP_PERCENT_FNS
+#undef HPP_PERCENT_ACC_FNS
 #undef HPP_RAW_FNS
 
 #undef __HPP_HEADER_FN
 #undef __HPP_WIDTH_FN
 #undef __HPP_COLOR_PERCENT_FN
 #undef __HPP_ENTRY_PERCENT_FN
+#undef __HPP_COLOR_ACC_PERCENT_FN
+#undef __HPP_ENTRY_ACC_PERCENT_FN
 #undef __HPP_ENTRY_RAW_FN
+#undef __HPP_SORT_FN
+#undef __HPP_SORT_ACC_FN
+#undef __HPP_SORT_RAW_FN
 
 
 void perf_hpp__init(void)
@@ -361,6 +447,13 @@ void perf_hpp__init(void)
 	if (field_order)
 		return;
 
+	if (symbol_conf.cumulate_callchain) {
+		perf_hpp__column_enable(PERF_HPP__OVERHEAD_ACC);
+
+		perf_hpp__format[PERF_HPP__OVERHEAD].header =
+						hpp__header_overhead_self;
+	}
+
 	perf_hpp__column_enable(PERF_HPP__OVERHEAD);
 
 	if (symbol_conf.show_cpu_utilization) {
@@ -383,6 +476,12 @@ void perf_hpp__init(void)
 	list = &perf_hpp__format[PERF_HPP__OVERHEAD].sort_list;
 	if (list_empty(list))
 		list_add(list, &perf_hpp__sort_list);
+
+	if (symbol_conf.cumulate_callchain) {
+		list = &perf_hpp__format[PERF_HPP__OVERHEAD_ACC].sort_list;
+		if (list_empty(list))
+			list_add(list, &perf_hpp__sort_list);
+	}
 }
 
 void perf_hpp__column_register(struct perf_hpp_fmt *format)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 78409f9..efd73e4 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -228,6 +228,7 @@ enum {
 	PERF_HPP__OVERHEAD_US,
 	PERF_HPP__OVERHEAD_GUEST_SYS,
 	PERF_HPP__OVERHEAD_GUEST_US,
+	PERF_HPP__OVERHEAD_ACC,
 	PERF_HPP__SAMPLES,
 	PERF_HPP__PERIOD,
 
@@ -254,6 +255,9 @@ typedef int (*hpp_snprint_fn)(struct perf_hpp *hpp, const char *fmt, ...);
 int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
 	       hpp_field_fn get_field, const char *fmt,
 	       hpp_snprint_fn print_fn, bool fmt_percent);
+int __hpp__fmt_acc(struct perf_hpp *hpp, struct hist_entry *he,
+		   hpp_field_fn get_field, const char *fmt,
+		   hpp_snprint_fn print_fn, bool fmt_percent);
 
 static inline void advance_hpp(struct perf_hpp *hpp, int inc)
 {
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 901b9be..9da8931 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1061,6 +1061,7 @@ static struct hpp_dimension hpp_sort_dimensions[] = {
 	DIM(PERF_HPP__OVERHEAD_US, "overhead_us"),
 	DIM(PERF_HPP__OVERHEAD_GUEST_SYS, "overhead_guest_sys"),
 	DIM(PERF_HPP__OVERHEAD_GUEST_US, "overhead_guest_us"),
+	DIM(PERF_HPP__OVERHEAD_ACC, "overhead_children"),
 	DIM(PERF_HPP__SAMPLES, "sample"),
 	DIM(PERF_HPP__PERIOD, "period"),
 };
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 11/27] perf ui/browser: Add support to accumulated hist stat
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (9 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 10/27] perf ui/hist: Add support to accumulated hist stat Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 12/27] perf ui/gtk: " Jiri Olsa
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Print accumulated stat of a hist entry if requested.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-12-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/ui/browsers/hists.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 1c331b9..2dcbe3d 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -651,13 +651,36 @@ hist_browser__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,\
 			  __hpp__slsmg_color_printf, true);		\
 }
 
+#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)			\
+static u64 __hpp_get_acc_##_field(struct hist_entry *he)		\
+{									\
+	return he->stat_acc->_field;					\
+}									\
+									\
+static int								\
+hist_browser__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,\
+				struct perf_hpp *hpp,			\
+				struct hist_entry *he)			\
+{									\
+	if (!symbol_conf.cumulate_callchain) {				\
+		int ret = scnprintf(hpp->buf, hpp->size, "%8s", "N/A");	\
+		slsmg_printf("%s", hpp->buf);				\
+									\
+		return ret;						\
+	}								\
+	return __hpp__fmt(hpp, he, __hpp_get_acc_##_field, " %6.2f%%",	\
+			  __hpp__slsmg_color_printf, true);		\
+}
+
 __HPP_COLOR_PERCENT_FN(overhead, period)
 __HPP_COLOR_PERCENT_FN(overhead_sys, period_sys)
 __HPP_COLOR_PERCENT_FN(overhead_us, period_us)
 __HPP_COLOR_PERCENT_FN(overhead_guest_sys, period_guest_sys)
 __HPP_COLOR_PERCENT_FN(overhead_guest_us, period_guest_us)
+__HPP_COLOR_ACC_PERCENT_FN(overhead_acc, period)
 
 #undef __HPP_COLOR_PERCENT_FN
+#undef __HPP_COLOR_ACC_PERCENT_FN
 
 void hist_browser__init_hpp(void)
 {
@@ -671,6 +694,8 @@ void hist_browser__init_hpp(void)
 				hist_browser__hpp_color_overhead_guest_sys;
 	perf_hpp__format[PERF_HPP__OVERHEAD_GUEST_US].color =
 				hist_browser__hpp_color_overhead_guest_us;
+	perf_hpp__format[PERF_HPP__OVERHEAD_ACC].color =
+				hist_browser__hpp_color_overhead_acc;
 }
 
 static int hist_browser__show_entry(struct hist_browser *browser,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 12/27] perf ui/gtk: Add support to accumulated hist stat
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (10 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 11/27] perf ui/browser: " Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 13/27] perf tools: Apply percent-limit to cumulative percentage Jiri Olsa
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Print accumulated stat of a hist entry if requested.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-13-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/ui/gtk/hists.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 9d90683..7e5da4a 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -47,11 +47,26 @@ static int perf_gtk__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,
 			  __percent_color_snprintf, true);			\
 }
 
+#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
+static u64 he_get_acc_##_field(struct hist_entry *he)				\
+{										\
+	return he->stat_acc->_field;						\
+}										\
+										\
+static int perf_gtk__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,	\
+				       struct perf_hpp *hpp,			\
+				       struct hist_entry *he)			\
+{										\
+	return __hpp__fmt_acc(hpp, he, he_get_acc_##_field, " %6.2f%%",		\
+			      __percent_color_snprintf, true);			\
+}
+
 __HPP_COLOR_PERCENT_FN(overhead, period)
 __HPP_COLOR_PERCENT_FN(overhead_sys, period_sys)
 __HPP_COLOR_PERCENT_FN(overhead_us, period_us)
 __HPP_COLOR_PERCENT_FN(overhead_guest_sys, period_guest_sys)
 __HPP_COLOR_PERCENT_FN(overhead_guest_us, period_guest_us)
+__HPP_COLOR_ACC_PERCENT_FN(overhead_acc, period)
 
 #undef __HPP_COLOR_PERCENT_FN
 
@@ -68,6 +83,8 @@ void perf_gtk__init_hpp(void)
 				perf_gtk__hpp_color_overhead_guest_sys;
 	perf_hpp__format[PERF_HPP__OVERHEAD_GUEST_US].color =
 				perf_gtk__hpp_color_overhead_guest_us;
+	perf_hpp__format[PERF_HPP__OVERHEAD_ACC].color =
+				perf_gtk__hpp_color_overhead_acc;
 }
 
 static void callchain_list__sym_name(struct callchain_list *cl,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 13/27] perf tools: Apply percent-limit to cumulative percentage
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (11 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 12/27] perf ui/gtk: " Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 14/27] perf tools: Add more hpp helper functions Jiri Olsa
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

If -g cumulative option is given, it needs to show entries which don't
have self overhead.  So apply percent-limit to accumulated overhead
percentage in this case.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-14-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/ui/browsers/hists.c | 40 +++++++++++-----------------------------
 tools/perf/ui/gtk/hists.c      |  6 ++----
 tools/perf/ui/stdio/hist.c     |  4 ++--
 tools/perf/util/sort.h         | 17 ++++++++++++++++-
 4 files changed, 31 insertions(+), 36 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 2dcbe3d..5905acd 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -37,7 +37,6 @@ static int hists__browser_title(struct hists *hists, char *bf, size_t size,
 static void hist_browser__update_nr_entries(struct hist_browser *hb);
 
 static struct rb_node *hists__filter_entries(struct rb_node *nd,
-					     struct hists *hists,
 					     float min_pcnt);
 
 static bool hist_browser__has_filter(struct hist_browser *hb)
@@ -319,7 +318,7 @@ __hist_browser__set_folding(struct hist_browser *browser, bool unfold)
 	struct hists *hists = browser->hists;
 
 	for (nd = rb_first(&hists->entries);
-	     (nd = hists__filter_entries(nd, hists, browser->min_pcnt)) != NULL;
+	     (nd = hists__filter_entries(nd, browser->min_pcnt)) != NULL;
 	     nd = rb_next(nd)) {
 		struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
 		hist_entry__set_folding(he, unfold);
@@ -808,15 +807,12 @@ static unsigned int hist_browser__refresh(struct ui_browser *browser)
 
 	for (nd = browser->top; nd; nd = rb_next(nd)) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-		u64 total = hists__total_period(h->hists);
-		float percent = 0.0;
+		float percent;
 
 		if (h->filtered)
 			continue;
 
-		if (total)
-			percent = h->stat.period * 100.0 / total;
-
+		percent = hist_entry__get_percent_limit(h);
 		if (percent < hb->min_pcnt)
 			continue;
 
@@ -829,16 +825,11 @@ static unsigned int hist_browser__refresh(struct ui_browser *browser)
 }
 
 static struct rb_node *hists__filter_entries(struct rb_node *nd,
-					     struct hists *hists,
 					     float min_pcnt)
 {
 	while (nd != NULL) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-		u64 total = hists__total_period(hists);
-		float percent = 0.0;
-
-		if (total)
-			percent = h->stat.period * 100.0 / total;
+		float percent = hist_entry__get_percent_limit(h);
 
 		if (!h->filtered && percent >= min_pcnt)
 			return nd;
@@ -850,16 +841,11 @@ static struct rb_node *hists__filter_entries(struct rb_node *nd,
 }
 
 static struct rb_node *hists__filter_prev_entries(struct rb_node *nd,
-						  struct hists *hists,
 						  float min_pcnt)
 {
 	while (nd != NULL) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-		u64 total = hists__total_period(hists);
-		float percent = 0.0;
-
-		if (total)
-			percent = h->stat.period * 100.0 / total;
+		float percent = hist_entry__get_percent_limit(h);
 
 		if (!h->filtered && percent >= min_pcnt)
 			return nd;
@@ -888,14 +874,14 @@ static void ui_browser__hists_seek(struct ui_browser *browser,
 	switch (whence) {
 	case SEEK_SET:
 		nd = hists__filter_entries(rb_first(browser->entries),
-					   hb->hists, hb->min_pcnt);
+					   hb->min_pcnt);
 		break;
 	case SEEK_CUR:
 		nd = browser->top;
 		goto do_offset;
 	case SEEK_END:
 		nd = hists__filter_prev_entries(rb_last(browser->entries),
-						hb->hists, hb->min_pcnt);
+						hb->min_pcnt);
 		first = false;
 		break;
 	default:
@@ -938,8 +924,7 @@ do_offset:
 					break;
 				}
 			}
-			nd = hists__filter_entries(rb_next(nd), hb->hists,
-						   hb->min_pcnt);
+			nd = hists__filter_entries(rb_next(nd), hb->min_pcnt);
 			if (nd == NULL)
 				break;
 			--offset;
@@ -972,7 +957,7 @@ do_offset:
 				}
 			}
 
-			nd = hists__filter_prev_entries(rb_prev(nd), hb->hists,
+			nd = hists__filter_prev_entries(rb_prev(nd),
 							hb->min_pcnt);
 			if (nd == NULL)
 				break;
@@ -1151,7 +1136,6 @@ static int hist_browser__fprintf_entry(struct hist_browser *browser,
 static int hist_browser__fprintf(struct hist_browser *browser, FILE *fp)
 {
 	struct rb_node *nd = hists__filter_entries(rb_first(browser->b.entries),
-						   browser->hists,
 						   browser->min_pcnt);
 	int printed = 0;
 
@@ -1159,8 +1143,7 @@ static int hist_browser__fprintf(struct hist_browser *browser, FILE *fp)
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
 
 		printed += hist_browser__fprintf_entry(browser, h, fp);
-		nd = hists__filter_entries(rb_next(nd), browser->hists,
-					   browser->min_pcnt);
+		nd = hists__filter_entries(rb_next(nd), browser->min_pcnt);
 	}
 
 	return printed;
@@ -1397,8 +1380,7 @@ static void hist_browser__update_nr_entries(struct hist_browser *hb)
 		return;
 	}
 
-	while ((nd = hists__filter_entries(nd, hb->hists,
-					   hb->min_pcnt)) != NULL) {
+	while ((nd = hists__filter_entries(nd, hb->min_pcnt)) != NULL) {
 		nr_entries++;
 		nd = rb_next(nd);
 	}
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 7e5da4a..03d6812 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -226,14 +226,12 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
 		GtkTreeIter iter;
 		u64 total = hists__total_period(h->hists);
-		float percent = 0.0;
+		float percent;
 
 		if (h->filtered)
 			continue;
 
-		if (total)
-			percent = h->stat.period * 100.0 / total;
-
+		percent = hist_entry__get_percent_limit(h);
 		if (percent < min_pcnt)
 			continue;
 
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 9f57991..475d2f5 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -461,12 +461,12 @@ print_entries:
 
 	for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-		float percent = h->stat.period * 100.0 /
-					hists->stats.total_period;
+		float percent;
 
 		if (h->filtered)
 			continue;
 
+		percent = hist_entry__get_percent_limit(h);
 		if (percent < min_pcnt)
 			continue;
 
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index c9ffa03..426b873 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -20,7 +20,7 @@
 
 #include "parse-options.h"
 #include "parse-events.h"
-
+#include "hist.h"
 #include "thread.h"
 
 extern regex_t parent_regex;
@@ -131,6 +131,21 @@ static inline void hist_entry__add_pair(struct hist_entry *pair,
 	list_add_tail(&pair->pairs.node, &he->pairs.head);
 }
 
+static inline float hist_entry__get_percent_limit(struct hist_entry *he)
+{
+	u64 period = he->stat.period;
+	u64 total_period = hists__total_period(he->hists);
+
+	if (unlikely(total_period == 0))
+		return 0;
+
+	if (symbol_conf.cumulate_callchain)
+		period = he->stat_acc->period;
+
+	return period * 100.0 / total_period;
+}
+
+
 enum sort_mode {
 	SORT_MODE__NORMAL,
 	SORT_MODE__BRANCH,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 14/27] perf tools: Add more hpp helper functions
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (12 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 13/27] perf tools: Apply percent-limit to cumulative percentage Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 15/27] perf report: Add --children option Jiri Olsa
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Sometimes it needs to disable some columns at runtime.  Add help
functions to support that.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-15-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/ui/hist.c   | 17 +++++++++++++++++
 tools/perf/util/hist.h |  4 ++++
 2 files changed, 21 insertions(+)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 0ce3e79..8ca6387 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -489,6 +489,11 @@ void perf_hpp__column_register(struct perf_hpp_fmt *format)
 	list_add_tail(&format->list, &perf_hpp__list);
 }
 
+void perf_hpp__column_unregister(struct perf_hpp_fmt *format)
+{
+	list_del(&format->list);
+}
+
 void perf_hpp__register_sort_field(struct perf_hpp_fmt *format)
 {
 	list_add_tail(&format->sort_list, &perf_hpp__sort_list);
@@ -500,6 +505,18 @@ void perf_hpp__column_enable(unsigned col)
 	perf_hpp__column_register(&perf_hpp__format[col]);
 }
 
+void perf_hpp__column_disable(unsigned col)
+{
+	BUG_ON(col >= PERF_HPP__MAX_INDEX);
+	perf_hpp__column_unregister(&perf_hpp__format[col]);
+}
+
+void perf_hpp__cancel_cumulate(void)
+{
+	perf_hpp__column_disable(PERF_HPP__OVERHEAD_ACC);
+	perf_hpp__format[PERF_HPP__OVERHEAD].header = hpp__header_overhead;
+}
+
 void perf_hpp__setup_output_field(void)
 {
 	struct perf_hpp_fmt *fmt;
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index efd73e4..99ad3cb 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -237,7 +237,11 @@ enum {
 
 void perf_hpp__init(void);
 void perf_hpp__column_register(struct perf_hpp_fmt *format);
+void perf_hpp__column_unregister(struct perf_hpp_fmt *format);
 void perf_hpp__column_enable(unsigned col);
+void perf_hpp__column_disable(unsigned col);
+void perf_hpp__cancel_cumulate(void);
+
 void perf_hpp__register_sort_field(struct perf_hpp_fmt *format);
 void perf_hpp__setup_output_field(void);
 void perf_hpp__reset_output_field(void);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 15/27] perf report: Add --children option
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (13 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 14/27] perf tools: Add more hpp helper functions Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 16/27] perf report: Add report.children config option Jiri Olsa
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

The --children option is for showing accumulated overhead (period)
value as well as self overhead.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-16-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/Documentation/perf-report.txt |  7 ++++++-
 tools/perf/builtin-report.c              | 15 ++++++++++++++-
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index a1b5185..cefdf43 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -111,7 +111,7 @@ OPTIONS
 --fields=::
 	Specify output field - multiple keys can be specified in CSV format.
 	Following fields are available:
-	overhead, overhead_sys, overhead_us, sample and period.
+	overhead, overhead_sys, overhead_us, overhead_children, sample and period.
 	Also it can contain any sort key(s).
 
 	By default, every sort keys not specified in -F will be appended
@@ -163,6 +163,11 @@ OPTIONS
 
 	Default: fractal,0.5,callee,function.
 
+--children::
+	Accumulate callchain of children to parent entry so that then can
+	show up in the output.  The output will have a new "Children" column
+	and will be sorted on the data.  It requires callchains are recorded.
+
 --max-stack::
 	Set the stack depth limit when parsing the callchain, anything
 	beyond the specified depth will be ignored. This is a trade-off
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index e8fa9fe..f27a8aa 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -185,6 +185,14 @@ static int report__setup_sample_type(struct report *rep)
 			}
 	}
 
+	if (symbol_conf.cumulate_callchain) {
+		/* Silently ignore if callchain is missing */
+		if (!(sample_type & PERF_SAMPLE_CALLCHAIN)) {
+			symbol_conf.cumulate_callchain = false;
+			perf_hpp__cancel_cumulate();
+		}
+	}
+
 	if (sort__mode == SORT_MODE__BRANCH) {
 		if (!is_pipe &&
 		    !(sample_type & PERF_SAMPLE_BRANCH_STACK)) {
@@ -568,6 +576,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 	OPT_CALLBACK_DEFAULT('g', "call-graph", &report, "output_type,min_percent[,print_limit],call_order",
 		     "Display callchains using output_type (graph, flat, fractal, or none) , min percent threshold, optional print limit, callchain order, key (function or address). "
 		     "Default: fractal,0.5,callee,function", &report_parse_callchain_opt, callchain_default_opt),
+	OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
+		    "Accumulate callchains of children and show total overhead as well"),
 	OPT_INTEGER(0, "max-stack", &report.max_stack,
 		    "Set the maximum stack depth when parsing the callchain, "
 		    "anything beyond the specified depth will be ignored. "
@@ -660,8 +670,10 @@ repeat:
 	has_br_stack = perf_header__has_feat(&session->header,
 					     HEADER_BRANCH_STACK);
 
-	if (branch_mode == -1 && has_br_stack)
+	if (branch_mode == -1 && has_br_stack) {
 		sort__mode = SORT_MODE__BRANCH;
+		symbol_conf.cumulate_callchain = false;
+	}
 
 	if (report.mem_mode) {
 		if (sort__mode == SORT_MODE__BRANCH) {
@@ -669,6 +681,7 @@ repeat:
 			goto error;
 		}
 		sort__mode = SORT_MODE__MEMORY;
+		symbol_conf.cumulate_callchain = false;
 	}
 
 	if (setup_sorting() < 0) {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 16/27] perf report: Add report.children config option
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (14 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 15/27] perf report: Add --children option Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 17/27] perf tools: Do not auto-remove Children column if --fields given Jiri Olsa
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Add report.children config option for setting default value of
callchain accumulation.  It affects the report output only if
perf.data contains callchain info.

A user can write .perfconfig file like below to enable accumulation
by default:

  $ cat ~/.perfconfig
  [report]
  children = true

And it can be disabled through command line:

  $ perf report --no-children

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-17-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-report.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index f27a8aa..6cac509 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -72,6 +72,10 @@ static int report__config(const char *var, const char *value, void *cb)
 		rep->min_percent = strtof(value, NULL);
 		return 0;
 	}
+	if (!strcmp(var, "report.children")) {
+		symbol_conf.cumulate_callchain = perf_config_bool(var, value);
+		return 0;
+	}
 
 	return perf_default_config(var, value, cb);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 17/27] perf tools: Do not auto-remove Children column if --fields given
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (15 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 16/27] perf report: Add report.children config option Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 18/27] perf tools: Add callback function to hist_entry_iter Jiri Olsa
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Namhyung Kim, Arun Sharma, Frederic Weisbecker,
	Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Depending on the configuration perf inserts/removes the Children
column in the output automatically.  But it might not be what user
wants if [s]he give --fields option explicitly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-18-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/ui/hist.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 8ca6387..498adb2 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -513,6 +513,9 @@ void perf_hpp__column_disable(unsigned col)
 
 void perf_hpp__cancel_cumulate(void)
 {
+	if (field_order)
+		return;
+
 	perf_hpp__column_disable(PERF_HPP__OVERHEAD_ACC);
 	perf_hpp__format[PERF_HPP__OVERHEAD].header = hpp__header_overhead;
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 18/27] perf tools: Add callback function to hist_entry_iter
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (16 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 17/27] perf tools: Do not auto-remove Children column if --fields given Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 19/27] perf top: Convert " Jiri Olsa
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

The new ->add_entry_cb() will be called after an entry was added to
the histogram.  It's used for code sharing between perf report and
perf top.  Note that ops->add_*_entry() should set iter->he properly
in order to call the ->add_entry_cb.

Also pass @arg to the callback function.  It'll be used by perf top
later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/87k393g999.fsf@sejong.aot.lge.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-report.c     | 61 ++++++++++++++++++++++++++++++++-----
 tools/perf/tests/hists_filter.c |  2 +-
 tools/perf/tests/hists_output.c |  2 +-
 tools/perf/util/hist.c          | 67 +++++++++++++++--------------------------
 tools/perf/util/hist.h          |  5 ++-
 5 files changed, 84 insertions(+), 53 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6cac509..21d830b 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -80,14 +80,59 @@ static int report__config(const char *var, const char *value, void *cb)
 	return perf_default_config(var, value, cb);
 }
 
-static void report__inc_stats(struct report *rep,
-			      struct hist_entry *he __maybe_unused)
+static void report__inc_stats(struct report *rep, struct hist_entry *he)
 {
 	/*
-	 * We cannot access @he at this time.  Just assume it's a new entry.
-	 * It'll be fixed once we have a callback mechanism in hist_iter.
+	 * The @he is either of a newly created one or an existing one
+	 * merging current sample.  We only want to count a new one so
+	 * checking ->nr_events being 1.
 	 */
-	rep->nr_entries++;
+	if (he->stat.nr_events == 1)
+		rep->nr_entries++;
+}
+
+static int hist_iter__report_callback(struct hist_entry_iter *iter,
+				      struct addr_location *al, bool single,
+				      void *arg)
+{
+	int err = 0;
+	struct report *rep = arg;
+	struct hist_entry *he = iter->he;
+	struct perf_evsel *evsel = iter->evsel;
+	struct mem_info *mi;
+	struct branch_info *bi;
+
+	report__inc_stats(rep, he);
+
+	if (!ui__has_annotation())
+		return 0;
+
+	if (sort__mode == SORT_MODE__BRANCH) {
+		bi = he->branch_info;
+		err = addr_map_symbol__inc_samples(&bi->from, evsel->idx);
+		if (err)
+			goto out;
+
+		err = addr_map_symbol__inc_samples(&bi->to, evsel->idx);
+
+	} else if (rep->mem_mode) {
+		mi = he->mem_info;
+		err = addr_map_symbol__inc_samples(&mi->daddr, evsel->idx);
+		if (err)
+			goto out;
+
+		err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
+
+	} else if (symbol_conf.cumulate_callchain) {
+		if (single)
+			err = hist_entry__inc_addr_samples(he, evsel->idx,
+							   al->addr);
+	} else {
+		err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
+	}
+
+out:
+	return err;
 }
 
 static int process_sample_event(struct perf_tool *tool,
@@ -100,6 +145,7 @@ static int process_sample_event(struct perf_tool *tool,
 	struct addr_location al;
 	struct hist_entry_iter iter = {
 		.hide_unresolved = rep->hide_unresolved,
+		.add_entry_cb = hist_iter__report_callback,
 	};
 	int ret;
 
@@ -127,9 +173,8 @@ static int process_sample_event(struct perf_tool *tool,
 	if (al.map != NULL)
 		al.map->dso->hit = 1;
 
-	report__inc_stats(rep, NULL);
-
-	ret = hist_entry_iter__add(&iter, &al, evsel, sample, rep->max_stack);
+	ret = hist_entry_iter__add(&iter, &al, evsel, sample, rep->max_stack,
+				   rep);
 	if (ret < 0)
 		pr_debug("problem adding hist entry, skipping event\n");
 
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 76b02e1..3539403 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -82,7 +82,7 @@ static int add_hist_entries(struct perf_evlist *evlist,
 				goto out;
 
 			if (hist_entry_iter__add(&iter, &al, evsel, &sample,
-						 PERF_MAX_STACK_DEPTH) < 0)
+						 PERF_MAX_STACK_DEPTH, NULL) < 0)
 				goto out;
 
 			fake_samples[i].thread = al.thread;
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index 1308f88..d40461e 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -71,7 +71,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 			goto out;
 
 		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
-					 PERF_MAX_STACK_DEPTH) < 0)
+					 PERF_MAX_STACK_DEPTH, NULL) < 0)
 			goto out;
 
 		fake_samples[i].thread = al.thread;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index c6f5f52..5a0a4b2 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -517,27 +517,16 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 }
 
 static int
-iter_finish_mem_entry(struct hist_entry_iter *iter, struct addr_location *al)
+iter_finish_mem_entry(struct hist_entry_iter *iter,
+		      struct addr_location *al __maybe_unused)
 {
 	struct perf_evsel *evsel = iter->evsel;
 	struct hist_entry *he = iter->he;
-	struct mem_info *mx;
 	int err = -EINVAL;
 
 	if (he == NULL)
 		goto out;
 
-	if (ui__has_annotation()) {
-		err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
-		if (err)
-			goto out;
-
-		mx = he->mem_info;
-		err = addr_map_symbol__inc_samples(&mx->daddr, evsel->idx);
-		if (err)
-			goto out;
-	}
-
 	hists__inc_nr_samples(&evsel->hists, he->filtered);
 
 	err = hist_entry__append_callchain(he, iter->sample);
@@ -575,6 +564,9 @@ static int
 iter_add_single_branch_entry(struct hist_entry_iter *iter __maybe_unused,
 			     struct addr_location *al __maybe_unused)
 {
+	/* to avoid calling callback function */
+	iter->he = NULL;
+
 	return 0;
 }
 
@@ -599,7 +591,7 @@ iter_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *al)
 static int
 iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *al)
 {
-	struct branch_info *bi, *bx;
+	struct branch_info *bi;
 	struct perf_evsel *evsel = iter->evsel;
 	struct hist_entry *he = NULL;
 	int i = iter->curr;
@@ -619,17 +611,6 @@ iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *a
 	if (he == NULL)
 		return -ENOMEM;
 
-	if (ui__has_annotation()) {
-		bx = he->branch_info;
-		err = addr_map_symbol__inc_samples(&bx->from, evsel->idx);
-		if (err)
-			goto out;
-
-		err = addr_map_symbol__inc_samples(&bx->to, evsel->idx);
-		if (err)
-			goto out;
-	}
-
 	hists__inc_nr_samples(&evsel->hists, he->filtered);
 
 out:
@@ -673,9 +654,9 @@ iter_add_single_normal_entry(struct hist_entry_iter *iter, struct addr_location
 }
 
 static int
-iter_finish_normal_entry(struct hist_entry_iter *iter, struct addr_location *al)
+iter_finish_normal_entry(struct hist_entry_iter *iter,
+			 struct addr_location *al __maybe_unused)
 {
-	int err;
 	struct hist_entry *he = iter->he;
 	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
@@ -685,12 +666,6 @@ iter_finish_normal_entry(struct hist_entry_iter *iter, struct addr_location *al)
 
 	iter->he = NULL;
 
-	if (ui__has_annotation()) {
-		err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
-		if (err)
-			return err;
-	}
-
 	hists__inc_nr_samples(&evsel->hists, he->filtered);
 
 	return hist_entry__append_callchain(he, sample);
@@ -746,13 +721,6 @@ iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
 	 */
 	callchain_cursor_commit(&callchain_cursor);
 
-	/*
-	 * The iter->he will be over-written after ->add_next_entry()
-	 * called so inc stats for the original entry now.
-	 */
-	if (ui__has_annotation())
-		err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
-
 	hists__inc_nr_samples(&evsel->hists, he->filtered);
 
 	return err;
@@ -802,8 +770,11 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 	 * It's possible that it has cycles or recursive calls.
 	 */
 	for (i = 0; i < iter->curr; i++) {
-		if (hist_entry__cmp(he_cache[i], &he_tmp) == 0)
+		if (hist_entry__cmp(he_cache[i], &he_tmp) == 0) {
+			/* to avoid calling callback function */
+			iter->he = NULL;
 			return 0;
+		}
 	}
 
 	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
@@ -863,7 +834,7 @@ const struct hist_iter_ops hist_iter_cumulative = {
 
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 			 struct perf_evsel *evsel, struct perf_sample *sample,
-			 int max_stack_depth)
+			 int max_stack_depth, void *arg)
 {
 	int err, err2;
 
@@ -883,10 +854,22 @@ int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 	if (err)
 		goto out;
 
+	if (iter->he && iter->add_entry_cb) {
+		err = iter->add_entry_cb(iter, al, true, arg);
+		if (err)
+			goto out;
+	}
+
 	while (iter->ops->next_entry(iter, al)) {
 		err = iter->ops->add_next_entry(iter, al);
 		if (err)
 			break;
+
+		if (iter->he && iter->add_entry_cb) {
+			err = iter->add_entry_cb(iter, al, false, arg);
+			if (err)
+				goto out;
+		}
 	}
 
 out:
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 99ad3cb..82b28ff 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -119,6 +119,9 @@ struct hist_entry_iter {
 	void *priv;
 
 	const struct hist_iter_ops *ops;
+	/* user-defined callback function (optional) */
+	int (*add_entry_cb)(struct hist_entry_iter *iter,
+			    struct addr_location *al, bool single, void *arg);
 };
 
 extern const struct hist_iter_ops hist_iter_normal;
@@ -135,7 +138,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      bool sample_self);
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 			 struct perf_evsel *evsel, struct perf_sample *sample,
-			 int max_stack_depth);
+			 int max_stack_depth, void *arg);
 
 int64_t hist_entry__cmp(struct hist_entry *left, struct hist_entry *right);
 int64_t hist_entry__collapse(struct hist_entry *left, struct hist_entry *right);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 19/27] perf top: Convert to hist_entry_iter
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (17 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 18/27] perf tools: Add callback function to hist_entry_iter Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 20/27] perf top: Add --children option Jiri Olsa
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Reuse hist_entry_iter__add() function to share the similar code with
perf report.  Note that it needs to be called with hists.lock so tweak
some internal functions not to deadlock or hold the lock too long.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Link: http://lkml.kernel.org/r/1401335910-16832-20-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-top.c | 76 ++++++++++++++++++++++++++----------------------
 1 file changed, 41 insertions(+), 35 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 12e2e12..b1cb5f5 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -196,6 +196,12 @@ static void perf_top__record_precise_ip(struct perf_top *top,
 
 	pthread_mutex_unlock(&notes->lock);
 
+	/*
+	 * This function is now called with he->hists->lock held.
+	 * Release it before going to sleep.
+	 */
+	pthread_mutex_unlock(&he->hists->lock);
+
 	if (err == -ERANGE && !he->ms.map->erange_warned)
 		ui__warn_map_erange(he->ms.map, sym, ip);
 	else if (err == -ENOMEM) {
@@ -203,6 +209,8 @@ static void perf_top__record_precise_ip(struct perf_top *top,
 		       sym->name);
 		sleep(1);
 	}
+
+	pthread_mutex_lock(&he->hists->lock);
 }
 
 static void perf_top__show_details(struct perf_top *top)
@@ -238,24 +246,6 @@ out_unlock:
 	pthread_mutex_unlock(&notes->lock);
 }
 
-static struct hist_entry *perf_evsel__add_hist_entry(struct perf_evsel *evsel,
-						     struct addr_location *al,
-						     struct perf_sample *sample)
-{
-	struct hist_entry *he;
-
-	pthread_mutex_lock(&evsel->hists.lock);
-	he = __hists__add_entry(&evsel->hists, al, NULL, NULL, NULL,
-				sample->period, sample->weight,
-				sample->transaction, true);
-	pthread_mutex_unlock(&evsel->hists.lock);
-	if (he == NULL)
-		return NULL;
-
-	hists__inc_nr_samples(&evsel->hists, he->filtered);
-	return he;
-}
-
 static void perf_top__print_sym_table(struct perf_top *top)
 {
 	char bf[160];
@@ -659,6 +649,26 @@ static int symbol_filter(struct map *map __maybe_unused, struct symbol *sym)
 	return 0;
 }
 
+static int hist_iter__top_callback(struct hist_entry_iter *iter,
+				   struct addr_location *al, bool single,
+				   void *arg)
+{
+	struct perf_top *top = arg;
+	struct hist_entry *he = iter->he;
+	struct perf_evsel *evsel = iter->evsel;
+
+	if (sort__has_sym && single) {
+		u64 ip = al->addr;
+
+		if (al->map)
+			ip = al->map->unmap_ip(al->map, ip);
+
+		perf_top__record_precise_ip(top, he, evsel->idx, ip);
+	}
+
+	return 0;
+}
+
 static void perf_event__process_sample(struct perf_tool *tool,
 				       const union perf_event *event,
 				       struct perf_evsel *evsel,
@@ -666,8 +676,6 @@ static void perf_event__process_sample(struct perf_tool *tool,
 				       struct machine *machine)
 {
 	struct perf_top *top = container_of(tool, struct perf_top, tool);
-	struct symbol *parent = NULL;
-	u64 ip = sample->ip;
 	struct addr_location al;
 	int err;
 
@@ -742,25 +750,23 @@ static void perf_event__process_sample(struct perf_tool *tool,
 	}
 
 	if (al.sym == NULL || !al.sym->ignore) {
-		struct hist_entry *he;
+		struct hist_entry_iter iter = {
+			.add_entry_cb = hist_iter__top_callback,
+		};
 
-		err = sample__resolve_callchain(sample, &parent, evsel, &al,
-						top->max_stack);
-		if (err)
-			return;
+		if (symbol_conf.cumulate_callchain)
+			iter.ops = &hist_iter_cumulative;
+		else
+			iter.ops = &hist_iter_normal;
 
-		he = perf_evsel__add_hist_entry(evsel, &al, sample);
-		if (he == NULL) {
-			pr_err("Problem incrementing symbol period, skipping event\n");
-			return;
-		}
+		pthread_mutex_lock(&evsel->hists.lock);
 
-		err = hist_entry__append_callchain(he, sample);
-		if (err)
-			return;
+		err = hist_entry_iter__add(&iter, &al, evsel, sample,
+					   top->max_stack, top);
+		if (err < 0)
+			pr_err("Problem incrementing symbol period, skipping event\n");
 
-		if (sort__has_sym)
-			perf_top__record_precise_ip(top, he, evsel->idx, ip);
+		pthread_mutex_unlock(&evsel->hists.lock);
 	}
 
 	return;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 20/27] perf top: Add --children option
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (18 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 19/27] perf top: Convert " Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 21/27] perf top: Add top.children config option Jiri Olsa
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

The --children option is for showing accumulated overhead (period)
value as well as self overhead.  It should be used with one of -g or
--call-graph option.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-21-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/Documentation/perf-top.txt | 8 +++++++-
 tools/perf/builtin-top.c              | 7 +++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index dcfa54c..180ae02 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -119,7 +119,7 @@ Default is to monitor all CPUS.
 --fields=::
 	Specify output field - multiple keys can be specified in CSV format.
 	Following fields are available:
-	overhead, overhead_sys, overhead_us, sample and period.
+	overhead, overhead_sys, overhead_us, overhead_children, sample and period.
 	Also it can contain any sort key(s).
 
 	By default, every sort keys not specified in --field will be appended
@@ -161,6 +161,12 @@ Default is to monitor all CPUS.
 	Setup and enable call-graph (stack chain/backtrace) recording,
 	implies -g.
 
+--children::
+	Accumulate callchain of children to parent entry so that then can
+	show up in the output.  The output will have a new "Children" column
+	and will be sorted on the data.  It requires -g/--call-graph option
+	enabled.
+
 --max-stack::
 	Set the stack depth limit when parsing the callchain, anything
 	beyond the specified depth will be ignored. This is a trade-off
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index b1cb5f5..fea55e3 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1098,6 +1098,8 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 	OPT_CALLBACK(0, "call-graph", &top.record_opts,
 		     "mode[,dump_size]", record_callchain_help,
 		     &parse_callchain_opt),
+	OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
+		    "Accumulate callchains of children and show total overhead as well"),
 	OPT_INTEGER(0, "max-stack", &top.max_stack,
 		    "Set the maximum stack depth when parsing the callchain. "
 		    "Default: " __stringify(PERF_MAX_STACK_DEPTH)),
@@ -1203,6 +1205,11 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 
 	top.sym_evsel = perf_evlist__first(top.evlist);
 
+	if (!symbol_conf.use_callchain) {
+		symbol_conf.cumulate_callchain = false;
+		perf_hpp__cancel_cumulate();
+	}
+
 	symbol_conf.priv_size = sizeof(struct annotation);
 
 	symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 21/27] perf top: Add top.children config option
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (19 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 20/27] perf top: Add --children option Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 22/27] perf tools: Enable --children option by default Jiri Olsa
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Add top.children config option for setting default value of
callchain accumulation.  It affects the output only if one of
-g or --call-graph option is given as well.

A user can write .perfconfig file like below to enable accumulation
by default:

  $ cat ~/.perfconfig
  [top]
  children = true

And it can be disabled through command line:

  $ perf top --no-children

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-22-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-top.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index fea55e3..377971d 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1004,6 +1004,10 @@ static int perf_top_config(const char *var, const char *value, void *cb)
 
 	if (!strcmp(var, "top.call-graph"))
 		return record_parse_callchain(value, &top->record_opts);
+	if (!strcmp(var, "top.children")) {
+		symbol_conf.cumulate_callchain = perf_config_bool(var, value);
+		return 0;
+	}
 
 	return perf_default_config(var, value, cb);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 22/27] perf tools: Enable --children option by default
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (20 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 21/27] perf top: Add top.children config option Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 23/27] perf ui/stdio: Fix invalid percentage value of cumulated hist entries Jiri Olsa
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Namhyung Kim, Frederic Weisbecker, Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Now perf top and perf report will show children column by default if
it has callchain information.

Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Tested-by: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-23-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/symbol.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 95e2497..7b9096f 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -29,11 +29,12 @@ int vmlinux_path__nr_entries;
 char **vmlinux_path;
 
 struct symbol_conf symbol_conf = {
-	.use_modules	  = true,
-	.try_vmlinux_path = true,
-	.annotate_src	  = true,
-	.demangle	  = true,
-	.symfs            = "",
+	.use_modules		= true,
+	.try_vmlinux_path	= true,
+	.annotate_src		= true,
+	.demangle		= true,
+	.cumulate_callchain	= true,
+	.symfs			= "",
 };
 
 static enum dso_binary_type binary_type_symtab[] = {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 23/27] perf ui/stdio: Fix invalid percentage value of cumulated hist entries
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (21 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 22/27] perf tools: Enable --children option by default Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 24/27] perf ui/gtk: Fix callchain display Jiri Olsa
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Namhyung Kim, Arun Sharma, Frederic Weisbecker,
	Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

On stdio, there's a problem that it shows invalid values for
callchains in cumulated hist entries.  It's because it only cares
about the self period.  But with --children behavior, we always add
callchain info to the cumulated entries so it should use the value in
that case.

Before:

  # Children      Self  Command      Shared Object            Symbol
  # ........  ........  .......  .................  ................
  #
      61.22%     0.32%  swapper  [kernel.kallsyms]      [k] cpu_idle
                    |
                    --- cpu_idle
                       |
                       |--16530.76%-- start_secondary
                       |
                       |--2758.70%-- rest_init
                       |          start_kernel
                       |          x86_64_start_reservations
                       |          x86_64_start_kernel
                        --6837850969203030.00%-- [...]

After:

  # Children      Self  Command      Shared Object            Symbol
  # ........  ........  .......  .................  ................
  #
      61.22%     0.32%  swapper  [kernel.kallsyms]      [k] cpu_idle
                    |
                    --- cpu_idle
                       |
                       |--85.70%-- start_secondary
                       |
                        --14.30%-- rest_init
                                  start_kernel
                                  x86_64_start_reservations
                                  x86_64_start_kernel

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-24-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/ui/stdio/hist.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 475d2f5..90122ab 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -271,7 +271,9 @@ static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
 {
 	switch (callchain_param.mode) {
 	case CHAIN_GRAPH_REL:
-		return callchain__fprintf_graph(fp, &he->sorted_chain, he->stat.period,
+		return callchain__fprintf_graph(fp, &he->sorted_chain,
+						symbol_conf.cumulate_callchain ?
+						he->stat_acc->period : he->stat.period,
 						left_margin);
 		break;
 	case CHAIN_GRAPH_ABS:
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 24/27] perf ui/gtk: Fix callchain display
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (22 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 23/27] perf ui/stdio: Fix invalid percentage value of cumulated hist entries Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 25/27] perf tools: Reset output/sort order to default Jiri Olsa
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Namhyung Kim, Arun Sharma, Frederic Weisbecker,
	Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

With current output field change, GTK browser cannot display callchain
information correctly since it couldn't determine where the symbol
column is.  This is a problem - just for now I changed to use the last
column since it'll work for most cases.

Also it has a same problem of the percentage as stdio code.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-25-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/ui/gtk/hists.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 03d6812..6ca60e4 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -198,6 +198,13 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
 		if (perf_hpp__should_skip(fmt))
 			continue;
 
+		/*
+		 * XXX no way to determine where symcol column is..
+		 *     Just use last column for now.
+		 */
+		if (perf_hpp__is_sort_entry(fmt))
+			sym_col = col_idx;
+
 		fmt->header(fmt, &hpp, hists_to_evsel(hists));
 
 		gtk_tree_view_insert_column_with_attributes(GTK_TREE_VIEW(view),
@@ -253,7 +260,8 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
 
 		if (symbol_conf.use_callchain && sort__has_sym) {
 			if (callchain_param.mode == CHAIN_GRAPH_REL)
-				total = h->stat.period;
+				total = symbol_conf.cumulate_callchain ?
+					h->stat_acc->period : h->stat.period;
 
 			perf_gtk__add_callchain(&h->sorted_chain, store, &iter,
 						sym_col, total);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 25/27] perf tools: Reset output/sort order to default
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (23 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 24/27] perf ui/gtk: Fix callchain display Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 26/27] perf tests: Define and use symbolic names for fake symbols Jiri Olsa
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Namhyung Kim, Arun Sharma, Frederic Weisbecker,
	Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

When reset_output_field() is called, also reset field/sort order to
NULL so that it can have the default values.  It's needed for testing.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
CC: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-26-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/sort.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 9da8931..254f583 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1582,6 +1582,9 @@ void reset_output_field(void)
 	sort__has_sym = 0;
 	sort__has_dso = 0;
 
+	field_order = NULL;
+	sort_order = NULL;
+
 	reset_dimensions();
 	perf_hpp__reset_output_field();
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 26/27] perf tests: Define and use symbolic names for fake symbols
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (24 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 25/27] perf tools: Reset output/sort order to default Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-01 13:31 ` [PATCH 27/27] perf tests: Add a test case for cumulating callchains Jiri Olsa
  2014-06-03 18:23 ` [GIT PULL 00/27] perf/core improvements and fixes Ingo Molnar
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Namhyung Kim, Arun Sharma, Frederic Weisbecker,
	Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

In various histogram test cases, fake symbols are used as raw numbers.
Define macros for each pid, map, symbols so that it can increase
readability somewhat.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-27-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/tests/hists_common.c | 47 ++++++++++++++++++++++-------------------
 tools/perf/tests/hists_common.h | 32 ++++++++++++++++++++++++++--
 tools/perf/tests/hists_filter.c | 23 ++++++++++----------
 tools/perf/tests/hists_link.c   | 32 ++++++++++++++--------------
 tools/perf/tests/hists_output.c | 20 +++++++++---------
 5 files changed, 92 insertions(+), 62 deletions(-)

diff --git a/tools/perf/tests/hists_common.c b/tools/perf/tests/hists_common.c
index e4e01aad..e4e120d 100644
--- a/tools/perf/tests/hists_common.c
+++ b/tools/perf/tests/hists_common.c
@@ -12,9 +12,9 @@ static struct {
 	u32 pid;
 	const char *comm;
 } fake_threads[] = {
-	{ 100, "perf" },
-	{ 200, "perf" },
-	{ 300, "bash" },
+	{ FAKE_PID_PERF1, "perf" },
+	{ FAKE_PID_PERF2, "perf" },
+	{ FAKE_PID_BASH,  "bash" },
 };
 
 static struct {
@@ -22,15 +22,15 @@ static struct {
 	u64 start;
 	const char *filename;
 } fake_mmap_info[] = {
-	{ 100, 0x40000, "perf" },
-	{ 100, 0x50000, "libc" },
-	{ 100, 0xf0000, "[kernel]" },
-	{ 200, 0x40000, "perf" },
-	{ 200, 0x50000, "libc" },
-	{ 200, 0xf0000, "[kernel]" },
-	{ 300, 0x40000, "bash" },
-	{ 300, 0x50000, "libc" },
-	{ 300, 0xf0000, "[kernel]" },
+	{ FAKE_PID_PERF1, FAKE_MAP_PERF,   "perf" },
+	{ FAKE_PID_PERF1, FAKE_MAP_LIBC,   "libc" },
+	{ FAKE_PID_PERF1, FAKE_MAP_KERNEL, "[kernel]" },
+	{ FAKE_PID_PERF2, FAKE_MAP_PERF,   "perf" },
+	{ FAKE_PID_PERF2, FAKE_MAP_LIBC,   "libc" },
+	{ FAKE_PID_PERF2, FAKE_MAP_KERNEL, "[kernel]" },
+	{ FAKE_PID_BASH,  FAKE_MAP_BASH,   "bash" },
+	{ FAKE_PID_BASH,  FAKE_MAP_LIBC,   "libc" },
+	{ FAKE_PID_BASH,  FAKE_MAP_KERNEL, "[kernel]" },
 };
 
 struct fake_sym {
@@ -40,27 +40,30 @@ struct fake_sym {
 };
 
 static struct fake_sym perf_syms[] = {
-	{ 700, 100, "main" },
-	{ 800, 100, "run_command" },
-	{ 900, 100, "cmd_record" },
+	{ FAKE_SYM_OFFSET1, FAKE_SYM_LENGTH, "main" },
+	{ FAKE_SYM_OFFSET2, FAKE_SYM_LENGTH, "run_command" },
+	{ FAKE_SYM_OFFSET3, FAKE_SYM_LENGTH, "cmd_record" },
 };
 
 static struct fake_sym bash_syms[] = {
-	{ 700, 100, "main" },
-	{ 800, 100, "xmalloc" },
-	{ 900, 100, "xfree" },
+	{ FAKE_SYM_OFFSET1, FAKE_SYM_LENGTH, "main" },
+	{ FAKE_SYM_OFFSET2, FAKE_SYM_LENGTH, "xmalloc" },
+	{ FAKE_SYM_OFFSET3, FAKE_SYM_LENGTH, "xfree" },
 };
 
 static struct fake_sym libc_syms[] = {
 	{ 700, 100, "malloc" },
 	{ 800, 100, "free" },
 	{ 900, 100, "realloc" },
+	{ FAKE_SYM_OFFSET1, FAKE_SYM_LENGTH, "malloc" },
+	{ FAKE_SYM_OFFSET2, FAKE_SYM_LENGTH, "free" },
+	{ FAKE_SYM_OFFSET3, FAKE_SYM_LENGTH, "realloc" },
 };
 
 static struct fake_sym kernel_syms[] = {
-	{ 700, 100, "schedule" },
-	{ 800, 100, "page_fault" },
-	{ 900, 100, "sys_perf_event_open" },
+	{ FAKE_SYM_OFFSET1, FAKE_SYM_LENGTH, "schedule" },
+	{ FAKE_SYM_OFFSET2, FAKE_SYM_LENGTH, "page_fault" },
+	{ FAKE_SYM_OFFSET3, FAKE_SYM_LENGTH, "sys_perf_event_open" },
 };
 
 static struct {
@@ -102,7 +105,7 @@ struct machine *setup_fake_machine(struct machines *machines)
 				.pid = fake_mmap_info[i].pid,
 				.tid = fake_mmap_info[i].pid,
 				.start = fake_mmap_info[i].start,
-				.len = 0x1000ULL,
+				.len = FAKE_MAP_LENGTH,
 				.pgoff = 0ULL,
 			},
 		};
diff --git a/tools/perf/tests/hists_common.h b/tools/perf/tests/hists_common.h
index 1415ae6..888254e 100644
--- a/tools/perf/tests/hists_common.h
+++ b/tools/perf/tests/hists_common.h
@@ -4,6 +4,34 @@
 struct machine;
 struct machines;
 
+#define FAKE_PID_PERF1  100
+#define FAKE_PID_PERF2  200
+#define FAKE_PID_BASH   300
+
+#define FAKE_MAP_PERF    0x400000
+#define FAKE_MAP_BASH    0x400000
+#define FAKE_MAP_LIBC    0x500000
+#define FAKE_MAP_KERNEL  0xf00000
+#define FAKE_MAP_LENGTH  0x100000
+
+#define FAKE_SYM_OFFSET1  700
+#define FAKE_SYM_OFFSET2  800
+#define FAKE_SYM_OFFSET3  900
+#define FAKE_SYM_LENGTH   100
+
+#define FAKE_IP_PERF_MAIN  FAKE_MAP_PERF + FAKE_SYM_OFFSET1
+#define FAKE_IP_PERF_RUN_COMMAND  FAKE_MAP_PERF + FAKE_SYM_OFFSET2
+#define FAKE_IP_PERF_CMD_RECORD  FAKE_MAP_PERF + FAKE_SYM_OFFSET3
+#define FAKE_IP_BASH_MAIN  FAKE_MAP_BASH + FAKE_SYM_OFFSET1
+#define FAKE_IP_BASH_XMALLOC  FAKE_MAP_BASH + FAKE_SYM_OFFSET2
+#define FAKE_IP_BASH_XFREE  FAKE_MAP_BASH + FAKE_SYM_OFFSET3
+#define FAKE_IP_LIBC_MALLOC  FAKE_MAP_LIBC + FAKE_SYM_OFFSET1
+#define FAKE_IP_LIBC_FREE  FAKE_MAP_LIBC + FAKE_SYM_OFFSET2
+#define FAKE_IP_LIBC_REALLOC  FAKE_MAP_LIBC + FAKE_SYM_OFFSET3
+#define FAKE_IP_KERNEL_SCHEDULE  FAKE_MAP_KERNEL + FAKE_SYM_OFFSET1
+#define FAKE_IP_KERNEL_PAGE_FAULT  FAKE_MAP_KERNEL + FAKE_SYM_OFFSET2
+#define FAKE_IP_KERNEL_SYS_PERF_EVENT_OPEN  FAKE_MAP_KERNEL + FAKE_SYM_OFFSET3
+
 /*
  * The setup_fake_machine() provides a test environment which consists
  * of 3 processes that have 3 mappings and in turn, have 3 symbols
@@ -13,7 +41,7 @@ struct machines;
  * .............  .............  ...................
  *    perf:  100           perf  main
  *    perf:  100           perf  run_command
- *    perf:  100           perf  comd_record
+ *    perf:  100           perf  cmd_record
  *    perf:  100           libc  malloc
  *    perf:  100           libc  free
  *    perf:  100           libc  realloc
@@ -22,7 +50,7 @@ struct machines;
  *    perf:  100       [kernel]  sys_perf_event_open
  *    perf:  200           perf  main
  *    perf:  200           perf  run_command
- *    perf:  200           perf  comd_record
+ *    perf:  200           perf  cmd_record
  *    perf:  200           libc  malloc
  *    perf:  200           libc  free
  *    perf:  200           libc  realloc
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 3539403..821f581 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -21,25 +21,25 @@ struct sample {
 /* For the numbers, see hists_common.c */
 static struct sample fake_samples[] = {
 	/* perf [kernel] schedule() */
-	{ .pid = 100, .ip = 0xf0000 + 700, },
+	{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_KERNEL_SCHEDULE, },
 	/* perf [perf]   main() */
-	{ .pid = 100, .ip = 0x40000 + 700, },
+	{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_PERF_MAIN, },
 	/* perf [libc]   malloc() */
-	{ .pid = 100, .ip = 0x50000 + 700, },
+	{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_LIBC_MALLOC, },
 	/* perf [perf]   main() */
-	{ .pid = 200, .ip = 0x40000 + 700, }, /* will be merged */
+	{ .pid = FAKE_PID_PERF2, .ip = FAKE_IP_PERF_MAIN, }, /* will be merged */
 	/* perf [perf]   cmd_record() */
-	{ .pid = 200, .ip = 0x40000 + 900, },
+	{ .pid = FAKE_PID_PERF2, .ip = FAKE_IP_PERF_CMD_RECORD, },
 	/* perf [kernel] page_fault() */
-	{ .pid = 200, .ip = 0xf0000 + 800, },
+	{ .pid = FAKE_PID_PERF2, .ip = FAKE_IP_KERNEL_PAGE_FAULT, },
 	/* bash [bash]   main() */
-	{ .pid = 300, .ip = 0x40000 + 700, },
+	{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_BASH_MAIN, },
 	/* bash [bash]   xmalloc() */
-	{ .pid = 300, .ip = 0x40000 + 800, },
+	{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_BASH_XMALLOC, },
 	/* bash [libc]   malloc() */
-	{ .pid = 300, .ip = 0x50000 + 700, },
+	{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_LIBC_MALLOC, },
 	/* bash [kernel] page_fault() */
-	{ .pid = 300, .ip = 0xf0000 + 800, },
+	{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_KERNEL_PAGE_FAULT, },
 };
 
 static int add_hist_entries(struct perf_evlist *evlist,
@@ -47,7 +47,7 @@ static int add_hist_entries(struct perf_evlist *evlist,
 {
 	struct perf_evsel *evsel;
 	struct addr_location al;
-	struct perf_sample sample = { .cpu = 0, };
+	struct perf_sample sample = { .period = 100, };
 	size_t i;
 
 	/*
@@ -75,7 +75,6 @@ static int add_hist_entries(struct perf_evlist *evlist,
 			sample.pid = fake_samples[i].pid;
 			sample.tid = fake_samples[i].pid;
 			sample.ip = fake_samples[i].ip;
-			sample.period = 100;
 
 			if (perf_event__preprocess_sample(&event, machine, &al,
 							  &sample) < 0)
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index ca6693b..d4b34b0 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -21,41 +21,41 @@ struct sample {
 /* For the numbers, see hists_common.c */
 static struct sample fake_common_samples[] = {
 	/* perf [kernel] schedule() */
-	{ .pid = 100, .ip = 0xf0000 + 700, },
+	{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_KERNEL_SCHEDULE, },
 	/* perf [perf]   main() */
-	{ .pid = 200, .ip = 0x40000 + 700, },
+	{ .pid = FAKE_PID_PERF2, .ip = FAKE_IP_PERF_MAIN, },
 	/* perf [perf]   cmd_record() */
-	{ .pid = 200, .ip = 0x40000 + 900, },
+	{ .pid = FAKE_PID_PERF2, .ip = FAKE_IP_PERF_CMD_RECORD, },
 	/* bash [bash]   xmalloc() */
-	{ .pid = 300, .ip = 0x40000 + 800, },
+	{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_BASH_XMALLOC, },
 	/* bash [libc]   malloc() */
-	{ .pid = 300, .ip = 0x50000 + 700, },
+	{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_LIBC_MALLOC, },
 };
 
 static struct sample fake_samples[][5] = {
 	{
 		/* perf [perf]   run_command() */
-		{ .pid = 100, .ip = 0x40000 + 800, },
+		{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_PERF_RUN_COMMAND, },
 		/* perf [libc]   malloc() */
-		{ .pid = 100, .ip = 0x50000 + 700, },
+		{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_LIBC_MALLOC, },
 		/* perf [kernel] page_fault() */
-		{ .pid = 100, .ip = 0xf0000 + 800, },
+		{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_KERNEL_PAGE_FAULT, },
 		/* perf [kernel] sys_perf_event_open() */
-		{ .pid = 200, .ip = 0xf0000 + 900, },
+		{ .pid = FAKE_PID_PERF2, .ip = FAKE_IP_KERNEL_SYS_PERF_EVENT_OPEN, },
 		/* bash [libc]   free() */
-		{ .pid = 300, .ip = 0x50000 + 800, },
+		{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_LIBC_FREE, },
 	},
 	{
 		/* perf [libc]   free() */
-		{ .pid = 200, .ip = 0x50000 + 800, },
+		{ .pid = FAKE_PID_PERF2, .ip = FAKE_IP_LIBC_FREE, },
 		/* bash [libc]   malloc() */
-		{ .pid = 300, .ip = 0x50000 + 700, }, /* will be merged */
+		{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_LIBC_MALLOC, }, /* will be merged */
 		/* bash [bash]   xfee() */
-		{ .pid = 300, .ip = 0x40000 + 900, },
+		{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_BASH_XFREE, },
 		/* bash [libc]   realloc() */
-		{ .pid = 300, .ip = 0x50000 + 900, },
+		{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_LIBC_REALLOC, },
 		/* bash [kernel] page_fault() */
-		{ .pid = 300, .ip = 0xf0000 + 800, },
+		{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_KERNEL_PAGE_FAULT, },
 	},
 };
 
@@ -64,7 +64,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 	struct perf_evsel *evsel;
 	struct addr_location al;
 	struct hist_entry *he;
-	struct perf_sample sample = { .cpu = 0, };
+	struct perf_sample sample = { .period = 1, };
 	size_t i = 0, k;
 
 	/*
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index d40461e..e3bbd6c 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -22,25 +22,25 @@ struct sample {
 /* For the numbers, see hists_common.c */
 static struct sample fake_samples[] = {
 	/* perf [kernel] schedule() */
-	{ .cpu = 0, .pid = 100, .ip = 0xf0000 + 700, },
+	{ .cpu = 0, .pid = FAKE_PID_PERF1, .ip = FAKE_IP_KERNEL_SCHEDULE, },
 	/* perf [perf]   main() */
-	{ .cpu = 1, .pid = 100, .ip = 0x40000 + 700, },
+	{ .cpu = 1, .pid = FAKE_PID_PERF1, .ip = FAKE_IP_PERF_MAIN, },
 	/* perf [perf]   cmd_record() */
-	{ .cpu = 1, .pid = 100, .ip = 0x40000 + 900, },
+	{ .cpu = 1, .pid = FAKE_PID_PERF1, .ip = FAKE_IP_PERF_CMD_RECORD, },
 	/* perf [libc]   malloc() */
-	{ .cpu = 1, .pid = 100, .ip = 0x50000 + 700, },
+	{ .cpu = 1, .pid = FAKE_PID_PERF1, .ip = FAKE_IP_LIBC_MALLOC, },
 	/* perf [libc]   free() */
-	{ .cpu = 2, .pid = 100, .ip = 0x50000 + 800, },
+	{ .cpu = 2, .pid = FAKE_PID_PERF1, .ip = FAKE_IP_LIBC_FREE, },
 	/* perf [perf]   main() */
-	{ .cpu = 2, .pid = 200, .ip = 0x40000 + 700, },
+	{ .cpu = 2, .pid = FAKE_PID_PERF2, .ip = FAKE_IP_PERF_MAIN, },
 	/* perf [kernel] page_fault() */
-	{ .cpu = 2, .pid = 200, .ip = 0xf0000 + 800, },
+	{ .cpu = 2, .pid = FAKE_PID_PERF2, .ip = FAKE_IP_KERNEL_PAGE_FAULT, },
 	/* bash [bash]   main() */
-	{ .cpu = 3, .pid = 300, .ip = 0x40000 + 700, },
+	{ .cpu = 3, .pid = FAKE_PID_BASH,  .ip = FAKE_IP_BASH_MAIN, },
 	/* bash [bash]   xmalloc() */
-	{ .cpu = 0, .pid = 300, .ip = 0x40000 + 800, },
+	{ .cpu = 0, .pid = FAKE_PID_BASH,  .ip = FAKE_IP_BASH_XMALLOC, },
 	/* bash [kernel] page_fault() */
-	{ .cpu = 1, .pid = 300, .ip = 0xf0000 + 800, },
+	{ .cpu = 1, .pid = FAKE_PID_BASH,  .ip = FAKE_IP_KERNEL_PAGE_FAULT, },
 };
 
 static int add_hist_entries(struct hists *hists, struct machine *machine)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 27/27] perf tests: Add a test case for cumulating callchains
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (25 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 26/27] perf tests: Define and use symbolic names for fake symbols Jiri Olsa
@ 2014-06-01 13:31 ` Jiri Olsa
  2014-06-03 18:23 ` [GIT PULL 00/27] perf/core improvements and fixes Ingo Molnar
  27 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2014-06-01 13:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Namhyung Kim, Arun Sharma, Frederic Weisbecker,
	Jiri Olsa

From: Namhyung Kim <namhyung@kernel.org>

Now it adds a new testcase to verify --children option working
correctly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-28-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/Makefile.perf          |   1 +
 tools/perf/tests/builtin-test.c   |   4 +
 tools/perf/tests/hists_common.c   |   5 +-
 tools/perf/tests/hists_cumulate.c | 726 ++++++++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h          |   1 +
 5 files changed, 735 insertions(+), 2 deletions(-)
 create mode 100644 tools/perf/tests/hists_cumulate.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 02f0a4d..67f7c05 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -400,6 +400,7 @@ LIB_OBJS += $(OUTPUT)tests/hists_common.o
 LIB_OBJS += $(OUTPUT)tests/hists_link.o
 LIB_OBJS += $(OUTPUT)tests/hists_filter.o
 LIB_OBJS += $(OUTPUT)tests/hists_output.o
+LIB_OBJS += $(OUTPUT)tests/hists_cumulate.o
 LIB_OBJS += $(OUTPUT)tests/python-use.o
 LIB_OBJS += $(OUTPUT)tests/bp_signal.o
 LIB_OBJS += $(OUTPUT)tests/bp_signal_overflow.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 831f52c..802e3cd 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -140,6 +140,10 @@ static struct test {
 		.func = test__hists_output,
 	},
 	{
+		.desc = "Test cumulation of child hist entries",
+		.func = test__hists_cumulate,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/hists_common.c b/tools/perf/tests/hists_common.c
index e4e120d..a62c091 100644
--- a/tools/perf/tests/hists_common.c
+++ b/tools/perf/tests/hists_common.c
@@ -196,10 +196,11 @@ void print_hists_out(struct hists *hists)
 		he = rb_entry(node, struct hist_entry, rb_node);
 
 		if (!he->filtered) {
-			pr_info("%2d: entry: %8s:%5d [%-8s] %20s: period = %"PRIu64"\n",
+			pr_info("%2d: entry: %8s:%5d [%-8s] %20s: period = %"PRIu64"/%"PRIu64"\n",
 				i, thread__comm_str(he->thread), he->thread->tid,
 				he->ms.map->dso->short_name,
-				he->ms.sym->name, he->stat.period);
+				he->ms.sym->name, he->stat.period,
+				he->stat_acc ? he->stat_acc->period : 0);
 		}
 
 		i++;
diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
new file mode 100644
index 0000000..0ac240d
--- /dev/null
+++ b/tools/perf/tests/hists_cumulate.c
@@ -0,0 +1,726 @@
+#include "perf.h"
+#include "util/debug.h"
+#include "util/symbol.h"
+#include "util/sort.h"
+#include "util/evsel.h"
+#include "util/evlist.h"
+#include "util/machine.h"
+#include "util/thread.h"
+#include "util/parse-events.h"
+#include "tests/tests.h"
+#include "tests/hists_common.h"
+
+struct sample {
+	u32 pid;
+	u64 ip;
+	struct thread *thread;
+	struct map *map;
+	struct symbol *sym;
+};
+
+/* For the numbers, see hists_common.c */
+static struct sample fake_samples[] = {
+	/* perf [kernel] schedule() */
+	{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_KERNEL_SCHEDULE, },
+	/* perf [perf]   main() */
+	{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_PERF_MAIN, },
+	/* perf [perf]   cmd_record() */
+	{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_PERF_CMD_RECORD, },
+	/* perf [libc]   malloc() */
+	{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_LIBC_MALLOC, },
+	/* perf [libc]   free() */
+	{ .pid = FAKE_PID_PERF1, .ip = FAKE_IP_LIBC_FREE, },
+	/* perf [perf]   main() */
+	{ .pid = FAKE_PID_PERF2, .ip = FAKE_IP_PERF_MAIN, },
+	/* perf [kernel] page_fault() */
+	{ .pid = FAKE_PID_PERF2, .ip = FAKE_IP_KERNEL_PAGE_FAULT, },
+	/* bash [bash]   main() */
+	{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_BASH_MAIN, },
+	/* bash [bash]   xmalloc() */
+	{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_BASH_XMALLOC, },
+	/* bash [kernel] page_fault() */
+	{ .pid = FAKE_PID_BASH,  .ip = FAKE_IP_KERNEL_PAGE_FAULT, },
+};
+
+/*
+ * Will be casted to struct ip_callchain which has all 64 bit entries
+ * of nr and ips[].
+ */
+static u64 fake_callchains[][10] = {
+	/*   schedule => run_command => main */
+	{ 3, FAKE_IP_KERNEL_SCHEDULE, FAKE_IP_PERF_RUN_COMMAND, FAKE_IP_PERF_MAIN, },
+	/*   main  */
+	{ 1, FAKE_IP_PERF_MAIN, },
+	/*   cmd_record => run_command => main */
+	{ 3, FAKE_IP_PERF_CMD_RECORD, FAKE_IP_PERF_RUN_COMMAND, FAKE_IP_PERF_MAIN, },
+	/*   malloc => cmd_record => run_command => main */
+	{ 4, FAKE_IP_LIBC_MALLOC, FAKE_IP_PERF_CMD_RECORD, FAKE_IP_PERF_RUN_COMMAND,
+	     FAKE_IP_PERF_MAIN, },
+	/*   free => cmd_record => run_command => main */
+	{ 4, FAKE_IP_LIBC_FREE, FAKE_IP_PERF_CMD_RECORD, FAKE_IP_PERF_RUN_COMMAND,
+	     FAKE_IP_PERF_MAIN, },
+	/*   main */
+	{ 1, FAKE_IP_PERF_MAIN, },
+	/*   page_fault => sys_perf_event_open => run_command => main */
+	{ 4, FAKE_IP_KERNEL_PAGE_FAULT, FAKE_IP_KERNEL_SYS_PERF_EVENT_OPEN,
+	     FAKE_IP_PERF_RUN_COMMAND, FAKE_IP_PERF_MAIN, },
+	/*   main */
+	{ 1, FAKE_IP_BASH_MAIN, },
+	/*   xmalloc => malloc => xmalloc => malloc => xmalloc => main */
+	{ 6, FAKE_IP_BASH_XMALLOC, FAKE_IP_LIBC_MALLOC, FAKE_IP_BASH_XMALLOC,
+	     FAKE_IP_LIBC_MALLOC, FAKE_IP_BASH_XMALLOC, FAKE_IP_BASH_MAIN, },
+	/*   page_fault => malloc => main */
+	{ 3, FAKE_IP_KERNEL_PAGE_FAULT, FAKE_IP_LIBC_MALLOC, FAKE_IP_BASH_MAIN, },
+};
+
+static int add_hist_entries(struct hists *hists, struct machine *machine)
+{
+	struct addr_location al;
+	struct perf_evsel *evsel = hists_to_evsel(hists);
+	struct perf_sample sample = { .period = 1000, };
+	size_t i;
+
+	for (i = 0; i < ARRAY_SIZE(fake_samples); i++) {
+		const union perf_event event = {
+			.header = {
+				.misc = PERF_RECORD_MISC_USER,
+			},
+		};
+		struct hist_entry_iter iter = {
+			.hide_unresolved = false,
+		};
+
+		if (symbol_conf.cumulate_callchain)
+			iter.ops = &hist_iter_cumulative;
+		else
+			iter.ops = &hist_iter_normal;
+
+		sample.pid = fake_samples[i].pid;
+		sample.tid = fake_samples[i].pid;
+		sample.ip = fake_samples[i].ip;
+		sample.callchain = (struct ip_callchain *)fake_callchains[i];
+
+		if (perf_event__preprocess_sample(&event, machine, &al,
+						  &sample) < 0)
+			goto out;
+
+		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
+					 PERF_MAX_STACK_DEPTH, NULL) < 0)
+			goto out;
+
+		fake_samples[i].thread = al.thread;
+		fake_samples[i].map = al.map;
+		fake_samples[i].sym = al.sym;
+	}
+
+	return TEST_OK;
+
+out:
+	pr_debug("Not enough memory for adding a hist entry\n");
+	return TEST_FAIL;
+}
+
+static void del_hist_entries(struct hists *hists)
+{
+	struct hist_entry *he;
+	struct rb_root *root_in;
+	struct rb_root *root_out;
+	struct rb_node *node;
+
+	if (sort__need_collapse)
+		root_in = &hists->entries_collapsed;
+	else
+		root_in = hists->entries_in;
+
+	root_out = &hists->entries;
+
+	while (!RB_EMPTY_ROOT(root_out)) {
+		node = rb_first(root_out);
+
+		he = rb_entry(node, struct hist_entry, rb_node);
+		rb_erase(node, root_out);
+		rb_erase(&he->rb_node_in, root_in);
+		hist_entry__free(he);
+	}
+}
+
+typedef int (*test_fn_t)(struct perf_evsel *, struct machine *);
+
+#define COMM(he)  (thread__comm_str(he->thread))
+#define DSO(he)   (he->ms.map->dso->short_name)
+#define SYM(he)   (he->ms.sym->name)
+#define CPU(he)   (he->cpu)
+#define PID(he)   (he->thread->tid)
+#define DEPTH(he) (he->callchain->max_depth)
+#define CDSO(cl)  (cl->ms.map->dso->short_name)
+#define CSYM(cl)  (cl->ms.sym->name)
+
+struct result {
+	u64 children;
+	u64 self;
+	const char *comm;
+	const char *dso;
+	const char *sym;
+};
+
+struct callchain_result {
+	u64 nr;
+	struct {
+		const char *dso;
+		const char *sym;
+	} node[10];
+};
+
+static int do_test(struct hists *hists, struct result *expected, size_t nr_expected,
+		   struct callchain_result *expected_callchain, size_t nr_callchain)
+{
+	char buf[32];
+	size_t i, c;
+	struct hist_entry *he;
+	struct rb_root *root;
+	struct rb_node *node;
+	struct callchain_node *cnode;
+	struct callchain_list *clist;
+
+	/*
+	 * adding and deleting hist entries must be done outside of this
+	 * function since TEST_ASSERT_VAL() returns in case of failure.
+	 */
+	hists__collapse_resort(hists, NULL);
+	hists__output_resort(hists);
+
+	if (verbose > 2) {
+		pr_info("use callchain: %d, cumulate callchain: %d\n",
+			symbol_conf.use_callchain,
+			symbol_conf.cumulate_callchain);
+		print_hists_out(hists);
+	}
+
+	root = &hists->entries;
+	for (node = rb_first(root), i = 0;
+	     node && (he = rb_entry(node, struct hist_entry, rb_node));
+	     node = rb_next(node), i++) {
+		scnprintf(buf, sizeof(buf), "Invalid hist entry #%zd", i);
+
+		TEST_ASSERT_VAL("Incorrect number of hist entry",
+				i < nr_expected);
+		TEST_ASSERT_VAL(buf, he->stat.period == expected[i].self &&
+				!strcmp(COMM(he), expected[i].comm) &&
+				!strcmp(DSO(he), expected[i].dso) &&
+				!strcmp(SYM(he), expected[i].sym));
+
+		if (symbol_conf.cumulate_callchain)
+			TEST_ASSERT_VAL(buf, he->stat_acc->period == expected[i].children);
+
+		if (!symbol_conf.use_callchain)
+			continue;
+
+		/* check callchain entries */
+		root = &he->callchain->node.rb_root;
+		cnode = rb_entry(rb_first(root), struct callchain_node, rb_node);
+
+		c = 0;
+		list_for_each_entry(clist, &cnode->val, list) {
+			scnprintf(buf, sizeof(buf), "Invalid callchain entry #%zd/%zd", i, c);
+
+			TEST_ASSERT_VAL("Incorrect number of callchain entry",
+					c < expected_callchain[i].nr);
+			TEST_ASSERT_VAL(buf,
+				!strcmp(CDSO(clist), expected_callchain[i].node[c].dso) &&
+				!strcmp(CSYM(clist), expected_callchain[i].node[c].sym));
+			c++;
+		}
+		/* TODO: handle multiple child nodes properly */
+		TEST_ASSERT_VAL("Incorrect number of callchain entry",
+				c <= expected_callchain[i].nr);
+	}
+	TEST_ASSERT_VAL("Incorrect number of hist entry",
+			i == nr_expected);
+	TEST_ASSERT_VAL("Incorrect number of callchain entry",
+			!symbol_conf.use_callchain || nr_expected == nr_callchain);
+	return 0;
+}
+
+/* NO callchain + NO children */
+static int test1(struct perf_evsel *evsel, struct machine *machine)
+{
+	int err;
+	struct hists *hists = &evsel->hists;
+	/*
+	 * expected output:
+	 *
+	 * Overhead  Command  Shared Object          Symbol
+	 * ========  =======  =============  ==============
+	 *   20.00%     perf  perf           [.] main
+	 *   10.00%     bash  [kernel]       [k] page_fault
+	 *   10.00%     bash  bash           [.] main
+	 *   10.00%     bash  bash           [.] xmalloc
+	 *   10.00%     perf  [kernel]       [k] page_fault
+	 *   10.00%     perf  [kernel]       [k] schedule
+	 *   10.00%     perf  libc           [.] free
+	 *   10.00%     perf  libc           [.] malloc
+	 *   10.00%     perf  perf           [.] cmd_record
+	 */
+	struct result expected[] = {
+		{ 0, 2000, "perf", "perf",     "main" },
+		{ 0, 1000, "bash", "[kernel]", "page_fault" },
+		{ 0, 1000, "bash", "bash",     "main" },
+		{ 0, 1000, "bash", "bash",     "xmalloc" },
+		{ 0, 1000, "perf", "[kernel]", "page_fault" },
+		{ 0, 1000, "perf", "[kernel]", "schedule" },
+		{ 0, 1000, "perf", "libc",     "free" },
+		{ 0, 1000, "perf", "libc",     "malloc" },
+		{ 0, 1000, "perf", "perf",     "cmd_record" },
+	};
+
+	symbol_conf.use_callchain = false;
+	symbol_conf.cumulate_callchain = false;
+
+	setup_sorting();
+	callchain_register_param(&callchain_param);
+
+	err = add_hist_entries(hists, machine);
+	if (err < 0)
+		goto out;
+
+	err = do_test(hists, expected, ARRAY_SIZE(expected), NULL, 0);
+
+out:
+	del_hist_entries(hists);
+	reset_output_field();
+	return err;
+}
+
+/* callcain + NO children */
+static int test2(struct perf_evsel *evsel, struct machine *machine)
+{
+	int err;
+	struct hists *hists = &evsel->hists;
+	/*
+	 * expected output:
+	 *
+	 * Overhead  Command  Shared Object          Symbol
+	 * ========  =======  =============  ==============
+	 *   20.00%     perf  perf           [.] main
+	 *              |
+	 *              --- main
+	 *
+	 *   10.00%     bash  [kernel]       [k] page_fault
+	 *              |
+	 *              --- page_fault
+	 *                  malloc
+	 *                  main
+	 *
+	 *   10.00%     bash  bash           [.] main
+	 *              |
+	 *              --- main
+	 *
+	 *   10.00%     bash  bash           [.] xmalloc
+	 *              |
+	 *              --- xmalloc
+	 *                  malloc
+	 *                  xmalloc     <--- NOTE: there's a cycle
+	 *                  malloc
+	 *                  xmalloc
+	 *                  main
+	 *
+	 *   10.00%     perf  [kernel]       [k] page_fault
+	 *              |
+	 *              --- page_fault
+	 *                  sys_perf_event_open
+	 *                  run_command
+	 *                  main
+	 *
+	 *   10.00%     perf  [kernel]       [k] schedule
+	 *              |
+	 *              --- schedule
+	 *                  run_command
+	 *                  main
+	 *
+	 *   10.00%     perf  libc           [.] free
+	 *              |
+	 *              --- free
+	 *                  cmd_record
+	 *                  run_command
+	 *                  main
+	 *
+	 *   10.00%     perf  libc           [.] malloc
+	 *              |
+	 *              --- malloc
+	 *                  cmd_record
+	 *                  run_command
+	 *                  main
+	 *
+	 *   10.00%     perf  perf           [.] cmd_record
+	 *              |
+	 *              --- cmd_record
+	 *                  run_command
+	 *                  main
+	 *
+	 */
+	struct result expected[] = {
+		{ 0, 2000, "perf", "perf",     "main" },
+		{ 0, 1000, "bash", "[kernel]", "page_fault" },
+		{ 0, 1000, "bash", "bash",     "main" },
+		{ 0, 1000, "bash", "bash",     "xmalloc" },
+		{ 0, 1000, "perf", "[kernel]", "page_fault" },
+		{ 0, 1000, "perf", "[kernel]", "schedule" },
+		{ 0, 1000, "perf", "libc",     "free" },
+		{ 0, 1000, "perf", "libc",     "malloc" },
+		{ 0, 1000, "perf", "perf",     "cmd_record" },
+	};
+	struct callchain_result expected_callchain[] = {
+		{
+			1, {	{ "perf",     "main" }, },
+		},
+		{
+			3, {	{ "[kernel]", "page_fault" },
+				{ "libc",     "malloc" },
+				{ "bash",     "main" }, },
+		},
+		{
+			1, {	{ "bash",     "main" }, },
+		},
+		{
+			6, {	{ "bash",     "xmalloc" },
+				{ "libc",     "malloc" },
+				{ "bash",     "xmalloc" },
+				{ "libc",     "malloc" },
+				{ "bash",     "xmalloc" },
+				{ "bash",     "main" }, },
+		},
+		{
+			4, {	{ "[kernel]", "page_fault" },
+				{ "[kernel]", "sys_perf_event_open" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			3, {	{ "[kernel]", "schedule" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			4, {	{ "libc",     "free" },
+				{ "perf",     "cmd_record" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			4, {	{ "libc",     "malloc" },
+				{ "perf",     "cmd_record" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			3, {	{ "perf",     "cmd_record" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+	};
+
+	symbol_conf.use_callchain = true;
+	symbol_conf.cumulate_callchain = false;
+
+	setup_sorting();
+	callchain_register_param(&callchain_param);
+
+	err = add_hist_entries(hists, machine);
+	if (err < 0)
+		goto out;
+
+	err = do_test(hists, expected, ARRAY_SIZE(expected),
+		      expected_callchain, ARRAY_SIZE(expected_callchain));
+
+out:
+	del_hist_entries(hists);
+	reset_output_field();
+	return err;
+}
+
+/* NO callchain + children */
+static int test3(struct perf_evsel *evsel, struct machine *machine)
+{
+	int err;
+	struct hists *hists = &evsel->hists;
+	/*
+	 * expected output:
+	 *
+	 * Children      Self  Command  Shared Object                   Symbol
+	 * ========  ========  =======  =============  =======================
+	 *   70.00%    20.00%     perf  perf           [.] main
+	 *   50.00%     0.00%     perf  perf           [.] run_command
+	 *   30.00%    10.00%     bash  bash           [.] main
+	 *   30.00%    10.00%     perf  perf           [.] cmd_record
+	 *   20.00%     0.00%     bash  libc           [.] malloc
+	 *   10.00%    10.00%     bash  [kernel]       [k] page_fault
+	 *   10.00%    10.00%     perf  [kernel]       [k] schedule
+	 *   10.00%     0.00%     perf  [kernel]       [k] sys_perf_event_open
+	 *   10.00%    10.00%     perf  [kernel]       [k] page_fault
+	 *   10.00%    10.00%     perf  libc           [.] free
+	 *   10.00%    10.00%     perf  libc           [.] malloc
+	 *   10.00%    10.00%     bash  bash           [.] xmalloc
+	 */
+	struct result expected[] = {
+		{ 7000, 2000, "perf", "perf",     "main" },
+		{ 5000,    0, "perf", "perf",     "run_command" },
+		{ 3000, 1000, "bash", "bash",     "main" },
+		{ 3000, 1000, "perf", "perf",     "cmd_record" },
+		{ 2000,    0, "bash", "libc",     "malloc" },
+		{ 1000, 1000, "bash", "[kernel]", "page_fault" },
+		{ 1000, 1000, "perf", "[kernel]", "schedule" },
+		{ 1000,    0, "perf", "[kernel]", "sys_perf_event_open" },
+		{ 1000, 1000, "perf", "[kernel]", "page_fault" },
+		{ 1000, 1000, "perf", "libc",     "free" },
+		{ 1000, 1000, "perf", "libc",     "malloc" },
+		{ 1000, 1000, "bash", "bash",     "xmalloc" },
+	};
+
+	symbol_conf.use_callchain = false;
+	symbol_conf.cumulate_callchain = true;
+
+	setup_sorting();
+	callchain_register_param(&callchain_param);
+
+	err = add_hist_entries(hists, machine);
+	if (err < 0)
+		goto out;
+
+	err = do_test(hists, expected, ARRAY_SIZE(expected), NULL, 0);
+
+out:
+	del_hist_entries(hists);
+	reset_output_field();
+	return err;
+}
+
+/* callchain + children */
+static int test4(struct perf_evsel *evsel, struct machine *machine)
+{
+	int err;
+	struct hists *hists = &evsel->hists;
+	/*
+	 * expected output:
+	 *
+	 * Children      Self  Command  Shared Object                   Symbol
+	 * ========  ========  =======  =============  =======================
+	 *   70.00%    20.00%     perf  perf           [.] main
+	 *              |
+	 *              --- main
+	 *
+	 *   50.00%     0.00%     perf  perf           [.] run_command
+	 *              |
+	 *              --- run_command
+	 *                  main
+	 *
+	 *   30.00%    10.00%     bash  bash           [.] main
+	 *              |
+	 *              --- main
+	 *
+	 *   30.00%    10.00%     perf  perf           [.] cmd_record
+	 *              |
+	 *              --- cmd_record
+	 *                  run_command
+	 *                  main
+	 *
+	 *   20.00%     0.00%     bash  libc           [.] malloc
+	 *              |
+	 *              --- malloc
+	 *                 |
+	 *                 |--50.00%-- xmalloc
+	 *                 |           main
+	 *                  --50.00%-- main
+	 *
+	 *   10.00%    10.00%     bash  [kernel]       [k] page_fault
+	 *              |
+	 *              --- page_fault
+	 *                  malloc
+	 *                  main
+	 *
+	 *   10.00%    10.00%     perf  [kernel]       [k] schedule
+	 *              |
+	 *              --- schedule
+	 *                  run_command
+	 *                  main
+	 *
+	 *   10.00%     0.00%     perf  [kernel]       [k] sys_perf_event_open
+	 *              |
+	 *              --- sys_perf_event_open
+	 *                  run_command
+	 *                  main
+	 *
+	 *   10.00%    10.00%     perf  [kernel]       [k] page_fault
+	 *              |
+	 *              --- page_fault
+	 *                  sys_perf_event_open
+	 *                  run_command
+	 *                  main
+	 *
+	 *   10.00%    10.00%     perf  libc           [.] free
+	 *              |
+	 *              --- free
+	 *                  cmd_record
+	 *                  run_command
+	 *                  main
+	 *
+	 *   10.00%    10.00%     perf  libc           [.] malloc
+	 *              |
+	 *              --- malloc
+	 *                  cmd_record
+	 *                  run_command
+	 *                  main
+	 *
+	 *   10.00%    10.00%     bash  bash           [.] xmalloc
+	 *              |
+	 *              --- xmalloc
+	 *                  malloc
+	 *                  xmalloc     <--- NOTE: there's a cycle
+	 *                  malloc
+	 *                  xmalloc
+	 *                  main
+	 *
+	 */
+	struct result expected[] = {
+		{ 7000, 2000, "perf", "perf",     "main" },
+		{ 5000,    0, "perf", "perf",     "run_command" },
+		{ 3000, 1000, "bash", "bash",     "main" },
+		{ 3000, 1000, "perf", "perf",     "cmd_record" },
+		{ 2000,    0, "bash", "libc",     "malloc" },
+		{ 1000, 1000, "bash", "[kernel]", "page_fault" },
+		{ 1000, 1000, "perf", "[kernel]", "schedule" },
+		{ 1000,    0, "perf", "[kernel]", "sys_perf_event_open" },
+		{ 1000, 1000, "perf", "[kernel]", "page_fault" },
+		{ 1000, 1000, "perf", "libc",     "free" },
+		{ 1000, 1000, "perf", "libc",     "malloc" },
+		{ 1000, 1000, "bash", "bash",     "xmalloc" },
+	};
+	struct callchain_result expected_callchain[] = {
+		{
+			1, {	{ "perf",     "main" }, },
+		},
+		{
+			2, {	{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			1, {	{ "bash",     "main" }, },
+		},
+		{
+			3, {	{ "perf",     "cmd_record" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			4, {	{ "libc",     "malloc" },
+				{ "bash",     "xmalloc" },
+				{ "bash",     "main" },
+				{ "bash",     "main" }, },
+		},
+		{
+			3, {	{ "[kernel]", "page_fault" },
+				{ "libc",     "malloc" },
+				{ "bash",     "main" }, },
+		},
+		{
+			3, {	{ "[kernel]", "schedule" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			3, {	{ "[kernel]", "sys_perf_event_open" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			4, {	{ "[kernel]", "page_fault" },
+				{ "[kernel]", "sys_perf_event_open" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			4, {	{ "libc",     "free" },
+				{ "perf",     "cmd_record" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			4, {	{ "libc",     "malloc" },
+				{ "perf",     "cmd_record" },
+				{ "perf",     "run_command" },
+				{ "perf",     "main" }, },
+		},
+		{
+			6, {	{ "bash",     "xmalloc" },
+				{ "libc",     "malloc" },
+				{ "bash",     "xmalloc" },
+				{ "libc",     "malloc" },
+				{ "bash",     "xmalloc" },
+				{ "bash",     "main" }, },
+		},
+	};
+
+	symbol_conf.use_callchain = true;
+	symbol_conf.cumulate_callchain = true;
+
+	setup_sorting();
+	callchain_register_param(&callchain_param);
+
+	err = add_hist_entries(hists, machine);
+	if (err < 0)
+		goto out;
+
+	err = do_test(hists, expected, ARRAY_SIZE(expected),
+		      expected_callchain, ARRAY_SIZE(expected_callchain));
+
+out:
+	del_hist_entries(hists);
+	reset_output_field();
+	return err;
+}
+
+int test__hists_cumulate(void)
+{
+	int err = TEST_FAIL;
+	struct machines machines;
+	struct machine *machine;
+	struct perf_evsel *evsel;
+	struct perf_evlist *evlist = perf_evlist__new();
+	size_t i;
+	test_fn_t testcases[] = {
+		test1,
+		test2,
+		test3,
+		test4,
+	};
+
+	TEST_ASSERT_VAL("No memory", evlist);
+
+	err = parse_events(evlist, "cpu-clock");
+	if (err)
+		goto out;
+
+	machines__init(&machines);
+
+	/* setup threads/dso/map/symbols also */
+	machine = setup_fake_machine(&machines);
+	if (!machine)
+		goto out;
+
+	if (verbose > 1)
+		machine__fprintf(machine, stderr);
+
+	evsel = perf_evlist__first(evlist);
+
+	for (i = 0; i < ARRAY_SIZE(testcases); i++) {
+		err = testcases[i](evsel, machine);
+		if (err < 0)
+			break;
+	}
+
+out:
+	/* tear down everything */
+	perf_evlist__delete(evlist);
+	machines__exit(&machines);
+
+	return err;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index d76c0e2..022bb68 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -45,6 +45,7 @@ int test__hists_filter(void);
 int test__mmap_thread_lookup(void);
 int test__thread_mg_share(void);
 int test__hists_output(void);
+int test__hists_cumulate(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [GIT PULL 00/27] perf/core improvements and fixes
  2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
                   ` (26 preceding siblings ...)
  2014-06-01 13:31 ` [PATCH 27/27] perf tests: Add a test case for cumulating callchains Jiri Olsa
@ 2014-06-03 18:23 ` Ingo Molnar
  27 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2014-06-03 18:23 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo, Arun Sharma,
	David Ahern, Don Zickus, Frederic Weisbecker, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Rodrigo Campos, Stephane Eranian


* Jiri Olsa <jolsa@kernel.org> wrote:

> hi Ingo,
> please consider pulling
> 
> thanks,
> jirka
> 
> 
> The following changes since commit e450f90e8c7d0bf70519223c1b848446ae63f313:
> 
>   Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core (2014-05-22 11:37:40 +0200)
> 
> are available in the git repository at:
> 
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git tags/perf-core-for-mingo
> 
> for you to fetch changes up to 0506aecce999d4370b979892f88cf1118cfe8dcb:
> 
>   perf tests: Add a test case for cumulating callchains (2014-06-01 14:35:11 +0200)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> . Add support to accumulate hist periods (Namhyung Kim)
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> 
> ----------------------------------------------------------------
> Namhyung Kim (27):
>       perf tools: Introduce hists__inc_nr_samples()
>       perf tools: Introduce struct hist_entry_iter
>       perf hists: Add support for accumulated stat of hist entry
>       perf hists: Check if accumulated when adding a hist entry
>       perf hists: Accumulate hist entry stat based on the callchain
>       perf tools: Update cpumode for each cumulative entry
>       perf report: Cache cumulative callchains
>       perf callchain: Add callchain_cursor_snapshot()
>       perf tools: Save callchain info for each cumulative entry
>       perf ui/hist: Add support to accumulated hist stat
>       perf ui/browser: Add support to accumulated hist stat
>       perf ui/gtk: Add support to accumulated hist stat
>       perf tools: Apply percent-limit to cumulative percentage
>       perf tools: Add more hpp helper functions
>       perf report: Add --children option
>       perf report: Add report.children config option
>       perf tools: Do not auto-remove Children column if --fields given
>       perf tools: Add callback function to hist_entry_iter
>       perf top: Convert to hist_entry_iter
>       perf top: Add --children option
>       perf top: Add top.children config option
>       perf tools: Enable --children option by default
>       perf ui/stdio: Fix invalid percentage value of cumulated hist entries
>       perf ui/gtk: Fix callchain display
>       perf tools: Reset output/sort order to default
>       perf tests: Define and use symbolic names for fake symbols
>       perf tests: Add a test case for cumulating callchains
> 
>  tools/perf/Documentation/perf-report.txt |   7 +-
>  tools/perf/Documentation/perf-top.txt    |   8 +-
>  tools/perf/Makefile.perf                 |   1 +
>  tools/perf/builtin-annotate.c            |   5 +-
>  tools/perf/builtin-diff.c                |   2 +-
>  tools/perf/builtin-report.c              | 210 +++------
>  tools/perf/builtin-sched.c               |   2 +-
>  tools/perf/builtin-top.c                 |  90 ++--
>  tools/perf/tests/builtin-test.c          |   4 +
>  tools/perf/tests/hists_common.c          |  52 ++-
>  tools/perf/tests/hists_common.h          |  32 +-
>  tools/perf/tests/hists_cumulate.c        | 726 +++++++++++++++++++++++++++++++
>  tools/perf/tests/hists_filter.c          |  39 +-
>  tools/perf/tests/hists_link.c            |  36 +-
>  tools/perf/tests/hists_output.c          |  31 +-
>  tools/perf/tests/tests.h                 |   1 +
>  tools/perf/ui/browsers/hists.c           |  65 +--
>  tools/perf/ui/gtk/hists.c                |  33 +-
>  tools/perf/ui/hist.c                     | 119 +++++
>  tools/perf/ui/stdio/hist.c               |   8 +-
>  tools/perf/util/callchain.c              |  45 +-
>  tools/perf/util/callchain.h              |  11 +
>  tools/perf/util/hist.c                   | 481 +++++++++++++++++++-
>  tools/perf/util/hist.h                   |  49 ++-
>  tools/perf/util/sort.c                   |   4 +
>  tools/perf/util/sort.h                   |  18 +-
>  tools/perf/util/symbol.c                 |  11 +-
>  tools/perf/util/symbol.h                 |   1 +
>  28 files changed, 1768 insertions(+), 323 deletions(-)
>  create mode 100644 tools/perf/tests/hists_cumulate.c

Pulled, thanks a lot Jiri!

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [GIT PULL 00/27] perf/core improvements and fixes
@ 2014-07-25 15:36 Arnaldo Carvalho de Melo
  2014-07-28  8:10 ` Ingo Molnar
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-07-25 15:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Andi Kleen,
	Corey Ashford, David Ahern, Don Zickus, Frederic Weisbecker,
	Jean Pihet, Jiri Olsa, Michael Ellerman, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Sukadev Bhattiprolu, Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo

The following changes since commit 2336ebc32676df5b794acfe0c980583ec6c05f34:

  Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core (2014-07-18 12:19:20 +0200)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo

for you to fetch changes up to dcabb507fd3a2b19aed6b4068e2a954f5fd8de45:

  perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds (2014-07-25 12:17:36 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

Developer stuff:

o More prep work to support Intel PT: (Adrian Hunter)
  - Polishing 'script' BTS output
  - 'inject' can specify --kallsym
  - VDSO is per machine, not a global var
  - Expose data addr lookup functions previously private to 'script'
  - Large mmap fixes in events processing

o Fix build on gcc 4.4.7 (Arnaldo Carvalho de Melo)

o Event ordering fixes (Jiri Olsa)

o Include standard stringify macros in power pc code (Sukadev Bhattiprolu)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Adrian Hunter (22):
      perf tools: Fix incorrect fd error comparison
      perf tools: Fix jump label always changing during tracing
      perf script: Improve srcline display for BTS
      perf script: Do not print dangling '=>' for BTS
      perf tools: Record whether a dso has data
      perf tools: Add dso__data_status_seen()
      perf tools: Add dsos__hit_all()
      perf tools: Add cpu to struct thread
      perf machine: Add ability to record the current tid for each cpu
      perf tools: Move rdtsc() function
      perf tools: Add dso__data_size()
      perf tools: Pass machine to vdso__dso_findnew()
      perf session: Add ability to 'skip' a non-piped event stream
      perf session: Add ability to skip 4GiB or more
      perf tools: Group VDSO global variables into a structure
      perf machine: Fix the lifetime of the VDSO temporary file
      perf tools: Add vdso__new()
      perf tools: Separate the VDSO map name from the VDSO dso name
      perf tools: Add dso__type()
      perf tools: Add thread parameter to vdso__dso_findnew()
      perf tools: Expose 'addr' functions so they can be reused
      perf inject: Add --kallsyms parameter

Arnaldo Carvalho de Melo (1):
      perf tools: Fix build on gcc 4.4.7

Jiri Olsa (3):
      perf session: Fix accounting of ordered samples queue
      perf record: Always force PERF_RECORD_FINISHED_ROUND event
      perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds

Sukadev Bhattiprolu (1):
      perf powerpc: Include util/util.h and remove stringify macros

 tools/perf/Documentation/perf-inject.txt |  3 +
 tools/perf/arch/powerpc/util/header.c    |  4 +-
 tools/perf/arch/x86/util/tsc.c           |  9 +++
 tools/perf/builtin-inject.c              |  2 +
 tools/perf/builtin-record.c              |  7 ++-
 tools/perf/builtin-script.c              | 60 +++++++-------------
 tools/perf/builtin-trace.c               |  2 +-
 tools/perf/tests/perf-time-to-tsc.c      |  9 ---
 tools/perf/util/cloexec.c                |  9 ++-
 tools/perf/util/dso.c                    | 70 ++++++++++++++++++++---
 tools/perf/util/dso.h                    | 25 +++++++++
 tools/perf/util/event.c                  | 42 ++++++++++++++
 tools/perf/util/event.h                  | 10 ++++
 tools/perf/util/header.c                 | 51 +++++++++++++++--
 tools/perf/util/header.h                 |  2 +
 tools/perf/util/machine.c                | 58 +++++++++++++++++--
 tools/perf/util/machine.h                |  8 +++
 tools/perf/util/map.c                    |  9 +--
 tools/perf/util/map.h                    |  5 +-
 tools/perf/util/session.c                | 25 +++++----
 tools/perf/util/symbol-elf.c             | 35 +++++++++++-
 tools/perf/util/symbol-minimal.c         | 21 +++++++
 tools/perf/util/symbol.h                 |  2 +
 tools/perf/util/thread.c                 |  1 +
 tools/perf/util/thread.h                 |  1 +
 tools/perf/util/tsc.c                    |  5 ++
 tools/perf/util/tsc.h                    |  1 +
 tools/perf/util/vdso.c                   | 96 +++++++++++++++++++++++++-------
 tools/perf/util/vdso.h                   | 13 ++++-
 29 files changed, 470 insertions(+), 115 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [GIT PULL 00/27] perf/core improvements and fixes
  2014-07-25 15:36 Arnaldo Carvalho de Melo
@ 2014-07-28  8:10 ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2014-07-28  8:10 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Jean Pihet,
	Jiri Olsa, Michael Ellerman, Mike Galbraith, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Sukadev Bhattiprolu, Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> The following changes since commit 2336ebc32676df5b794acfe0c980583ec6c05f34:
> 
>   Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core (2014-07-18 12:19:20 +0200)
> 
> are available in the git repository at:
> 
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo
> 
> for you to fetch changes up to dcabb507fd3a2b19aed6b4068e2a954f5fd8de45:
> 
>   perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds (2014-07-25 12:17:36 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> Developer stuff:
> 
> o More prep work to support Intel PT: (Adrian Hunter)
>   - Polishing 'script' BTS output
>   - 'inject' can specify --kallsym
>   - VDSO is per machine, not a global var
>   - Expose data addr lookup functions previously private to 'script'
>   - Large mmap fixes in events processing
> 
> o Fix build on gcc 4.4.7 (Arnaldo Carvalho de Melo)
> 
> o Event ordering fixes (Jiri Olsa)
> 
> o Include standard stringify macros in power pc code (Sukadev Bhattiprolu)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Adrian Hunter (22):
>       perf tools: Fix incorrect fd error comparison
>       perf tools: Fix jump label always changing during tracing
>       perf script: Improve srcline display for BTS
>       perf script: Do not print dangling '=>' for BTS
>       perf tools: Record whether a dso has data
>       perf tools: Add dso__data_status_seen()
>       perf tools: Add dsos__hit_all()
>       perf tools: Add cpu to struct thread
>       perf machine: Add ability to record the current tid for each cpu
>       perf tools: Move rdtsc() function
>       perf tools: Add dso__data_size()
>       perf tools: Pass machine to vdso__dso_findnew()
>       perf session: Add ability to 'skip' a non-piped event stream
>       perf session: Add ability to skip 4GiB or more
>       perf tools: Group VDSO global variables into a structure
>       perf machine: Fix the lifetime of the VDSO temporary file
>       perf tools: Add vdso__new()
>       perf tools: Separate the VDSO map name from the VDSO dso name
>       perf tools: Add dso__type()
>       perf tools: Add thread parameter to vdso__dso_findnew()
>       perf tools: Expose 'addr' functions so they can be reused
>       perf inject: Add --kallsyms parameter
> 
> Arnaldo Carvalho de Melo (1):
>       perf tools: Fix build on gcc 4.4.7
> 
> Jiri Olsa (3):
>       perf session: Fix accounting of ordered samples queue
>       perf record: Always force PERF_RECORD_FINISHED_ROUND event
>       perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds
> 
> Sukadev Bhattiprolu (1):
>       perf powerpc: Include util/util.h and remove stringify macros
> 
>  tools/perf/Documentation/perf-inject.txt |  3 +
>  tools/perf/arch/powerpc/util/header.c    |  4 +-
>  tools/perf/arch/x86/util/tsc.c           |  9 +++
>  tools/perf/builtin-inject.c              |  2 +
>  tools/perf/builtin-record.c              |  7 ++-
>  tools/perf/builtin-script.c              | 60 +++++++-------------
>  tools/perf/builtin-trace.c               |  2 +-
>  tools/perf/tests/perf-time-to-tsc.c      |  9 ---
>  tools/perf/util/cloexec.c                |  9 ++-
>  tools/perf/util/dso.c                    | 70 ++++++++++++++++++++---
>  tools/perf/util/dso.h                    | 25 +++++++++
>  tools/perf/util/event.c                  | 42 ++++++++++++++
>  tools/perf/util/event.h                  | 10 ++++
>  tools/perf/util/header.c                 | 51 +++++++++++++++--
>  tools/perf/util/header.h                 |  2 +
>  tools/perf/util/machine.c                | 58 +++++++++++++++++--
>  tools/perf/util/machine.h                |  8 +++
>  tools/perf/util/map.c                    |  9 +--
>  tools/perf/util/map.h                    |  5 +-
>  tools/perf/util/session.c                | 25 +++++----
>  tools/perf/util/symbol-elf.c             | 35 +++++++++++-
>  tools/perf/util/symbol-minimal.c         | 21 +++++++
>  tools/perf/util/symbol.h                 |  2 +
>  tools/perf/util/thread.c                 |  1 +
>  tools/perf/util/thread.h                 |  1 +
>  tools/perf/util/tsc.c                    |  5 ++
>  tools/perf/util/tsc.h                    |  1 +
>  tools/perf/util/vdso.c                   | 96 +++++++++++++++++++++++++-------
>  tools/perf/util/vdso.h                   | 13 ++++-
>  29 files changed, 470 insertions(+), 115 deletions(-)

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [GIT PULL 00/27] perf/core improvements and fixes
@ 2016-06-23 21:23 Arnaldo Carvalho de Melo
  2016-06-26 10:43 ` Ingo Molnar
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-06-23 21:23 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Alexander Shishkin, Andi Kleen, Brendan Gregg, David Ahern,
	Ekaterina Tumanova, He Kuang, Jiri Olsa, Josh Poimboeuf,
	Kan Liang, Masami Hiramatsu, Milian Wolff, Namhyung Kim,
	Paolo Bonzini, Pekka Enberg, Peter Zijlstra, Stephane Eranian,
	Sukadev Bhattiprolu, Taeung Song, Wang Nan,
	Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo

The following changes since commit 4330b439bbe16b48dd2fe9a379bd58a07b97aab8:

  Merge tag 'perf-core-for-mingo-20160621' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-06-22 09:34:19 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160623

for you to fetch changes up to 4a35b3497c413de8b409f9d75700eeb4772b21b8:

  perf config: Reimplement show_config() using config_set__for_each (2016-06-23 17:23:00 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Add 'callindent' option to 'perf script -F', to indent the Intel PT
  call stack, making this output more ftrace-like (Adrian Hunter, Andi Kleen)

User visible:

- Enlarge 'pid' column width, to cope with large pids (Jiri Olsa)

Infrastructure:

- Cross platform unwind fixes (He Kuang)

- Make destructors accept NULL, behaving like free() (Arnaldo Carvalho de Melo)

- Remove reference to perl interpreted in the recently added 'perf script'
  stackcollapse python script (Arnaldo Carvalho de Melo)

- Rename CLASS__for_each() macros to CLASS__for_each_entry(), to use the
  list_for_each_entry() semantics, as most of these class specific loop helpers
  are list_for_each_entry*() wrappers  (Arnaldo Carvalho de Melo)

- Expose the hist_browser code, will be used with data structures other
  than perf_evsel (Jiri Olsa)

- 'perf config' refactorings (Taeung Song)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Adrian Hunter (3):
      perf script: Print sample flags more nicely
      perf auxtrace: Add option to feed branches to the thread stack
      perf script: Add callindent option

Arnaldo Carvalho de Melo (9):
      perf script stackcollapse: Remove reference to the perl interpreter
      perf evlist: Destructors should accept NULL
      perf session: Destructors should accept NULL
      perf tests time-to-tsc: No need to disable an event before deleting it
      perf machine: Destructors should accept NULL
      perf evlist: Rename for_each() macros to for_each_entry()
      perf tools: Rename strlist_for_each() macros to for_each_entry()
      perf rb_resort: Rename for_each() macros to for_each_entry()
      perf intlist: Rename for_each() macros to for_each_entry()

He Kuang (5):
      perf tools: Let python use correct gcc for build_ext
      perf tools: Find right DSO taking into account if binary is 32 or 64-bit
      perf unwind: Change macro names of perf register
      perf unwind: Fix wrongly used regs for x86_32 unwind
      perf unwind: Fix wrongly used regs for aarch64 unwind

Jiri Olsa (7):
      perf hists browser: Move hist_browser into header file
      perf hists browser: Make (new|delete|run) public
      perf hists browser: Introduce struct hist_browser title callback
      perf hists browser: Move horizontal scroll init to new()
      perf hists browser: Introduce perf_evsel_browser constructor
      perf hists browser: Introduce init()
      perf hists: Enlarge pid sort entry size

Taeung Song (3):
      perf config: Move config declarations from util/cache.h to util/config.h
      perf config: Introduce new init() and exit()
      perf config: Reimplement show_config() using config_set__for_each

 tools/perf/Documentation/perf-script.txt     |  11 ++-
 tools/perf/Makefile.perf                     |   3 +-
 tools/perf/arch/x86/tests/perf-time-to-tsc.c |   6 +-
 tools/perf/arch/x86/util/auxtrace.c          |   2 +-
 tools/perf/arch/x86/util/intel-bts.c         |   8 +-
 tools/perf/arch/x86/util/intel-pt.c          |  10 +--
 tools/perf/builtin-annotate.c                |   2 +-
 tools/perf/builtin-buildid-cache.c           |  13 ++-
 tools/perf/builtin-config.c                  |  21 +++--
 tools/perf/builtin-diff.c                    |  10 +--
 tools/perf/builtin-evlist.c                  |   2 +-
 tools/perf/builtin-help.c                    |   2 +-
 tools/perf/builtin-inject.c                  |   8 +-
 tools/perf/builtin-kmem.c                    |   4 +-
 tools/perf/builtin-kvm.c                     |   8 +-
 tools/perf/builtin-probe.c                   |   4 +-
 tools/perf/builtin-record.c                  |   3 +-
 tools/perf/builtin-report.c                  |  12 +--
 tools/perf/builtin-script.c                  | 115 +++++++++++++++++++++++++--
 tools/perf/builtin-stat.c                    |  22 ++---
 tools/perf/builtin-top.c                     |  10 +--
 tools/perf/builtin-trace.c                   |  10 +--
 tools/perf/perf.c                            |   4 +-
 tools/perf/scripts/python/stackcollapse.py   |   2 -
 tools/perf/tests/backward-ring-buffer.c      |   2 +-
 tools/perf/tests/event-times.c               |   3 +-
 tools/perf/tests/evsel-roundtrip-name.c      |   2 +-
 tools/perf/tests/hists_filter.c              |   4 +-
 tools/perf/tests/hists_link.c                |   4 +-
 tools/perf/tests/mmap-basic.c                |   2 +-
 tools/perf/tests/parse-events.c              |   4 +-
 tools/perf/tests/parse-no-sample-id-all.c    |   3 +-
 tools/perf/tests/switch-tracking.c           |   2 +-
 tools/perf/ui/browser.c                      |   2 +-
 tools/perf/ui/browsers/annotate.c            |   1 +
 tools/perf/ui/browsers/hists.c               | 109 ++++++++++++-------------
 tools/perf/ui/browsers/hists.h               |  32 ++++++++
 tools/perf/ui/gtk/hists.c                    |   2 +-
 tools/perf/ui/hist.c                         |   2 +-
 tools/perf/util/alias.c                      |   1 +
 tools/perf/util/auxtrace.h                   |   2 +
 tools/perf/util/cache.h                      |  11 ---
 tools/perf/util/cgroup.c                     |   4 +-
 tools/perf/util/color.c                      |   1 +
 tools/perf/util/config.c                     |  92 ++++++++++-----------
 tools/perf/util/config.h                     |  40 ++++++++++
 tools/perf/util/data-convert-bt.c            |   4 +-
 tools/perf/util/evlist.c                     |  59 +++++++-------
 tools/perf/util/evlist.h                     |  40 +++++-----
 tools/perf/util/header.c                     |  18 ++---
 tools/perf/util/help-unknown-cmd.c           |   1 +
 tools/perf/util/hist.c                       |   4 +-
 tools/perf/util/intel-bts.c                  |  24 ++++--
 tools/perf/util/intel-pt.c                   |  26 +++---
 tools/perf/util/intlist.h                    |   8 +-
 tools/perf/util/jitdump.c                    |   2 +-
 tools/perf/util/libunwind/arm64.c            |   5 ++
 tools/perf/util/libunwind/x86_32.c           |   6 ++
 tools/perf/util/llvm-utils.c                 |   1 +
 tools/perf/util/machine.c                    |   6 +-
 tools/perf/util/parse-events.c               |   4 +-
 tools/perf/util/probe-event.c                |  12 ++-
 tools/perf/util/probe-file.c                 |   8 +-
 tools/perf/util/python.c                     |   2 +-
 tools/perf/util/rb_resort.h                  |   4 +-
 tools/perf/util/record.c                     |   8 +-
 tools/perf/util/session.c                    |  12 +--
 tools/perf/util/sort.c                       |  14 ++--
 tools/perf/util/stat.c                       |   6 +-
 tools/perf/util/strlist.h                    |   4 +-
 tools/perf/util/symbol.c                     |   2 +-
 tools/perf/util/thread-stack.c               |   7 ++
 tools/perf/util/thread-stack.h               |   1 +
 tools/perf/util/thread_map.c                 |   4 +-
 tools/perf/util/unwind-libunwind-local.c     |   6 +-
 tools/perf/util/unwind.h                     |   9 +++
 tools/perf/util/vdso.c                       |  40 +++++++++-
 77 files changed, 606 insertions(+), 358 deletions(-)
 create mode 100644 tools/perf/ui/browsers/hists.h

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [GIT PULL 00/27] perf/core improvements and fixes
  2016-06-23 21:23 Arnaldo Carvalho de Melo
@ 2016-06-26 10:43 ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2016-06-26 10:43 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Brendan Gregg, David Ahern, Ekaterina Tumanova, He Kuang,
	Jiri Olsa, Josh Poimboeuf, Kan Liang, Masami Hiramatsu,
	Milian Wolff, Namhyung Kim, Paolo Bonzini, Pekka Enberg,
	Peter Zijlstra, Stephane Eranian, Sukadev Bhattiprolu,
	Taeung Song, Wang Nan, Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> The following changes since commit 4330b439bbe16b48dd2fe9a379bd58a07b97aab8:
> 
>   Merge tag 'perf-core-for-mingo-20160621' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-06-22 09:34:19 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160623
> 
> for you to fetch changes up to 4a35b3497c413de8b409f9d75700eeb4772b21b8:
> 
>   perf config: Reimplement show_config() using config_set__for_each (2016-06-23 17:23:00 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> New features:
> 
> - Add 'callindent' option to 'perf script -F', to indent the Intel PT
>   call stack, making this output more ftrace-like (Adrian Hunter, Andi Kleen)
> 
> User visible:
> 
> - Enlarge 'pid' column width, to cope with large pids (Jiri Olsa)
> 
> Infrastructure:
> 
> - Cross platform unwind fixes (He Kuang)
> 
> - Make destructors accept NULL, behaving like free() (Arnaldo Carvalho de Melo)
> 
> - Remove reference to perl interpreted in the recently added 'perf script'
>   stackcollapse python script (Arnaldo Carvalho de Melo)
> 
> - Rename CLASS__for_each() macros to CLASS__for_each_entry(), to use the
>   list_for_each_entry() semantics, as most of these class specific loop helpers
>   are list_for_each_entry*() wrappers  (Arnaldo Carvalho de Melo)
> 
> - Expose the hist_browser code, will be used with data structures other
>   than perf_evsel (Jiri Olsa)
> 
> - 'perf config' refactorings (Taeung Song)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Adrian Hunter (3):
>       perf script: Print sample flags more nicely
>       perf auxtrace: Add option to feed branches to the thread stack
>       perf script: Add callindent option
> 
> Arnaldo Carvalho de Melo (9):
>       perf script stackcollapse: Remove reference to the perl interpreter
>       perf evlist: Destructors should accept NULL
>       perf session: Destructors should accept NULL
>       perf tests time-to-tsc: No need to disable an event before deleting it
>       perf machine: Destructors should accept NULL
>       perf evlist: Rename for_each() macros to for_each_entry()
>       perf tools: Rename strlist_for_each() macros to for_each_entry()
>       perf rb_resort: Rename for_each() macros to for_each_entry()
>       perf intlist: Rename for_each() macros to for_each_entry()
> 
> He Kuang (5):
>       perf tools: Let python use correct gcc for build_ext
>       perf tools: Find right DSO taking into account if binary is 32 or 64-bit
>       perf unwind: Change macro names of perf register
>       perf unwind: Fix wrongly used regs for x86_32 unwind
>       perf unwind: Fix wrongly used regs for aarch64 unwind
> 
> Jiri Olsa (7):
>       perf hists browser: Move hist_browser into header file
>       perf hists browser: Make (new|delete|run) public
>       perf hists browser: Introduce struct hist_browser title callback
>       perf hists browser: Move horizontal scroll init to new()
>       perf hists browser: Introduce perf_evsel_browser constructor
>       perf hists browser: Introduce init()
>       perf hists: Enlarge pid sort entry size
> 
> Taeung Song (3):
>       perf config: Move config declarations from util/cache.h to util/config.h
>       perf config: Introduce new init() and exit()
>       perf config: Reimplement show_config() using config_set__for_each
> 
>  tools/perf/Documentation/perf-script.txt     |  11 ++-
>  tools/perf/Makefile.perf                     |   3 +-
>  tools/perf/arch/x86/tests/perf-time-to-tsc.c |   6 +-
>  tools/perf/arch/x86/util/auxtrace.c          |   2 +-
>  tools/perf/arch/x86/util/intel-bts.c         |   8 +-
>  tools/perf/arch/x86/util/intel-pt.c          |  10 +--
>  tools/perf/builtin-annotate.c                |   2 +-
>  tools/perf/builtin-buildid-cache.c           |  13 ++-
>  tools/perf/builtin-config.c                  |  21 +++--
>  tools/perf/builtin-diff.c                    |  10 +--
>  tools/perf/builtin-evlist.c                  |   2 +-
>  tools/perf/builtin-help.c                    |   2 +-
>  tools/perf/builtin-inject.c                  |   8 +-
>  tools/perf/builtin-kmem.c                    |   4 +-
>  tools/perf/builtin-kvm.c                     |   8 +-
>  tools/perf/builtin-probe.c                   |   4 +-
>  tools/perf/builtin-record.c                  |   3 +-
>  tools/perf/builtin-report.c                  |  12 +--
>  tools/perf/builtin-script.c                  | 115 +++++++++++++++++++++++++--
>  tools/perf/builtin-stat.c                    |  22 ++---
>  tools/perf/builtin-top.c                     |  10 +--
>  tools/perf/builtin-trace.c                   |  10 +--
>  tools/perf/perf.c                            |   4 +-
>  tools/perf/scripts/python/stackcollapse.py   |   2 -
>  tools/perf/tests/backward-ring-buffer.c      |   2 +-
>  tools/perf/tests/event-times.c               |   3 +-
>  tools/perf/tests/evsel-roundtrip-name.c      |   2 +-
>  tools/perf/tests/hists_filter.c              |   4 +-
>  tools/perf/tests/hists_link.c                |   4 +-
>  tools/perf/tests/mmap-basic.c                |   2 +-
>  tools/perf/tests/parse-events.c              |   4 +-
>  tools/perf/tests/parse-no-sample-id-all.c    |   3 +-
>  tools/perf/tests/switch-tracking.c           |   2 +-
>  tools/perf/ui/browser.c                      |   2 +-
>  tools/perf/ui/browsers/annotate.c            |   1 +
>  tools/perf/ui/browsers/hists.c               | 109 ++++++++++++-------------
>  tools/perf/ui/browsers/hists.h               |  32 ++++++++
>  tools/perf/ui/gtk/hists.c                    |   2 +-
>  tools/perf/ui/hist.c                         |   2 +-
>  tools/perf/util/alias.c                      |   1 +
>  tools/perf/util/auxtrace.h                   |   2 +
>  tools/perf/util/cache.h                      |  11 ---
>  tools/perf/util/cgroup.c                     |   4 +-
>  tools/perf/util/color.c                      |   1 +
>  tools/perf/util/config.c                     |  92 ++++++++++-----------
>  tools/perf/util/config.h                     |  40 ++++++++++
>  tools/perf/util/data-convert-bt.c            |   4 +-
>  tools/perf/util/evlist.c                     |  59 +++++++-------
>  tools/perf/util/evlist.h                     |  40 +++++-----
>  tools/perf/util/header.c                     |  18 ++---
>  tools/perf/util/help-unknown-cmd.c           |   1 +
>  tools/perf/util/hist.c                       |   4 +-
>  tools/perf/util/intel-bts.c                  |  24 ++++--
>  tools/perf/util/intel-pt.c                   |  26 +++---
>  tools/perf/util/intlist.h                    |   8 +-
>  tools/perf/util/jitdump.c                    |   2 +-
>  tools/perf/util/libunwind/arm64.c            |   5 ++
>  tools/perf/util/libunwind/x86_32.c           |   6 ++
>  tools/perf/util/llvm-utils.c                 |   1 +
>  tools/perf/util/machine.c                    |   6 +-
>  tools/perf/util/parse-events.c               |   4 +-
>  tools/perf/util/probe-event.c                |  12 ++-
>  tools/perf/util/probe-file.c                 |   8 +-
>  tools/perf/util/python.c                     |   2 +-
>  tools/perf/util/rb_resort.h                  |   4 +-
>  tools/perf/util/record.c                     |   8 +-
>  tools/perf/util/session.c                    |  12 +--
>  tools/perf/util/sort.c                       |  14 ++--
>  tools/perf/util/stat.c                       |   6 +-
>  tools/perf/util/strlist.h                    |   4 +-
>  tools/perf/util/symbol.c                     |   2 +-
>  tools/perf/util/thread-stack.c               |   7 ++
>  tools/perf/util/thread-stack.h               |   1 +
>  tools/perf/util/thread_map.c                 |   4 +-
>  tools/perf/util/unwind-libunwind-local.c     |   6 +-
>  tools/perf/util/unwind.h                     |   9 +++
>  tools/perf/util/vdso.c                       |  40 +++++++++-
>  77 files changed, 606 insertions(+), 358 deletions(-)
>  create mode 100644 tools/perf/ui/browsers/hists.h

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [GIT PULL 00/27] perf/core improvements and fixes
@ 2016-09-29 14:35 Arnaldo Carvalho de Melo
  2016-09-29 17:11 ` Ingo Molnar
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-09-29 14:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Linux Weekly News, Arnaldo Carvalho de Melo,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Anju T Sudhakar,
	Chong Jiang, Clark Williams, Daniel Bristot de Oliveira,
	David Ahern, Jiri Olsa, Josh Poimboeuf, linux-arm-kernel,
	linuxppc-dev, Masami Hiramatsu, Mathieu Poirier, Matt Fleming,
	Michael Ellerman, Namhyung Kim, Peter Zijlstra, pi3orama,
	Ravi Bangoria, Simon Que, Steven Rostedt, Thomas Gleixner,
	Wang Nan, Zefan Li, Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling, more to come soon,

- Arnaldo

Build and test results at the end of this message.

The following changes since commit 6b652de2b27c0a4020ce0e8f277e782b6af76096:

  Merge tag 'perf-core-for-mingo-20160922' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-09-23 07:21:38 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160929

for you to fetch changes up to d18019a53a07e009899ff6b8dc5ec30f249360d9:

  perf tests: Add dwarf unwind test for powerpc (2016-09-29 11:18:21 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

User visible:
-------------

New features:

- Add support for using symbols in address filters with Intel PT and ARM
  CoreSight (hardware assisted tracing facilities) (Adrian Hunter, Mathieu Poirier)

Fixes:

- Fix MMAP event synthesis for pre-existing threads when no hugetlbfs
  mount is in place (Adrian Hunter)

- Don't ignore kernel idle symbols in 'perf script' (Adrian Hunter)

- Assorted Intel PT fixes (Adrian Hunter)

Improvements:

- Fix handling of C++ symbols in 'perf probe' (Masami Hiramatsu)

- Beautify sched_[gs]et_attr return value in 'perf trace' (Arnaldo Carvalho de Melo)

Infrastructure:
---------------

New features:

- Add dwarf unwind 'perf test' for powerpc (Ravi Bangoria)

Fixes:

- Fix error paths in 'perf record' (Adrian Hunter)

Documentation:

- Update documentation info about quipper, a C++ parser for converting
  to/from perf.data/chromium profiling format (Simon Que)

Build Fixes:

  Fix building in 32 bit platform with libbabeltrace (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Adrian Hunter (16):
      perf record: Fix documentation 'event_sources' -> 'event_source'
      perf tools: Fix MMAP event synthesis broken by MAP_HUGETLB change
      perf script: Fix vanished idle symbols
      perf record: Rename label 'out_symbol_exit'
      perf record: Fix error paths
      perf symbols: Add dso__last_symbol()
      perf record: Add support for using symbols in address filters
      perf probe: Increase debug level of SDT debug messages
      perf intel-pt: Fix snapshot overlap detection decoder errors
      perf intel-pt: Add support for recording the max non-turbo ratio
      perf intel-pt: Fix missing error codes processing auxtrace_info
      perf intel-pt: Add a helper function for processing AUXTRACE_INFO
      perf intel-pt: Record address filter in AUXTRACE_INFO event
      perf intel-pt: Read address filter from AUXTRACE_INFO event
      perf intel-pt: Enable decoder to handle TIP.PGD with missing IP
      perf intel-pt: Fix decoding when there are address filters

Arnaldo Carvalho de Melo (1):
      perf trace: Beautify sched_[gs]et_attr return value

Masami Hiramatsu (4):
      perf probe: Ignore the error of finding inline instance
      perf probe: Skip if the function address is 0
      perf probe: Fix to cut off incompatible chars from group name
      perf probe: Match linkage name with mangled name

Mathieu Poirier (3):
      perf tools: Make perf_evsel__append_filter() generic
      perf evsel: New tracepoint specific function
      perf evsel: Add support for address filters

Ravi Bangoria (1):
      perf tests: Add dwarf unwind test for powerpc

Simon Que (1):
      perf tools: Update documentation info about quipper

Wang Nan (1):
      perf data: Fix building in 32 bit platform with libbabeltrace

 tools/perf/Documentation/perf-record.txt           |  61 +-
 tools/perf/Documentation/perf.data-file-format.txt |   6 +-
 tools/perf/arch/powerpc/Build                      |   1 +
 tools/perf/arch/powerpc/include/arch-tests.h       |  13 +
 tools/perf/arch/powerpc/include/perf_regs.h        |   2 +
 tools/perf/arch/powerpc/tests/Build                |   4 +
 tools/perf/arch/powerpc/tests/arch-tests.c         |  15 +
 tools/perf/arch/powerpc/tests/dwarf-unwind.c       |  62 ++
 tools/perf/arch/powerpc/tests/regs_load.S          |  94 +++
 tools/perf/arch/x86/util/intel-pt.c                |  57 +-
 tools/perf/builtin-record.c                        |  32 +-
 tools/perf/builtin-trace.c                         |  10 +-
 tools/perf/tests/Build                             |   2 +-
 tools/perf/tests/dwarf-unwind.c                    |   2 +-
 tools/perf/util/auxtrace.c                         | 737 +++++++++++++++++++++
 tools/perf/util/auxtrace.h                         |  54 ++
 tools/perf/util/build-id.c                         |   4 +-
 tools/perf/util/data-convert-bt.c                  |   2 +-
 tools/perf/util/dwarf-aux.c                        |  28 +-
 tools/perf/util/dwarf-aux.h                        |   3 +
 tools/perf/util/event.c                            |   3 +-
 tools/perf/util/evsel.c                            |  16 +-
 tools/perf/util/evsel.h                            |   5 +-
 tools/perf/util/evsel_fprintf.c                    |   7 +-
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  |  30 +
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |   1 +
 tools/perf/util/intel-pt.c                         | 172 ++++-
 tools/perf/util/intel-pt.h                         |   4 +-
 tools/perf/util/parse-events.c                     |  41 +-
 tools/perf/util/probe-event.c                      |  10 +-
 tools/perf/util/probe-file.c                       |   2 +-
 tools/perf/util/probe-finder.c                     |  17 +-
 tools/perf/util/symbol.c                           |  15 +
 tools/perf/util/symbol.h                           |   1 +
 34 files changed, 1451 insertions(+), 62 deletions(-)
 create mode 100644 tools/perf/arch/powerpc/include/arch-tests.h
 create mode 100644 tools/perf/arch/powerpc/tests/Build
 create mode 100644 tools/perf/arch/powerpc/tests/arch-tests.c
 create mode 100644 tools/perf/arch/powerpc/tests/dwarf-unwind.c
 create mode 100644 tools/perf/arch/powerpc/tests/regs_load.S

  # time dm
   1  alpine:3.4: Ok
   2 android-ndk:r12b-arm: Ok
   3 archlinux:latest: Ok
   4 centos:5: Ok
   5 centos:6: Ok
   6 centos:7: Ok
   7 debian:7: Ok
   8 debian:8: Ok
   9 debian:experimental: Ok
  10 fedora:20: Ok
  11 fedora:21: Ok
  12 fedora:22: Ok
  13 fedora:23: Ok
  14 fedora:24: Ok
  15 fedora:24-x-ARC-uClibc: Ok
  16 fedora:rawhide: Ok
  17 mageia:5: Ok
  18 opensuse:13.2: Ok
  19 opensuse:42.1: Ok
  20 opensuse:tumbleweed: Ok
  21 ubuntu:12.04.5: Ok
  22 ubuntu:14.04: Ok
  23 ubuntu:14.04.4: Ok
  24 ubuntu:15.10: Ok
  25 ubuntu:16.04: Ok
  26 ubuntu:16.04-x-arm: Ok
  27 ubuntu:16.04-x-arm64: Ok
  28 ubuntu:16.04-x-powerpc: Ok
  29 ubuntu:16.04-x-powerpc64: Ok
  30 ubuntu:16.04-x-powerpc64el: Ok
  31 ubuntu:16.04-x-s390: Ok
  32 ubuntu:16.10: Ok
  33 2246.21

  real	37m26.862s
  user	0m2.148s
  sys	0m2.256s
  # 

  # perf test
   1: vmlinux symtab matches kallsyms                          : Ok
   2: detect openat syscall event                              : Ok
   3: detect openat syscall event on all cpus                  : Ok
   4: read samples using the mmap interface                    : Ok
   5: parse events tests                                       : Ok
   6: Validate PERF_RECORD_* events & perf_sample fields       : Ok
   7: Test perf pmu format parsing                             : Ok
   8: Test dso data read                                       : Ok
   9: Test dso data cache                                      : Ok
  10: Test dso data reopen                                     : Ok
  11: roundtrip evsel->name check                              : Ok
  12: Check parsing of sched tracepoints fields                : Ok
  13: Generate and check syscalls:sys_enter_openat event fields: Ok
  14: struct perf_event_attr setup                             : Ok
  15: Test matching and linking multiple hists                 : Ok
  16: Try 'import perf' in python, checking link problems      : Ok
  17: Test breakpoint overflow signal handler                  : Ok
  18: Test breakpoint overflow sampling                        : Ok
  19: Test number of exit event of a simple workload           : Ok
  20: Test software clock events have valid period values      : Ok
  21: Test object code reading                                 : Ok
  22: Test sample parsing                                      : Ok
  23: Test using a dummy software event to keep tracking       : Ok
  24: Test parsing with no sample_id_all bit set               : Ok
  25: Test filtering hist entries                              : Ok
  26: Test mmap thread lookup                                  : Ok
  27: Test thread mg sharing                                   : Ok
  28: Test output sorting of hist entries                      : Ok
  29: Test cumulation of child hist entries                    : Ok
  30: Test tracking with sched_switch                          : Ok
  31: Filter fds with revents mask in a fdarray                : Ok
  32: Add fd to a fdarray, making it autogrow                  : Ok
  33: Test kmod_path__parse function                           : Ok
  34: Test thread map                                          : Ok
  35: Test LLVM searching and compiling                        :
  35.1: Basic BPF llvm compiling test                          : Ok
  35.2: Test kbuild searching                                  : Ok
  35.3: Compile source for BPF prologue generation test        : Ok
  35.4: Compile source for BPF relocation test                 : Ok
  36: Test topology in session                                 : Ok
  37: Test BPF filter                                          :
  37.1: Test basic BPF filtering                               : Ok
  37.2: Test BPF prologue generation                           : Ok
  37.3: Test BPF relocation checker                            : Ok
  38: Test thread map synthesize                               : Ok
  39: Test cpu map synthesize                                  : Ok
  40: Test stat config synthesize                              : Ok
  41: Test stat synthesize                                     : Ok
  42: Test stat round synthesize                               : Ok
  43: Test attr update synthesize                              : Ok
  44: Test events times                                        : Ok
  45: Test backward reading from ring buffer                   : Ok
  46: Test cpu map print                                       : Ok
  47: Test SDT event probing                                   : Ok
  48: Test is_printable_array function                         : Ok
  49: Test bitmap print                                        : Ok
  50: x86 rdpmc test                                           : Ok
  51: Test converting perf time to TSC                         : Ok
  52: Test dwarf unwind                                        : Ok
  53: Test x86 instruction decoder - new instructions          : Ok
  54: Test intel cqm nmi context read                          : Skip
  #

  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/linux/tools/perf'
                        tarpkg: ./tests/perf-targz-src-pkg .
                  make_debug_O: make DEBUG=1
             make_no_libnuma_O: make NO_LIBNUMA=1
               make_no_slang_O: make NO_SLANG=1
            make_no_libaudit_O: make NO_LIBAUDIT=1
              make_no_libbpf_O: make NO_LIBBPF=1
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
                   make_tags_O: make tags
                    make_doc_O: make doc
           make_no_libunwind_O: make NO_LIBUNWIND=1
            make_install_bin_O: make install-bin
           make_no_libbionic_O: make NO_LIBBIONIC=1
        make_with_babeltrace_O: make LIBBABELTRACE=1
            make_no_demangle_O: make NO_DEMANGLE=1
                 make_perf_o_O: make perf.o
            make_no_auxtrace_O: make NO_AUXTRACE=1
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
                   make_pure_O: make
             make_util_map_o_O: make util/map.o
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
                make_no_newt_O: make NO_NEWT=1
           make_no_libpython_O: make NO_LIBPYTHON=1
       make_util_pmu_bison_o_O: make util/pmu-bison.o
                   make_help_O: make help
         make_install_prefix_O: make install prefix=/tmp/krava
                 make_static_O: make LDFLAGS=-static
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
           make_no_backtrace_O: make NO_BACKTRACE=1
              make_clean_all_O: make clean all
                make_install_O: make install
              make_no_libelf_O: make NO_LIBELF=1
             make_no_libperl_O: make NO_LIBPERL=1
                make_no_gtk2_O: make NO_GTK2=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1
  OK
  make: Leaving directory '/home/acme/git/linux/tools/perf'
  $

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [GIT PULL 00/27] perf/core improvements and fixes
  2016-09-29 14:35 Arnaldo Carvalho de Melo
@ 2016-09-29 17:11 ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2016-09-29 17:11 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Linux Weekly News, Adrian Hunter,
	Alexander Shishkin, Andi Kleen, Anju T Sudhakar, Chong Jiang,
	Clark Williams, Daniel Bristot de Oliveira, David Ahern,
	Jiri Olsa, Josh Poimboeuf, linux-arm-kernel, linuxppc-dev,
	Masami Hiramatsu, Mathieu Poirier, Matt Fleming, Michael Ellerman,
	Namhyung Kim, Peter Zijlstra, pi3orama, Ravi Bangoria, Simon Que,
	Steven Rostedt, Thomas Gleixner, Wang Nan, Zefan Li,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling, more to come soon,
> 
> - Arnaldo
> 
> Build and test results at the end of this message.
> 
> The following changes since commit 6b652de2b27c0a4020ce0e8f277e782b6af76096:
> 
>   Merge tag 'perf-core-for-mingo-20160922' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-09-23 07:21:38 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160929
> 
> for you to fetch changes up to d18019a53a07e009899ff6b8dc5ec30f249360d9:
> 
>   perf tests: Add dwarf unwind test for powerpc (2016-09-29 11:18:21 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> User visible:
> -------------
> 
> New features:
> 
> - Add support for using symbols in address filters with Intel PT and ARM
>   CoreSight (hardware assisted tracing facilities) (Adrian Hunter, Mathieu Poirier)
> 
> Fixes:
> 
> - Fix MMAP event synthesis for pre-existing threads when no hugetlbfs
>   mount is in place (Adrian Hunter)
> 
> - Don't ignore kernel idle symbols in 'perf script' (Adrian Hunter)
> 
> - Assorted Intel PT fixes (Adrian Hunter)
> 
> Improvements:
> 
> - Fix handling of C++ symbols in 'perf probe' (Masami Hiramatsu)
> 
> - Beautify sched_[gs]et_attr return value in 'perf trace' (Arnaldo Carvalho de Melo)
> 
> Infrastructure:
> ---------------
> 
> New features:
> 
> - Add dwarf unwind 'perf test' for powerpc (Ravi Bangoria)
> 
> Fixes:
> 
> - Fix error paths in 'perf record' (Adrian Hunter)
> 
> Documentation:
> 
> - Update documentation info about quipper, a C++ parser for converting
>   to/from perf.data/chromium profiling format (Simon Que)
> 
> Build Fixes:
> 
>   Fix building in 32 bit platform with libbabeltrace (Wang Nan)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Adrian Hunter (16):
>       perf record: Fix documentation 'event_sources' -> 'event_source'
>       perf tools: Fix MMAP event synthesis broken by MAP_HUGETLB change
>       perf script: Fix vanished idle symbols
>       perf record: Rename label 'out_symbol_exit'
>       perf record: Fix error paths
>       perf symbols: Add dso__last_symbol()
>       perf record: Add support for using symbols in address filters
>       perf probe: Increase debug level of SDT debug messages
>       perf intel-pt: Fix snapshot overlap detection decoder errors
>       perf intel-pt: Add support for recording the max non-turbo ratio
>       perf intel-pt: Fix missing error codes processing auxtrace_info
>       perf intel-pt: Add a helper function for processing AUXTRACE_INFO
>       perf intel-pt: Record address filter in AUXTRACE_INFO event
>       perf intel-pt: Read address filter from AUXTRACE_INFO event
>       perf intel-pt: Enable decoder to handle TIP.PGD with missing IP
>       perf intel-pt: Fix decoding when there are address filters
> 
> Arnaldo Carvalho de Melo (1):
>       perf trace: Beautify sched_[gs]et_attr return value
> 
> Masami Hiramatsu (4):
>       perf probe: Ignore the error of finding inline instance
>       perf probe: Skip if the function address is 0
>       perf probe: Fix to cut off incompatible chars from group name
>       perf probe: Match linkage name with mangled name
> 
> Mathieu Poirier (3):
>       perf tools: Make perf_evsel__append_filter() generic
>       perf evsel: New tracepoint specific function
>       perf evsel: Add support for address filters
> 
> Ravi Bangoria (1):
>       perf tests: Add dwarf unwind test for powerpc
> 
> Simon Que (1):
>       perf tools: Update documentation info about quipper
> 
> Wang Nan (1):
>       perf data: Fix building in 32 bit platform with libbabeltrace
> 
>  tools/perf/Documentation/perf-record.txt           |  61 +-
>  tools/perf/Documentation/perf.data-file-format.txt |   6 +-
>  tools/perf/arch/powerpc/Build                      |   1 +
>  tools/perf/arch/powerpc/include/arch-tests.h       |  13 +
>  tools/perf/arch/powerpc/include/perf_regs.h        |   2 +
>  tools/perf/arch/powerpc/tests/Build                |   4 +
>  tools/perf/arch/powerpc/tests/arch-tests.c         |  15 +
>  tools/perf/arch/powerpc/tests/dwarf-unwind.c       |  62 ++
>  tools/perf/arch/powerpc/tests/regs_load.S          |  94 +++
>  tools/perf/arch/x86/util/intel-pt.c                |  57 +-
>  tools/perf/builtin-record.c                        |  32 +-
>  tools/perf/builtin-trace.c                         |  10 +-
>  tools/perf/tests/Build                             |   2 +-
>  tools/perf/tests/dwarf-unwind.c                    |   2 +-
>  tools/perf/util/auxtrace.c                         | 737 +++++++++++++++++++++
>  tools/perf/util/auxtrace.h                         |  54 ++
>  tools/perf/util/build-id.c                         |   4 +-
>  tools/perf/util/data-convert-bt.c                  |   2 +-
>  tools/perf/util/dwarf-aux.c                        |  28 +-
>  tools/perf/util/dwarf-aux.h                        |   3 +
>  tools/perf/util/event.c                            |   3 +-
>  tools/perf/util/evsel.c                            |  16 +-
>  tools/perf/util/evsel.h                            |   5 +-
>  tools/perf/util/evsel_fprintf.c                    |   7 +-
>  .../perf/util/intel-pt-decoder/intel-pt-decoder.c  |  30 +
>  .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |   1 +
>  tools/perf/util/intel-pt.c                         | 172 ++++-
>  tools/perf/util/intel-pt.h                         |   4 +-
>  tools/perf/util/parse-events.c                     |  41 +-
>  tools/perf/util/probe-event.c                      |  10 +-
>  tools/perf/util/probe-file.c                       |   2 +-
>  tools/perf/util/probe-finder.c                     |  17 +-
>  tools/perf/util/symbol.c                           |  15 +
>  tools/perf/util/symbol.h                           |   1 +
>  34 files changed, 1451 insertions(+), 62 deletions(-)
>  create mode 100644 tools/perf/arch/powerpc/include/arch-tests.h
>  create mode 100644 tools/perf/arch/powerpc/tests/Build
>  create mode 100644 tools/perf/arch/powerpc/tests/arch-tests.c
>  create mode 100644 tools/perf/arch/powerpc/tests/dwarf-unwind.c
>  create mode 100644 tools/perf/arch/powerpc/tests/regs_load.S
> 
>   # time dm
>    1  alpine:3.4: Ok
>    2 android-ndk:r12b-arm: Ok
>    3 archlinux:latest: Ok
>    4 centos:5: Ok
>    5 centos:6: Ok
>    6 centos:7: Ok
>    7 debian:7: Ok
>    8 debian:8: Ok
>    9 debian:experimental: Ok
>   10 fedora:20: Ok
>   11 fedora:21: Ok
>   12 fedora:22: Ok
>   13 fedora:23: Ok
>   14 fedora:24: Ok
>   15 fedora:24-x-ARC-uClibc: Ok
>   16 fedora:rawhide: Ok
>   17 mageia:5: Ok
>   18 opensuse:13.2: Ok
>   19 opensuse:42.1: Ok
>   20 opensuse:tumbleweed: Ok
>   21 ubuntu:12.04.5: Ok
>   22 ubuntu:14.04: Ok
>   23 ubuntu:14.04.4: Ok
>   24 ubuntu:15.10: Ok
>   25 ubuntu:16.04: Ok
>   26 ubuntu:16.04-x-arm: Ok
>   27 ubuntu:16.04-x-arm64: Ok
>   28 ubuntu:16.04-x-powerpc: Ok
>   29 ubuntu:16.04-x-powerpc64: Ok
>   30 ubuntu:16.04-x-powerpc64el: Ok
>   31 ubuntu:16.04-x-s390: Ok
>   32 ubuntu:16.10: Ok
>   33 2246.21
> 
>   real	37m26.862s
>   user	0m2.148s
>   sys	0m2.256s
>   # 
> 
>   # perf test
>    1: vmlinux symtab matches kallsyms                          : Ok
>    2: detect openat syscall event                              : Ok
>    3: detect openat syscall event on all cpus                  : Ok
>    4: read samples using the mmap interface                    : Ok
>    5: parse events tests                                       : Ok
>    6: Validate PERF_RECORD_* events & perf_sample fields       : Ok
>    7: Test perf pmu format parsing                             : Ok
>    8: Test dso data read                                       : Ok
>    9: Test dso data cache                                      : Ok
>   10: Test dso data reopen                                     : Ok
>   11: roundtrip evsel->name check                              : Ok
>   12: Check parsing of sched tracepoints fields                : Ok
>   13: Generate and check syscalls:sys_enter_openat event fields: Ok
>   14: struct perf_event_attr setup                             : Ok
>   15: Test matching and linking multiple hists                 : Ok
>   16: Try 'import perf' in python, checking link problems      : Ok
>   17: Test breakpoint overflow signal handler                  : Ok
>   18: Test breakpoint overflow sampling                        : Ok
>   19: Test number of exit event of a simple workload           : Ok
>   20: Test software clock events have valid period values      : Ok
>   21: Test object code reading                                 : Ok
>   22: Test sample parsing                                      : Ok
>   23: Test using a dummy software event to keep tracking       : Ok
>   24: Test parsing with no sample_id_all bit set               : Ok
>   25: Test filtering hist entries                              : Ok
>   26: Test mmap thread lookup                                  : Ok
>   27: Test thread mg sharing                                   : Ok
>   28: Test output sorting of hist entries                      : Ok
>   29: Test cumulation of child hist entries                    : Ok
>   30: Test tracking with sched_switch                          : Ok
>   31: Filter fds with revents mask in a fdarray                : Ok
>   32: Add fd to a fdarray, making it autogrow                  : Ok
>   33: Test kmod_path__parse function                           : Ok
>   34: Test thread map                                          : Ok
>   35: Test LLVM searching and compiling                        :
>   35.1: Basic BPF llvm compiling test                          : Ok
>   35.2: Test kbuild searching                                  : Ok
>   35.3: Compile source for BPF prologue generation test        : Ok
>   35.4: Compile source for BPF relocation test                 : Ok
>   36: Test topology in session                                 : Ok
>   37: Test BPF filter                                          :
>   37.1: Test basic BPF filtering                               : Ok
>   37.2: Test BPF prologue generation                           : Ok
>   37.3: Test BPF relocation checker                            : Ok
>   38: Test thread map synthesize                               : Ok
>   39: Test cpu map synthesize                                  : Ok
>   40: Test stat config synthesize                              : Ok
>   41: Test stat synthesize                                     : Ok
>   42: Test stat round synthesize                               : Ok
>   43: Test attr update synthesize                              : Ok
>   44: Test events times                                        : Ok
>   45: Test backward reading from ring buffer                   : Ok
>   46: Test cpu map print                                       : Ok
>   47: Test SDT event probing                                   : Ok
>   48: Test is_printable_array function                         : Ok
>   49: Test bitmap print                                        : Ok
>   50: x86 rdpmc test                                           : Ok
>   51: Test converting perf time to TSC                         : Ok
>   52: Test dwarf unwind                                        : Ok
>   53: Test x86 instruction decoder - new instructions          : Ok
>   54: Test intel cqm nmi context read                          : Skip
>   #
> 
>   $ make -C tools/perf build-test
>   make: Entering directory '/home/acme/git/linux/tools/perf'
>                         tarpkg: ./tests/perf-targz-src-pkg .
>                   make_debug_O: make DEBUG=1
>              make_no_libnuma_O: make NO_LIBNUMA=1
>                make_no_slang_O: make NO_SLANG=1
>             make_no_libaudit_O: make NO_LIBAUDIT=1
>               make_no_libbpf_O: make NO_LIBBPF=1
>    make_install_prefix_slash_O: make install prefix=/tmp/krava/
>                    make_tags_O: make tags
>                     make_doc_O: make doc
>            make_no_libunwind_O: make NO_LIBUNWIND=1
>             make_install_bin_O: make install-bin
>            make_no_libbionic_O: make NO_LIBBIONIC=1
>         make_with_babeltrace_O: make LIBBABELTRACE=1
>             make_no_demangle_O: make NO_DEMANGLE=1
>                  make_perf_o_O: make perf.o
>             make_no_auxtrace_O: make NO_AUXTRACE=1
>              make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
>                    make_pure_O: make
>              make_util_map_o_O: make util/map.o
>   make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
>                 make_no_newt_O: make NO_NEWT=1
>            make_no_libpython_O: make NO_LIBPYTHON=1
>        make_util_pmu_bison_o_O: make util/pmu-bison.o
>                    make_help_O: make help
>          make_install_prefix_O: make install prefix=/tmp/krava
>                  make_static_O: make LDFLAGS=-static
>                   make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
>            make_no_backtrace_O: make NO_BACKTRACE=1
>               make_clean_all_O: make clean all
>                 make_install_O: make install
>               make_no_libelf_O: make NO_LIBELF=1
>              make_no_libperl_O: make NO_LIBPERL=1
>                 make_no_gtk2_O: make NO_GTK2=1
>                 make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1
>   OK
>   make: Leaving directory '/home/acme/git/linux/tools/perf'
>   $

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [GIT PULL 00/27] perf/core improvements and fixes
@ 2018-01-10 21:28 Arnaldo Carvalho de Melo
  2018-01-11  5:54 ` Ingo Molnar
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-10 21:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, David Ahern,
	Jin Yao, Jiri Olsa, Kan Liang, Namhyung Kim, Peter Zijlstra,
	Thomas Gleixner, Wang Nan, Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 9128d3ed9de3882c83b927eb553d5d44c84505f5:

  perf/x86/msr: Clean up the code (2018-01-06 12:18:40 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.16-20180110

for you to fetch changes up to 5d64db2966e38bfd99114ecf0b54f97d33023dcd:

  tools headers: Synchronize kernel <-> tooling headers (2018-01-10 12:46:54 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

- The 'perf test bpf' entry hooked a eBPF proggie to the
  SyS_epoll_wait() kernel function and expected it to be hit when calling
  the epoll_wait() libc wrapper, which changed recently, in systems such
  as Fedora 27, with the glibc wrapper calling instead the epoll_pwait()
  syscall, so switch to epoll_pwait() for both the kernel and libc
  function, getting it to work both in old and new systems (Arnaldo Carvalho de Melo)

- Beautify 'gettid' syscall result in 'perf trace', and in doing so
  noticed that we need to handle namespaces in 'perf trace', will be
  dealt with in follow up patches where we'll try to figure out if
  the recent support for namespace in tools/perf/ can be used for this
  purpose as well. (Arnaldo Carvalho de Melo)

- Introduce 'perf report --mmaps' and 'perf report --tasks' to show
  info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)

- Synchronize kernel <-> tooling headers wrt meltdown/spectre changes
  (Arnaldo Carvalho de Melo)

- Fix a wrong offset issue when using /proc/kcore (Jin Yao)

- Fix bug that prevented annotating symbols in perf.data files
  generated with 'perf record --branch-any'  (Jin Yao)

- Add infrastructure to record first and last sample time to the
  perf.data file header, so that when processing all samples in
  a 'perf record' session, such as when doing build-id processing,
  or when specifically requesting that that info be recorded, use
  that in 'perf report --time', that also got support for percent
  slices in addition to absolute ones.

  I.e. now it is possible to ask for the samples in the 10%-20%
  time slice of a perf.data file (Jin Yao)

- Enable building with libbabeltrace by default (Jiri Olsa)

- Display perf_event_attr::namespaces when duping the attributes
  in verbose mode (Jiri Olsa)

- Allocate context task_ctx_data for child event (Jiri Olsa)

- Update comments for PERF_RECORD_ITRACE_START and PERF_RECORD_MISC_* (Jiri Olsa)

- Add support for showing PERF_RECORD_LOST events in 'perf script' (Jiri Olsa)

- Add 'perf report --stats' option to display quick statistics about
  metadata events (PERF_RECORD_*) i.e. what we get at the end of 'perf
  report -D' (Jiri Olsa)

- Fix compile error with libunwind x86 (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (6):
      perf test bpf: Improve message about expected samples
      perf test bpf: Use designated struct field initializers
      perf test bpf: Hook on epoll_pwait()
      perf trace: Beautify 'gettid' syscall result
      perf report: Introduce --mmaps
      tools headers: Synchronize kernel <-> tooling headers

Jin Yao (8):
      perf report: Fix a wrong offset issue when using /proc/kcore
      perf report: Fix a no annotate browser displayed issue
      perf header: Add infrastructure to record first and last sample time
      perf record: Record the first and last sample time in the header
      perf tools: Create function to parse time percent
      perf tools: Create function to perform multiple time range checking
      perf report: Support time percent and multiple time ranges
      perf script: Support time percent and multiple time ranges

Jiri Olsa (12):
      perf tools: Enable LIBBABELTRACE by default
      perf tools: Display perf_event_attr::namespaces debug info
      perf: Allocate context task_ctx_data for child event
      perf: Add sample_id to PERF_RECORD_ITRACE_START event comment
      perf: Make perf_callchain function static
      perf: Return empty callchain instead of NULL
      perf: Update PERF_RECORD_MISC_* comment for perf_event_header::misc bit 13
      perf script: Add support to display sample misc field
      perf script: Add support to display lost events
      perf tools: Make the tool's warning messages optional
      perf report: Add --stats option to display quick data statistics
      perf report: Add --tasks option to display monitored tasks

Wang Nan (1):
      perf tools: Fix compile error with libunwind x86

 include/uapi/linux/perf_event.h                    |  10 +-
 kernel/events/callchain.c                          |  15 --
 kernel/events/core.c                               |  54 +++--
 kernel/events/internal.h                           |   4 -
 tools/arch/x86/include/asm/cpufeatures.h           |   4 +-
 tools/arch/x86/include/asm/disabled-features.h     |   8 +-
 tools/include/uapi/linux/perf_event.h              |  10 +-
 tools/perf/Documentation/perf-record.txt           |   3 +
 tools/perf/Documentation/perf-report.txt           |  37 ++-
 tools/perf/Documentation/perf-script.txt           |  39 +++-
 tools/perf/Documentation/perf.data-file-format.txt |   4 +
 tools/perf/Makefile.config                         |   2 +-
 tools/perf/Makefile.perf                           |   2 +-
 tools/perf/arch/x86/util/unwind-libunwind.c        |   2 +-
 tools/perf/builtin-record.c                        |  18 +-
 tools/perf/builtin-report.c                        | 249 ++++++++++++++++++++-
 tools/perf/builtin-script.c                        | 136 +++++++++--
 tools/perf/builtin-trace.c                         |   1 +
 tools/perf/tests/bpf-script-example.c              |   4 +-
 tools/perf/tests/bpf.c                             |  65 +++---
 tools/perf/util/annotate.c                         |   3 +-
 tools/perf/util/event.c                            |   8 +
 tools/perf/util/event.h                            |   1 +
 tools/perf/util/evlist.h                           |   2 +
 tools/perf/util/evsel.c                            |   2 +
 tools/perf/util/header.c                           |  60 +++++
 tools/perf/util/header.h                           |   1 +
 tools/perf/util/machine.c                          |   2 +-
 tools/perf/util/map.c                              |   2 +-
 tools/perf/util/session.c                          |   6 +-
 tools/perf/util/sort.c                             |  16 +-
 tools/perf/util/srcline.c                          |   9 +-
 tools/perf/util/srcline.h                          |   5 +-
 tools/perf/util/time-utils.c                       | 233 ++++++++++++++++++-
 tools/perf/util/time-utils.h                       |   6 +
 tools/perf/util/tool.h                             |   1 +
 36 files changed, 884 insertions(+), 140 deletions(-)

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support.  Where clang is available, it is also used to build
perf with/without libelf.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

  # dm
   1 36.42 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0
   2 43.36 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822
   3 42.02 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0
   4 39.44 alpine:edge                   : Ok   gcc (Alpine 6.4.0) 6.4.0
   5 34.12 amazonlinux:1                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
   6 43.12 amazonlinux:2                 : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
   7 25.83 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
   8 27.52 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
   9 21.87 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  10 31.57 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
  11 37.87 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
  12 35.19 debian:7                      : Ok   gcc (Debian 4.7.2-5) 4.7.2
  13 36.66 debian:8                      : Ok   gcc (Debian 4.9.2-10) 4.9.2
  14 60.91 debian:9                      : Ok   gcc (Debian 6.3.0-18) 6.3.0 20170516
  15 63.71 debian:experimental           : Ok   gcc (Debian 7.2.0-18) 7.2.0
  16 41.01 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
  17 37.53 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
  18 35.28 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 7.2.0-11) 7.2.0
  19 38.06 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
  20 37.48 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
  21 40.09 fedora:21                     : Ok   gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
  22 38.66 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  23 38.56 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  24 40.34 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  25 33.60 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  26 76.30 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
  27 83.91 fedora:26                     : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
  28 76.94 fedora:27                     : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
  29 84.12 fedora:rawhide                : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-4)
  30 39.65 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 6.4.0 p1.1) 6.4.0
  31 40.31 mageia:5                      : Ok   gcc (GCC) 4.9.2
  32 40.96 mageia:6                      : Ok   gcc (Mageia 5.4.0-5.mga6) 5.4.0
  33 40.56 opensuse:42.1                 : Ok   gcc (SUSE Linux) 4.8.5
  34 44.99 opensuse:42.2                 : Ok   gcc (SUSE Linux) 4.8.5
  35 39.41 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5
  36 82.57 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 7.2.1 20171020 [gcc-7-branch revision 253932]
  37 31.39 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
  38 37.82 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
  39 29.52 ubuntu:12.04.5                : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  40 35.32 ubuntu:14.04.4                : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
  41 31.84 ubuntu:14.04.4-x-linaro-arm64 : Ok   aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
  42 59.60 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
  43 32.43 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  44 30.82 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  45 30.35 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  46 32.15 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.1) 5.4.0 20160609
  47 31.77 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  48 31.49 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  49 62.81 ubuntu:16.10                  : Ok   gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
  50 64.87 ubuntu:17.04                  : Ok   gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406
  51 63.93 ubuntu:17.10                  : Ok   gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0
  52 64.43 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.2.0-18ubuntu2) 7.2.0
  # 

  # uname -a
  Linux seventh 4.15.0-rc7+ #2 SMP Wed Jan 10 11:53:43 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Number of exit events of a simple workload            : Ok
  22: Software clock events period values                   : Ok
  23: Object code reading                                   : Ok
  24: Sample parsing                                        : Ok
  25: Use a dummy software event to keep tracking           : Ok
  26: Parse with no sample_id_all bit set                   : Ok
  27: Filter hist entries                                   : Ok
  28: Lookup mmap thread                                    : Ok
  29: Share thread mg                                       : Ok
  30: Sort output of hist entries                           : Ok
  31: Cumulate child hist entries                           : Ok
  32: Track with sched_switch                               : Ok
  33: Filter fds with revents mask in a fdarray             : Ok
  34: Add fd to a fdarray, making it autogrow               : Ok
  35: kmod_path__parse                                      : Ok
  36: Thread map                                            : Ok
  37: LLVM search and compile                               :
  37.1: Basic BPF llvm compile                              : Ok
  37.2: kbuild searching                                    : Ok
  37.3: Compile source for BPF prologue generation          : Ok
  37.4: Compile source for BPF relocation                   : Ok
  38: Session topology                                      : Ok
  39: BPF filter                                            :
  39.1: Basic BPF filtering                                 : Ok
  39.2: BPF pinning                                         : Ok
  39.3: BPF prologue generation                             : Ok
  39.4: BPF relocation checker                              : Ok
  40: Synthesize thread map                                 : Ok
  41: Remove thread map                                     : Ok
  42: Synthesize cpu map                                    : Ok
  43: Synthesize stat config                                : Ok
  44: Synthesize stat                                       : Ok
  45: Synthesize stat round                                 : Ok
  46: Synthesize attr update                                : Ok
  47: Event times                                           : Ok
  48: Read backward ring buffer                             : Ok
  49: Print cpu map                                         : Ok
  50: Probe SDT events                                      : Ok
  51: is_printable_array                                    : Ok
  52: Print bitmap                                          : Ok
  53: perf hooks                                            : Ok
  54: builtin clang support                                 : Skip (not compiled in)
  55: unit_number__scnprintf                                : Ok
  56: x86 rdpmc                                             : Ok
  57: Convert perf time to TSC                              : Ok
  58: DWARF unwind                                          : Ok
  59: x86 instruction decoder - new instructions            : Ok
  60: Use vfs_getname probe to get syscall args filenames   : Ok
  61: Check open filename arg using perf trace + vfs_getname: Ok
  62: probe libc's inet_pton & backtrace it with ping       : Ok
  63: Add vfs_getname probe to get syscall args filenames   : Ok
  #

  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
                 make_perf_o_O: make perf.o
                 make_static_O: make LDFLAGS=-static
                    make_doc_O: make doc
             make_no_libperl_O: make NO_LIBPERL=1
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
                   make_tags_O: make tags
       make_util_pmu_bison_o_O: make util/pmu-bison.o
         make_install_prefix_O: make install prefix=/tmp/krava
              make_clean_all_O: make clean all
              make_no_libelf_O: make NO_LIBELF=1
           make_no_libunwind_O: make NO_LIBUNWIND=1
        make_with_babeltrace_O: make LIBBABELTRACE=1
                make_no_gtk2_O: make NO_GTK2=1
            make_no_demangle_O: make NO_DEMANGLE=1
           make_no_libpython_O: make NO_LIBPYTHON=1
              make_no_libbpf_O: make NO_LIBBPF=1
             make_util_map_o_O: make util/map.o
                  make_debug_O: make DEBUG=1
           make_no_libbionic_O: make NO_LIBBIONIC=1
           make_no_backtrace_O: make NO_BACKTRACE=1
                   make_help_O: make help
                make_no_newt_O: make NO_NEWT=1
            make_no_auxtrace_O: make NO_AUXTRACE=1
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
                make_install_O: make install
            make_no_libaudit_O: make NO_LIBAUDIT=1
                   make_pure_O: make
             make_no_libnuma_O: make NO_LIBNUMA=1
               make_no_slang_O: make NO_SLANG=1
            make_install_bin_O: make install-bin
         make_with_clangllvm_O: make LIBCLANGLLVM=1
  OK
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [GIT PULL 00/27] perf/core improvements and fixes
  2018-01-10 21:28 Arnaldo Carvalho de Melo
@ 2018-01-11  5:54 ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2018-01-11  5:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, linux-perf-users, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, David Ahern, Jin Yao, Jiri Olsa, Kan Liang,
	Namhyung Kim, Peter Zijlstra, Thomas Gleixner, Wang Nan,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 9128d3ed9de3882c83b927eb553d5d44c84505f5:
> 
>   perf/x86/msr: Clean up the code (2018-01-06 12:18:40 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.16-20180110
> 
> for you to fetch changes up to 5d64db2966e38bfd99114ecf0b54f97d33023dcd:
> 
>   tools headers: Synchronize kernel <-> tooling headers (2018-01-10 12:46:54 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> - The 'perf test bpf' entry hooked a eBPF proggie to the
>   SyS_epoll_wait() kernel function and expected it to be hit when calling
>   the epoll_wait() libc wrapper, which changed recently, in systems such
>   as Fedora 27, with the glibc wrapper calling instead the epoll_pwait()
>   syscall, so switch to epoll_pwait() for both the kernel and libc
>   function, getting it to work both in old and new systems (Arnaldo Carvalho de Melo)
> 
> - Beautify 'gettid' syscall result in 'perf trace', and in doing so
>   noticed that we need to handle namespaces in 'perf trace', will be
>   dealt with in follow up patches where we'll try to figure out if
>   the recent support for namespace in tools/perf/ can be used for this
>   purpose as well. (Arnaldo Carvalho de Melo)
> 
> - Introduce 'perf report --mmaps' and 'perf report --tasks' to show
>   info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)
> 
> - Synchronize kernel <-> tooling headers wrt meltdown/spectre changes
>   (Arnaldo Carvalho de Melo)
> 
> - Fix a wrong offset issue when using /proc/kcore (Jin Yao)
> 
> - Fix bug that prevented annotating symbols in perf.data files
>   generated with 'perf record --branch-any'  (Jin Yao)
> 
> - Add infrastructure to record first and last sample time to the
>   perf.data file header, so that when processing all samples in
>   a 'perf record' session, such as when doing build-id processing,
>   or when specifically requesting that that info be recorded, use
>   that in 'perf report --time', that also got support for percent
>   slices in addition to absolute ones.
> 
>   I.e. now it is possible to ask for the samples in the 10%-20%
>   time slice of a perf.data file (Jin Yao)
> 
> - Enable building with libbabeltrace by default (Jiri Olsa)
> 
> - Display perf_event_attr::namespaces when duping the attributes
>   in verbose mode (Jiri Olsa)
> 
> - Allocate context task_ctx_data for child event (Jiri Olsa)
> 
> - Update comments for PERF_RECORD_ITRACE_START and PERF_RECORD_MISC_* (Jiri Olsa)
> 
> - Add support for showing PERF_RECORD_LOST events in 'perf script' (Jiri Olsa)
> 
> - Add 'perf report --stats' option to display quick statistics about
>   metadata events (PERF_RECORD_*) i.e. what we get at the end of 'perf
>   report -D' (Jiri Olsa)
> 
> - Fix compile error with libunwind x86 (Wang Nan)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Arnaldo Carvalho de Melo (6):
>       perf test bpf: Improve message about expected samples
>       perf test bpf: Use designated struct field initializers
>       perf test bpf: Hook on epoll_pwait()
>       perf trace: Beautify 'gettid' syscall result
>       perf report: Introduce --mmaps
>       tools headers: Synchronize kernel <-> tooling headers
> 
> Jin Yao (8):
>       perf report: Fix a wrong offset issue when using /proc/kcore
>       perf report: Fix a no annotate browser displayed issue
>       perf header: Add infrastructure to record first and last sample time
>       perf record: Record the first and last sample time in the header
>       perf tools: Create function to parse time percent
>       perf tools: Create function to perform multiple time range checking
>       perf report: Support time percent and multiple time ranges
>       perf script: Support time percent and multiple time ranges
> 
> Jiri Olsa (12):
>       perf tools: Enable LIBBABELTRACE by default
>       perf tools: Display perf_event_attr::namespaces debug info
>       perf: Allocate context task_ctx_data for child event
>       perf: Add sample_id to PERF_RECORD_ITRACE_START event comment
>       perf: Make perf_callchain function static
>       perf: Return empty callchain instead of NULL
>       perf: Update PERF_RECORD_MISC_* comment for perf_event_header::misc bit 13
>       perf script: Add support to display sample misc field
>       perf script: Add support to display lost events
>       perf tools: Make the tool's warning messages optional
>       perf report: Add --stats option to display quick data statistics
>       perf report: Add --tasks option to display monitored tasks
> 
> Wang Nan (1):
>       perf tools: Fix compile error with libunwind x86
> 
>  include/uapi/linux/perf_event.h                    |  10 +-
>  kernel/events/callchain.c                          |  15 --
>  kernel/events/core.c                               |  54 +++--
>  kernel/events/internal.h                           |   4 -
>  tools/arch/x86/include/asm/cpufeatures.h           |   4 +-
>  tools/arch/x86/include/asm/disabled-features.h     |   8 +-
>  tools/include/uapi/linux/perf_event.h              |  10 +-
>  tools/perf/Documentation/perf-record.txt           |   3 +
>  tools/perf/Documentation/perf-report.txt           |  37 ++-
>  tools/perf/Documentation/perf-script.txt           |  39 +++-
>  tools/perf/Documentation/perf.data-file-format.txt |   4 +
>  tools/perf/Makefile.config                         |   2 +-
>  tools/perf/Makefile.perf                           |   2 +-
>  tools/perf/arch/x86/util/unwind-libunwind.c        |   2 +-
>  tools/perf/builtin-record.c                        |  18 +-
>  tools/perf/builtin-report.c                        | 249 ++++++++++++++++++++-
>  tools/perf/builtin-script.c                        | 136 +++++++++--
>  tools/perf/builtin-trace.c                         |   1 +
>  tools/perf/tests/bpf-script-example.c              |   4 +-
>  tools/perf/tests/bpf.c                             |  65 +++---
>  tools/perf/util/annotate.c                         |   3 +-
>  tools/perf/util/event.c                            |   8 +
>  tools/perf/util/event.h                            |   1 +
>  tools/perf/util/evlist.h                           |   2 +
>  tools/perf/util/evsel.c                            |   2 +
>  tools/perf/util/header.c                           |  60 +++++
>  tools/perf/util/header.h                           |   1 +
>  tools/perf/util/machine.c                          |   2 +-
>  tools/perf/util/map.c                              |   2 +-
>  tools/perf/util/session.c                          |   6 +-
>  tools/perf/util/sort.c                             |  16 +-
>  tools/perf/util/srcline.c                          |   9 +-
>  tools/perf/util/srcline.h                          |   5 +-
>  tools/perf/util/time-utils.c                       | 233 ++++++++++++++++++-
>  tools/perf/util/time-utils.h                       |   6 +
>  tools/perf/util/tool.h                             |   1 +
>  36 files changed, 884 insertions(+), 140 deletions(-)

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [GIT PULL 00/27] perf/core improvements and fixes
@ 2018-07-25 17:59 Arnaldo Carvalho de Melo
  2018-07-25 20:34 ` Ingo Molnar
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-07-25 17:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin,
	Alexei Starovoitov, Alexey Budankov, Andi Kleen,
	Christian Borntraeger, David Ahern, David Carrillo-Cisneros,
	Daniel Borkmann, Heiko Carstens, Hendrik Brueckner, Jiri Olsa,
	Kan Liang, kernel-team, Kim Phillips, Leo Yan, linux-arm-kernel,
	Lukasz Odzioba, Martin Schwidefsky, Mathieu Poirier,
	Maynard Johnson, Michael Ellerman, Mike Leach, Milian Wolff,
	Namhyung Kim, Naveen N . Rao, Peter Zijlstra, Ravi Bangoria,
	Robert Walker, Sandipan Das, Sangwon Hong, stable, Stefan Raspl,
	Stephane Eranian, Sukadev Bhattiprolu, Thomas Richter, Wang Nan,
	Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling, I'm now investigating why these failed:

  38: LLVM search and compile                               :
  38.1: Basic BPF llvm compile                              : Ok
  38.2: kbuild searching                                    : Ok
  38.3: Compile source for BPF prologue generation          : Ok
  38.4: Compile source for BPF relocation                   : FAILED!
  40: BPF filter                                            :
  40.1: Basic BPF filtering                                 : Ok
  40.2: BPF pinning                                         : Ok
  40.3: BPF prologue generation                             : Ok
  40.4: BPF relocation checker                              : FAILED!

	I think these failures are not related to changes in this patch
kit. Details about the test environment, versions, etc.

Regards,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 1d59d16e9b4d5be80c9786a8b129c0f2af0e9522:

  Merge remote-tracking branch 'tip/perf/urgent' into perf/core (2018-07-24 14:34:32 -0300)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.19-20180725

for you to fetch changes up to 9ef0112442bdddef5fb55adf20b3a5464b33de75:

  perf test: Fix subtest number when showing results (2018-07-24 14:55:51 -0300)

----------------------------------------------------------------
perf/cores fixes and improvements:

Tools:

top:

- Fix 'struct comm_str' removal crash race, detected with refcount_t
  debugging (Jiri Olsa)

- Use last_match threads cache only in single threaded mode, fixing
  a crash (Jiri Olsa)

record:

- Synthesize GROUP_DESC feature in pipe mode fixing display of
  event groups (Jiri Olsa)

stat:

- Get rid of extra clock display function (Jiri Olsa)

perf script:

- Show correct offsets for DWARF-based unwinding (Sandipan Das)

test:

- Check that complex event name is parsed correctly (Alexey Budankov)

- Fix subtest number when showing results (Thomas Richter)

Arch specific:

arm64:

- Generate syscall table from the kernel sources (asm/unistd.h) like
  other arches do, speeding up the support for new system calls in
  tools such as 'perf trace' (Kim Phillips)

arm:

- Bail out immediatelly on CoreSight hardware tracing instruction sample failure (Leo Yan)

PowerPC:

- Fix record+probe_libc_inet_pton.sh 'perf test' entry (Sandipan Das)

- Callchain IP filtering fixes (Sandipan Das)

S/390:

- Add support for detailed S/390 PMU event description in 'perf list' (Thomas Richter)

- Add transaction flag (-T) support in 'perf stat' for S/390 (Thomas Richter)

- Fix 'perf kvm' S/390 subcommands (Thomas Richter)

Infrastructure:

hists:

- Clarify callchain disabling when available (Arnaldo Carvalho de Melo)

evsel:

- Use perf_evsel__match instead of open coded equivalent (Jiri Olsa)

Documentation:

- Add missing documentation for 'perf list' --desc and --debug options (Sangwon Hong)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Alexey Budankov (1):
      perf tests: Check that complex event name is parsed correctly

Arnaldo Carvalho de Melo (1):
      perf hists: Clarify callchain disabling when available

Jiri Olsa (7):
      perf tools: Synthesize GROUP_DESC feature in pipe mode
      perf machine: Add threads__get_last_match function
      perf machine: Add threads__set_last_match function
      perf machine: Use last_match threads cache only in single thread mode
      perf tools: Fix struct comm_str removal crash
      perf tools: Use perf_evsel__match instead of open coded equivalent
      perf stat: Get rid of extra clock display function

Kim Phillips (3):
      tools include: Grab copies of arm64 dependent unistd.h files
      perf arm64: Generate system call table from asm/unistd.h
      perf trace arm64: Use generated syscall table

Leo Yan (2):
      perf cs-etm: Introduce invalid address macro
      perf cs-etm: Bail out immediately for instruction sample failure

Sandipan Das (6):
      perf powerpc: Fix callchain ip filtering
      perf powerpc: Fix callchain ip filtering when return address is in a register
      perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64
      perf tests: Fix record+probe_libc_inet_pton.sh to ensure cleanups
      perf tests: Fix record+probe_libc_inet_pton.sh when event exists
      perf script: Show correct offsets for DWARF-based unwinding

Sangwon Hong (1):
      perf list: Add missing documentation for --desc and --debug options

Thomas Richter (6):
      Revert "perf list: Add s390 support for detailed/verbose PMU event description"
      perf list: Add s390 support for detailed PMU event description
      perf json: Add s390 transaction counter definition
      perf stat: Add transaction flag (-T) support for s390
      perf kvm: Fix subcommands on s390
      perf test: Fix subtest number when showing results

 tools/arch/arm64/include/uapi/asm/unistd.h         |  20 +
 tools/include/uapi/asm-generic/unistd.h            | 783 +++++++++++++++++++++
 tools/perf/Documentation/perf-list.txt             |   8 +-
 tools/perf/Makefile.config                         |   2 +
 tools/perf/arch/arm64/Makefile                     |  21 +
 tools/perf/arch/arm64/entry/syscalls/mksyscalltbl  |  62 ++
 tools/perf/arch/powerpc/util/skip-callchain-idx.c  |  10 +-
 tools/perf/arch/s390/util/kvm-stat.c               |   2 +-
 tools/perf/builtin-c2c.c                           |   4 +-
 tools/perf/builtin-diff.c                          |   2 +-
 tools/perf/builtin-report.c                        |   4 +-
 tools/perf/builtin-stat.c                          |  60 +-
 tools/perf/builtin-top.c                           |   2 +-
 tools/perf/check-headers.sh                        |   2 +
 tools/perf/pmu-events/arch/s390/cf_z10/basic.json  |  12 +
 tools/perf/pmu-events/arch/s390/cf_z10/crypto.json |  16 +
 .../perf/pmu-events/arch/s390/cf_z10/extended.json |  18 +
 tools/perf/pmu-events/arch/s390/cf_z13/basic.json  |  12 +
 tools/perf/pmu-events/arch/s390/cf_z13/crypto.json |  16 +
 .../perf/pmu-events/arch/s390/cf_z13/extended.json |  56 ++
 .../pmu-events/arch/s390/cf_z13/transaction.json   |   7 +
 tools/perf/pmu-events/arch/s390/cf_z14/basic.json  |   8 +
 tools/perf/pmu-events/arch/s390/cf_z14/crypto.json |  16 +
 .../perf/pmu-events/arch/s390/cf_z14/extended.json |  53 ++
 .../pmu-events/arch/s390/cf_z14/transaction.json   |   7 +
 tools/perf/pmu-events/arch/s390/cf_z196/basic.json |  12 +
 .../perf/pmu-events/arch/s390/cf_z196/crypto.json  |  16 +
 .../pmu-events/arch/s390/cf_z196/extended.json     |  24 +
 .../perf/pmu-events/arch/s390/cf_zec12/basic.json  |  12 +
 .../perf/pmu-events/arch/s390/cf_zec12/crypto.json |  16 +
 .../pmu-events/arch/s390/cf_zec12/extended.json    |  35 +
 .../pmu-events/arch/s390/cf_zec12/transaction.json |   7 +
 tools/perf/pmu-events/jevents.c                    |   2 +
 tools/perf/tests/builtin-test.c                    |   2 +-
 tools/perf/tests/parse-events.c                    |  18 +
 .../tests/shell/record+probe_libc_inet_pton.sh     |  36 +-
 tools/perf/ui/stdio/hist.c                         |   8 +-
 tools/perf/util/comm.c                             |  16 +-
 tools/perf/util/cs-etm-decoder/cs-etm-decoder.c    |  10 +-
 tools/perf/util/cs-etm.c                           |   3 +
 tools/perf/util/evsel.c                            |  11 +
 tools/perf/util/evsel.h                            |   9 +-
 tools/perf/util/header.c                           |   2 +-
 tools/perf/util/hist.h                             |   2 +-
 tools/perf/util/machine.c                          |  79 ++-
 tools/perf/util/metricgroup.c                      |  22 +
 tools/perf/util/metricgroup.h                      |   1 +
 tools/perf/util/pmu.c                              |   6 -
 tools/perf/util/stat-shadow.c                      |   5 +-
 tools/perf/util/syscalltbl.c                       |   4 +
 tools/perf/util/unwind-libdw.c                     |   2 +-
 tools/perf/util/unwind-libunwind-local.c           |   2 +-
 52 files changed, 1456 insertions(+), 109 deletions(-)
 create mode 100644 tools/arch/arm64/include/uapi/asm/unistd.h
 create mode 100644 tools/include/uapi/asm-generic/unistd.h
 create mode 100755 tools/perf/arch/arm64/entry/syscalls/mksyscalltbl
 create mode 100644 tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
 create mode 100644 tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
 create mode 100644 tools/perf/pmu-events/arch/s390/cf_zec12/transaction.json

Test results:

The first ones are container (docker) based builds of tools/perf with
and without libelf support.  Where clang is available, it is also used
to build perf with/without libelf, and building with LIBCLANGLLVM=1
(built-in clang) with gcc and clang when clang and its devel libraries
are installed.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

  # dm
   1 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0
   2 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822
   3 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0
   4 alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   5 alpine:edge                   : Ok   gcc (Alpine 6.4.0) 6.4.0
   6 amazonlinux:1                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
   7 amazonlinux:2                 : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
   8 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
   9 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  10 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  11 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
  12 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
  13 debian:7                      : Ok   gcc (Debian 4.7.2-5) 4.7.2
  14 debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u1) 4.9.2
  15 debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
  16 debian:experimental           : Ok   gcc (Debian 7.3.0-15) 7.3.0
  17 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 7.3.0-15) 7.3.0
  18 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 7.3.0-19) 7.3.0
  19 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 7.3.0-18) 7.3.0
  20 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 7.3.0-20) 7.3.0
  21 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
  22 fedora:21                     : Ok   gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
  23 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  24 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  25 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  26 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  27 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
  28 fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
  29 fedora:27                     : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
  30 fedora:28                     : Ok   gcc (GCC) 8.1.1 20180712 (Red Hat 8.1.1-5)
  31 fedora:rawhide                : Ok   gcc (GCC) 8.0.1 20180324 (Red Hat 8.0.1-0.20)
  32 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 7.3.0-r3 p1.4) 7.3.0
  33 mageia:5                      : Ok   gcc (GCC) 4.9.2
  34 mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0
  35 opensuse:42.1                 : Ok   gcc (SUSE Linux) 4.8.5
  36 opensuse:42.2                 : Ok   gcc (SUSE Linux) 4.8.5
  37 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5
  38 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
  39 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
  40 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28.0.1)
  41 ubuntu:12.04.5                : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  42 ubuntu:14.04.4                : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
  43 ubuntu:14.04.4-x-linaro-arm64 : Ok   aarch64-linux-gnu-gcc (Linaro GCC 5.4-2017.05) 5.4.1 20170404
  44 ubuntu:15.04                  : Ok   gcc (Ubuntu 4.9.2-10ubuntu13) 4.9.2
  45 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  46 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  47 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  48 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  49 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  50 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  51 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  52 ubuntu:16.10                  : Ok   gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
  53 ubuntu:17.04                  : Ok   gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406
  54 ubuntu:17.10                  : Ok   gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
  55 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  #

  Investigation is underway for the BPF related failures below.

  # git log --oneline -1
  9ef0112442bd (HEAD -> perf/core, jouet/perf/core) perf test: Fix subtest number when showing results
  # perf version --build-options
  perf version 4.18.rc6.g9ef0112
                   dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
      dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                   glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                    gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
           syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                  libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
                  libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                 libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
  numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
               libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                libslang: [ on  ]  # HAVE_SLANG_SUPPORT
               libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
               libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
      libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                    zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                    lzma: [ on  ]  # HAVE_LZMA_SUPPORT
               get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
  # uname -a
  Linux seventh 4.18.0-rc6-00093-g9981b4fb8684 #2 SMP Wed Jul 25 12:31:40 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Breakpoint accounting                                 : Ok
  22: Number of exit events of a simple workload            : Ok
  23: Software clock events period values                   : Ok
  24: Object code reading                                   : Ok
  25: Sample parsing                                        : Ok
  26: Use a dummy software event to keep tracking           : Ok
  27: Parse with no sample_id_all bit set                   : Ok
  28: Filter hist entries                                   : Ok
  29: Lookup mmap thread                                    : Ok
  30: Share thread mg                                       : Ok
  31: Sort output of hist entries                           : Ok
  32: Cumulate child hist entries                           : Ok
  33: Track with sched_switch                               : Ok
  34: Filter fds with revents mask in a fdarray             : Ok
  35: Add fd to a fdarray, making it autogrow               : Ok
  36: kmod_path__parse                                      : Ok
  37: Thread map                                            : Ok
  38: LLVM search and compile                               :
  38.1: Basic BPF llvm compile                              : Ok
  38.2: kbuild searching                                    : Ok
  38.3: Compile source for BPF prologue generation          : Ok
  38.4: Compile source for BPF relocation                   : FAILED!
  39: Session topology                                      : Ok
  40: BPF filter                                            :
  40.1: Basic BPF filtering                                 : Ok
  40.2: BPF pinning                                         : Ok
  40.3: BPF prologue generation                             : Ok
  40.4: BPF relocation checker                              : FAILED!
  41: Synthesize thread map                                 : Ok
  42: Remove thread map                                     : Ok
  43: Synthesize cpu map                                    : Ok
  44: Synthesize stat config                                : Ok
  45: Synthesize stat                                       : Ok
  46: Synthesize stat round                                 : Ok
  47: Synthesize attr update                                : Ok
  48: Event times                                           : Ok
  49: Read backward ring buffer                             : Ok
  50: Print cpu map                                         : Ok
  51: Probe SDT events                                      : Ok
  52: is_printable_array                                    : Ok
  53: Print bitmap                                          : Ok
  54: perf hooks                                            : Ok
  55: builtin clang support                                 : Skip (not compiled in)
  56: unit_number__scnprintf                                : Ok
  57: mem2node                                              : Ok
  58: x86 rdpmc                                             : Ok
  59: Convert perf time to TSC                              : Ok
  60: DWARF unwind                                          : Ok
  61: x86 instruction decoder - new instructions            : Ok
  62: probe libc's inet_pton & backtrace it with ping       : Ok
  63: Check open filename arg using perf trace + vfs_getname: Ok
  64: Use vfs_getname probe to get syscall args filenames   : Ok
  65: Add vfs_getname probe to get syscall args filenames   : Ok
  #

  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git2/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
        make_with_babeltrace_O: make LIBBABELTRACE=1
       make_util_pmu_bison_o_O: make util/pmu-bison.o
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
              make_clean_all_O: make clean all
           make_no_libunwind_O: make NO_LIBUNWIND=1
             make_util_map_o_O: make util/map.o
            make_no_auxtrace_O: make NO_AUXTRACE=1
           make_no_libbionic_O: make NO_LIBBIONIC=1
                make_install_O: make install
                   make_pure_O: make
                    make_doc_O: make doc
                   make_help_O: make help
                make_no_gtk2_O: make NO_GTK2=1
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
               make_no_slang_O: make NO_SLANG=1
            make_install_bin_O: make install-bin
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
            make_no_libaudit_O: make NO_LIBAUDIT=1
             make_no_libnuma_O: make NO_LIBNUMA=1
                make_no_newt_O: make NO_NEWT=1
            make_no_demangle_O: make NO_DEMANGLE=1
              make_no_libelf_O: make NO_LIBELF=1
                 make_cscope_O: make cscope
                 make_static_O: make LDFLAGS=-static
                  make_debug_O: make DEBUG=1
                 make_perf_o_O: make perf.o
           make_no_backtrace_O: make NO_BACKTRACE=1
                   make_tags_O: make tags
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
         make_install_prefix_O: make install prefix=/tmp/krava
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
         make_with_clangllvm_O: make LIBCLANGLLVM=1
              make_no_libbpf_O: make NO_LIBBPF=1
             make_no_libperl_O: make NO_LIBPERL=1
           make_no_libpython_O: make NO_LIBPYTHON=1
  OK
  make: Leaving directory '/home/acme/git2/perf/tools/perf'
  $

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [GIT PULL 00/27] perf/core improvements and fixes
  2018-07-25 17:59 Arnaldo Carvalho de Melo
@ 2018-07-25 20:34 ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2018-07-25 20:34 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter,
	Alexander Shishkin, Alexei Starovoitov, Alexey Budankov,
	Andi Kleen, Christian Borntraeger, David Ahern,
	David Carrillo-Cisneros, Daniel Borkmann, Heiko Carstens,
	Hendrik Brueckner, Jiri Olsa, Kan Liang, kernel-team,
	Kim Phillips, Leo Yan, linux-arm-kernel, Lukasz Odzioba,
	Martin Schwidefsky, Mathieu Poirier, Maynard Johnson,
	Michael Ellerman, Mike Leach, Milian Wolff, Namhyung Kim,
	Naveen N . Rao, Peter Zijlstra, Ravi Bangoria, Robert Walker,
	Sandipan Das, Sangwon Hong, stable, Stefan Raspl,
	Stephane Eranian, Sukadev Bhattiprolu, Thomas Richter, Wang Nan,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling, I'm now investigating why these failed:
> 
>   38: LLVM search and compile                               :
>   38.1: Basic BPF llvm compile                              : Ok
>   38.2: kbuild searching                                    : Ok
>   38.3: Compile source for BPF prologue generation          : Ok
>   38.4: Compile source for BPF relocation                   : FAILED!
>   40: BPF filter                                            :
>   40.1: Basic BPF filtering                                 : Ok
>   40.2: BPF pinning                                         : Ok
>   40.3: BPF prologue generation                             : Ok
>   40.4: BPF relocation checker                              : FAILED!
> 
> 	I think these failures are not related to changes in this patch
> kit. Details about the test environment, versions, etc.

Ok!

> The following changes since commit 1d59d16e9b4d5be80c9786a8b129c0f2af0e9522:
> 
>   Merge remote-tracking branch 'tip/perf/urgent' into perf/core (2018-07-24 14:34:32 -0300)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.19-20180725

>  52 files changed, 1456 insertions(+), 109 deletions(-)

Pulled, thanks a lot Arnaldo!

Could we please also fix these before v4.18 is released, which trigger in 
perf/urgent:

Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
Warning: Kernel ABI header at 'tools/arch/powerpc/include/uapi/asm/unistd.h' differs from latest version at 'arch/powerpc/include/uapi/asm/unistd.h'
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'

?

Thanks!

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [GIT PULL 00/27] perf/core improvements and fixes
@ 2018-09-24 15:02 Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-24 15:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Andi Kleen,
	Andrew Morton, Jin Yao, Jiri Olsa, John Garry, Josh Poimboeuf,
	Kim Phillips, linux-arm-kernel, linux-trace-devel, Namhyung Kim,
	Sangwon Hong, Sean V Kelley, Steven Rostedt, Tzvetomir Stoyanov,
	William Cohen, Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling, this is on top of
perf-core-for-mingo-4.20-20180919, that is not yet in tip.

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 24ef0fd0a1f389b156e6ef0edd71072728831bd9:

  perf python: Use -Wno-redundant-decls to build with PYTHON=python3 (2018-09-19 10:25:13 -0300)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.20-20180919

for you to fetch changes up to 24ef0fd0a1f389b156e6ef0edd71072728831bd9:

  perf python: Use -Wno-redundant-decls to build with PYTHON=python3 (2018-09-19 10:25:13 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

perf test:

- Add watchpoint entry (Ravi Bangoria)

Build fixes:

- Initialize perf_data_file fd field to fix building the CTF (trace format)
  converter with with gcc 4.8.4 on Ubuntu 14.04 (Jérémie Galarneau)

- Use -Wno-redundant-decls to build with PYTHON=python3 to
  build the python binding, fixing the build in systems such
  as Clear Linux (Arnaldo Carvalho de Melo)

Hardware tracing:

- Suppress AUX/OVERWRITE records (Alexander Shishkin)

Infrastructure:

- Adopt PTR_ERR_OR_ZERO from the kernel and use it in
  the bpf-loader instead of open coded equivalent (Ding Xiang)

- Improve the event ordering code to make it clear and fix
  a bug related to freeing of events when using pipe mode
  from 'record' to 'inject' (Jiri Olsa)

- Some prep work to facilitate per-cpu threads to write
  record data to per-cpu files (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------

Test results:

The first ones are container (docker) based builds of tools/perf with
and without libelf support.  Where clang is available, it is also used
to build perf with/without libelf, and building with LIBCLANGLLVM=1
(built-in clang) with gcc and clang when clang and its devel libraries
are installed.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

The Clear Linux container is building with NO_CLANG=1, the problem preventing
its use when building for python3 has been identified and the next builds will
build in ClearLinux with both gcc and clang. This time around only gcc was
used.

  # dm
   1 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0
   2 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822
   3 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0
   4 alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   5 alpine:3.8                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   6 alpine:edge                   : Ok   gcc (Alpine 6.4.0) 6.4.0
   7 amazonlinux:1                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
   8 amazonlinux:2                 : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
   9 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  10 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  11 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  12 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
  13 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
  14 clearlinux:latest             : Ok   gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502
  15 debian:7                      : Ok   gcc (Debian 4.7.2-5) 4.7.2
  16 debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u1) 4.9.2
  17 debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
  18 debian:experimental           : Ok   gcc (Debian 8.2.0-4) 8.2.0
  19 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.2.0-4) 8.2.0
  20 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 8.2.0-4) 8.2.0
  21 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 8.1.0-12) 8.1.0
  22 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 8.2.0-4) 8.2.0
  23 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
  24 fedora:21                     : Ok   gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
  25 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  26 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  27 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  28 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  29 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
  30 fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
  31 fedora:27                     : Ok   gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6)
  32 fedora:28                     : Ok   gcc (GCC) 8.1.1 20180712 (Red Hat 8.1.1-5)
  33 fedora:rawhide                : Ok   gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3)
  34 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 7.3.0-r3 p1.4) 7.3.0
  35 mageia:5                      : Ok   gcc (GCC) 4.9.2
  36 mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0
  37 opensuse:13.2                 : Ok   gcc (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]
  38 opensuse:42.1                 : Ok   gcc (SUSE Linux) 4.8.5
  39 opensuse:42.2                 : Ok   gcc (SUSE Linux) 4.8.5
  40 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5
  41 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
  42 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
  43 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28.0.1)
  44 ubuntu:12.04.5                : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  45 ubuntu:14.04.4                : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
  46 ubuntu:14.04.4-x-linaro-arm64 : Ok   aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
  47 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  48 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  49 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  50 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  51 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  52 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  53 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  54 ubuntu:16.10                  : Ok   gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
  55 ubuntu:17.10                  : Ok   gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
  56 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  57 ubuntu:18.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.3.0-16ubuntu3) 7.3.0
  58 ubuntu:18.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.3.0-16ubuntu3) 7.3.0
  59 ubuntu:18.04-x-m68k           : Ok   m68k-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  60 ubuntu:18.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  61 ubuntu:18.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  62 ubuntu:18.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  63 ubuntu:18.04-x-riscv64        : Ok   riscv64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  64 ubuntu:18.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  65 ubuntu:18.04-x-sh4            : Ok   sh4-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  66 ubuntu:18.04-x-sparc64        : Ok   sparc64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  67 ubuntu:18.10                  : Ok   gcc (Ubuntu 8.2.0-4ubuntu1) 8.2.0

  # uname -a
  Linux jouet 4.19.0-rc4-00022-gad3273d5f1b9 #1 SMP Mon Sep 17 17:18:22 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
  # git log --oneline -1
  d35c595bf005 perf vendor events arm64: Revise core JSON events for eMAG
  # perf version --build-options
  perf version 4.19.rc2.gd35c595
                   dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
      dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                   glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                    gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
           syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                  libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
                  libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                 libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
  numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
               libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                libslang: [ on  ]  # HAVE_SLANG_SUPPORT
               libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
               libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
      libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                    zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                    lzma: [ on  ]  # HAVE_LZMA_SUPPORT
               get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Breakpoint accounting                                 : Ok
  22: Watchpoint                                            :
  22.1: Read Only Watchpoint                                : Skip
  22.2: Write Only Watchpoint                               : Ok
  22.3: Read / Write Watchpoint                             : Ok
  22.4: Modify Watchpoint                                   : Ok
  23: Number of exit events of a simple workload            : Ok
  24: Software clock events period values                   : Ok
  25: Object code reading                                   : Ok
  26: Sample parsing                                        : Ok
  27: Use a dummy software event to keep tracking           : Ok
  28: Parse with no sample_id_all bit set                   : Ok
  29: Filter hist entries                                   : Ok
  30: Lookup mmap thread                                    : Ok
  31: Share thread mg                                       : Ok
  32: Sort output of hist entries                           : Ok
  33: Cumulate child hist entries                           : Ok
  34: Track with sched_switch                               : Ok
  35: Filter fds with revents mask in a fdarray             : Ok
  36: Add fd to a fdarray, making it autogrow               : Ok
  37: kmod_path__parse                                      : Ok
  38: Thread map                                            : Ok
  39: LLVM search and compile                               :
  39.1: Basic BPF llvm compile                              : Ok
  39.2: kbuild searching                                    : Ok
  39.3: Compile source for BPF prologue generation          : Ok
  39.4: Compile source for BPF relocation                   : Ok
  40: Session topology                                      : Ok
  41: BPF filter                                            :
  41.1: Basic BPF filtering                                 : Ok
  41.2: BPF pinning                                         : Ok
  41.3: BPF prologue generation                             : Ok
  41.4: BPF relocation checker                              : Ok
  42: Synthesize thread map                                 : Ok
  43: Remove thread map                                     : Ok
  44: Synthesize cpu map                                    : Ok
  45: Synthesize stat config                                : Ok
  46: Synthesize stat                                       : Ok
  47: Synthesize stat round                                 : Ok
  48: Synthesize attr update                                : Ok
  49: Event times                                           : Ok
  50: Read backward ring buffer                             : Ok
  51: Print cpu map                                         : Ok
  52: Probe SDT events                                      : Ok
  53: is_printable_array                                    : Ok
  54: Print bitmap                                          : Ok
  55: perf hooks                                            : Ok
  56: builtin clang support                                 : Skip (not compiled in)
  57: unit_number__scnprintf                                : Ok
  58: mem2node                                              : Ok
  59: x86 rdpmc                                             : Ok
  60: Convert perf time to TSC                              : Ok
  61: DWARF unwind                                          : Ok
  62: x86 instruction decoder - new instructions            : Ok
  63: x86 bp modify                                         : Ok
  64: Use vfs_getname probe to get syscall args filenames   : Ok
  65: Check open filename arg using perf trace + vfs_getname: Ok
  66: probe libc's inet_pton & backtrace it with ping       : Ok
  67: Add vfs_getname probe to get syscall args filenames   : Ok
  
  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
            make_install_bin_O: make install-bin
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
                make_no_gtk2_O: make NO_GTK2=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
                   make_pure_O: make
             make_no_libnuma_O: make NO_LIBNUMA=1
                make_install_O: make install
           make_no_libunwind_O: make NO_LIBUNWIND=1
             make_no_libperl_O: make NO_LIBPERL=1
              make_no_libbpf_O: make NO_LIBBPF=1
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
                   make_help_O: make help
            make_no_auxtrace_O: make NO_AUXTRACE=1
         make_with_clangllvm_O: make LIBCLANGLLVM=1
               make_no_slang_O: make NO_SLANG=1
        make_with_babeltrace_O: make LIBBABELTRACE=1
              make_clean_all_O: make clean all
       make_util_pmu_bison_o_O: make util/pmu-bison.o
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
            make_no_libaudit_O: make NO_LIBAUDIT=1
             make_util_map_o_O: make util/map.o
                  make_debug_O: make DEBUG=1
           make_no_libpython_O: make NO_LIBPYTHON=1
           make_no_backtrace_O: make NO_BACKTRACE=1
           make_no_libbionic_O: make NO_LIBBIONIC=1
            make_no_demangle_O: make NO_DEMANGLE=1
                   make_tags_O: make tags
                 make_static_O: make LDFLAGS=-static
                 make_perf_o_O: make perf.o
              make_no_libelf_O: make NO_LIBELF=1
                    make_doc_O: make doc
                make_no_newt_O: make NO_NEWT=1
         make_install_prefix_O: make install prefix=/tmp/krava
  OK
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $ 

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2018-09-24 15:03 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-01 13:31 [GIT PULL 00/27] perf/core improvements and fixes Jiri Olsa
2014-06-01 13:31 ` [PATCH 01/27] perf tools: Introduce hists__inc_nr_samples() Jiri Olsa
2014-06-01 13:31 ` [PATCH 02/27] perf tools: Introduce struct hist_entry_iter Jiri Olsa
2014-06-01 13:31 ` [PATCH 03/27] perf hists: Add support for accumulated stat of hist entry Jiri Olsa
2014-06-01 13:31 ` [PATCH 04/27] perf hists: Check if accumulated when adding a " Jiri Olsa
2014-06-01 13:31 ` [PATCH 05/27] perf hists: Accumulate hist entry stat based on the callchain Jiri Olsa
2014-06-01 13:31 ` [PATCH 06/27] perf tools: Update cpumode for each cumulative entry Jiri Olsa
2014-06-01 13:31 ` [PATCH 07/27] perf report: Cache cumulative callchains Jiri Olsa
2014-06-01 13:31 ` [PATCH 08/27] perf callchain: Add callchain_cursor_snapshot() Jiri Olsa
2014-06-01 13:31 ` [PATCH 09/27] perf tools: Save callchain info for each cumulative entry Jiri Olsa
2014-06-01 13:31 ` [PATCH 10/27] perf ui/hist: Add support to accumulated hist stat Jiri Olsa
2014-06-01 13:31 ` [PATCH 11/27] perf ui/browser: " Jiri Olsa
2014-06-01 13:31 ` [PATCH 12/27] perf ui/gtk: " Jiri Olsa
2014-06-01 13:31 ` [PATCH 13/27] perf tools: Apply percent-limit to cumulative percentage Jiri Olsa
2014-06-01 13:31 ` [PATCH 14/27] perf tools: Add more hpp helper functions Jiri Olsa
2014-06-01 13:31 ` [PATCH 15/27] perf report: Add --children option Jiri Olsa
2014-06-01 13:31 ` [PATCH 16/27] perf report: Add report.children config option Jiri Olsa
2014-06-01 13:31 ` [PATCH 17/27] perf tools: Do not auto-remove Children column if --fields given Jiri Olsa
2014-06-01 13:31 ` [PATCH 18/27] perf tools: Add callback function to hist_entry_iter Jiri Olsa
2014-06-01 13:31 ` [PATCH 19/27] perf top: Convert " Jiri Olsa
2014-06-01 13:31 ` [PATCH 20/27] perf top: Add --children option Jiri Olsa
2014-06-01 13:31 ` [PATCH 21/27] perf top: Add top.children config option Jiri Olsa
2014-06-01 13:31 ` [PATCH 22/27] perf tools: Enable --children option by default Jiri Olsa
2014-06-01 13:31 ` [PATCH 23/27] perf ui/stdio: Fix invalid percentage value of cumulated hist entries Jiri Olsa
2014-06-01 13:31 ` [PATCH 24/27] perf ui/gtk: Fix callchain display Jiri Olsa
2014-06-01 13:31 ` [PATCH 25/27] perf tools: Reset output/sort order to default Jiri Olsa
2014-06-01 13:31 ` [PATCH 26/27] perf tests: Define and use symbolic names for fake symbols Jiri Olsa
2014-06-01 13:31 ` [PATCH 27/27] perf tests: Add a test case for cumulating callchains Jiri Olsa
2014-06-03 18:23 ` [GIT PULL 00/27] perf/core improvements and fixes Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2014-07-25 15:36 Arnaldo Carvalho de Melo
2014-07-28  8:10 ` Ingo Molnar
2016-06-23 21:23 Arnaldo Carvalho de Melo
2016-06-26 10:43 ` Ingo Molnar
2016-09-29 14:35 Arnaldo Carvalho de Melo
2016-09-29 17:11 ` Ingo Molnar
2018-01-10 21:28 Arnaldo Carvalho de Melo
2018-01-11  5:54 ` Ingo Molnar
2018-07-25 17:59 Arnaldo Carvalho de Melo
2018-07-25 20:34 ` Ingo Molnar
2018-09-24 15:02 Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).