linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf, tools: Refactor and support interval and CSV metrics
@ 2016-02-17 22:43 Andi Kleen
  2016-02-17 22:44 ` [PATCH 1/6] perf, tools, stat: Handled scaled == -1 case for counters Andi Kleen
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Andi Kleen @ 2016-02-17 22:43 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, eranian

Rebased tree and fixed Jiri's last feedback.

[v5: Fix mainly bisect problems. No regressions introduced by one
patch and fixed again later. Some minor fixes in addition]
[v6: Fix running/noise printing patch.]
[v7: Reorder and merge two patches to avoid a bisect hole where unsupported was
printed as 0]
[v8: Minor fixes for review feedback. See changelog in patches.]
[v9: Fix newline bug. Add support for -A for --metric-only]
[v10: Remove extra "noise" printing (Jiri)
      Fix fields in documentation (Jiri)]
[v11: Fix manpage again. Avoid extra metric output in CSV mode.]
[v12: Move CSV metrics fields to after running/enabled/variance.
      Fix regression with not counted counters.
      Minor fixes.]

Currently perf stat does not support printing computed metrics for interval (-I xxx)
or CSV (-x,) mode. For example IPC or TSX metrics over time are quite useful to know.

This patch implements them. The main obstacle was that the
metrics printing was all open coded all over the metrics computation code.
The second patch refactors the metrics printing to work through call backs that
can be more easily changed. This also cleans up the metrics printing significantly.
The indentation is now handled through printf, no more need to manually count spaces.

Then based on that it implements metrics printing for CSV and interval mode,
and finally a --metric-only mode.

Example output:

% perf stat  -I1000 -a sleep 1
#          time              counts unit events                    metric                              multiplex
     1.001301370       12020.049593      task-clock (msec)                                             (100.00%)
     1.001301370              3,952      context-switches          #    0.329 K/sec                    (100.00%)
     1.001301370                 69      cpu-migrations            #    0.006 K/sec                    (100.00%)
     1.001301370                 76      page-faults               #    0.006 K/sec                  
     1.001301370        386,582,789      cycles                    #    0.032 GHz                      (100.00%)
     1.001301370        716,441,544      stalled-cycles-frontend   #  185.33% frontend cycles idle     (100.00%)
     1.001301370    <not supported>      stalled-cycles-backend   
     1.001301370        101,751,678      instructions              #    0.26  insn per cycle         
     1.001301370                                                   #    7.04  stalled cycles per insn  (100.00%)
     1.001301370         20,914,692      branches                  #    1.740 M/sec                    (100.00%)
     1.001301370          1,943,630      branch-misses             #    9.29% of all branches        

CSV mode:

% perf stat  -x, -I1000 -a sleep 1
     1.000982778,12006.549977,,task-clock,12006547787,100.00,,,,
     1.000982778,12822,,context-switches,12007100604,100.00,0.001,M/sec
     1.000982778,175,,cpu-migrations,12007180306,100.00,0.015,K/sec
     1.000982778,3404,,page-faults,12007185482,100.00,0.284,K/sec
     1.000982778,1930307489,,cycles,12007018233,100.00,0.161,GHz
     1.000982778,6971803638,,stalled-cycles-frontend,12006902870,100.00,361.18,frontend cycles idle
     1.000982778,<not supported>,,stalled-cycles-backend,0,100.00,,,,
     1.000982778,464493941,,instructions,12006873327,100.00,0.24,insn per cycle
     1.000982778,,,,,,15.01,stalled cycles per insn
     1.000982778,86548409,,branches,12006758420,100.00,7.208,M/sec
     1.000982778,4933638,,branch-misses,12006648104,100.00,5.70,of all branches

Now includes metrics

Metric only mode:

Concicse information if you only care about computed metrics, not raw values

% perf stat --metric-only  -a -I 1000
     1.001750901 frontend cycles idle backend cycles idle  insn per cycle       stalled cycles per insn branch-misses of all branches 
     1.001750901  188.78%                                   0.53                3.56                    4.19%                      
     2.002625926  233.68%                                   0.86                2.30                    2.84%                      
     3.003296456  236.16%                                   1.18                1.58                    2.87%                      
     4.004095913  129.87%                                   0.24                7.82                    2.08%                      
     5.004964861  116.26%                                   0.17               11.35                    1.43%                      
     6.005802242  148.16%                                   0.19               10.05                    1.54%                      
     7.006485273  151.76%                                   0.18               11.25                    1.88%                     

Metric only mode in CSV (flat format, easy to plot and analyze in statistical tools like JMP, R, pandas, gnuplot):

% perf stat -x, --metric-only  -a -I 1000
     1.001381652,frontend cycles idle,backend cycles idle,insn per cycle,stalled cycles per insn,branch-misses of all branches,
     1.001381652,173.32,,0.83,2.09,1.73,
     2.002073343,199.47,,1.07,1.60,2.14,
     3.002875524,109.52,,0.22,7.83,1.63,
     4.003970059,132.10,,0.17,10.85,1.51,
     5.004818754,181.60,,0.22,8.87,2.22,


Available in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc-2.6 perf/stat-metrics-16

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/6] perf, tools, stat: Handled scaled == -1 case for counters
  2016-02-17 22:43 perf, tools: Refactor and support interval and CSV metrics Andi Kleen
@ 2016-02-17 22:44 ` Andi Kleen
  2016-02-20 11:35   ` [tip:perf/core] perf " tip-bot for Andi Kleen
  2016-02-17 22:44 ` [PATCH 2/6] perf, tools, stat: Implement CSV metrics output Andi Kleen
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2016-02-17 22:44 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Arnaldo pointed out that the earlier
"Move noise/running printing into printout"
change changed behavior for not counted counters. This patch fixes it again.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-stat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 15e4fcf..86289df 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -860,7 +860,7 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 
 	nl = new_line_std;
 
-	if (run == 0 || ena == 0) {
+	if (run == 0 || ena == 0 || counter->counts->scaled == -1) {
 		aggr_printout(counter, id, nr);
 
 		fprintf(stat_config.output, "%*s%s",
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/6] perf, tools, stat: Implement CSV metrics output
  2016-02-17 22:43 perf, tools: Refactor and support interval and CSV metrics Andi Kleen
  2016-02-17 22:44 ` [PATCH 1/6] perf, tools, stat: Handled scaled == -1 case for counters Andi Kleen
@ 2016-02-17 22:44 ` Andi Kleen
  2016-02-18 17:00   ` Arnaldo Carvalho de Melo
  2016-02-21 16:39   ` Jiri Olsa
  2016-02-17 22:44 ` [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode Andi Kleen
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 20+ messages in thread
From: Andi Kleen @ 2016-02-17 22:44 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Now support CSV output for metrics. With the new output callbacks
this is relatively straight forward by creating new callbacks.

This allows to easily plot metrics from CSV files.

The new line callback needs to know the number of fields to skip them
correctly

Example output before:

% perf stat -x, true
0.200687,,task-clock,200687,100.00
0,,context-switches,200687,100.00
0,,cpu-migrations,200687,100.00
40,,page-faults,200687,100.00
730871,,cycles,203601,100.00
551056,,stalled-cycles-frontend,203601,100.00
<not supported>,,stalled-cycles-backend,0,100.00
385523,,instructions,203601,100.00
78028,,branches,203601,100.00
3946,,branch-misses,203601,100.00

After:

% perf stat -x, true
.502457,,task-clock,502457,100.00,0.485,CPUs utilized
0,,context-switches,502457,100.00,0.000,K/sec
0,,cpu-migrations,502457,100.00,0.000,K/sec
45,,page-faults,502457,100.00,0.090,M/sec
644692,,cycles,509102,100.00,1.283,GHz
423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle
<not supported>,,stalled-cycles-backend,0,100.00,,,,
492701,,instructions,509102,100.00,0.76,insn per cycle
,,,,,0.86,stalled cycles per insn
97767,,branches,509102,100.00,194.578,M/sec
4788,,branch-misses,509102,100.00,4.90,of all branches

or easier readable

perf stat  -x, -o x.csv true
[ak@tassilo hle]$ column -s, -t x.csv
0.490635                                 task-clock               490635  100.00  0.489    CPUs utilized
0                                        context-switches         490635  100.00  0.000    K/sec
0                                        cpu-migrations           490635  100.00  0.000    K/sec
45                                       page-faults              490635  100.00  0.092    M/sec
629080                                   cycles                   497698  100.00  1.282    GHz
409498                                   stalled-cycles-frontend  497698  100.00  65.09    frontend cycles idle
<not supported>                          stalled-cycles-backend   0       100.00
491424                                   instructions             497698  100.00  0.78     insn per cycle
                                                                                  0.83     stalled cycles per insn
97278                                    branches                 497698  100.00  198.270  M/sec
4569                                     branch-misses            497698  100.00  4.70     of all branches

Two new fields are added: metric value and metric name.

v2: Split out function argument changes
v3: Reenable metrics for real.
v4: Fix wrong hunk from refactoring.
v5: Remove extra "noise" printing (Jiri), but add it to the not counted case.
Print empty metrics for not counted.
v6: Avoid outputting metric on empty format.
v7: Print metric at the end
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-stat.c | 76 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 72 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 86289df..6c2c1d2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -739,6 +739,8 @@ struct outstate {
 	FILE *fh;
 	bool newline;
 	const char *prefix;
+	int  nfields;
+	u64  run, ena;
 };
 
 #define METRIC_LEN  35
@@ -789,6 +791,43 @@ static void print_metric_std(void *ctx, const char *color, const char *fmt,
 	fprintf(out, " %-*s", METRIC_LEN - n - 1, unit);
 }
 
+static void new_line_csv(void *ctx)
+{
+	struct outstate *os = ctx;
+	int i;
+
+	fputc('\n', os->fh);
+	if (os->prefix)
+		fprintf(os->fh, "%s%s", os->prefix, csv_sep);
+	for (i = 0; i < os->nfields; i++)
+		fputs(csv_sep, os->fh);
+}
+
+static void print_metric_csv(void *ctx,
+			     const char *color __maybe_unused,
+			     const char *fmt, const char *unit, double val)
+{
+	struct outstate *os = ctx;
+	FILE *out = os->fh;
+	char buf[64], *vals, *ends;
+
+	if (unit == NULL || fmt == NULL) {
+		fprintf(out, "%s%s%s%s", csv_sep, csv_sep, csv_sep, csv_sep);
+		return;
+	}
+	snprintf(buf, sizeof(buf), fmt, val);
+	vals = buf;
+	while (isspace(*vals))
+		vals++;
+	ends = vals;
+	while (isdigit(*ends) || *ends == '.')
+		ends++;
+	*ends = 0;
+	while (isspace(*unit))
+		unit++;
+	fprintf(out, "%s%s%s%s", csv_sep, vals, csv_sep, unit);
+}
+
 static void nsec_printout(int id, int nr, struct perf_evsel *evsel, double avg)
 {
 	FILE *output = stat_config.output;
@@ -860,6 +899,24 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 
 	nl = new_line_std;
 
+	if (csv_output) {
+		static int aggr_fields[] = {
+			[AGGR_GLOBAL] = 0,
+			[AGGR_THREAD] = 1,
+			[AGGR_NONE] = 1,
+			[AGGR_SOCKET] = 2,
+			[AGGR_CORE] = 2,
+		};
+
+		pm = print_metric_csv;
+		nl = new_line_csv;
+		os.nfields = 3;
+		os.nfields += aggr_fields[stat_config.aggr_mode];
+		if (counter->cgrp)
+			os.nfields++;
+		os.run = run;
+		os.ena = ena;
+	}
 	if (run == 0 || ena == 0 || counter->counts->scaled == -1) {
 		aggr_printout(counter, id, nr);
 
@@ -880,7 +937,12 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 			fprintf(stat_config.output, "%s%s",
 				csv_sep, counter->cgrp->name);
 
+		if (!csv_output)
+			pm(&os, NULL, NULL, "", 0);
+		print_noise(counter, noise);
 		print_running(run, ena);
+		if (csv_output)
+			pm(&os, NULL, NULL, "", 0);
 		return;
 	}
 
@@ -893,14 +955,20 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 	out.new_line = nl;
 	out.ctx = &os;
 
-	if (!csv_output)
-		perf_stat__print_shadow_stats(counter, uval,
+	if (csv_output) {
+		print_noise(counter, noise);
+		print_running(run, ena);
+	}
+
+	perf_stat__print_shadow_stats(counter, uval,
 				stat_config.aggr_mode == AGGR_GLOBAL ? 0 :
 				cpu_map__id_to_cpu(id),
 				&out);
 
-	print_noise(counter, noise);
-	print_running(run, ena);
+	if (!csv_output) {
+		print_noise(counter, noise);
+		print_running(run, ena);
+	}
 }
 
 static void print_aggr(char *prefix)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode
  2016-02-17 22:43 perf, tools: Refactor and support interval and CSV metrics Andi Kleen
  2016-02-17 22:44 ` [PATCH 1/6] perf, tools, stat: Handled scaled == -1 case for counters Andi Kleen
  2016-02-17 22:44 ` [PATCH 2/6] perf, tools, stat: Implement CSV metrics output Andi Kleen
@ 2016-02-17 22:44 ` Andi Kleen
  2016-02-21 17:15   ` Jiri Olsa
                     ` (2 more replies)
  2016-02-17 22:44 ` [PATCH 4/6] perf, tools, stat: Document CSV format in manpage Andi Kleen
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 20+ messages in thread
From: Andi Kleen @ 2016-02-17 22:44 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Enable metrics printing in --per-core / --per-socket mode. We need
to save the shadow metrics in a unique place. Always use the first
CPU in the aggregation. Then use the same CPU to retrieve the
shadow value later.

Example output:

% perf stat --per-core -a ./BC1s

 Performance counter stats for 'system wide':

S0-C0           2        2966.020381      task-clock (msec)         #    2.004 CPUs utilized            (100.00%)
S0-C0           2                 49      context-switches          #    0.017 K/sec                    (100.00%)
S0-C0           2                  4      cpu-migrations            #    0.001 K/sec                    (100.00%)
S0-C0           2                467      page-faults               #    0.157 K/sec
S0-C0           2      4,599,061,773      cycles                    #    1.551 GHz                      (100.00%)
S0-C0           2      9,755,886,883      instructions              #    2.12  insn per cycle           (100.00%)
S0-C0           2      1,906,272,125      branches                  #  642.704 M/sec                    (100.00%)
S0-C0           2         81,180,867      branch-misses             #    4.26% of all branches
S0-C1           2        2965.995373      task-clock (msec)         #    2.003 CPUs utilized            (100.00%)
S0-C1           2                 62      context-switches          #    0.021 K/sec                    (100.00%)
S0-C1           2                  8      cpu-migrations            #    0.003 K/sec                    (100.00%)
S0-C1           2                281      page-faults               #    0.095 K/sec
S0-C1           2          6,347,290      cycles                    #    0.002 GHz                      (100.00%)
S0-C1           2          4,654,156      instructions              #    0.73  insn per cycle           (100.00%)
S0-C1           2            947,121      branches                  #    0.319 M/sec                    (100.00%)
S0-C1           2             37,322      branch-misses             #    3.94% of all branches

       1.480409747 seconds time elapsed

v2: Rebase to older patches
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-stat.c | 58 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 51 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 6c2c1d2..715e5b5 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -741,6 +741,8 @@ struct outstate {
 	const char *prefix;
 	int  nfields;
 	u64  run, ena;
+	int  id, nr;
+	struct perf_evsel *evsel;
 };
 
 #define METRIC_LEN  35
@@ -756,12 +758,9 @@ static void do_new_line_std(struct outstate *os)
 {
 	fputc('\n', os->fh);
 	fputs(os->prefix, os->fh);
+	aggr_printout(os->evsel, os->id, os->nr);
 	if (stat_config.aggr_mode == AGGR_NONE)
 		fprintf(os->fh, "        ");
-	if (stat_config.aggr_mode == AGGR_CORE)
-		fprintf(os->fh, "                  ");
-	if (stat_config.aggr_mode == AGGR_SOCKET)
-		fprintf(os->fh, "            ");
 	fprintf(os->fh, "                                                 ");
 }
 
@@ -799,6 +798,7 @@ static void new_line_csv(void *ctx)
 	fputc('\n', os->fh);
 	if (os->prefix)
 		fprintf(os->fh, "%s%s", os->prefix, csv_sep);
+	aggr_printout(os->evsel, os->id, os->nr);
 	for (i = 0; i < os->nfields; i++)
 		fputs(csv_sep, os->fh);
 }
@@ -856,6 +856,22 @@ static void nsec_printout(int id, int nr, struct perf_evsel *evsel, double avg)
 		fprintf(output, "%s%s", csv_sep, evsel->cgrp->name);
 }
 
+static int first_shadow_cpu(struct perf_evsel *evsel, int id)
+{
+	int i;
+
+	if (aggr_get_id == NULL)
+		return 0;
+
+	for (i = 0; i < perf_evsel__nr_cpus(evsel); i++) {
+		int cpu2 = perf_evsel__cpus(evsel)->map[i];
+
+		if (aggr_get_id(evsel_list->cpus, cpu2) == id)
+			return cpu2;
+	}
+	return 0;
+}
+
 static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
 {
 	FILE *output = stat_config.output;
@@ -892,7 +908,10 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 	struct perf_stat_output_ctx out;
 	struct outstate os = {
 		.fh = stat_config.output,
-		.prefix = prefix ? prefix : ""
+		.prefix = prefix ? prefix : "",
+		.id = id,
+		.nr = nr,
+		.evsel = counter,
 	};
 	print_metric_t pm = print_metric_std;
 	void (*nl)(void *);
@@ -962,15 +981,38 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 
 	perf_stat__print_shadow_stats(counter, uval,
 				stat_config.aggr_mode == AGGR_GLOBAL ? 0 :
-				cpu_map__id_to_cpu(id),
+				first_shadow_cpu(counter, id),
 				&out);
-
 	if (!csv_output) {
 		print_noise(counter, noise);
 		print_running(run, ena);
 	}
 }
 
+static void aggr_update_shadow(void)
+{
+	int cpu, cpu2, s2, id, s;
+	u64 val;
+	struct perf_evsel *counter;
+
+	for (s = 0; s < aggr_map->nr; s++) {
+		id = aggr_map->map[s];
+		evlist__for_each(evsel_list, counter) {
+			val = 0;
+			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
+				cpu2 = perf_evsel__cpus(counter)->map[cpu];
+				s2 = aggr_get_id(evsel_list->cpus, cpu2);
+				if (s2 != id)
+					continue;
+				val += perf_counts(counter->counts, cpu, 0)->val;
+			}
+			val = val * counter->scale;
+			perf_stat__update_shadow_stats(counter, &val,
+						       first_shadow_cpu(counter, id));
+		}
+	}
+}
+
 static void print_aggr(char *prefix)
 {
 	FILE *output = stat_config.output;
@@ -982,6 +1024,8 @@ static void print_aggr(char *prefix)
 	if (!(aggr_map || aggr_get_id))
 		return;
 
+	aggr_update_shadow();
+
 	for (s = 0; s < aggr_map->nr; s++) {
 		id = aggr_map->map[s];
 		evlist__for_each(evsel_list, counter) {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/6] perf, tools, stat: Document CSV format in manpage
  2016-02-17 22:43 perf, tools: Refactor and support interval and CSV metrics Andi Kleen
                   ` (2 preceding siblings ...)
  2016-02-17 22:44 ` [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode Andi Kleen
@ 2016-02-17 22:44 ` Andi Kleen
  2016-02-17 22:44 ` [PATCH 5/6] perf, tools, stat: Implement --metric-only mode Andi Kleen
  2016-02-17 22:44 ` [PATCH 6/6] perf, tools, stat: Add --metric-only support for -A Andi Kleen
  5 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2016-02-17 22:44 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

With all the recently added fields in the perf stat CSV output
we should finally document them in the man page. Do this here.

v2: Fix fields in documentation (Jiri)
v3: fix order of fields again (Jiri)
v4: Change order again.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-stat.txt | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 52ef7a9..3ae7907 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -211,6 +211,27 @@ $ perf stat -- make -j
 
  Wall-clock time elapsed:   719.554352 msecs
 
+CSV FORMAT
+----------
+
+With -x, perf stat is able to output a not-quite-CSV format output
+Commas in the output are not put into "". To make it easy to parse
+it is recommended to use a different character like -x \;
+
+The fields are in this order:
+
+	- optional usec time stamp in fractions of second (with -I xxx)
+	- counter value
+	- unit of the counter value or empty
+	- event name
+	- run time of counter
+	- percentage of measurement time the counter was running
+	- optional variance if multiple values are collected with -r
+	- optional metric value
+	- optional unit of metric
+
+Additional metrics may be printed with all earlier fields being empty.
+
 SEE ALSO
 --------
 linkperf:perf-top[1], linkperf:perf-list[1]
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/6] perf, tools, stat: Implement --metric-only mode
  2016-02-17 22:43 perf, tools: Refactor and support interval and CSV metrics Andi Kleen
                   ` (3 preceding siblings ...)
  2016-02-17 22:44 ` [PATCH 4/6] perf, tools, stat: Document CSV format in manpage Andi Kleen
@ 2016-02-17 22:44 ` Andi Kleen
  2016-02-17 22:44 ` [PATCH 6/6] perf, tools, stat: Add --metric-only support for -A Andi Kleen
  5 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2016-02-17 22:44 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add a new mode to only print metrics. Sometimes we don't care about
the raw values, just want the computed metrics. This allows more
compact printing, so with -I each sample is only a single line.
This also allows easier plotting and processing with other tools.

The main target is with using --topdown, but it also works with
-T and standard perf stat. A few metrics are not supported.

To avoiding having to hardcode all the metrics in the code it uses
a two pass approach: first compute dummy metrics and only
print the headers in the print_metric callback. Then use the callback
to print the actual values.

There are some additional changes
in the stat printout code to handle all metrics being on a single line.

One issue is that the column code doesn't know in advance what events
are not supported by the CPU, and it would be hard to find out
as this could change based on dynamic conditions. That causes
empty columns in some cases.

The output can be fairly wide, often you may need more than 80 columns.

Example:

% perf stat -a -I 1000 --metric-only
     1.000604977 frontend cycles idle     backend cycles idle      insn per cycle           stalled cycles per insn  branch-misses of all branches
     1.000604977                                                    0.76                                             2.35%
     2.000924680                                                    0.72                                             2.34%
     3.001139592                                                    0.76                                             2.57%
     4.001358452                                                    0.73                                             2.44%

v2: Lots of updates.
v3: Use slightly narrower columns
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-stat.txt |   4 +
 tools/perf/builtin-stat.c              | 207 +++++++++++++++++++++++++++++++--
 2 files changed, 201 insertions(+), 10 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 3ae7907..3929ab0 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -139,6 +139,10 @@ Print count deltas every N milliseconds (minimum: 10ms)
 The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals.  Use with caution.
 	example: 'perf stat -I 1000 -e cycles -a sleep 5'
 
+--metric-only::
+Only print computed metrics. Print them in a single line.
+Don't show any raw values. Not supported with -A or --per-thread.
+
 --per-socket::
 Aggregate counts per processor socket for system-wide mode measurements.  This
 is a useful mode to detect imbalance between sockets.  To enable this mode,
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 715e5b5..6140365 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -122,6 +122,7 @@ static bool			sync_run			= false;
 static unsigned int		initial_delay			= 0;
 static unsigned int		unit_width			= 4; /* strlen("unit") */
 static bool			forever				= false;
+static bool			metric_only			= false;
 static struct timespec		ref_time;
 static struct cpu_map		*aggr_map;
 static aggr_get_id_t		aggr_get_id;
@@ -828,6 +829,99 @@ static void print_metric_csv(void *ctx,
 	fprintf(out, "%s%s%s%s", csv_sep, vals, csv_sep, unit);
 }
 
+#define METRIC_ONLY_LEN 20
+
+/* Filter out some columns that don't work well in metrics only mode */
+
+static bool valid_only_metric(const char *unit)
+{
+	if (!unit)
+		return false;
+	if (strstr(unit, "/sec") ||
+	    strstr(unit, "hz") ||
+	    strstr(unit, "Hz") ||
+	    strstr(unit, "CPUs utilized"))
+		return false;
+	return true;
+}
+
+static const char *fixunit(char *buf, struct perf_evsel *evsel,
+			   const char *unit)
+{
+	if (!strncmp(unit, "of all", 6)) {
+		snprintf(buf, 1024, "%s %s", perf_evsel__name(evsel),
+			 unit);
+		return buf;
+	}
+	return unit;
+}
+
+static void print_metric_only(void *ctx, const char *color, const char *fmt,
+			      const char *unit, double val)
+{
+	struct outstate *os = ctx;
+	FILE *out = os->fh;
+	int n;
+	char buf[1024];
+	unsigned mlen = METRIC_ONLY_LEN;
+
+	if (!valid_only_metric(unit))
+		return;
+	unit = fixunit(buf, os->evsel, unit);
+	if (color)
+		n = color_fprintf(out, color, fmt, val);
+	else
+		n = fprintf(out, fmt, val);
+	if (n > METRIC_ONLY_LEN)
+		n = METRIC_ONLY_LEN;
+	if (mlen < strlen(unit))
+		mlen = strlen(unit) + 1;
+	fprintf(out, "%*s", mlen - n, "");
+}
+
+static void print_metric_only_csv(void *ctx, const char *color __maybe_unused,
+				  const char *fmt,
+				  const char *unit, double val)
+{
+	struct outstate *os = ctx;
+	FILE *out = os->fh;
+	char buf[64], *vals, *ends;
+	char tbuf[1024];
+
+	if (!valid_only_metric(unit))
+		return;
+	unit = fixunit(tbuf, os->evsel, unit);
+	snprintf(buf, sizeof buf, fmt, val);
+	vals = buf;
+	while (isspace(*vals))
+		vals++;
+	ends = vals;
+	while (isdigit(*ends) || *ends == '.')
+		ends++;
+	*ends = 0;
+	fprintf(out, "%s%s", vals, csv_sep);
+}
+
+static void new_line_metric(void *ctx __maybe_unused)
+{
+}
+
+static void print_metric_header(void *ctx, const char *color __maybe_unused,
+				const char *fmt __maybe_unused,
+				const char *unit, double val __maybe_unused)
+{
+	struct outstate *os = ctx;
+	char tbuf[1024];
+
+	if (!valid_only_metric(unit))
+		return;
+	unit = fixunit(tbuf, os->evsel, unit);
+	if (csv_output)
+		fprintf(os->fh, "%s%s", unit, csv_sep);
+	else
+		fprintf(os->fh, "%-*s ", METRIC_ONLY_LEN, unit);
+}
+
 static void nsec_printout(int id, int nr, struct perf_evsel *evsel, double avg)
 {
 	FILE *output = stat_config.output;
@@ -916,9 +1010,16 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 	print_metric_t pm = print_metric_std;
 	void (*nl)(void *);
 
-	nl = new_line_std;
+	if (metric_only) {
+		nl = new_line_metric;
+		if (csv_output)
+			pm = print_metric_only_csv;
+		else
+			pm = print_metric_only;
+	} else
+		nl = new_line_std;
 
-	if (csv_output) {
+	if (csv_output && !metric_only) {
 		static int aggr_fields[] = {
 			[AGGR_GLOBAL] = 0,
 			[AGGR_THREAD] = 1,
@@ -937,6 +1038,10 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 		os.ena = ena;
 	}
 	if (run == 0 || ena == 0 || counter->counts->scaled == -1) {
+		if (metric_only) {
+			pm(&os, NULL, "", "", 0);
+			return;
+		}
 		aggr_printout(counter, id, nr);
 
 		fprintf(stat_config.output, "%*s%s",
@@ -965,7 +1070,9 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 		return;
 	}
 
-	if (nsec_counter(counter))
+	if (metric_only)
+		/* nothing */;
+	else if (nsec_counter(counter))
 		nsec_printout(id, nr, counter, uval);
 	else
 		abs_printout(id, nr, counter, uval);
@@ -974,7 +1081,7 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 	out.new_line = nl;
 	out.ctx = &os;
 
-	if (csv_output) {
+	if (csv_output && !metric_only) {
 		print_noise(counter, noise);
 		print_running(run, ena);
 	}
@@ -983,7 +1090,7 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 				stat_config.aggr_mode == AGGR_GLOBAL ? 0 :
 				first_shadow_cpu(counter, id),
 				&out);
-	if (!csv_output) {
+	if (!csv_output && !metric_only) {
 		print_noise(counter, noise);
 		print_running(run, ena);
 	}
@@ -1020,6 +1127,7 @@ static void print_aggr(char *prefix)
 	int cpu, s, s2, id, nr;
 	double uval;
 	u64 ena, run, val;
+	bool first;
 
 	if (!(aggr_map || aggr_get_id))
 		return;
@@ -1027,7 +1135,11 @@ static void print_aggr(char *prefix)
 	aggr_update_shadow();
 
 	for (s = 0; s < aggr_map->nr; s++) {
+		if (prefix && metric_only)
+			fprintf(output, "%s", prefix);
+
 		id = aggr_map->map[s];
+		first = true;
 		evlist__for_each(evsel_list, counter) {
 			val = ena = run = 0;
 			nr = 0;
@@ -1040,13 +1152,20 @@ static void print_aggr(char *prefix)
 				run += perf_counts(counter->counts, cpu, 0)->run;
 				nr++;
 			}
-			if (prefix)
+			if (first && metric_only) {
+				first = false;
+				aggr_printout(counter, id, nr);
+			}
+			if (prefix && !metric_only)
 				fprintf(output, "%s", prefix);
 
 			uval = val * counter->scale;
 			printout(id, nr, counter, uval, prefix, run, ena, 1.0);
-			fputc('\n', output);
+			if (!metric_only)
+				fputc('\n', output);
 		}
+		if (metric_only)
+			fputc('\n', output);
 	}
 }
 
@@ -1091,12 +1210,13 @@ static void print_counter_aggr(struct perf_evsel *counter, char *prefix)
 	avg_enabled = avg_stats(&ps->res_stats[1]);
 	avg_running = avg_stats(&ps->res_stats[2]);
 
-	if (prefix)
+	if (prefix && !metric_only)
 		fprintf(output, "%s", prefix);
 
 	uval = avg * counter->scale;
 	printout(-1, 0, counter, uval, prefix, avg_running, avg_enabled, avg);
-	fprintf(output, "\n");
+	if (!metric_only)
+		fprintf(output, "\n");
 }
 
 /*
@@ -1125,6 +1245,43 @@ static void print_counter(struct perf_evsel *counter, char *prefix)
 	}
 }
 
+static int aggr_header_lens[] = {
+	[AGGR_CORE] = 18,
+	[AGGR_SOCKET] = 12,
+	[AGGR_NONE] = 15,
+	[AGGR_THREAD] = 24,
+	[AGGR_GLOBAL] = 0,
+};
+
+static void print_metric_headers(char *prefix)
+{
+	struct perf_stat_output_ctx out;
+	struct perf_evsel *counter;
+	struct outstate os = {
+		.fh = stat_config.output
+	};
+
+	if (prefix)
+		fprintf(stat_config.output, "%s", prefix);
+
+	if (!csv_output)
+		fprintf(stat_config.output, "%*s",
+			aggr_header_lens[stat_config.aggr_mode], "");
+
+	/* Print metrics headers only */
+	evlist__for_each(evsel_list, counter) {
+		os.evsel = counter;
+		out.ctx = &os;
+		out.print_metric = print_metric_header;
+		out.new_line = new_line_metric;
+		os.evsel = counter;
+		perf_stat__print_shadow_stats(counter, 0,
+					      0,
+					      &out);
+	}
+	fputc('\n', stat_config.output);
+}
+
 static void print_interval(char *prefix, struct timespec *ts)
 {
 	FILE *output = stat_config.output;
@@ -1132,7 +1289,7 @@ static void print_interval(char *prefix, struct timespec *ts)
 
 	sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep);
 
-	if (num_print_interval == 0 && !csv_output) {
+	if (num_print_interval == 0 && !csv_output && !metric_only) {
 		switch (stat_config.aggr_mode) {
 		case AGGR_SOCKET:
 			fprintf(output, "#           time socket cpus             counts %*s events\n", unit_width, "unit");
@@ -1219,6 +1376,17 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
 	else
 		print_header(argc, argv);
 
+	if (metric_only) {
+		static int num_print_iv;
+
+		if (num_print_iv == 0)
+			print_metric_headers(prefix);
+		if (num_print_iv++ == 25)
+			num_print_iv = 0;
+		if (stat_config.aggr_mode == AGGR_GLOBAL && prefix)
+			fprintf(stat_config.output, "%s", prefix);
+	}
+
 	switch (stat_config.aggr_mode) {
 	case AGGR_CORE:
 	case AGGR_SOCKET:
@@ -1231,6 +1399,8 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
 	case AGGR_GLOBAL:
 		evlist__for_each(evsel_list, counter)
 			print_counter_aggr(counter, prefix);
+		if (metric_only)
+			fputc('\n', stat_config.output);
 		break;
 	case AGGR_NONE:
 		evlist__for_each(evsel_list, counter)
@@ -1355,6 +1525,8 @@ static const struct option stat_options[] = {
 		     "aggregate counts per thread", AGGR_THREAD),
 	OPT_UINTEGER('D', "delay", &initial_delay,
 		     "ms to wait before starting measurement after program start"),
+	OPT_BOOLEAN(0, "metric-only", &metric_only,
+			"Only print computed metrics. No raw values"),
 	OPT_END()
 };
 
@@ -1976,6 +2148,21 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
 		goto out;
 	}
 
+	if (metric_only && stat_config.aggr_mode == AGGR_THREAD) {
+		fprintf(stderr, "--metric-only is not supported with --per-thread\n");
+		goto out;
+	}
+
+	if (metric_only && stat_config.aggr_mode == AGGR_NONE) {
+		fprintf(stderr, "--metric-only is not supported with -A\n");
+		goto out;
+	}
+
+	if (metric_only && run_count > 1) {
+		fprintf(stderr, "--metric-only is not supported with -r\n");
+		goto out;
+	}
+
 	if (output_fd < 0) {
 		fprintf(stderr, "argument to --log-fd must be a > 0\n");
 		parse_options_usage(stat_usage, stat_options, "log-fd", 0);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/6] perf, tools, stat: Add --metric-only support for -A
  2016-02-17 22:43 perf, tools: Refactor and support interval and CSV metrics Andi Kleen
                   ` (4 preceding siblings ...)
  2016-02-17 22:44 ` [PATCH 5/6] perf, tools, stat: Implement --metric-only mode Andi Kleen
@ 2016-02-17 22:44 ` Andi Kleen
  5 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2016-02-17 22:44 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add metric only support for -A too. This requires a new print
function that prints the metrics in the right order.

v2: Fix manpage
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-stat.txt |  2 +-
 tools/perf/builtin-stat.c              | 48 ++++++++++++++++++++++++++++------
 2 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 3929ab0..44095cd 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -141,7 +141,7 @@ The overhead percentage could be high in some cases, for instance with small, su
 
 --metric-only::
 Only print computed metrics. Print them in a single line.
-Don't show any raw values. Not supported with -A or --per-thread.
+Don't show any raw values. Not supported with --per-thread.
 
 --per-socket::
 Aggregate counts per processor socket for system-wide mode measurements.  This
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 6140365..14794b8 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1245,10 +1245,43 @@ static void print_counter(struct perf_evsel *counter, char *prefix)
 	}
 }
 
+static void print_no_aggr_metric(char *prefix)
+{
+	int cpu;
+	int nrcpus = 0;
+	struct perf_evsel *counter;
+	u64 ena, run, val;
+	double uval;
+
+	evlist__for_each(evsel_list, counter) {
+		nrcpus = perf_evsel__nr_cpus(counter);
+		break;
+	}
+	for (cpu = 0; cpu < nrcpus; cpu++) {
+		bool first = true;
+
+		if (prefix)
+			fputs(prefix, stat_config.output);
+		evlist__for_each(evsel_list, counter) {
+			if (first) {
+				aggr_printout(counter, cpu, 0);
+				first = false;
+			}
+			val = perf_counts(counter->counts, cpu, 0)->val;
+			ena = perf_counts(counter->counts, cpu, 0)->ena;
+			run = perf_counts(counter->counts, cpu, 0)->run;
+
+			uval = val * counter->scale;
+			printout(cpu, 0, counter, uval, prefix, run, ena, 1.0);
+		}
+		fputc('\n', stat_config.output);
+	}
+}
+
 static int aggr_header_lens[] = {
 	[AGGR_CORE] = 18,
 	[AGGR_SOCKET] = 12,
-	[AGGR_NONE] = 15,
+	[AGGR_NONE] = 6,
 	[AGGR_THREAD] = 24,
 	[AGGR_GLOBAL] = 0,
 };
@@ -1403,8 +1436,12 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
 			fputc('\n', stat_config.output);
 		break;
 	case AGGR_NONE:
-		evlist__for_each(evsel_list, counter)
-			print_counter(counter, prefix);
+		if (metric_only)
+			print_no_aggr_metric(prefix);
+		else {
+			evlist__for_each(evsel_list, counter)
+				print_counter(counter, prefix);
+		}
 		break;
 	case AGGR_UNSET:
 	default:
@@ -2153,11 +2190,6 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
 		goto out;
 	}
 
-	if (metric_only && stat_config.aggr_mode == AGGR_NONE) {
-		fprintf(stderr, "--metric-only is not supported with -A\n");
-		goto out;
-	}
-
 	if (metric_only && run_count > 1) {
 		fprintf(stderr, "--metric-only is not supported with -r\n");
 		goto out;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] perf, tools, stat: Implement CSV metrics output
  2016-02-17 22:44 ` [PATCH 2/6] perf, tools, stat: Implement CSV metrics output Andi Kleen
@ 2016-02-18 17:00   ` Arnaldo Carvalho de Melo
  2016-02-18 17:39     ` Andi Kleen
  2016-02-21 16:39   ` Jiri Olsa
  1 sibling, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-18 17:00 UTC (permalink / raw)
  To: Andi Kleen; +Cc: jolsa, linux-kernel, eranian, Andi Kleen

Em Wed, Feb 17, 2016 at 02:44:01PM -0800, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
> 
> Now support CSV output for metrics. With the new output callbacks
> this is relatively straight forward by creating new callbacks.
> 
> This allows to easily plot metrics from CSV files.
> 
> The new line callback needs to know the number of fields to skip them
> correctly
> 
> Example output before:
> 
> % perf stat -x, true
> 0.200687,,task-clock,200687,100.00
> 0,,context-switches,200687,100.00
> 0,,cpu-migrations,200687,100.00
> 40,,page-faults,200687,100.00
> 730871,,cycles,203601,100.00
> 551056,,stalled-cycles-frontend,203601,100.00
> <not supported>,,stalled-cycles-backend,0,100.00
> 385523,,instructions,203601,100.00
> 78028,,branches,203601,100.00
> 3946,,branch-misses,203601,100.00
> 
> After:
> 
> % perf stat -x, true
> .502457,,task-clock,502457,100.00,0.485,CPUs utilized
> 0,,context-switches,502457,100.00,0.000,K/sec
> 0,,cpu-migrations,502457,100.00,0.000,K/sec
> 45,,page-faults,502457,100.00,0.090,M/sec
> 644692,,cycles,509102,100.00,1.283,GHz
> 423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle
> <not supported>,,stalled-cycles-backend,0,100.00,,,,
> 492701,,instructions,509102,100.00,0.76,insn per cycle
> ,,,,,0.86,stalled cycles per insn
> 97767,,branches,509102,100.00,194.578,M/sec
> 4788,,branch-misses,509102,100.00,4.90,of all branches

Testing here I noticed this new line with just commas:


[root@jouet ~]# perf stat -x, usleep 1 
0.268163,,task-clock,268163,100.00,0.484,CPUs utilized
1,,context-switches,268163,100.00,0.004,M/sec
0,,cpu-migrations,268163,100.00,0.000,K/sec
52,,page-faults,268163,100.00,0.194,M/sec
815922,,cycles,270746,100.00,3.043,GHz
<not supported>,,stalled-cycles-frontend,0,100.00,,,,
<not supported>,,stalled-cycles-backend,0,100.00,,,,
680198,,instructions,270746,100.00,0.83,insn per cycle
,,,,,,,,
136401,,branches,270746,100.00,508.650,M/sec
6995,,branch-misses,270746,100.00,5.13,of all branches
[root@jouet ~]#

Where before it wasn't there:

cat /tmp/before
0.282628,,task-clock,282628,100.00
1,,context-switches,282628,100.00
0,,cpu-migrations,282628,100.00
52,,page-faults,282628,100.00
861213,,cycles,285354,100.00
<not supported>,,stalled-cycles-frontend,0,100.00
<not supported>,,stalled-cycles-backend,0,100.00
686082,,instructions,285354,100.00
137846,,branches,285354,100.00
7142,,branch-misses,285354,100.00

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] perf, tools, stat: Implement CSV metrics output
  2016-02-18 17:00   ` Arnaldo Carvalho de Melo
@ 2016-02-18 17:39     ` Andi Kleen
  2016-02-21 16:39       ` Jiri Olsa
  0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2016-02-18 17:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andi Kleen, jolsa, linux-kernel, eranian, Andi Kleen

> Where before it wasn't there:
> 
> cat /tmp/before
> 0.282628,,task-clock,282628,100.00
> 1,,context-switches,282628,100.00
> 0,,cpu-migrations,282628,100.00
> 52,,page-faults,282628,100.00
> 861213,,cycles,285354,100.00
> <not supported>,,stalled-cycles-frontend,0,100.00
> <not supported>,,stalled-cycles-backend,0,100.00
> 686082,,instructions,285354,100.00
> 137846,,branches,285354,100.00
> 7142,,branch-misses,285354,100.00

This is intentional. See the standard perf output:


          521,232      instructions              #    0.63  insns per cycle        
                                                 #    1.13  stalled cycles per insn

So this line has multiple metrics. In CSV this is expressed as a mostly empty line.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [tip:perf/core] perf stat: Handled scaled == -1 case for counters
  2016-02-17 22:44 ` [PATCH 1/6] perf, tools, stat: Handled scaled == -1 case for counters Andi Kleen
@ 2016-02-20 11:35   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Andi Kleen @ 2016-02-20 11:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, eranian, jolsa, ak, tglx, mingo, linux-kernel, acme

Commit-ID:  b002f3bbd321993c1a6d56b86544065420156ab9
Gitweb:     http://git.kernel.org/tip/b002f3bbd321993c1a6d56b86544065420156ab9
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Wed, 17 Feb 2016 14:44:00 -0800
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 19 Feb 2016 19:12:45 -0300

perf stat: Handled scaled == -1 case for counters

Arnaldo pointed out that the earlier cb110f471025 ("perf stat: Move
noise/running printing into printout") change changed behavior for not
counted counters. This patch fixes it again.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: cb110f471025 ("perf stat: Move noise/running printing into printout")
Link: http://lkml.kernel.org/r/1455749045-18098-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 15e4fcf..86289df 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -860,7 +860,7 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 
 	nl = new_line_std;
 
-	if (run == 0 || ena == 0) {
+	if (run == 0 || ena == 0 || counter->counts->scaled == -1) {
 		aggr_printout(counter, id, nr);
 
 		fprintf(stat_config.output, "%*s%s",

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] perf, tools, stat: Implement CSV metrics output
  2016-02-18 17:39     ` Andi Kleen
@ 2016-02-21 16:39       ` Jiri Olsa
  2016-02-22 16:26         ` Andi Kleen
  0 siblings, 1 reply; 20+ messages in thread
From: Jiri Olsa @ 2016-02-21 16:39 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Arnaldo Carvalho de Melo, jolsa, linux-kernel, eranian,
	Andi Kleen

On Thu, Feb 18, 2016 at 06:39:21PM +0100, Andi Kleen wrote:
> > Where before it wasn't there:
> > 
> > cat /tmp/before
> > 0.282628,,task-clock,282628,100.00
> > 1,,context-switches,282628,100.00
> > 0,,cpu-migrations,282628,100.00
> > 52,,page-faults,282628,100.00
> > 861213,,cycles,285354,100.00
> > <not supported>,,stalled-cycles-frontend,0,100.00
> > <not supported>,,stalled-cycles-backend,0,100.00
> > 686082,,instructions,285354,100.00
> > 137846,,branches,285354,100.00
> > 7142,,branch-misses,285354,100.00
> 
> This is intentional. See the standard perf output:
> 
> 
>           521,232      instructions              #    0.63  insns per cycle        
>                                                  #    1.13  stalled cycles per insn
> 
> So this line has multiple metrics. In CSV this is expressed as a mostly empty line.

it's intentional if you have data from stalled cycles counter
on cpu where this one is non supported you get blank line:

0.186177,,task-clock,186177,100.00,0.448,CPUs utilized
0,,context-switches,186177,100.00,0.000,K/sec
0,,cpu-migrations,186177,100.00,0.000,K/sec
43,,page-faults,186177,100.00,0.231,M/sec
567286,,cycles,187628,100.00,3.047,GHz
<not supported>,,stalled-cycles-frontend,0,100.00,,,,
<not supported>,,stalled-cycles-backend,0,100.00,,,,
456664,,instructions,187628,100.00,0.80,insn per cycle
,,,,,,,,
89069,,branches,187628,100.00,478.410,M/sec
3360,,branch-misses,187628,100.00,3.77,of all branches


which I think is wrong and we should not print

jirka

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] perf, tools, stat: Implement CSV metrics output
  2016-02-17 22:44 ` [PATCH 2/6] perf, tools, stat: Implement CSV metrics output Andi Kleen
  2016-02-18 17:00   ` Arnaldo Carvalho de Melo
@ 2016-02-21 16:39   ` Jiri Olsa
  1 sibling, 0 replies; 20+ messages in thread
From: Jiri Olsa @ 2016-02-21 16:39 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, linux-kernel, eranian, Andi Kleen

On Wed, Feb 17, 2016 at 02:44:01PM -0800, Andi Kleen wrote:

SNIP

> ---
>  tools/perf/builtin-stat.c | 76 ++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 72 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 86289df..6c2c1d2 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -739,6 +739,8 @@ struct outstate {
>  	FILE *fh;
>  	bool newline;
>  	const char *prefix;
> +	int  nfields;
> +	u64  run, ena;

what are outstate's ena and run being used for?

thanks,
jirka

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode
  2016-02-17 22:44 ` [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode Andi Kleen
@ 2016-02-21 17:15   ` Jiri Olsa
  2016-02-22 16:52     ` Andi Kleen
  2016-02-21 17:18   ` Jiri Olsa
  2016-02-21 17:22   ` Jiri Olsa
  2 siblings, 1 reply; 20+ messages in thread
From: Jiri Olsa @ 2016-02-21 17:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, linux-kernel, eranian, Andi Kleen

On Wed, Feb 17, 2016 at 02:44:02PM -0800, Andi Kleen wrote:

SNIP

> @@ -892,7 +908,10 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
>  	struct perf_stat_output_ctx out;
>  	struct outstate os = {
>  		.fh = stat_config.output,
> -		.prefix = prefix ? prefix : ""
> +		.prefix = prefix ? prefix : "",
> +		.id = id,
> +		.nr = nr,
> +		.evsel = counter,
>  	};
>  	print_metric_t pm = print_metric_std;
>  	void (*nl)(void *);
> @@ -962,15 +981,38 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
>  
>  	perf_stat__print_shadow_stats(counter, uval,
>  				stat_config.aggr_mode == AGGR_GLOBAL ? 0 :
> -				cpu_map__id_to_cpu(id),
> +				first_shadow_cpu(counter, id),

hum, IIUC you need to handle AGGR_NONE in here as well?

thanks,
jirka

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode
  2016-02-17 22:44 ` [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode Andi Kleen
  2016-02-21 17:15   ` Jiri Olsa
@ 2016-02-21 17:18   ` Jiri Olsa
  2016-02-21 17:22   ` Jiri Olsa
  2 siblings, 0 replies; 20+ messages in thread
From: Jiri Olsa @ 2016-02-21 17:18 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, linux-kernel, eranian, Andi Kleen

On Wed, Feb 17, 2016 at 02:44:02PM -0800, Andi Kleen wrote:

SNIP

>  
>  	perf_stat__print_shadow_stats(counter, uval,
>  				stat_config.aggr_mode == AGGR_GLOBAL ? 0 :
> -				cpu_map__id_to_cpu(id),
> +				first_shadow_cpu(counter, id),
>  				&out);
> -
>  	if (!csv_output) {
>  		print_noise(counter, noise);
>  		print_running(run, ena);
>  	}
>  }
>  
> +static void aggr_update_shadow(void)
> +{
> +	int cpu, cpu2, s2, id, s;
> +	u64 val;
> +	struct perf_evsel *counter;
> +
> +	for (s = 0; s < aggr_map->nr; s++) {
> +		id = aggr_map->map[s];
> +		evlist__for_each(evsel_list, counter) {
> +			val = 0;
> +			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
> +				cpu2 = perf_evsel__cpus(counter)->map[cpu];
> +				s2 = aggr_get_id(evsel_list->cpus, cpu2);

I think you need to pass cpu's 'idx' into aggr_get_id,
because it will do evsel_list->cpus[cpu2] for you

jirka

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode
  2016-02-17 22:44 ` [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode Andi Kleen
  2016-02-21 17:15   ` Jiri Olsa
  2016-02-21 17:18   ` Jiri Olsa
@ 2016-02-21 17:22   ` Jiri Olsa
  2016-02-26 23:53     ` Andi Kleen
  2 siblings, 1 reply; 20+ messages in thread
From: Jiri Olsa @ 2016-02-21 17:22 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, linux-kernel, eranian, Andi Kleen

On Wed, Feb 17, 2016 at 02:44:02PM -0800, Andi Kleen wrote:

SNIP

> +static void aggr_update_shadow(void)
> +{
> +	int cpu, cpu2, s2, id, s;
> +	u64 val;
> +	struct perf_evsel *counter;
> +
> +	for (s = 0; s < aggr_map->nr; s++) {
> +		id = aggr_map->map[s];
> +		evlist__for_each(evsel_list, counter) {
> +			val = 0;
> +			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
> +				cpu2 = perf_evsel__cpus(counter)->map[cpu];
> +				s2 = aggr_get_id(evsel_list->cpus, cpu2);
> +				if (s2 != id)
> +					continue;
> +				val += perf_counts(counter->counts, cpu, 0)->val;
> +			}
> +			val = val * counter->scale;
> +			perf_stat__update_shadow_stats(counter, &val,
> +						       first_shadow_cpu(counter, id));
> +		}
> +	}
> +}


> +
>  static void print_aggr(char *prefix)
>  {
>  	FILE *output = stat_config.output;
> @@ -982,6 +1024,8 @@ static void print_aggr(char *prefix)
>  	if (!(aggr_map || aggr_get_id))
>  		return;
>  
> +	aggr_update_shadow();

this should be called from perf_stat_process_counter,
not from display function

also please document somewhere (best around shadow stats variables)
what cpus (array members) are used for given AGGR_*

thanks,
jirka

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] perf, tools, stat: Implement CSV metrics output
  2016-02-21 16:39       ` Jiri Olsa
@ 2016-02-22 16:26         ` Andi Kleen
  0 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2016-02-22 16:26 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andi Kleen, Arnaldo Carvalho de Melo, jolsa, linux-kernel,
	eranian, Andi Kleen

On Sun, Feb 21, 2016 at 05:39:40PM +0100, Jiri Olsa wrote:
> On Thu, Feb 18, 2016 at 06:39:21PM +0100, Andi Kleen wrote:
> > > Where before it wasn't there:
> > > 
> > > cat /tmp/before
> > > 0.282628,,task-clock,282628,100.00
> > > 1,,context-switches,282628,100.00
> > > 0,,cpu-migrations,282628,100.00
> > > 52,,page-faults,282628,100.00
> > > 861213,,cycles,285354,100.00
> > > <not supported>,,stalled-cycles-frontend,0,100.00
> > > <not supported>,,stalled-cycles-backend,0,100.00
> > > 686082,,instructions,285354,100.00
> > > 137846,,branches,285354,100.00
> > > 7142,,branch-misses,285354,100.00
> > 
> > This is intentional. See the standard perf output:
> > 
> > 
> >           521,232      instructions              #    0.63  insns per cycle        
> >                                                  #    1.13  stalled cycles per insn
> > 
> > So this line has multiple metrics. In CSV this is expressed as a mostly empty line.
> 
> it's intentional if you have data from stalled cycles counter
> on cpu where this one is non supported you get blank line:

I fixed this now by probing for the stalled cycles counters in advance.
That avoids a couple of other issues too, like the empty columns in
--metric-only, and even makes the output of standard perf stat
shorter.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode
  2016-02-21 17:15   ` Jiri Olsa
@ 2016-02-22 16:52     ` Andi Kleen
  2016-02-23  7:37       ` Jiri Olsa
  0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2016-02-22 16:52 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, acme, jolsa, linux-kernel, eranian, Andi Kleen

On Sun, Feb 21, 2016 at 06:15:35PM +0100, Jiri Olsa wrote:
> On Wed, Feb 17, 2016 at 02:44:02PM -0800, Andi Kleen wrote:
> 
> SNIP
> 
> > @@ -892,7 +908,10 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
> >  	struct perf_stat_output_ctx out;
> >  	struct outstate os = {
> >  		.fh = stat_config.output,
> > -		.prefix = prefix ? prefix : ""
> > +		.prefix = prefix ? prefix : "",
> > +		.id = id,
> > +		.nr = nr,
> > +		.evsel = counter,
> >  	};
> >  	print_metric_t pm = print_metric_std;
> >  	void (*nl)(void *);
> > @@ -962,15 +981,38 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
> >  
> >  	perf_stat__print_shadow_stats(counter, uval,
> >  				stat_config.aggr_mode == AGGR_GLOBAL ? 0 :
> > -				cpu_map__id_to_cpu(id),
> > +				first_shadow_cpu(counter, id),
> 
> hum, IIUC you need to handle AGGR_NONE in here as well?

AFAIK it works. aggr_get_id in first_shadow_cpu and cpu_map__id_to_cpu
handle this case, right?

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode
  2016-02-22 16:52     ` Andi Kleen
@ 2016-02-23  7:37       ` Jiri Olsa
  0 siblings, 0 replies; 20+ messages in thread
From: Jiri Olsa @ 2016-02-23  7:37 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, linux-kernel, eranian, Andi Kleen

On Mon, Feb 22, 2016 at 05:52:02PM +0100, Andi Kleen wrote:
> On Sun, Feb 21, 2016 at 06:15:35PM +0100, Jiri Olsa wrote:
> > On Wed, Feb 17, 2016 at 02:44:02PM -0800, Andi Kleen wrote:
> > 
> > SNIP
> > 
> > > @@ -892,7 +908,10 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
> > >  	struct perf_stat_output_ctx out;
> > >  	struct outstate os = {
> > >  		.fh = stat_config.output,
> > > -		.prefix = prefix ? prefix : ""
> > > +		.prefix = prefix ? prefix : "",
> > > +		.id = id,
> > > +		.nr = nr,
> > > +		.evsel = counter,
> > >  	};
> > >  	print_metric_t pm = print_metric_std;
> > >  	void (*nl)(void *);
> > > @@ -962,15 +981,38 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
> > >  
> > >  	perf_stat__print_shadow_stats(counter, uval,
> > >  				stat_config.aggr_mode == AGGR_GLOBAL ? 0 :
> > > -				cpu_map__id_to_cpu(id),
> > > +				first_shadow_cpu(counter, id),
> > 
> > hum, IIUC you need to handle AGGR_NONE in here as well?
> 
> AFAIK it works. aggr_get_id in first_shadow_cpu and cpu_map__id_to_cpu
> handle this case, right?

it does not look like.. however it'll be more clear once
there's the doc/comment about used cpus for aggr modes
I asked for in here:

  http://marc.info/?l=linux-kernel&m=145607533503803&w=2

thanks,
jirka

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode
  2016-02-21 17:22   ` Jiri Olsa
@ 2016-02-26 23:53     ` Andi Kleen
  0 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2016-02-26 23:53 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, acme, jolsa, linux-kernel, eranian, Andi Kleen

> > +
> >  static void print_aggr(char *prefix)
> >  {
> >  	FILE *output = stat_config.output;
> > @@ -982,6 +1024,8 @@ static void print_aggr(char *prefix)
> >  	if (!(aggr_map || aggr_get_id))
> >  		return;
> >  
> > +	aggr_update_shadow();
> 
> this should be called from perf_stat_process_counter,
> not from display function

I tried it, but the function needs a lot of stuff (aggr_map,
evsel_list) that only exists in builtin-stat. Passing all
that around is quite complicated and intrusive.

I left it alone for now.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode
  2016-02-27  0:27 perf, tools: Refactor and support interval and CSV metrics Andi Kleen
@ 2016-02-27  0:27 ` Andi Kleen
  0 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2016-02-27  0:27 UTC (permalink / raw)
  To: acme; +Cc: jolsa, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Enable metrics printing in --per-core / --per-socket mode. We need
to save the shadow metrics in a unique place. Always use the first
CPU in the aggregation. Then use the same CPU to retrieve the
shadow value later.

Example output:

% perf stat --per-core -a ./BC1s

 Performance counter stats for 'system wide':

S0-C0           2        2966.020381      task-clock (msec)         #    2.004 CPUs utilized            (100.00%)
S0-C0           2                 49      context-switches          #    0.017 K/sec                    (100.00%)
S0-C0           2                  4      cpu-migrations            #    0.001 K/sec                    (100.00%)
S0-C0           2                467      page-faults               #    0.157 K/sec
S0-C0           2      4,599,061,773      cycles                    #    1.551 GHz                      (100.00%)
S0-C0           2      9,755,886,883      instructions              #    2.12  insn per cycle           (100.00%)
S0-C0           2      1,906,272,125      branches                  #  642.704 M/sec                    (100.00%)
S0-C0           2         81,180,867      branch-misses             #    4.26% of all branches
S0-C1           2        2965.995373      task-clock (msec)         #    2.003 CPUs utilized            (100.00%)
S0-C1           2                 62      context-switches          #    0.021 K/sec                    (100.00%)
S0-C1           2                  8      cpu-migrations            #    0.003 K/sec                    (100.00%)
S0-C1           2                281      page-faults               #    0.095 K/sec
S0-C1           2          6,347,290      cycles                    #    0.002 GHz                      (100.00%)
S0-C1           2          4,654,156      instructions              #    0.73  insn per cycle           (100.00%)
S0-C1           2            947,121      branches                  #    0.319 M/sec                    (100.00%)
S0-C1           2             37,322      branch-misses             #    3.94% of all branches

       1.480409747 seconds time elapsed

v2: Rebase to older patches
v3: Document shadow cpus. Fix aggr_get_id argument. Fix -A shadows (Jiri)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-stat.c     | 61 +++++++++++++++++++++++++++++++++++++------
 tools/perf/util/stat-shadow.c |  7 +++++
 2 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 2ffb822..c79e571 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -740,6 +740,8 @@ struct outstate {
 	bool newline;
 	const char *prefix;
 	int  nfields;
+	int  id, nr;
+	struct perf_evsel *evsel;
 };
 
 #define METRIC_LEN  35
@@ -755,12 +757,9 @@ static void do_new_line_std(struct outstate *os)
 {
 	fputc('\n', os->fh);
 	fputs(os->prefix, os->fh);
+	aggr_printout(os->evsel, os->id, os->nr);
 	if (stat_config.aggr_mode == AGGR_NONE)
 		fprintf(os->fh, "        ");
-	if (stat_config.aggr_mode == AGGR_CORE)
-		fprintf(os->fh, "                  ");
-	if (stat_config.aggr_mode == AGGR_SOCKET)
-		fprintf(os->fh, "            ");
 	fprintf(os->fh, "                                                 ");
 }
 
@@ -798,6 +797,7 @@ static void new_line_csv(void *ctx)
 	fputc('\n', os->fh);
 	if (os->prefix)
 		fprintf(os->fh, "%s%s", os->prefix, csv_sep);
+	aggr_printout(os->evsel, os->id, os->nr);
 	for (i = 0; i < os->nfields; i++)
 		fputs(csv_sep, os->fh);
 }
@@ -855,6 +855,25 @@ static void nsec_printout(int id, int nr, struct perf_evsel *evsel, double avg)
 		fprintf(output, "%s%s", csv_sep, evsel->cgrp->name);
 }
 
+static int first_shadow_cpu(struct perf_evsel *evsel, int id)
+{
+	int i;
+
+	if (stat_config.aggr_mode == AGGR_NONE)
+		return id;
+
+	if (stat_config.aggr_mode == AGGR_GLOBAL)
+		return 0;
+
+	for (i = 0; i < perf_evsel__nr_cpus(evsel); i++) {
+		int cpu2 = perf_evsel__cpus(evsel)->map[i];
+
+		if (aggr_get_id(evsel_list->cpus, cpu2) == id)
+			return cpu2;
+	}
+	return 0;
+}
+
 static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
 {
 	FILE *output = stat_config.output;
@@ -891,7 +910,10 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 	struct perf_stat_output_ctx out;
 	struct outstate os = {
 		.fh = stat_config.output,
-		.prefix = prefix ? prefix : ""
+		.prefix = prefix ? prefix : "",
+		.id = id,
+		.nr = nr,
+		.evsel = counter,
 	};
 	print_metric_t pm = print_metric_std;
 	void (*nl)(void *);
@@ -958,16 +980,37 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 	}
 
 	perf_stat__print_shadow_stats(counter, uval,
-				stat_config.aggr_mode == AGGR_GLOBAL ? 0 :
-				cpu_map__id_to_cpu(id),
+				first_shadow_cpu(counter, id),
 				&out);
-
 	if (!csv_output) {
 		print_noise(counter, noise);
 		print_running(run, ena);
 	}
 }
 
+static void aggr_update_shadow(void)
+{
+	int cpu, s2, id, s;
+	u64 val;
+	struct perf_evsel *counter;
+
+	for (s = 0; s < aggr_map->nr; s++) {
+		id = aggr_map->map[s];
+		evlist__for_each(evsel_list, counter) {
+			val = 0;
+			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
+				s2 = aggr_get_id(evsel_list->cpus, cpu);
+				if (s2 != id)
+					continue;
+				val += perf_counts(counter->counts, cpu, 0)->val;
+			}
+			val = val * counter->scale;
+			perf_stat__update_shadow_stats(counter, &val,
+						       first_shadow_cpu(counter, id));
+		}
+	}
+}
+
 static void print_aggr(char *prefix)
 {
 	FILE *output = stat_config.output;
@@ -979,6 +1022,8 @@ static void print_aggr(char *prefix)
 	if (!(aggr_map || aggr_get_id))
 		return;
 
+	aggr_update_shadow();
+
 	for (s = 0; s < aggr_map->nr; s++) {
 		id = aggr_map->map[s];
 		evlist__for_each(evsel_list, counter) {
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 4d8f185..78d7347 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -14,6 +14,13 @@ enum {
 
 #define NUM_CTX CTX_BIT_MAX
 
+/*
+ * AGGR_GLOBAL: Use CPU 0
+ * AGGR_SOCKET: Use first CPU of socket
+ * AGGR_CORE: Use first CPU of core
+ * AGGR_NONE: Use matching CPU
+ * AGGR_THREAD: Not supported?
+ */
 static struct stats runtime_nsecs_stats[MAX_NR_CPUS];
 static struct stats runtime_cycles_stats[NUM_CTX][MAX_NR_CPUS];
 static struct stats runtime_stalled_cycles_front_stats[NUM_CTX][MAX_NR_CPUS];
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-02-27  0:28 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-17 22:43 perf, tools: Refactor and support interval and CSV metrics Andi Kleen
2016-02-17 22:44 ` [PATCH 1/6] perf, tools, stat: Handled scaled == -1 case for counters Andi Kleen
2016-02-20 11:35   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2016-02-17 22:44 ` [PATCH 2/6] perf, tools, stat: Implement CSV metrics output Andi Kleen
2016-02-18 17:00   ` Arnaldo Carvalho de Melo
2016-02-18 17:39     ` Andi Kleen
2016-02-21 16:39       ` Jiri Olsa
2016-02-22 16:26         ` Andi Kleen
2016-02-21 16:39   ` Jiri Olsa
2016-02-17 22:44 ` [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode Andi Kleen
2016-02-21 17:15   ` Jiri Olsa
2016-02-22 16:52     ` Andi Kleen
2016-02-23  7:37       ` Jiri Olsa
2016-02-21 17:18   ` Jiri Olsa
2016-02-21 17:22   ` Jiri Olsa
2016-02-26 23:53     ` Andi Kleen
2016-02-17 22:44 ` [PATCH 4/6] perf, tools, stat: Document CSV format in manpage Andi Kleen
2016-02-17 22:44 ` [PATCH 5/6] perf, tools, stat: Implement --metric-only mode Andi Kleen
2016-02-17 22:44 ` [PATCH 6/6] perf, tools, stat: Add --metric-only support for -A Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2016-02-27  0:27 perf, tools: Refactor and support interval and CSV metrics Andi Kleen
2016-02-27  0:27 ` [PATCH 3/6] perf, tools, stat: Support metrics in --per-core/socket mode Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).