linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4 0/5] New metricgroup output in perf stat default mode
@ 2023-06-16  3:14 kan.liang
  2023-06-16  3:14 ` [PATCH V4 1/5] perf metrics: Sort the Default metricgroup kan.liang
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: kan.liang @ 2023-06-16  3:14 UTC (permalink / raw)
  To: acme, mingo, peterz, irogers, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel
  Cc: ak, eranian, ahmad.yasin, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

Changes since V3:
- Move the full name (PMU + metricgroup name) generation from the metric
  code to the output code. (Ian)
- Add default tags for Hisi hip08 L1 metrics (John)
- Some patches have been merged. Drop them from the V4.

Changes since V2:
- Fixes memory leak (Ian)
  (Ian, I cannot reproduce the memory leak on all my machines. Please
   check whether the fix works on your side. Thanks.)
- Add Reviewed-by tags for several patches.

Changes since V1:
- Remove EVSEL_EVENT_MASK and use the __evsel__match which is suggested
  by Ian.
- Support TopdownL1 on both e-core and p-core of ADL in the default
  mode. (Ian)
- Have separate patches for the modifications of metricgroup and output.
  (Ian)
- Does 2nd sort for the Default metricgroup. Remove the logic of
  changing the associated metric event. (Ian)
- Move all the metric related code to stat-shadow (Ian)
- Move the commong functions between stat+csv_output and stat+std_output
  to the lib directory (Ian)

In the default mode, the current output of the metricgroup include both
events and metrics, which is not necessary and makes the output hard to
read. Also, different ARCHs (even different generations of the ARCH) may
have a different output format because of the different events in a
metrics.

The patch proposes a new output format which only outputting the value
of each metric and the metricgroup name. It can brings a clean and
consistent output format among ARCHs and generations.

The patches 1-2 introduce the new metricgroup output.

The patches 3-4 improve the tests to cover the default mode.

The patch 5 update the event list for Hisi hip08.

Here are some examples for the new output.

STD output:

On SPR

perf stat -a sleep 1

 Performance counter stats for 'system wide':

        226,054.13 msec cpu-clock                        #  224.588 CPUs utilized
               932      context-switches                 #    4.123 /sec
               224      cpu-migrations                   #    0.991 /sec
                76      page-faults                      #    0.336 /sec
        45,940,682      cycles                           #    0.000 GHz
        36,676,047      instructions                     #    0.80  insn per cycle
         7,044,516      branches                         #   31.163 K/sec
            62,169      branch-misses                    #    0.88% of all branches
                        TopdownL1                 #     68.7 %  tma_backend_bound
                                                  #      3.1 %  tma_bad_speculation
                                                  #     13.0 %  tma_frontend_bound
                                                  #     15.2 %  tma_retiring
                        TopdownL2                 #      2.7 %  tma_branch_mispredicts
                                                  #     19.6 %  tma_core_bound
                                                  #      4.8 %  tma_fetch_bandwidth
                                                  #      8.3 %  tma_fetch_latency
                                                  #      2.9 %  tma_heavy_operations
                                                  #     12.3 %  tma_light_operations
                                                  #      0.4 %  tma_machine_clears
                                                  #     49.1 %  tma_memory_bound

       1.006529767 seconds time elapsed

perf stat -a sleep 1

 Performance counter stats for 'system wide':

         32,127.99 msec cpu-clock                        #   31.992 CPUs utilized
               240      context-switches                 #    7.470 /sec
                32      cpu-migrations                   #    0.996 /sec
                74      page-faults                      #    2.303 /sec
         6,313,960      cpu_core/cycles/                 #    0.000 GHz
       257,711,907      cpu_atom/cycles/                 #    0.008 GHz                         (54.18%)
         4,477,162      cpu_core/instructions/           #    0.71  insn per cycle
        37,721,481      cpu_atom/instructions/           #    5.97  insn per cycle              (63.33%)
           809,747      cpu_core/branches/               #   25.204 K/sec
         6,621,226      cpu_atom/branches/               #  206.089 K/sec                       (63.32%)
            39,667      cpu_core/branch-misses/          #    4.90% of all branches
         1,032,146      cpu_atom/branch-misses/          #  127.47% of all branches             (63.33%)
             TopdownL1 (cpu_core)                 #      nan %  tma_backend_bound
                                                  #      0.0 %  tma_bad_speculation
                                                  #      nan %  tma_frontend_bound
                                                  #      nan %  tma_retiring
             TopdownL1 (cpu_atom)                 #     13.6 %  tma_bad_speculation      (63.36%)
                                                  #     41.1 %  tma_frontend_bound       (63.54%)
                                                  #     39.2 %  tma_backend_bound
                                                  #     39.2 %  tma_backend_bound_aux    (63.93%)
                                                  #      5.4 %  tma_retiring             (64.15%)

       1.004244114 seconds time elapsed

JSON output

on SPR

perf stat --json -a sleep 1
{"counter-value" : "225904.823297", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 225904323425, "pcnt-running" : 100.00, "metric-value" : "224.456872", "metric-unit" : "CPUs utilized"}
{"counter-value" : "986.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 225904108985, "pcnt-running" : 100.00, "metric-value" : "4.364670", "metric-unit" : "/sec"}
{"counter-value" : "224.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 225904016141, "pcnt-running" : 100.00, "metric-value" : "0.991568", "metric-unit" : "/sec"}
{"counter-value" : "76.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 225903913270, "pcnt-running" : 100.00, "metric-value" : "0.336425", "metric-unit" : "/sec"}
{"counter-value" : "48433482.000000", "unit" : "", "event" : "cycles", "event-runtime" : 225903792732, "pcnt-running" : 100.00, "metric-value" : "0.000214", "metric-unit" : "GHz"}
{"counter-value" : "38620409.000000", "unit" : "", "event" : "instructions", "event-runtime" : 225903657830, "pcnt-running" : 100.00, "metric-value" : "0.797391", "metric-unit" : "insn per cycle"}
{"counter-value" : "7369473.000000", "unit" : "", "event" : "branches", "event-runtime" : 225903464328, "pcnt-running" : 100.00, "metric-value" : "32.622026", "metric-unit" : "K/sec"}
{"counter-value" : "54747.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 225903234523, "pcnt-running" : 100.00, "metric-value" : "0.742889", "metric-unit" : "of all branches"}
{"event-runtime" : 225902840555, "pcnt-running" : 100.00, "metricgroup" : "TopdownL1"}
{"metric-value" : "69.950631", "metric-unit" : "%  tma_backend_bound"}
{"metric-value" : "2.771783", "metric-unit" : "%  tma_bad_speculation"}
{"metric-value" : "12.026074", "metric-unit" : "%  tma_frontend_bound"}
{"metric-value" : "15.251513", "metric-unit" : "%  tma_retiring"}
{"event-runtime" : 225902840555, "pcnt-running" : 100.00, "metricgroup" : "TopdownL2"}
{"metric-value" : "2.351757", "metric-unit" : "%  tma_branch_mispredicts"}
{"metric-value" : "19.729771", "metric-unit" : "%  tma_core_bound"}
{"metric-value" : "4.555207", "metric-unit" : "%  tma_fetch_bandwidth"}
{"metric-value" : "7.470867", "metric-unit" : "%  tma_fetch_latency"}
{"metric-value" : "2.938808", "metric-unit" : "%  tma_heavy_operations"}
{"metric-value" : "12.312705", "metric-unit" : "%  tma_light_operations"}
{"metric-value" : "0.420026", "metric-unit" : "%  tma_machine_clears"}
{"metric-value" : "50.220860", "metric-unit" : "%  tma_memory_bound"}

On hybrid

perf stat --json -a sleep 1
{"counter-value" : "32131.530625", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 32131536951, "pcnt-running" : 100.00, "metric-value" : "31.992642", "metric-unit" : "CPUs utilized"}
{"counter-value" : "328.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 32131525778, "pcnt-running" : 100.00, "metric-value" : "10.208042", "metric-unit" : "/sec"}
{"counter-value" : "32.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 32131515104, "pcnt-running" : 100.00, "metric-value" : "0.995906", "metric-unit" : "/sec"}
{"counter-value" : "353.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 32131501396, "pcnt-running" : 100.00, "metric-value" : "10.986094", "metric-unit" : "/sec"}
{"counter-value" : "18685492.000000", "unit" : "", "event" : "cpu_core/cycles/", "event-runtime" : 16061585292, "pcnt-running" : 100.00, "metric-value" : "0.000582", "metric-unit" : "GHz"}
{"counter-value" : "255620352.000000", "unit" : "", "event" : "cpu_atom/cycles/", "event-runtime" : 8690268422, "pcnt-running" : 54.00, "metric-value" : "0.007955", "metric-unit" : "GHz"}
{"counter-value" : "15489913.000000", "unit" : "", "event" : "cpu_core/instructions/", "event-runtime" : 16061582200, "pcnt-running" : 100.00, "metric-value" : "0.828981", "metric-unit" : "insn per cycle"}
{"counter-value" : "38790161.000000", "unit" : "", "event" : "cpu_atom/instructions/", "event-runtime" : 10163133324, "pcnt-running" : 63.00, "metric-value" : "2.075951", "metric-unit" : "insn per cycle"}
{"counter-value" : "2908031.000000", "unit" : "", "event" : "cpu_core/branches/", "event-runtime" : 16061563416, "pcnt-running" : 100.00, "metric-value" : "90.503967", "metric-unit" : "K/sec"}
{"counter-value" : "6814948.000000", "unit" : "", "event" : "cpu_atom/branches/", "event-runtime" : 10161711336, "pcnt-running" : 63.00, "metric-value" : "212.095343", "metric-unit" : "K/sec"}
{"counter-value" : "97638.000000", "unit" : "", "event" : "cpu_core/branch-misses/", "event-runtime" : 16061535261, "pcnt-running" : 100.00, "metric-value" : "3.357530", "metric-unit" : "of all branches"}
{"counter-value" : "1017066.000000", "unit" : "", "event" : "cpu_atom/branch-misses/", "event-runtime" : 10159971797, "pcnt-running" : 63.00, "metric-value" : "34.974386", "metric-unit" : "of all branches"}
{"event-runtime" : 16061513607, "pcnt-running" : 100.00, "metricgroup" : "TopdownL1 (cpu_core)"}
{"metric-value" : "nan", "metric-unit" : "%  tma_backend_bound"}
{"metric-value" : "0.000000", "metric-unit" : "%  tma_bad_speculation"}
{"metric-value" : "nan", "metric-unit" : "%  tma_frontend_bound"}
{"metric-value" : "nan", "metric-unit" : "%  tma_retiring"}
{"event-runtime" : 10157398501, "pcnt-running" : 63.00, "metricgroup" : "TopdownL1 (cpu_atom)"}
{"metric-value" : "13.719821", "metric-unit" : "%  tma_bad_speculation"}
{"event-runtime" : 10178698656, "pcnt-running" : 63.00, "metric-value" : "41.016738", "metric-unit" : "%  tma_frontend_bound"}
{"event-runtime" : 10240582902, "pcnt-running" : 63.00, "metric-value" : "39.327764", "metric-unit" : "%  tma_backend_bound"}
{"metric-value" : "39.327764", "metric-unit" : "%  tma_backend_bound_aux"}
{"event-runtime" : 10284284920, "pcnt-running" : 64.00, "metric-value" : "5.374638", "metric-unit" : "%  tma_retiring"}

CSV output

On SPR

perf stat -x, -a sleep 1
225851.20,msec,cpu-clock,225850700108,100.00,224.431,CPUs utilized
976,,context-switches,225850504803,100.00,4.321,/sec
224,,cpu-migrations,225850410336,100.00,0.992,/sec
76,,page-faults,225850304155,100.00,0.337,/sec
52288305,,cycles,225850188531,100.00,0.000,GHz
37977214,,instructions,225850071251,100.00,0.73,insn per cycle
7299859,,branches,225849890722,100.00,32.322,K/sec
51102,,branch-misses,225849672536,100.00,0.70,of all branches
,225849327050,100.00,,,,TopdownL1
,,,,,70.1,%  tma_backend_bound
,,,,,2.7,%  tma_bad_speculation
,,,,,12.5,%  tma_frontend_bound
,,,,,14.6,%  tma_retiring
,225849327050,100.00,,,,TopdownL2
,,,,,2.3,%  tma_branch_mispredicts
,,,,,19.6,%  tma_core_bound
,,,,,4.6,%  tma_fetch_bandwidth
,,,,,7.9,%  tma_fetch_latency
,,,,,2.9,%  tma_heavy_operations
,,,,,11.7,%  tma_light_operations
,,,,,0.5,%  tma_machine_clears
,,,,,50.5,%  tma_memory_bound

On Hybrid

perf stat -x, -a sleep 1
32139.34,msec,cpu-clock,32139351409,100.00,32.001,CPUs utilized
225,,context-switches,32139342672,100.00,7.001,/sec
32,,cpu-migrations,32139337772,100.00,0.996,/sec
72,,page-faults,32139328384,100.00,2.240,/sec
6766433,,cpu_core/cycles/,16067551558,100.00,0.000,GHz
256500230,,cpu_atom/cycles/,8695757391,54.00,0.008,GHz
4688595,,cpu_core/instructions/,16067558976,100.00,0.69,insn per cycle
37487490,,cpu_atom/instructions/,10165193856,63.00,5.54,insn per cycle
845211,,cpu_core/branches/,16067540225,100.00,26.298,K/sec
6571193,,cpu_atom/branches/,10155940853,63.00,204.459,K/sec
41359,,cpu_core/branch-misses/,16067516493,100.00,4.89,of all branches
1020231,,cpu_atom/branch-misses/,10159363620,63.00,120.71,of all branches
,16067494476,100.00,,,,TopdownL1 (cpu_core)
,,,,,,%  tma_backend_bound
,,,,,0.0,%  tma_bad_speculation
,,,,,,%  tma_frontend_bound
,,,,,,%  tma_retiring
,10160989992,63.00,,,,TopdownL1 (cpu_atom)
,,,,,13.8,%  tma_bad_speculation
,10188319019,63.00,,,41.3,%  tma_frontend_bound
,10258326591,63.00,,,38.6,%  tma_backend_bound
,,,,,38.6,%  tma_backend_bound_aux
,10282689488,64.00,,,5.4,%  tma_retiring

Kan Liang (5):
  perf metrics: Sort the Default metricgroup
  perf stat: New metricgroup output for the default mode
  perf test: Move all the check functions of stat csv output to lib
  perf test: Add test case for the standard perf stat output
  perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics

 tools/perf/builtin-stat.c                     |   1 +
 .../arch/arm64/hisilicon/hip08/metrics.json   |  12 +-
 tools/perf/tests/shell/lib/stat_output.sh     | 169 ++++++++++++++++
 tools/perf/tests/shell/stat+csv_output.sh     | 188 ++----------------
 tools/perf/tests/shell/stat+std_output.sh     | 108 ++++++++++
 tools/perf/util/evsel.h                       |   1 +
 tools/perf/util/metricgroup.c                 |  26 +++
 tools/perf/util/metricgroup.h                 |   3 +
 tools/perf/util/stat-display.c                | 108 +++++++++-
 tools/perf/util/stat-shadow.c                 | 131 ++++++++++--
 tools/perf/util/stat.h                        |  15 ++
 11 files changed, 563 insertions(+), 199 deletions(-)
 create mode 100755 tools/perf/tests/shell/lib/stat_output.sh
 create mode 100755 tools/perf/tests/shell/stat+std_output.sh

-- 
2.35.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V4 1/5] perf metrics: Sort the Default metricgroup
  2023-06-16  3:14 [PATCH V4 0/5] New metricgroup output in perf stat default mode kan.liang
@ 2023-06-16  3:14 ` kan.liang
  2023-06-16  5:48   ` Ian Rogers
  2023-06-16  3:14 ` [PATCH V4 2/5] perf stat: New metricgroup output for the default mode kan.liang
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: kan.liang @ 2023-06-16  3:14 UTC (permalink / raw)
  To: acme, mingo, peterz, irogers, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel
  Cc: ak, eranian, ahmad.yasin, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The new default mode will print the metrics as a metric group. The
metrics from the same metric group must be adjacent to each other in the
metric list. But the metric_list_cmp() sorts metrics by the number of
events.

Add a new sort for the Default metricgroup, which sorts by
default_metricgroup_name and metric_name.

Add is_default in the struct metric_event to indicate that it's from
the Default metricgroup.

Store the displayed metricgroup name of the Default metricgroup into
the metric expr for output.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 tools/perf/util/metricgroup.c | 26 ++++++++++++++++++++++++++
 tools/perf/util/metricgroup.h |  3 +++
 2 files changed, 29 insertions(+)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 8b19644ade7d..a6a5ed44a679 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -79,6 +79,7 @@ static struct rb_node *metric_event_new(struct rblist *rblist __maybe_unused,
 		return NULL;
 	memcpy(me, entry, sizeof(struct metric_event));
 	me->evsel = ((struct metric_event *)entry)->evsel;
+	me->is_default = false;
 	INIT_LIST_HEAD(&me->head);
 	return &me->nd;
 }
@@ -1160,6 +1161,25 @@ static int metric_list_cmp(void *priv __maybe_unused, const struct list_head *l,
 	return right_count - left_count;
 }
 
+/**
+ * default_metricgroup_cmp - Implements complex key for the Default metricgroup
+ *			     that first sorts by default_metricgroup_name, then
+ *			     metric_name.
+ */
+static int default_metricgroup_cmp(void *priv __maybe_unused,
+				   const struct list_head *l,
+				   const struct list_head *r)
+{
+	const struct metric *left = container_of(l, struct metric, nd);
+	const struct metric *right = container_of(r, struct metric, nd);
+	int diff = strcmp(right->default_metricgroup_name, left->default_metricgroup_name);
+
+	if (diff)
+		return diff;
+
+	return strcmp(right->metric_name, left->metric_name);
+}
+
 struct metricgroup__add_metric_data {
 	struct list_head *list;
 	const char *pmu;
@@ -1515,6 +1535,7 @@ static int parse_groups(struct evlist *perf_evlist,
 	LIST_HEAD(metric_list);
 	struct metric *m;
 	bool tool_events[PERF_TOOL_MAX] = {false};
+	bool is_default = !strcmp(str, "Default");
 	int ret;
 
 	if (metric_events_list->nr_entries == 0)
@@ -1549,6 +1570,9 @@ static int parse_groups(struct evlist *perf_evlist,
 			goto out;
 	}
 
+	if (is_default)
+		list_sort(NULL, &metric_list, default_metricgroup_cmp);
+
 	list_for_each_entry(m, &metric_list, nd) {
 		struct metric_event *me;
 		struct evsel **metric_events;
@@ -1637,6 +1661,8 @@ static int parse_groups(struct evlist *perf_evlist,
 		expr->metric_unit = m->metric_unit;
 		expr->metric_events = metric_events;
 		expr->runtime = m->pctx->sctx.runtime;
+		expr->default_metricgroup_name = m->default_metricgroup_name;
+		me->is_default = is_default;
 		list_add(&expr->nd, &me->head);
 	}
 
diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
index bf18274c15df..d5325c6ec8e1 100644
--- a/tools/perf/util/metricgroup.h
+++ b/tools/perf/util/metricgroup.h
@@ -22,6 +22,7 @@ struct cgroup;
 struct metric_event {
 	struct rb_node nd;
 	struct evsel *evsel;
+	bool is_default; /* the metric evsel from the Default metricgroup */
 	struct list_head head; /* list of metric_expr */
 };
 
@@ -55,6 +56,8 @@ struct metric_expr {
 	 * more human intelligible) and then add "MiB" afterward when displayed.
 	 */
 	const char *metric_unit;
+	/** Displayed metricgroup name of the Default metricgroup */
+	const char *default_metricgroup_name;
 	/** Null terminated array of events used by the metric. */
 	struct evsel **metric_events;
 	/** Null terminated array of referenced metrics. */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V4 2/5] perf stat: New metricgroup output for the default mode
  2023-06-16  3:14 [PATCH V4 0/5] New metricgroup output in perf stat default mode kan.liang
  2023-06-16  3:14 ` [PATCH V4 1/5] perf metrics: Sort the Default metricgroup kan.liang
@ 2023-06-16  3:14 ` kan.liang
  2023-06-16  5:56   ` Ian Rogers
  2023-06-16  3:14 ` [PATCH V4 3/5] perf test: Move all the check functions of stat csv output to lib kan.liang
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: kan.liang @ 2023-06-16  3:14 UTC (permalink / raw)
  To: acme, mingo, peterz, irogers, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel
  Cc: ak, eranian, ahmad.yasin, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

In the default mode, the current output of the metricgroup include both
events and metrics, which is not necessary and just makes the output
hard to read. Since different ARCHs (even different generations in the
same ARCH) may use different events. The output also vary on different
platforms.

For a metricgroup, only outputting the value of each metric is good
enough.

Add a new field default_metricgroup in evsel to indicate an event of
the default metricgroup. For those events, printout() should print
the metricgroup name rather than each event.

Add perf_stat__skip_metric_event() to skip the evsel in the Default
metricgroup, if it's not running or not the metric event.

Add print_metricgroup_header_t to pass the functions which print the
display name of each metricgroup in the Default metricgroup. Support
all three output methods.

Factor out perf_stat__print_shadow_stats_metricgroup() to print out
each metrics.

On SPR
Before:

 ./perf_old stat sleep 1

 Performance counter stats for 'sleep 1':

              0.54 msec task-clock:u                     #    0.001 CPUs utilized
                 0      context-switches:u               #    0.000 /sec
                 0      cpu-migrations:u                 #    0.000 /sec
                68      page-faults:u                    #  125.445 K/sec
           540,970      cycles:u                         #    0.998 GHz
           556,325      instructions:u                   #    1.03  insn per cycle
           123,602      branches:u                       #  228.018 M/sec
             6,889      branch-misses:u                  #    5.57% of all branches
         3,245,820      TOPDOWN.SLOTS:u                  #     18.4 %  tma_backend_bound
                                                  #     17.2 %  tma_retiring
                                                  #     23.1 %  tma_bad_speculation
                                                  #     41.4 %  tma_frontend_bound
           564,859      topdown-retiring:u
         1,370,999      topdown-fe-bound:u
           603,271      topdown-be-bound:u
           744,874      topdown-bad-spec:u
            12,661      INT_MISC.UOP_DROPPING:u          #   23.357 M/sec

       1.001798215 seconds time elapsed

       0.000193000 seconds user
       0.001700000 seconds sys

After:

$ ./perf stat sleep 1

 Performance counter stats for 'sleep 1':

              0.51 msec task-clock:u                     #    0.001 CPUs utilized
                 0      context-switches:u               #    0.000 /sec
                 0      cpu-migrations:u                 #    0.000 /sec
                68      page-faults:u                    #  132.683 K/sec
           545,228      cycles:u                         #    1.064 GHz
           555,509      instructions:u                   #    1.02  insn per cycle
           123,574      branches:u                       #  241.120 M/sec
             6,957      branch-misses:u                  #    5.63% of all branches
                        TopdownL1                 #     17.5 %  tma_backend_bound
                                                  #     22.6 %  tma_bad_speculation
                                                  #     42.7 %  tma_frontend_bound
                                                  #     17.1 %  tma_retiring
                        TopdownL2                 #     21.8 %  tma_branch_mispredicts
                                                  #     11.5 %  tma_core_bound
                                                  #     13.4 %  tma_fetch_bandwidth
                                                  #     29.3 %  tma_fetch_latency
                                                  #      2.7 %  tma_heavy_operations
                                                  #     14.5 %  tma_light_operations
                                                  #      0.8 %  tma_machine_clears
                                                  #      6.1 %  tma_memory_bound

       1.001712086 seconds time elapsed

       0.000151000 seconds user
       0.001618000 seconds sys

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 tools/perf/builtin-stat.c      |   1 +
 tools/perf/util/evsel.h        |   1 +
 tools/perf/util/stat-display.c | 108 ++++++++++++++++++++++++---
 tools/perf/util/stat-shadow.c  | 131 ++++++++++++++++++++++++++++++---
 tools/perf/util/stat.h         |  15 ++++
 5 files changed, 234 insertions(+), 22 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 55601b4b5c34..3f4e76f76f94 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2172,6 +2172,7 @@ static int add_default_attributes(void)
 
 			evlist__for_each_entry(metric_evlist, metric_evsel) {
 				metric_evsel->skippable = true;
+				metric_evsel->default_metricgroup = true;
 			}
 			evlist__splice_list_tail(evsel_list, &metric_evlist->core.entries);
 			evlist__delete(metric_evlist);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index cc6fb3049b99..9f06d6cd5379 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -131,6 +131,7 @@ struct evsel {
 	bool			reset_group;
 	bool			errored;
 	bool			needs_auxtrace_mmap;
+	bool			default_metricgroup; /* A member of the Default metricgroup */
 	struct hashmap		*per_pkg_mask;
 	int			err;
 	struct {
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index a2bbdc25d979..7329b3340f88 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -25,6 +25,7 @@
 #define CNTR_NOT_SUPPORTED	"<not supported>"
 #define CNTR_NOT_COUNTED	"<not counted>"
 
+#define MGROUP_LEN   50
 #define METRIC_LEN   38
 #define EVNAME_LEN   32
 #define COUNTS_LEN   18
@@ -364,16 +365,27 @@ static void new_line_std(struct perf_stat_config *config __maybe_unused,
 	os->newline = true;
 }
 
-static void do_new_line_std(struct perf_stat_config *config,
-			    struct outstate *os)
+static inline void __new_line_std_csv(struct perf_stat_config *config,
+				      struct outstate *os)
 {
 	fputc('\n', os->fh);
 	if (os->prefix)
 		fputs(os->prefix, os->fh);
 	aggr_printout(config, os->evsel, os->id, os->aggr_nr);
+}
+
+static inline void __new_line_std(struct outstate *os)
+{
+	fprintf(os->fh, "                                                 ");
+}
+
+static void do_new_line_std(struct perf_stat_config *config,
+			    struct outstate *os)
+{
+	__new_line_std_csv(config, os);
 	if (config->aggr_mode == AGGR_NONE)
 		fprintf(os->fh, "        ");
-	fprintf(os->fh, "                                                 ");
+	__new_line_std(os);
 }
 
 static void print_metric_std(struct perf_stat_config *config,
@@ -408,10 +420,7 @@ static void new_line_csv(struct perf_stat_config *config, void *ctx)
 	struct outstate *os = ctx;
 	int i;
 
-	fputc('\n', os->fh);
-	if (os->prefix)
-		fprintf(os->fh, "%s", os->prefix);
-	aggr_printout(config, os->evsel, os->id, os->aggr_nr);
+	__new_line_std_csv(config, os);
 	for (i = 0; i < os->nfields; i++)
 		fputs(config->csv_sep, os->fh);
 }
@@ -462,6 +471,54 @@ static void new_line_json(struct perf_stat_config *config, void *ctx)
 	aggr_printout(config, os->evsel, os->id, os->aggr_nr);
 }
 
+static void print_metricgroup_header_json(struct perf_stat_config *config,
+					  void *ctx,
+					  const char *metricgroup_name)
+{
+	if (!metricgroup_name)
+		return;
+
+	fprintf(config->output, "\"metricgroup\" : \"%s\"}", metricgroup_name);
+	new_line_json(config, ctx);
+}
+
+static void print_metricgroup_header_csv(struct perf_stat_config *config,
+					 void *ctx,
+					 const char *metricgroup_name)
+{
+	struct outstate *os = ctx;
+	int i;
+
+	if (!metricgroup_name) {
+		/* Leave space for running and enabling */
+		for (i = 0; i < os->nfields - 2; i++)
+			fputs(config->csv_sep, os->fh);
+		return;
+	}
+
+	for (i = 0; i < os->nfields; i++)
+		fputs(config->csv_sep, os->fh);
+	fprintf(config->output, "%s", metricgroup_name);
+	new_line_csv(config, ctx);
+}
+
+static void print_metricgroup_header_std(struct perf_stat_config *config,
+					 void *ctx,
+					 const char *metricgroup_name)
+{
+	struct outstate *os = ctx;
+	int n;
+
+	if (!metricgroup_name) {
+		__new_line_std(os);
+		return;
+	}
+
+	n = fprintf(config->output, " %*s", EVNAME_LEN, metricgroup_name);
+
+	fprintf(config->output, "%*s", MGROUP_LEN - n - 1, "");
+}
+
 /* Filter out some columns that don't work well in metrics only mode */
 
 static bool valid_only_metric(const char *unit)
@@ -713,19 +770,23 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
 	struct perf_stat_output_ctx out;
 	print_metric_t pm;
 	new_line_t nl;
+	print_metricgroup_header_t pmh;
 	bool ok = true;
 	struct evsel *counter = os->evsel;
 
 	if (config->csv_output) {
 		pm = config->metric_only ? print_metric_only_csv : print_metric_csv;
 		nl = config->metric_only ? new_line_metric : new_line_csv;
+		pmh = print_metricgroup_header_csv;
 		os->nfields = 4 + (counter->cgrp ? 1 : 0);
 	} else if (config->json_output) {
 		pm = config->metric_only ? print_metric_only_json : print_metric_json;
 		nl = config->metric_only ? new_line_metric : new_line_json;
+		pmh = print_metricgroup_header_json;
 	} else {
 		pm = config->metric_only ? print_metric_only : print_metric_std;
 		nl = config->metric_only ? new_line_metric : new_line_std;
+		pmh = print_metricgroup_header_std;
 	}
 
 	if (run == 0 || ena == 0 || counter->counts->scaled == -1) {
@@ -747,10 +808,11 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
 
 	out.print_metric = pm;
 	out.new_line = nl;
+	out.print_metricgroup_header = pmh;
 	out.ctx = os;
 	out.force_header = false;
 
-	if (!config->metric_only) {
+	if (!config->metric_only && !counter->default_metricgroup) {
 		abs_printout(config, os->id, os->aggr_nr, counter, uval, ok);
 
 		print_noise(config, counter, noise, /*before_metric=*/true);
@@ -758,8 +820,31 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
 	}
 
 	if (ok) {
-		perf_stat__print_shadow_stats(config, counter, uval, aggr_idx,
-					      &out, &config->metric_events);
+		if (!config->metric_only && counter->default_metricgroup) {
+			void *from = NULL;
+
+			aggr_printout(config, os->evsel, os->id, os->aggr_nr);
+			/* Print out all the metricgroup with the same metric event. */
+			do {
+				int num = 0;
+
+				/* Print out the new line for the next new metricgroup. */
+				if (from) {
+					if (config->json_output)
+						new_line_json(config, (void *)os);
+					else
+						__new_line_std_csv(config, os);
+				}
+
+				print_noise(config, counter, noise, /*before_metric=*/true);
+				print_running(config, run, ena, /*before_metric=*/true);
+				from = perf_stat__print_shadow_stats_metricgroup(config, counter, aggr_idx,
+										 &num, from, &out,
+										 &config->metric_events);
+			} while (from != NULL);
+		} else
+			perf_stat__print_shadow_stats(config, counter, uval, aggr_idx,
+						      &out, &config->metric_events);
 	} else {
 		pm(config, os, /*color=*/NULL, /*format=*/NULL, /*unit=*/"", /*val=*/0);
 	}
@@ -889,6 +974,9 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
 	ena = aggr->counts.ena;
 	run = aggr->counts.run;
 
+	if (perf_stat__skip_metric_event(counter, &config->metric_events, ena, run))
+		return;
+
 	if (val == 0 && should_skip_zero_counter(config, counter, &id))
 		return;
 
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 1566a206ba42..1c5c3eeba4cf 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -539,6 +539,106 @@ double test_generic_metric(struct metric_expr *mexp, int aggr_idx)
 	return ratio;
 }
 
+static void perf_stat__print_metricgroup_header(struct perf_stat_config *config,
+						struct evsel *evsel,
+						void *ctxp,
+						const char *name,
+						struct perf_stat_output_ctx *out)
+{
+	bool need_full_name = perf_pmus__num_core_pmus() > 1;
+	static const char *last_name;
+	static const char *last_pmu;
+	char full_name[64];
+
+	/*
+	 * A metricgroup may have several metric events,
+	 * e.g.,TopdownL1 on e-core of ADL.
+	 * The name has been output by the first metric
+	 * event. Only align with other metics from
+	 * different metric events.
+	 */
+	if (last_name && !strcmp(last_name, name)) {
+		if (!need_full_name || !strcmp(last_pmu, evsel->pmu_name)) {
+			out->print_metricgroup_header(config, ctxp, NULL);
+			return;
+		}
+	}
+
+	if (need_full_name)
+		scnprintf(full_name, sizeof(full_name), "%s (%s)", name, evsel->pmu_name);
+	else
+		scnprintf(full_name, sizeof(full_name), "%s", name);
+
+	out->print_metricgroup_header(config, ctxp, full_name);
+
+	last_name = name;
+	last_pmu = evsel->pmu_name;
+}
+
+/**
+ * perf_stat__print_shadow_stats_metricgroup - Print out metrics associated with the evsel
+ *					       For the non-default, all metrics associated
+ *					       with the evsel are printed.
+ *					       For the default mode, only the metrics from
+ *					       the same metricgroup and the name of the
+ *					       metricgroup are printed. To print the metrics
+ *					       from the next metricgroup (if available),
+ *					       invoke the function with correspoinding
+ *					       metric_expr.
+ */
+void *perf_stat__print_shadow_stats_metricgroup(struct perf_stat_config *config,
+						struct evsel *evsel,
+						int aggr_idx,
+						int *num,
+						void *from,
+						struct perf_stat_output_ctx *out,
+						struct rblist *metric_events)
+{
+	struct metric_event *me;
+	struct metric_expr *mexp = from;
+	void *ctxp = out->ctx;
+	bool header_printed = false;
+	const char *name = NULL;
+
+	me = metricgroup__lookup(metric_events, evsel, false);
+	if (me == NULL)
+		return NULL;
+
+	if (!mexp)
+		mexp = list_first_entry(&me->head, typeof(*mexp), nd);
+
+	list_for_each_entry_from(mexp, &me->head, nd) {
+		/* Print the display name of the Default metricgroup */
+		if (!config->metric_only && me->is_default) {
+			if (!name)
+				name = mexp->default_metricgroup_name;
+			/*
+			 * Two or more metricgroup may share the same metric
+			 * event, e.g., TopdownL1 and TopdownL2 on SPR.
+			 * Return and print the prefix, e.g., noise, running
+			 * for the next metricgroup.
+			 */
+			if (strcmp(name, mexp->default_metricgroup_name))
+				return (void *)mexp;
+			/* Only print the name of the metricgroup once */
+			if (!header_printed) {
+				header_printed = true;
+				perf_stat__print_metricgroup_header(config, evsel, ctxp,
+								    name, out);
+			}
+		}
+
+		if ((*num)++ > 0)
+			out->new_line(config, ctxp);
+		generic_metric(config, mexp->metric_expr, mexp->metric_threshold,
+			       mexp->metric_events, mexp->metric_refs, evsel->name,
+			       mexp->metric_name, mexp->metric_unit, mexp->runtime,
+			       aggr_idx, out);
+	}
+
+	return NULL;
+}
+
 void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 				   struct evsel *evsel,
 				   double avg, int aggr_idx,
@@ -565,7 +665,6 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 	};
 	print_metric_t print_metric = out->print_metric;
 	void *ctxp = out->ctx;
-	struct metric_event *me;
 	int num = 1;
 
 	if (config->iostat_run) {
@@ -592,18 +691,26 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		}
 	}
 
-	if ((me = metricgroup__lookup(metric_events, evsel, false)) != NULL) {
-		struct metric_expr *mexp;
+	perf_stat__print_shadow_stats_metricgroup(config, evsel, aggr_idx,
+						  &num, NULL, out, metric_events);
 
-		list_for_each_entry (mexp, &me->head, nd) {
-			if (num++ > 0)
-				out->new_line(config, ctxp);
-			generic_metric(config, mexp->metric_expr, mexp->metric_threshold,
-				       mexp->metric_events, mexp->metric_refs, evsel->name,
-				       mexp->metric_name, mexp->metric_unit, mexp->runtime,
-				       aggr_idx, out);
-		}
-	}
 	if (num == 0)
 		print_metric(config, ctxp, NULL, NULL, NULL, 0);
 }
+
+/**
+ * perf_stat__skip_metric_event - Skip the evsel in the Default metricgroup,
+ *				  if it's not running or not the metric event.
+ */
+bool perf_stat__skip_metric_event(struct evsel *evsel,
+				  struct rblist *metric_events,
+				  u64 ena, u64 run)
+{
+	if (!evsel->default_metricgroup)
+		return false;
+
+	if (!ena || !run)
+		return true;
+
+	return !metricgroup__lookup(metric_events, evsel, false);
+}
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 7abff7cbb5a1..934f79778cea 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -158,11 +158,16 @@ typedef void (*print_metric_t)(struct perf_stat_config *config,
 			       const char *fmt, double val);
 typedef void (*new_line_t)(struct perf_stat_config *config, void *ctx);
 
+/* Used to print the display name of the Default metricgroup for now. */
+typedef void (*print_metricgroup_header_t)(struct perf_stat_config *config,
+					   void *ctx, const char *metricgroup_name);
+
 void perf_stat__reset_shadow_stats(void);
 struct perf_stat_output_ctx {
 	void *ctx;
 	print_metric_t print_metric;
 	new_line_t new_line;
+	print_metricgroup_header_t print_metricgroup_header;
 	bool force_header;
 };
 
@@ -171,6 +176,16 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 				   double avg, int aggr_idx,
 				   struct perf_stat_output_ctx *out,
 				   struct rblist *metric_events);
+bool perf_stat__skip_metric_event(struct evsel *evsel,
+				  struct rblist *metric_events,
+				  u64 ena, u64 run);
+void *perf_stat__print_shadow_stats_metricgroup(struct perf_stat_config *config,
+						struct evsel *evsel,
+						int aggr_idx,
+						int *num,
+						void *from,
+						struct perf_stat_output_ctx *out,
+						struct rblist *metric_events);
 
 int evlist__alloc_stats(struct perf_stat_config *config,
 			struct evlist *evlist, bool alloc_raw);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V4 3/5] perf test: Move all the check functions of stat csv output to lib
  2023-06-16  3:14 [PATCH V4 0/5] New metricgroup output in perf stat default mode kan.liang
  2023-06-16  3:14 ` [PATCH V4 1/5] perf metrics: Sort the Default metricgroup kan.liang
  2023-06-16  3:14 ` [PATCH V4 2/5] perf stat: New metricgroup output for the default mode kan.liang
@ 2023-06-16  3:14 ` kan.liang
  2023-06-16  3:14 ` [PATCH V4 4/5] perf test: Add test case for the standard perf stat output kan.liang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: kan.liang @ 2023-06-16  3:14 UTC (permalink / raw)
  To: acme, mingo, peterz, irogers, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel
  Cc: ak, eranian, ahmad.yasin, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

These functions can be shared with the stat std output test.

There is no functional change.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 tools/perf/tests/shell/lib/stat_output.sh | 169 +++++++++++++++++++
 tools/perf/tests/shell/stat+csv_output.sh | 188 ++--------------------
 2 files changed, 184 insertions(+), 173 deletions(-)
 create mode 100755 tools/perf/tests/shell/lib/stat_output.sh

diff --git a/tools/perf/tests/shell/lib/stat_output.sh b/tools/perf/tests/shell/lib/stat_output.sh
new file mode 100755
index 000000000000..363979b1123d
--- /dev/null
+++ b/tools/perf/tests/shell/lib/stat_output.sh
@@ -0,0 +1,169 @@
+# SPDX-License-Identifier: GPL-2.0
+
+# Return true if perf_event_paranoid is > $1 and not running as root.
+function ParanoidAndNotRoot()
+{
+	 [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ]
+}
+
+# $1 name $2 extra_opt
+check_no_args()
+{
+        echo -n "Checking $1 output: no args"
+        perf stat $2 true
+        commachecker --no-args
+        echo "[Success]"
+}
+
+check_system_wide()
+{
+	echo -n "Checking $1 output: system wide "
+	if ParanoidAndNotRoot 0
+	then
+		echo "[Skip] paranoid and not root"
+		return
+	fi
+	perf stat -a $2 true
+	commachecker --system-wide
+	echo "[Success]"
+}
+
+check_system_wide_no_aggr()
+{
+	echo -n "Checking $1 output: system wide no aggregation "
+	if ParanoidAndNotRoot 0
+	then
+		echo "[Skip] paranoid and not root"
+		return
+	fi
+	perf stat -A -a --no-merge $2 true
+	commachecker --system-wide-no-aggr
+	echo "[Success]"
+}
+
+check_interval()
+{
+	echo -n "Checking $1 output: interval "
+	perf stat -I 1000 $2 true
+	commachecker --interval
+	echo "[Success]"
+}
+
+check_event()
+{
+	echo -n "Checking $1 output: event "
+	perf stat -e cpu-clock $2 true
+	commachecker --event
+	echo "[Success]"
+}
+
+check_per_core()
+{
+	echo -n "Checking $1 output: per core "
+	if ParanoidAndNotRoot 0
+	then
+		echo "[Skip] paranoid and not root"
+		return
+	fi
+	perf stat --per-core -a $2 true
+	commachecker --per-core
+	echo "[Success]"
+}
+
+check_per_thread()
+{
+	echo -n "Checking $1 output: per thread "
+	if ParanoidAndNotRoot 0
+	then
+		echo "[Skip] paranoid and not root"
+		return
+	fi
+	perf stat --per-thread -a $2 true
+	commachecker --per-thread
+	echo "[Success]"
+}
+
+check_per_cache_instance()
+{
+	echo -n "Checking $1 output: per cache instance "
+	if ParanoidAndNotRoot 0
+	then
+		echo "[Skip] paranoid and not root"
+		return
+	fi
+	perf stat --per-cache -a $2 true
+	commachecker --per-cache
+	echo "[Success]"
+}
+
+check_per_die()
+{
+	echo -n "Checking $1 output: per die "
+	if ParanoidAndNotRoot 0
+	then
+		echo "[Skip] paranoid and not root"
+		return
+	fi
+	perf stat --per-die -a $2 true
+	commachecker --per-die
+	echo "[Success]"
+}
+
+check_per_node()
+{
+	echo -n "Checking $1 output: per node "
+	if ParanoidAndNotRoot 0
+	then
+		echo "[Skip] paranoid and not root"
+		return
+	fi
+	perf stat --per-node -a $2 true
+	commachecker --per-node
+	echo "[Success]"
+}
+
+check_per_socket()
+{
+	echo -n "Checking $1 output: per socket "
+	if ParanoidAndNotRoot 0
+	then
+		echo "[Skip] paranoid and not root"
+		return
+	fi
+	perf stat --per-socket -a $2 true
+	commachecker --per-socket
+	echo "[Success]"
+}
+
+# The perf stat options for per-socket, per-core, per-die
+# and -A ( no_aggr mode ) uses the info fetched from this
+# directory: "/sys/devices/system/cpu/cpu*/topology". For
+# example, socket value is fetched from "physical_package_id"
+# file in topology directory.
+# Reference: cpu__get_topology_int in util/cpumap.c
+# If the platform doesn't expose topology information, values
+# will be set to -1. For example, incase of pSeries platform
+# of powerpc, value for  "physical_package_id" is restricted
+# and set to -1. Check here validates the socket-id read from
+# topology file before proceeding further
+
+FILE_LOC="/sys/devices/system/cpu/cpu*/topology/"
+FILE_NAME="physical_package_id"
+
+function check_for_topology()
+{
+	if ! ParanoidAndNotRoot 0
+	then
+		socket_file=`ls $FILE_LOC/$FILE_NAME | head -n 1`
+		[ -z $socket_file ] && {
+			echo 0
+			return
+		}
+		socket_id=`cat $socket_file`
+		[ $socket_id == -1 ] && {
+			echo 1
+			return
+		}
+	fi
+	echo 0
+}
diff --git a/tools/perf/tests/shell/stat+csv_output.sh b/tools/perf/tests/shell/stat+csv_output.sh
index ed082daf839c..34a0701fee05 100755
--- a/tools/perf/tests/shell/stat+csv_output.sh
+++ b/tools/perf/tests/shell/stat+csv_output.sh
@@ -6,7 +6,8 @@
 
 set -e
 
-skip_test=0
+. $(dirname $0)/lib/stat_output.sh
+
 csv_sep=@
 
 stat_output=$(mktemp /tmp/__perf_test.stat_output.csv.XXXXX)
@@ -63,181 +64,22 @@ function commachecker()
 	return 0
 }
 
-# Return true if perf_event_paranoid is > $1 and not running as root.
-function ParanoidAndNotRoot()
-{
-	 [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ]
-}
-
-check_no_args()
-{
-	echo -n "Checking CSV output: no args "
-	perf stat -x$csv_sep -o "${stat_output}" true
-        commachecker --no-args
-	echo "[Success]"
-}
-
-check_system_wide()
-{
-	echo -n "Checking CSV output: system wide "
-	if ParanoidAndNotRoot 0
-	then
-		echo "[Skip] paranoid and not root"
-		return
-	fi
-	perf stat -x$csv_sep -a -o "${stat_output}" true
-        commachecker --system-wide
-	echo "[Success]"
-}
-
-check_system_wide_no_aggr()
-{
-	echo -n "Checking CSV output: system wide no aggregation "
-	if ParanoidAndNotRoot 0
-	then
-		echo "[Skip] paranoid and not root"
-		return
-	fi
-	perf stat -x$csv_sep -A -a --no-merge -o "${stat_output}" true
-        commachecker --system-wide-no-aggr
-	echo "[Success]"
-}
-
-check_interval()
-{
-	echo -n "Checking CSV output: interval "
-	perf stat -x$csv_sep -I 1000 -o "${stat_output}" true
-        commachecker --interval
-	echo "[Success]"
-}
-
-
-check_event()
-{
-	echo -n "Checking CSV output: event "
-	perf stat -x$csv_sep -e cpu-clock -o "${stat_output}" true
-        commachecker --event
-	echo "[Success]"
-}
-
-check_per_core()
-{
-	echo -n "Checking CSV output: per core "
-	if ParanoidAndNotRoot 0
-	then
-		echo "[Skip] paranoid and not root"
-		return
-	fi
-	perf stat -x$csv_sep --per-core -a -o "${stat_output}" true
-        commachecker --per-core
-	echo "[Success]"
-}
-
-check_per_thread()
-{
-	echo -n "Checking CSV output: per thread "
-	if ParanoidAndNotRoot 0
-	then
-		echo "[Skip] paranoid and not root"
-		return
-	fi
-	perf stat -x$csv_sep --per-thread -a -o "${stat_output}" true
-        commachecker --per-thread
-	echo "[Success]"
-}
-
-check_per_cache_instance()
-{
-	echo -n "Checking CSV output: per cache instance "
-	if ParanoidAndNotRoot 0
-	then
-		echo "[Skip] paranoid and not root"
-		return
-	fi
-	perf stat -x$csv_sep --per-cache -a true 2>&1 | commachecker --per-cache
-	echo "[Success]"
-}
-
-check_per_die()
-{
-	echo -n "Checking CSV output: per die "
-	if ParanoidAndNotRoot 0
-	then
-		echo "[Skip] paranoid and not root"
-		return
-	fi
-	perf stat -x$csv_sep --per-die -a -o "${stat_output}" true
-        commachecker --per-die
-	echo "[Success]"
-}
-
-check_per_node()
-{
-	echo -n "Checking CSV output: per node "
-	if ParanoidAndNotRoot 0
-	then
-		echo "[Skip] paranoid and not root"
-		return
-	fi
-	perf stat -x$csv_sep --per-node -a -o "${stat_output}" true
-        commachecker --per-node
-	echo "[Success]"
-}
-
-check_per_socket()
-{
-	echo -n "Checking CSV output: per socket "
-	if ParanoidAndNotRoot 0
-	then
-		echo "[Skip] paranoid and not root"
-		return
-	fi
-	perf stat -x$csv_sep --per-socket -a -o "${stat_output}" true
-        commachecker --per-socket
-	echo "[Success]"
-}
-
-# The perf stat options for per-socket, per-core, per-die
-# and -A ( no_aggr mode ) uses the info fetched from this
-# directory: "/sys/devices/system/cpu/cpu*/topology". For
-# example, socket value is fetched from "physical_package_id"
-# file in topology directory.
-# Reference: cpu__get_topology_int in util/cpumap.c
-# If the platform doesn't expose topology information, values
-# will be set to -1. For example, incase of pSeries platform
-# of powerpc, value for  "physical_package_id" is restricted
-# and set to -1. Check here validates the socket-id read from
-# topology file before proceeding further
-
-FILE_LOC="/sys/devices/system/cpu/cpu*/topology/"
-FILE_NAME="physical_package_id"
-
-check_for_topology()
-{
-	if ! ParanoidAndNotRoot 0
-	then
-		socket_file=`ls $FILE_LOC/$FILE_NAME | head -n 1`
-		[ -z $socket_file ] && return 0
-		socket_id=`cat $socket_file`
-		[ $socket_id == -1 ] && skip_test=1
-		return 0
-	fi
-}
+perf_cmd="-x$csv_sep -o ${stat_output}"
 
-check_for_topology
-check_no_args
-check_system_wide
-check_interval
-check_event
-check_per_thread
-check_per_node
+skip_test=$(check_for_topology)
+check_no_args "CSV" "$perf_cmd"
+check_system_wide "CSV" "$perf_cmd"
+check_interval "CSV" "$perf_cmd"
+check_event "CSV" "$perf_cmd"
+check_per_thread "CSV" "$perf_cmd"
+check_per_node "CSV" "$perf_cmd"
 if [ $skip_test -ne 1 ]
 then
-	check_system_wide_no_aggr
-	check_per_core
-	check_per_cache_instance
-	check_per_die
-	check_per_socket
+	check_system_wide_no_aggr "CSV" "$perf_cmd"
+	check_per_core "CSV" "$perf_cmd"
+	check_per_cache_instance "CSV" "$perf_cmd"
+	check_per_die "CSV" "$perf_cmd"
+	check_per_socket "CSV" "$perf_cmd"
 else
 	echo "[Skip] Skipping tests for system_wide_no_aggr, per_core, per_die and per_socket since socket id exposed via topology is invalid"
 fi
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V4 4/5] perf test: Add test case for the standard perf stat output
  2023-06-16  3:14 [PATCH V4 0/5] New metricgroup output in perf stat default mode kan.liang
                   ` (2 preceding siblings ...)
  2023-06-16  3:14 ` [PATCH V4 3/5] perf test: Move all the check functions of stat csv output to lib kan.liang
@ 2023-06-16  3:14 ` kan.liang
  2023-06-16  3:14 ` [PATCH V4 5/5] perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics kan.liang
  2023-06-16  5:59 ` [PATCH V4 0/5] New metricgroup output in perf stat default mode Ian Rogers
  5 siblings, 0 replies; 14+ messages in thread
From: kan.liang @ 2023-06-16  3:14 UTC (permalink / raw)
  To: acme, mingo, peterz, irogers, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel
  Cc: ak, eranian, ahmad.yasin, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

Add a new test case to verify the standard perf stat output with
different options.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 tools/perf/tests/shell/stat+std_output.sh | 108 ++++++++++++++++++++++
 1 file changed, 108 insertions(+)
 create mode 100755 tools/perf/tests/shell/stat+std_output.sh

diff --git a/tools/perf/tests/shell/stat+std_output.sh b/tools/perf/tests/shell/stat+std_output.sh
new file mode 100755
index 000000000000..98cc3356a04a
--- /dev/null
+++ b/tools/perf/tests/shell/stat+std_output.sh
@@ -0,0 +1,108 @@
+#!/bin/bash
+# perf stat STD output linter
+# SPDX-License-Identifier: GPL-2.0
+# Tests various perf stat STD output commands for
+# default event and metricgroup
+
+set -e
+
+. $(dirname $0)/lib/stat_output.sh
+
+stat_output=$(mktemp /tmp/__perf_test.stat_output.std.XXXXX)
+
+event_name=(cpu-clock task-clock context-switches cpu-migrations page-faults cycles instructions branches branch-misses stalled-cycles-frontend stalled-cycles-backend)
+event_metric=("CPUs utilized" "CPUs utilized" "/sec" "/sec" "/sec" "GHz" "insn per cycle" "/sec" "of all branches" "frontend cycles idle" "backend cycles idle")
+
+metricgroup_name=(TopdownL1 TopdownL2)
+
+cleanup() {
+  rm -f "${stat_output}"
+
+  trap - EXIT TERM INT
+}
+
+trap_cleanup() {
+  cleanup
+  exit 1
+}
+trap trap_cleanup EXIT TERM INT
+
+function commachecker()
+{
+	local -i cnt=0
+	local prefix=1
+
+	case "$1"
+	in "--interval")	prefix=2
+	;; "--per-thread")	prefix=2
+	;; "--system-wide-no-aggr")	prefix=2
+	;; "--per-core")	prefix=3
+	;; "--per-socket")	prefix=3
+	;; "--per-node")	prefix=3
+	;; "--per-die")		prefix=3
+	;; "--per-cache")	prefix=3
+	esac
+
+	while read line
+	do
+		# Ignore initial "started on" comment.
+		x=${line:0:1}
+		[ "$x" = "#" ] && continue
+		# Ignore initial blank line.
+		[ "$line" = "" ] && continue
+		# Ignore "Performance counter stats"
+		x=${line:0:25}
+		[ "$x" = "Performance counter stats" ] && continue
+		# Ignore "seconds time elapsed" and break
+		[[ "$line" == *"time elapsed"* ]] && break
+
+		main_body=$(echo $line | cut -d' ' -f$prefix-)
+		x=${main_body%#*}
+		# Check default metricgroup
+		y=$(echo $x | tr -d ' ')
+		[ "$y" = "" ] && continue
+		for i in "${!metricgroup_name[@]}"; do
+			[[ "$y" == *"${metricgroup_name[$i]}"* ]] && break
+		done
+		[[ "$y" == *"${metricgroup_name[$i]}"* ]] && continue
+
+		# Check default event
+		for i in "${!event_name[@]}"; do
+			[[ "$x" == *"${event_name[$i]}"* ]] && break
+		done
+
+		[[ ! "$x" == *"${event_name[$i]}"* ]] && {
+			echo "Unknown event name in $line" 1>&2
+			exit 1;
+		}
+
+		# Check event metric if it exists
+		[[ ! "$main_body" == *"#"* ]] && continue
+		[[ ! "$main_body" == *"${event_metric[$i]}"* ]] && {
+			echo "wrong event metric. expected ${event_metric[$i]} in $line" 1>&2
+			exit 1;
+		}
+	done < "${stat_output}"
+	return 0
+}
+
+perf_cmd="-o ${stat_output}"
+
+skip_test=$(check_for_topology)
+check_no_args "STD" "$perf_cmd"
+check_system_wide "STD" "$perf_cmd"
+check_interval "STD" "$perf_cmd"
+check_per_thread "STD" "$perf_cmd"
+check_per_node "STD" "$perf_cmd"
+if [ $skip_test -ne 1 ]
+then
+	check_system_wide_no_aggr "STD" "$perf_cmd"
+	check_per_core "STD" "$perf_cmd"
+	check_per_cache_instance "STD" "$perf_cmd"
+	check_per_die "STD" "$perf_cmd"
+	check_per_socket "STD" "$perf_cmd"
+else
+	echo "[Skip] Skipping tests for system_wide_no_aggr, per_core, per_die and per_socket since socket id exposed via topology is invalid"
+fi
+cleanup
+exit 0
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V4 5/5] perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics
  2023-06-16  3:14 [PATCH V4 0/5] New metricgroup output in perf stat default mode kan.liang
                   ` (3 preceding siblings ...)
  2023-06-16  3:14 ` [PATCH V4 4/5] perf test: Add test case for the standard perf stat output kan.liang
@ 2023-06-16  3:14 ` kan.liang
  2023-06-16  5:57   ` Ian Rogers
  2023-06-16 13:48   ` John Garry
  2023-06-16  5:59 ` [PATCH V4 0/5] New metricgroup output in perf stat default mode Ian Rogers
  5 siblings, 2 replies; 14+ messages in thread
From: kan.liang @ 2023-06-16  3:14 UTC (permalink / raw)
  To: acme, mingo, peterz, irogers, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel
  Cc: ak, eranian, ahmad.yasin, Kan Liang, John Garry

From: Kan Liang <kan.liang@linux.intel.com>

Add the default tags for Hisi hip08 as well.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: John Garry <john.g.garry@oracle.com>
---
 .../arch/arm64/hisilicon/hip08/metrics.json          | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json b/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json
index 6443a061e22a..6463531b9941 100644
--- a/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json
@@ -3,28 +3,32 @@
         "MetricExpr": "FETCH_BUBBLE / (4 * CPU_CYCLES)",
         "PublicDescription": "Frontend bound L1 topdown metric",
         "BriefDescription": "Frontend bound L1 topdown metric",
-        "MetricGroup": "TopDownL1",
+        "DefaultMetricgroupName": "TopDownL1",
+        "MetricGroup": "Default;TopDownL1",
         "MetricName": "frontend_bound"
     },
     {
         "MetricExpr": "(INST_SPEC - INST_RETIRED) / (4 * CPU_CYCLES)",
         "PublicDescription": "Bad Speculation L1 topdown metric",
         "BriefDescription": "Bad Speculation L1 topdown metric",
-        "MetricGroup": "TopDownL1",
+        "DefaultMetricgroupName": "TopDownL1",
+        "MetricGroup": "Default;TopDownL1",
         "MetricName": "bad_speculation"
     },
     {
         "MetricExpr": "INST_RETIRED / (CPU_CYCLES * 4)",
         "PublicDescription": "Retiring L1 topdown metric",
         "BriefDescription": "Retiring L1 topdown metric",
-        "MetricGroup": "TopDownL1",
+        "DefaultMetricgroupName": "TopDownL1",
+        "MetricGroup": "Default;TopDownL1",
         "MetricName": "retiring"
     },
     {
         "MetricExpr": "1 - (frontend_bound + bad_speculation + retiring)",
         "PublicDescription": "Backend Bound L1 topdown metric",
         "BriefDescription": "Backend Bound L1 topdown metric",
-        "MetricGroup": "TopDownL1",
+        "DefaultMetricgroupName": "TopDownL1",
+        "MetricGroup": "Default;TopDownL1",
         "MetricName": "backend_bound"
     },
     {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH V4 1/5] perf metrics: Sort the Default metricgroup
  2023-06-16  3:14 ` [PATCH V4 1/5] perf metrics: Sort the Default metricgroup kan.liang
@ 2023-06-16  5:48   ` Ian Rogers
  0 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2023-06-16  5:48 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, mingo, peterz, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel, ak, eranian, ahmad.yasin

On Thu, Jun 15, 2023 at 8:14 PM <kan.liang@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> The new default mode will print the metrics as a metric group. The
> metrics from the same metric group must be adjacent to each other in the
> metric list. But the metric_list_cmp() sorts metrics by the number of
> events.
>
> Add a new sort for the Default metricgroup, which sorts by
> default_metricgroup_name and metric_name.
>
> Add is_default in the struct metric_event to indicate that it's from
> the Default metricgroup.
>
> Store the displayed metricgroup name of the Default metricgroup into
> the metric expr for output.
>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>

Reviewed-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

> ---
>  tools/perf/util/metricgroup.c | 26 ++++++++++++++++++++++++++
>  tools/perf/util/metricgroup.h |  3 +++
>  2 files changed, 29 insertions(+)
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 8b19644ade7d..a6a5ed44a679 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -79,6 +79,7 @@ static struct rb_node *metric_event_new(struct rblist *rblist __maybe_unused,
>                 return NULL;
>         memcpy(me, entry, sizeof(struct metric_event));
>         me->evsel = ((struct metric_event *)entry)->evsel;
> +       me->is_default = false;
>         INIT_LIST_HEAD(&me->head);
>         return &me->nd;
>  }
> @@ -1160,6 +1161,25 @@ static int metric_list_cmp(void *priv __maybe_unused, const struct list_head *l,
>         return right_count - left_count;
>  }
>
> +/**
> + * default_metricgroup_cmp - Implements complex key for the Default metricgroup

nit: what is the meaning of complex key here?

> + *                          that first sorts by default_metricgroup_name, then
> + *                          metric_name.
> + */
> +static int default_metricgroup_cmp(void *priv __maybe_unused,
> +                                  const struct list_head *l,
> +                                  const struct list_head *r)
> +{
> +       const struct metric *left = container_of(l, struct metric, nd);
> +       const struct metric *right = container_of(r, struct metric, nd);
> +       int diff = strcmp(right->default_metricgroup_name, left->default_metricgroup_name);
> +
> +       if (diff)
> +               return diff;
> +
> +       return strcmp(right->metric_name, left->metric_name);
> +}
> +
>  struct metricgroup__add_metric_data {
>         struct list_head *list;
>         const char *pmu;
> @@ -1515,6 +1535,7 @@ static int parse_groups(struct evlist *perf_evlist,
>         LIST_HEAD(metric_list);
>         struct metric *m;
>         bool tool_events[PERF_TOOL_MAX] = {false};
> +       bool is_default = !strcmp(str, "Default");
>         int ret;
>
>         if (metric_events_list->nr_entries == 0)
> @@ -1549,6 +1570,9 @@ static int parse_groups(struct evlist *perf_evlist,
>                         goto out;
>         }
>
> +       if (is_default)
> +               list_sort(NULL, &metric_list, default_metricgroup_cmp);
> +
>         list_for_each_entry(m, &metric_list, nd) {
>                 struct metric_event *me;
>                 struct evsel **metric_events;
> @@ -1637,6 +1661,8 @@ static int parse_groups(struct evlist *perf_evlist,
>                 expr->metric_unit = m->metric_unit;
>                 expr->metric_events = metric_events;
>                 expr->runtime = m->pctx->sctx.runtime;
> +               expr->default_metricgroup_name = m->default_metricgroup_name;
> +               me->is_default = is_default;
>                 list_add(&expr->nd, &me->head);
>         }
>
> diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
> index bf18274c15df..d5325c6ec8e1 100644
> --- a/tools/perf/util/metricgroup.h
> +++ b/tools/perf/util/metricgroup.h
> @@ -22,6 +22,7 @@ struct cgroup;
>  struct metric_event {
>         struct rb_node nd;
>         struct evsel *evsel;
> +       bool is_default; /* the metric evsel from the Default metricgroup */
>         struct list_head head; /* list of metric_expr */
>  };
>
> @@ -55,6 +56,8 @@ struct metric_expr {
>          * more human intelligible) and then add "MiB" afterward when displayed.
>          */
>         const char *metric_unit;
> +       /** Displayed metricgroup name of the Default metricgroup */
> +       const char *default_metricgroup_name;
>         /** Null terminated array of events used by the metric. */
>         struct evsel **metric_events;
>         /** Null terminated array of referenced metrics. */
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V4 2/5] perf stat: New metricgroup output for the default mode
  2023-06-16  3:14 ` [PATCH V4 2/5] perf stat: New metricgroup output for the default mode kan.liang
@ 2023-06-16  5:56   ` Ian Rogers
  2023-06-16 13:23     ` Liang, Kan
  0 siblings, 1 reply; 14+ messages in thread
From: Ian Rogers @ 2023-06-16  5:56 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, mingo, peterz, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel, ak, eranian, ahmad.yasin

On Thu, Jun 15, 2023 at 8:15 PM <kan.liang@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> In the default mode, the current output of the metricgroup include both
> events and metrics, which is not necessary and just makes the output
> hard to read. Since different ARCHs (even different generations in the
> same ARCH) may use different events. The output also vary on different
> platforms.
>
> For a metricgroup, only outputting the value of each metric is good
> enough.
>
> Add a new field default_metricgroup in evsel to indicate an event of
> the default metricgroup. For those events, printout() should print
> the metricgroup name rather than each event.
>
> Add perf_stat__skip_metric_event() to skip the evsel in the Default
> metricgroup, if it's not running or not the metric event.
>
> Add print_metricgroup_header_t to pass the functions which print the
> display name of each metricgroup in the Default metricgroup. Support
> all three output methods.
>
> Factor out perf_stat__print_shadow_stats_metricgroup() to print out
> each metrics.
>
> On SPR
> Before:
>
>  ./perf_old stat sleep 1
>
>  Performance counter stats for 'sleep 1':
>
>               0.54 msec task-clock:u                     #    0.001 CPUs utilized
>                  0      context-switches:u               #    0.000 /sec
>                  0      cpu-migrations:u                 #    0.000 /sec
>                 68      page-faults:u                    #  125.445 K/sec
>            540,970      cycles:u                         #    0.998 GHz
>            556,325      instructions:u                   #    1.03  insn per cycle
>            123,602      branches:u                       #  228.018 M/sec
>              6,889      branch-misses:u                  #    5.57% of all branches
>          3,245,820      TOPDOWN.SLOTS:u                  #     18.4 %  tma_backend_bound
>                                                   #     17.2 %  tma_retiring
>                                                   #     23.1 %  tma_bad_speculation
>                                                   #     41.4 %  tma_frontend_bound
>            564,859      topdown-retiring:u
>          1,370,999      topdown-fe-bound:u
>            603,271      topdown-be-bound:u
>            744,874      topdown-bad-spec:u
>             12,661      INT_MISC.UOP_DROPPING:u          #   23.357 M/sec
>
>        1.001798215 seconds time elapsed
>
>        0.000193000 seconds user
>        0.001700000 seconds sys
>
> After:
>
> $ ./perf stat sleep 1
>
>  Performance counter stats for 'sleep 1':
>
>               0.51 msec task-clock:u                     #    0.001 CPUs utilized
>                  0      context-switches:u               #    0.000 /sec
>                  0      cpu-migrations:u                 #    0.000 /sec
>                 68      page-faults:u                    #  132.683 K/sec
>            545,228      cycles:u                         #    1.064 GHz
>            555,509      instructions:u                   #    1.02  insn per cycle
>            123,574      branches:u                       #  241.120 M/sec
>              6,957      branch-misses:u                  #    5.63% of all branches
>                         TopdownL1                 #     17.5 %  tma_backend_bound
>                                                   #     22.6 %  tma_bad_speculation
>                                                   #     42.7 %  tma_frontend_bound
>                                                   #     17.1 %  tma_retiring
>                         TopdownL2                 #     21.8 %  tma_branch_mispredicts
>                                                   #     11.5 %  tma_core_bound
>                                                   #     13.4 %  tma_fetch_bandwidth
>                                                   #     29.3 %  tma_fetch_latency
>                                                   #      2.7 %  tma_heavy_operations
>                                                   #     14.5 %  tma_light_operations
>                                                   #      0.8 %  tma_machine_clears
>                                                   #      6.1 %  tma_memory_bound
>
>        1.001712086 seconds time elapsed
>
>        0.000151000 seconds user
>        0.001618000 seconds sys
>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>

Reviewed-by: Ian Rogers <irogers@google.com>

Just an observation: I think a lot of the "default" terminology is
confusing as default means more like automatically selected when no
event or metric are given. Reading default as meaning default makes
the comments on perf_stat__print_shadow_stats_metricgroup somewhat
counter intuitive.

Thanks,
Ian

> ---
>  tools/perf/builtin-stat.c      |   1 +
>  tools/perf/util/evsel.h        |   1 +
>  tools/perf/util/stat-display.c | 108 ++++++++++++++++++++++++---
>  tools/perf/util/stat-shadow.c  | 131 ++++++++++++++++++++++++++++++---
>  tools/perf/util/stat.h         |  15 ++++
>  5 files changed, 234 insertions(+), 22 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 55601b4b5c34..3f4e76f76f94 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -2172,6 +2172,7 @@ static int add_default_attributes(void)
>
>                         evlist__for_each_entry(metric_evlist, metric_evsel) {
>                                 metric_evsel->skippable = true;
> +                               metric_evsel->default_metricgroup = true;
>                         }
>                         evlist__splice_list_tail(evsel_list, &metric_evlist->core.entries);
>                         evlist__delete(metric_evlist);
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index cc6fb3049b99..9f06d6cd5379 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -131,6 +131,7 @@ struct evsel {
>         bool                    reset_group;
>         bool                    errored;
>         bool                    needs_auxtrace_mmap;
> +       bool                    default_metricgroup; /* A member of the Default metricgroup */
>         struct hashmap          *per_pkg_mask;
>         int                     err;
>         struct {
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index a2bbdc25d979..7329b3340f88 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -25,6 +25,7 @@
>  #define CNTR_NOT_SUPPORTED     "<not supported>"
>  #define CNTR_NOT_COUNTED       "<not counted>"
>
> +#define MGROUP_LEN   50
>  #define METRIC_LEN   38
>  #define EVNAME_LEN   32
>  #define COUNTS_LEN   18
> @@ -364,16 +365,27 @@ static void new_line_std(struct perf_stat_config *config __maybe_unused,
>         os->newline = true;
>  }
>
> -static void do_new_line_std(struct perf_stat_config *config,
> -                           struct outstate *os)
> +static inline void __new_line_std_csv(struct perf_stat_config *config,
> +                                     struct outstate *os)
>  {
>         fputc('\n', os->fh);
>         if (os->prefix)
>                 fputs(os->prefix, os->fh);
>         aggr_printout(config, os->evsel, os->id, os->aggr_nr);
> +}
> +
> +static inline void __new_line_std(struct outstate *os)
> +{
> +       fprintf(os->fh, "                                                 ");
> +}
> +
> +static void do_new_line_std(struct perf_stat_config *config,
> +                           struct outstate *os)
> +{
> +       __new_line_std_csv(config, os);
>         if (config->aggr_mode == AGGR_NONE)
>                 fprintf(os->fh, "        ");
> -       fprintf(os->fh, "                                                 ");
> +       __new_line_std(os);
>  }
>
>  static void print_metric_std(struct perf_stat_config *config,
> @@ -408,10 +420,7 @@ static void new_line_csv(struct perf_stat_config *config, void *ctx)
>         struct outstate *os = ctx;
>         int i;
>
> -       fputc('\n', os->fh);
> -       if (os->prefix)
> -               fprintf(os->fh, "%s", os->prefix);
> -       aggr_printout(config, os->evsel, os->id, os->aggr_nr);
> +       __new_line_std_csv(config, os);
>         for (i = 0; i < os->nfields; i++)
>                 fputs(config->csv_sep, os->fh);
>  }
> @@ -462,6 +471,54 @@ static void new_line_json(struct perf_stat_config *config, void *ctx)
>         aggr_printout(config, os->evsel, os->id, os->aggr_nr);
>  }
>
> +static void print_metricgroup_header_json(struct perf_stat_config *config,
> +                                         void *ctx,
> +                                         const char *metricgroup_name)
> +{
> +       if (!metricgroup_name)
> +               return;
> +
> +       fprintf(config->output, "\"metricgroup\" : \"%s\"}", metricgroup_name);
> +       new_line_json(config, ctx);
> +}
> +
> +static void print_metricgroup_header_csv(struct perf_stat_config *config,
> +                                        void *ctx,
> +                                        const char *metricgroup_name)
> +{
> +       struct outstate *os = ctx;
> +       int i;
> +
> +       if (!metricgroup_name) {
> +               /* Leave space for running and enabling */
> +               for (i = 0; i < os->nfields - 2; i++)
> +                       fputs(config->csv_sep, os->fh);
> +               return;
> +       }
> +
> +       for (i = 0; i < os->nfields; i++)
> +               fputs(config->csv_sep, os->fh);
> +       fprintf(config->output, "%s", metricgroup_name);
> +       new_line_csv(config, ctx);
> +}
> +
> +static void print_metricgroup_header_std(struct perf_stat_config *config,
> +                                        void *ctx,
> +                                        const char *metricgroup_name)
> +{
> +       struct outstate *os = ctx;
> +       int n;
> +
> +       if (!metricgroup_name) {
> +               __new_line_std(os);
> +               return;
> +       }
> +
> +       n = fprintf(config->output, " %*s", EVNAME_LEN, metricgroup_name);
> +
> +       fprintf(config->output, "%*s", MGROUP_LEN - n - 1, "");
> +}
> +
>  /* Filter out some columns that don't work well in metrics only mode */
>
>  static bool valid_only_metric(const char *unit)
> @@ -713,19 +770,23 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
>         struct perf_stat_output_ctx out;
>         print_metric_t pm;
>         new_line_t nl;
> +       print_metricgroup_header_t pmh;
>         bool ok = true;
>         struct evsel *counter = os->evsel;
>
>         if (config->csv_output) {
>                 pm = config->metric_only ? print_metric_only_csv : print_metric_csv;
>                 nl = config->metric_only ? new_line_metric : new_line_csv;
> +               pmh = print_metricgroup_header_csv;
>                 os->nfields = 4 + (counter->cgrp ? 1 : 0);
>         } else if (config->json_output) {
>                 pm = config->metric_only ? print_metric_only_json : print_metric_json;
>                 nl = config->metric_only ? new_line_metric : new_line_json;
> +               pmh = print_metricgroup_header_json;
>         } else {
>                 pm = config->metric_only ? print_metric_only : print_metric_std;
>                 nl = config->metric_only ? new_line_metric : new_line_std;
> +               pmh = print_metricgroup_header_std;
>         }
>
>         if (run == 0 || ena == 0 || counter->counts->scaled == -1) {
> @@ -747,10 +808,11 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
>
>         out.print_metric = pm;
>         out.new_line = nl;
> +       out.print_metricgroup_header = pmh;
>         out.ctx = os;
>         out.force_header = false;
>
> -       if (!config->metric_only) {
> +       if (!config->metric_only && !counter->default_metricgroup) {
>                 abs_printout(config, os->id, os->aggr_nr, counter, uval, ok);
>
>                 print_noise(config, counter, noise, /*before_metric=*/true);
> @@ -758,8 +820,31 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
>         }
>
>         if (ok) {
> -               perf_stat__print_shadow_stats(config, counter, uval, aggr_idx,
> -                                             &out, &config->metric_events);
> +               if (!config->metric_only && counter->default_metricgroup) {
> +                       void *from = NULL;
> +
> +                       aggr_printout(config, os->evsel, os->id, os->aggr_nr);
> +                       /* Print out all the metricgroup with the same metric event. */
> +                       do {
> +                               int num = 0;
> +
> +                               /* Print out the new line for the next new metricgroup. */
> +                               if (from) {
> +                                       if (config->json_output)
> +                                               new_line_json(config, (void *)os);
> +                                       else
> +                                               __new_line_std_csv(config, os);
> +                               }
> +
> +                               print_noise(config, counter, noise, /*before_metric=*/true);
> +                               print_running(config, run, ena, /*before_metric=*/true);
> +                               from = perf_stat__print_shadow_stats_metricgroup(config, counter, aggr_idx,
> +                                                                                &num, from, &out,
> +                                                                                &config->metric_events);
> +                       } while (from != NULL);
> +               } else
> +                       perf_stat__print_shadow_stats(config, counter, uval, aggr_idx,
> +                                                     &out, &config->metric_events);
>         } else {
>                 pm(config, os, /*color=*/NULL, /*format=*/NULL, /*unit=*/"", /*val=*/0);
>         }
> @@ -889,6 +974,9 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
>         ena = aggr->counts.ena;
>         run = aggr->counts.run;
>
> +       if (perf_stat__skip_metric_event(counter, &config->metric_events, ena, run))
> +               return;
> +
>         if (val == 0 && should_skip_zero_counter(config, counter, &id))
>                 return;
>
> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> index 1566a206ba42..1c5c3eeba4cf 100644
> --- a/tools/perf/util/stat-shadow.c
> +++ b/tools/perf/util/stat-shadow.c
> @@ -539,6 +539,106 @@ double test_generic_metric(struct metric_expr *mexp, int aggr_idx)
>         return ratio;
>  }
>
> +static void perf_stat__print_metricgroup_header(struct perf_stat_config *config,
> +                                               struct evsel *evsel,
> +                                               void *ctxp,
> +                                               const char *name,
> +                                               struct perf_stat_output_ctx *out)
> +{
> +       bool need_full_name = perf_pmus__num_core_pmus() > 1;
> +       static const char *last_name;
> +       static const char *last_pmu;
> +       char full_name[64];
> +
> +       /*
> +        * A metricgroup may have several metric events,
> +        * e.g.,TopdownL1 on e-core of ADL.
> +        * The name has been output by the first metric
> +        * event. Only align with other metics from
> +        * different metric events.
> +        */
> +       if (last_name && !strcmp(last_name, name)) {
> +               if (!need_full_name || !strcmp(last_pmu, evsel->pmu_name)) {
> +                       out->print_metricgroup_header(config, ctxp, NULL);
> +                       return;
> +               }
> +       }
> +
> +       if (need_full_name)
> +               scnprintf(full_name, sizeof(full_name), "%s (%s)", name, evsel->pmu_name);
> +       else
> +               scnprintf(full_name, sizeof(full_name), "%s", name);
> +
> +       out->print_metricgroup_header(config, ctxp, full_name);
> +
> +       last_name = name;
> +       last_pmu = evsel->pmu_name;
> +}
> +
> +/**
> + * perf_stat__print_shadow_stats_metricgroup - Print out metrics associated with the evsel
> + *                                            For the non-default, all metrics associated
> + *                                            with the evsel are printed.
> + *                                            For the default mode, only the metrics from
> + *                                            the same metricgroup and the name of the
> + *                                            metricgroup are printed. To print the metrics
> + *                                            from the next metricgroup (if available),
> + *                                            invoke the function with correspoinding
> + *                                            metric_expr.
> + */
> +void *perf_stat__print_shadow_stats_metricgroup(struct perf_stat_config *config,
> +                                               struct evsel *evsel,
> +                                               int aggr_idx,
> +                                               int *num,
> +                                               void *from,
> +                                               struct perf_stat_output_ctx *out,
> +                                               struct rblist *metric_events)
> +{
> +       struct metric_event *me;
> +       struct metric_expr *mexp = from;
> +       void *ctxp = out->ctx;
> +       bool header_printed = false;
> +       const char *name = NULL;
> +
> +       me = metricgroup__lookup(metric_events, evsel, false);
> +       if (me == NULL)
> +               return NULL;
> +
> +       if (!mexp)
> +               mexp = list_first_entry(&me->head, typeof(*mexp), nd);
> +
> +       list_for_each_entry_from(mexp, &me->head, nd) {
> +               /* Print the display name of the Default metricgroup */
> +               if (!config->metric_only && me->is_default) {
> +                       if (!name)
> +                               name = mexp->default_metricgroup_name;
> +                       /*
> +                        * Two or more metricgroup may share the same metric
> +                        * event, e.g., TopdownL1 and TopdownL2 on SPR.
> +                        * Return and print the prefix, e.g., noise, running
> +                        * for the next metricgroup.
> +                        */
> +                       if (strcmp(name, mexp->default_metricgroup_name))
> +                               return (void *)mexp;
> +                       /* Only print the name of the metricgroup once */
> +                       if (!header_printed) {
> +                               header_printed = true;
> +                               perf_stat__print_metricgroup_header(config, evsel, ctxp,
> +                                                                   name, out);
> +                       }
> +               }
> +
> +               if ((*num)++ > 0)
> +                       out->new_line(config, ctxp);
> +               generic_metric(config, mexp->metric_expr, mexp->metric_threshold,
> +                              mexp->metric_events, mexp->metric_refs, evsel->name,
> +                              mexp->metric_name, mexp->metric_unit, mexp->runtime,
> +                              aggr_idx, out);
> +       }
> +
> +       return NULL;
> +}
> +
>  void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>                                    struct evsel *evsel,
>                                    double avg, int aggr_idx,
> @@ -565,7 +665,6 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>         };
>         print_metric_t print_metric = out->print_metric;
>         void *ctxp = out->ctx;
> -       struct metric_event *me;
>         int num = 1;
>
>         if (config->iostat_run) {
> @@ -592,18 +691,26 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>                 }
>         }
>
> -       if ((me = metricgroup__lookup(metric_events, evsel, false)) != NULL) {
> -               struct metric_expr *mexp;
> +       perf_stat__print_shadow_stats_metricgroup(config, evsel, aggr_idx,
> +                                                 &num, NULL, out, metric_events);
>
> -               list_for_each_entry (mexp, &me->head, nd) {
> -                       if (num++ > 0)
> -                               out->new_line(config, ctxp);
> -                       generic_metric(config, mexp->metric_expr, mexp->metric_threshold,
> -                                      mexp->metric_events, mexp->metric_refs, evsel->name,
> -                                      mexp->metric_name, mexp->metric_unit, mexp->runtime,
> -                                      aggr_idx, out);
> -               }
> -       }
>         if (num == 0)
>                 print_metric(config, ctxp, NULL, NULL, NULL, 0);
>  }
> +
> +/**
> + * perf_stat__skip_metric_event - Skip the evsel in the Default metricgroup,
> + *                               if it's not running or not the metric event.
> + */
> +bool perf_stat__skip_metric_event(struct evsel *evsel,
> +                                 struct rblist *metric_events,
> +                                 u64 ena, u64 run)
> +{
> +       if (!evsel->default_metricgroup)
> +               return false;
> +
> +       if (!ena || !run)
> +               return true;
> +
> +       return !metricgroup__lookup(metric_events, evsel, false);
> +}
> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> index 7abff7cbb5a1..934f79778cea 100644
> --- a/tools/perf/util/stat.h
> +++ b/tools/perf/util/stat.h
> @@ -158,11 +158,16 @@ typedef void (*print_metric_t)(struct perf_stat_config *config,
>                                const char *fmt, double val);
>  typedef void (*new_line_t)(struct perf_stat_config *config, void *ctx);
>
> +/* Used to print the display name of the Default metricgroup for now. */
> +typedef void (*print_metricgroup_header_t)(struct perf_stat_config *config,
> +                                          void *ctx, const char *metricgroup_name);
> +
>  void perf_stat__reset_shadow_stats(void);
>  struct perf_stat_output_ctx {
>         void *ctx;
>         print_metric_t print_metric;
>         new_line_t new_line;
> +       print_metricgroup_header_t print_metricgroup_header;
>         bool force_header;
>  };
>
> @@ -171,6 +176,16 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>                                    double avg, int aggr_idx,
>                                    struct perf_stat_output_ctx *out,
>                                    struct rblist *metric_events);
> +bool perf_stat__skip_metric_event(struct evsel *evsel,
> +                                 struct rblist *metric_events,
> +                                 u64 ena, u64 run);
> +void *perf_stat__print_shadow_stats_metricgroup(struct perf_stat_config *config,
> +                                               struct evsel *evsel,
> +                                               int aggr_idx,
> +                                               int *num,
> +                                               void *from,
> +                                               struct perf_stat_output_ctx *out,
> +                                               struct rblist *metric_events);
>
>  int evlist__alloc_stats(struct perf_stat_config *config,
>                         struct evlist *evlist, bool alloc_raw);
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V4 5/5] perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics
  2023-06-16  3:14 ` [PATCH V4 5/5] perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics kan.liang
@ 2023-06-16  5:57   ` Ian Rogers
  2023-06-16 13:48   ` John Garry
  1 sibling, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2023-06-16  5:57 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, mingo, peterz, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel, ak, eranian, ahmad.yasin,
	John Garry

On Thu, Jun 15, 2023 at 8:15 PM <kan.liang@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> Add the default tags for Hisi hip08 as well.
>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Cc: John Garry <john.g.garry@oracle.com>

Acked-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

> ---
>  .../arch/arm64/hisilicon/hip08/metrics.json          | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json b/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json
> index 6443a061e22a..6463531b9941 100644
> --- a/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json
> +++ b/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json
> @@ -3,28 +3,32 @@
>          "MetricExpr": "FETCH_BUBBLE / (4 * CPU_CYCLES)",
>          "PublicDescription": "Frontend bound L1 topdown metric",
>          "BriefDescription": "Frontend bound L1 topdown metric",
> -        "MetricGroup": "TopDownL1",
> +        "DefaultMetricgroupName": "TopDownL1",
> +        "MetricGroup": "Default;TopDownL1",
>          "MetricName": "frontend_bound"
>      },
>      {
>          "MetricExpr": "(INST_SPEC - INST_RETIRED) / (4 * CPU_CYCLES)",
>          "PublicDescription": "Bad Speculation L1 topdown metric",
>          "BriefDescription": "Bad Speculation L1 topdown metric",
> -        "MetricGroup": "TopDownL1",
> +        "DefaultMetricgroupName": "TopDownL1",
> +        "MetricGroup": "Default;TopDownL1",
>          "MetricName": "bad_speculation"
>      },
>      {
>          "MetricExpr": "INST_RETIRED / (CPU_CYCLES * 4)",
>          "PublicDescription": "Retiring L1 topdown metric",
>          "BriefDescription": "Retiring L1 topdown metric",
> -        "MetricGroup": "TopDownL1",
> +        "DefaultMetricgroupName": "TopDownL1",
> +        "MetricGroup": "Default;TopDownL1",
>          "MetricName": "retiring"
>      },
>      {
>          "MetricExpr": "1 - (frontend_bound + bad_speculation + retiring)",
>          "PublicDescription": "Backend Bound L1 topdown metric",
>          "BriefDescription": "Backend Bound L1 topdown metric",
> -        "MetricGroup": "TopDownL1",
> +        "DefaultMetricgroupName": "TopDownL1",
> +        "MetricGroup": "Default;TopDownL1",
>          "MetricName": "backend_bound"
>      },
>      {
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V4 0/5] New metricgroup output in perf stat default mode
  2023-06-16  3:14 [PATCH V4 0/5] New metricgroup output in perf stat default mode kan.liang
                   ` (4 preceding siblings ...)
  2023-06-16  3:14 ` [PATCH V4 5/5] perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics kan.liang
@ 2023-06-16  5:59 ` Ian Rogers
  2023-06-16 13:26   ` Liang, Kan
  5 siblings, 1 reply; 14+ messages in thread
From: Ian Rogers @ 2023-06-16  5:59 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, mingo, peterz, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel, ak, eranian, ahmad.yasin

On Thu, Jun 15, 2023 at 8:14 PM <kan.liang@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> Changes since V3:
> - Move the full name (PMU + metricgroup name) generation from the metric
>   code to the output code. (Ian)
> - Add default tags for Hisi hip08 L1 metrics (John)
> - Some patches have been merged. Drop them from the V4.
>
> Changes since V2:
> - Fixes memory leak (Ian)
>   (Ian, I cannot reproduce the memory leak on all my machines. Please
>    check whether the fix works on your side. Thanks.)
> - Add Reviewed-by tags for several patches.
>
> Changes since V1:
> - Remove EVSEL_EVENT_MASK and use the __evsel__match which is suggested
>   by Ian.
> - Support TopdownL1 on both e-core and p-core of ADL in the default
>   mode. (Ian)
> - Have separate patches for the modifications of metricgroup and output.
>   (Ian)
> - Does 2nd sort for the Default metricgroup. Remove the logic of
>   changing the associated metric event. (Ian)
> - Move all the metric related code to stat-shadow (Ian)
> - Move the commong functions between stat+csv_output and stat+std_output
>   to the lib directory (Ian)
>
> In the default mode, the current output of the metricgroup include both
> events and metrics, which is not necessary and makes the output hard to
> read. Also, different ARCHs (even different generations of the ARCH) may
> have a different output format because of the different events in a
> metrics.
>
> The patch proposes a new output format which only outputting the value
> of each metric and the metricgroup name. It can brings a clean and
> consistent output format among ARCHs and generations.
>
> The patches 1-2 introduce the new metricgroup output.
>
> The patches 3-4 improve the tests to cover the default mode.
>
> The patch 5 update the event list for Hisi hip08.
>
> Here are some examples for the new output.
>
> STD output:
>
> On SPR
>
> perf stat -a sleep 1
>
>  Performance counter stats for 'system wide':
>
>         226,054.13 msec cpu-clock                        #  224.588 CPUs utilized
>                932      context-switches                 #    4.123 /sec
>                224      cpu-migrations                   #    0.991 /sec
>                 76      page-faults                      #    0.336 /sec
>         45,940,682      cycles                           #    0.000 GHz
>         36,676,047      instructions                     #    0.80  insn per cycle
>          7,044,516      branches                         #   31.163 K/sec
>             62,169      branch-misses                    #    0.88% of all branches
>                         TopdownL1                 #     68.7 %  tma_backend_bound
>                                                   #      3.1 %  tma_bad_speculation
>                                                   #     13.0 %  tma_frontend_bound
>                                                   #     15.2 %  tma_retiring
>                         TopdownL2                 #      2.7 %  tma_branch_mispredicts
>                                                   #     19.6 %  tma_core_bound
>                                                   #      4.8 %  tma_fetch_bandwidth
>                                                   #      8.3 %  tma_fetch_latency
>                                                   #      2.9 %  tma_heavy_operations
>                                                   #     12.3 %  tma_light_operations
>                                                   #      0.4 %  tma_machine_clears
>                                                   #     49.1 %  tma_memory_bound
>
>        1.006529767 seconds time elapsed
>
> perf stat -a sleep 1
>
>  Performance counter stats for 'system wide':
>
>          32,127.99 msec cpu-clock                        #   31.992 CPUs utilized
>                240      context-switches                 #    7.470 /sec
>                 32      cpu-migrations                   #    0.996 /sec
>                 74      page-faults                      #    2.303 /sec
>          6,313,960      cpu_core/cycles/                 #    0.000 GHz
>        257,711,907      cpu_atom/cycles/                 #    0.008 GHz                         (54.18%)
>          4,477,162      cpu_core/instructions/           #    0.71  insn per cycle
>         37,721,481      cpu_atom/instructions/           #    5.97  insn per cycle              (63.33%)
>            809,747      cpu_core/branches/               #   25.204 K/sec
>          6,621,226      cpu_atom/branches/               #  206.089 K/sec                       (63.32%)
>             39,667      cpu_core/branch-misses/          #    4.90% of all branches
>          1,032,146      cpu_atom/branch-misses/          #  127.47% of all branches             (63.33%)
>              TopdownL1 (cpu_core)                 #      nan %  tma_backend_bound
>                                                   #      0.0 %  tma_bad_speculation
>                                                   #      nan %  tma_frontend_bound
>                                                   #      nan %  tma_retiring
>              TopdownL1 (cpu_atom)                 #     13.6 %  tma_bad_speculation      (63.36%)
>                                                   #     41.1 %  tma_frontend_bound       (63.54%)
>                                                   #     39.2 %  tma_backend_bound
>                                                   #     39.2 %  tma_backend_bound_aux    (63.93%)
>                                                   #      5.4 %  tma_retiring             (64.15%)
>
>        1.004244114 seconds time elapsed
>
> JSON output
>
> on SPR
>
> perf stat --json -a sleep 1
> {"counter-value" : "225904.823297", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 225904323425, "pcnt-running" : 100.00, "metric-value" : "224.456872", "metric-unit" : "CPUs utilized"}
> {"counter-value" : "986.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 225904108985, "pcnt-running" : 100.00, "metric-value" : "4.364670", "metric-unit" : "/sec"}
> {"counter-value" : "224.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 225904016141, "pcnt-running" : 100.00, "metric-value" : "0.991568", "metric-unit" : "/sec"}
> {"counter-value" : "76.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 225903913270, "pcnt-running" : 100.00, "metric-value" : "0.336425", "metric-unit" : "/sec"}
> {"counter-value" : "48433482.000000", "unit" : "", "event" : "cycles", "event-runtime" : 225903792732, "pcnt-running" : 100.00, "metric-value" : "0.000214", "metric-unit" : "GHz"}
> {"counter-value" : "38620409.000000", "unit" : "", "event" : "instructions", "event-runtime" : 225903657830, "pcnt-running" : 100.00, "metric-value" : "0.797391", "metric-unit" : "insn per cycle"}
> {"counter-value" : "7369473.000000", "unit" : "", "event" : "branches", "event-runtime" : 225903464328, "pcnt-running" : 100.00, "metric-value" : "32.622026", "metric-unit" : "K/sec"}
> {"counter-value" : "54747.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 225903234523, "pcnt-running" : 100.00, "metric-value" : "0.742889", "metric-unit" : "of all branches"}
> {"event-runtime" : 225902840555, "pcnt-running" : 100.00, "metricgroup" : "TopdownL1"}
> {"metric-value" : "69.950631", "metric-unit" : "%  tma_backend_bound"}
> {"metric-value" : "2.771783", "metric-unit" : "%  tma_bad_speculation"}
> {"metric-value" : "12.026074", "metric-unit" : "%  tma_frontend_bound"}
> {"metric-value" : "15.251513", "metric-unit" : "%  tma_retiring"}
> {"event-runtime" : 225902840555, "pcnt-running" : 100.00, "metricgroup" : "TopdownL2"}
> {"metric-value" : "2.351757", "metric-unit" : "%  tma_branch_mispredicts"}
> {"metric-value" : "19.729771", "metric-unit" : "%  tma_core_bound"}
> {"metric-value" : "4.555207", "metric-unit" : "%  tma_fetch_bandwidth"}
> {"metric-value" : "7.470867", "metric-unit" : "%  tma_fetch_latency"}
> {"metric-value" : "2.938808", "metric-unit" : "%  tma_heavy_operations"}
> {"metric-value" : "12.312705", "metric-unit" : "%  tma_light_operations"}
> {"metric-value" : "0.420026", "metric-unit" : "%  tma_machine_clears"}
> {"metric-value" : "50.220860", "metric-unit" : "%  tma_memory_bound"}
>
> On hybrid
>
> perf stat --json -a sleep 1
> {"counter-value" : "32131.530625", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 32131536951, "pcnt-running" : 100.00, "metric-value" : "31.992642", "metric-unit" : "CPUs utilized"}
> {"counter-value" : "328.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 32131525778, "pcnt-running" : 100.00, "metric-value" : "10.208042", "metric-unit" : "/sec"}
> {"counter-value" : "32.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 32131515104, "pcnt-running" : 100.00, "metric-value" : "0.995906", "metric-unit" : "/sec"}
> {"counter-value" : "353.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 32131501396, "pcnt-running" : 100.00, "metric-value" : "10.986094", "metric-unit" : "/sec"}
> {"counter-value" : "18685492.000000", "unit" : "", "event" : "cpu_core/cycles/", "event-runtime" : 16061585292, "pcnt-running" : 100.00, "metric-value" : "0.000582", "metric-unit" : "GHz"}
> {"counter-value" : "255620352.000000", "unit" : "", "event" : "cpu_atom/cycles/", "event-runtime" : 8690268422, "pcnt-running" : 54.00, "metric-value" : "0.007955", "metric-unit" : "GHz"}
> {"counter-value" : "15489913.000000", "unit" : "", "event" : "cpu_core/instructions/", "event-runtime" : 16061582200, "pcnt-running" : 100.00, "metric-value" : "0.828981", "metric-unit" : "insn per cycle"}
> {"counter-value" : "38790161.000000", "unit" : "", "event" : "cpu_atom/instructions/", "event-runtime" : 10163133324, "pcnt-running" : 63.00, "metric-value" : "2.075951", "metric-unit" : "insn per cycle"}
> {"counter-value" : "2908031.000000", "unit" : "", "event" : "cpu_core/branches/", "event-runtime" : 16061563416, "pcnt-running" : 100.00, "metric-value" : "90.503967", "metric-unit" : "K/sec"}
> {"counter-value" : "6814948.000000", "unit" : "", "event" : "cpu_atom/branches/", "event-runtime" : 10161711336, "pcnt-running" : 63.00, "metric-value" : "212.095343", "metric-unit" : "K/sec"}
> {"counter-value" : "97638.000000", "unit" : "", "event" : "cpu_core/branch-misses/", "event-runtime" : 16061535261, "pcnt-running" : 100.00, "metric-value" : "3.357530", "metric-unit" : "of all branches"}
> {"counter-value" : "1017066.000000", "unit" : "", "event" : "cpu_atom/branch-misses/", "event-runtime" : 10159971797, "pcnt-running" : 63.00, "metric-value" : "34.974386", "metric-unit" : "of all branches"}
> {"event-runtime" : 16061513607, "pcnt-running" : 100.00, "metricgroup" : "TopdownL1 (cpu_core)"}
> {"metric-value" : "nan", "metric-unit" : "%  tma_backend_bound"}
> {"metric-value" : "0.000000", "metric-unit" : "%  tma_bad_speculation"}
> {"metric-value" : "nan", "metric-unit" : "%  tma_frontend_bound"}
> {"metric-value" : "nan", "metric-unit" : "%  tma_retiring"}
> {"event-runtime" : 10157398501, "pcnt-running" : 63.00, "metricgroup" : "TopdownL1 (cpu_atom)"}
> {"metric-value" : "13.719821", "metric-unit" : "%  tma_bad_speculation"}
> {"event-runtime" : 10178698656, "pcnt-running" : 63.00, "metric-value" : "41.016738", "metric-unit" : "%  tma_frontend_bound"}
> {"event-runtime" : 10240582902, "pcnt-running" : 63.00, "metric-value" : "39.327764", "metric-unit" : "%  tma_backend_bound"}
> {"metric-value" : "39.327764", "metric-unit" : "%  tma_backend_bound_aux"}
> {"event-runtime" : 10284284920, "pcnt-running" : 64.00, "metric-value" : "5.374638", "metric-unit" : "%  tma_retiring"}
>
> CSV output
>
> On SPR
>
> perf stat -x, -a sleep 1
> 225851.20,msec,cpu-clock,225850700108,100.00,224.431,CPUs utilized
> 976,,context-switches,225850504803,100.00,4.321,/sec
> 224,,cpu-migrations,225850410336,100.00,0.992,/sec
> 76,,page-faults,225850304155,100.00,0.337,/sec
> 52288305,,cycles,225850188531,100.00,0.000,GHz
> 37977214,,instructions,225850071251,100.00,0.73,insn per cycle
> 7299859,,branches,225849890722,100.00,32.322,K/sec
> 51102,,branch-misses,225849672536,100.00,0.70,of all branches
> ,225849327050,100.00,,,,TopdownL1
> ,,,,,70.1,%  tma_backend_bound
> ,,,,,2.7,%  tma_bad_speculation
> ,,,,,12.5,%  tma_frontend_bound
> ,,,,,14.6,%  tma_retiring
> ,225849327050,100.00,,,,TopdownL2
> ,,,,,2.3,%  tma_branch_mispredicts
> ,,,,,19.6,%  tma_core_bound
> ,,,,,4.6,%  tma_fetch_bandwidth
> ,,,,,7.9,%  tma_fetch_latency
> ,,,,,2.9,%  tma_heavy_operations
> ,,,,,11.7,%  tma_light_operations
> ,,,,,0.5,%  tma_machine_clears
> ,,,,,50.5,%  tma_memory_bound
>
> On Hybrid
>
> perf stat -x, -a sleep 1
> 32139.34,msec,cpu-clock,32139351409,100.00,32.001,CPUs utilized
> 225,,context-switches,32139342672,100.00,7.001,/sec
> 32,,cpu-migrations,32139337772,100.00,0.996,/sec
> 72,,page-faults,32139328384,100.00,2.240,/sec
> 6766433,,cpu_core/cycles/,16067551558,100.00,0.000,GHz
> 256500230,,cpu_atom/cycles/,8695757391,54.00,0.008,GHz
> 4688595,,cpu_core/instructions/,16067558976,100.00,0.69,insn per cycle
> 37487490,,cpu_atom/instructions/,10165193856,63.00,5.54,insn per cycle
> 845211,,cpu_core/branches/,16067540225,100.00,26.298,K/sec
> 6571193,,cpu_atom/branches/,10155940853,63.00,204.459,K/sec
> 41359,,cpu_core/branch-misses/,16067516493,100.00,4.89,of all branches
> 1020231,,cpu_atom/branch-misses/,10159363620,63.00,120.71,of all branches
> ,16067494476,100.00,,,,TopdownL1 (cpu_core)
> ,,,,,,%  tma_backend_bound
> ,,,,,0.0,%  tma_bad_speculation
> ,,,,,,%  tma_frontend_bound
> ,,,,,,%  tma_retiring
> ,10160989992,63.00,,,,TopdownL1 (cpu_atom)
> ,,,,,13.8,%  tma_bad_speculation
> ,10188319019,63.00,,,41.3,%  tma_frontend_bound
> ,10258326591,63.00,,,38.6,%  tma_backend_bound
> ,,,,,38.6,%  tma_backend_bound_aux
> ,10282689488,64.00,,,5.4,%  tma_retiring
>
> Kan Liang (5):
>   perf metrics: Sort the Default metricgroup
>   perf stat: New metricgroup output for the default mode
>   perf test: Move all the check functions of stat csv output to lib
>   perf test: Add test case for the standard perf stat output
>   perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics

Just to be clear, I'm happy with this to be submitted having put
reviewed/acked-by on it.

Thanks,
Ian

>  tools/perf/builtin-stat.c                     |   1 +
>  .../arch/arm64/hisilicon/hip08/metrics.json   |  12 +-
>  tools/perf/tests/shell/lib/stat_output.sh     | 169 ++++++++++++++++
>  tools/perf/tests/shell/stat+csv_output.sh     | 188 ++----------------
>  tools/perf/tests/shell/stat+std_output.sh     | 108 ++++++++++
>  tools/perf/util/evsel.h                       |   1 +
>  tools/perf/util/metricgroup.c                 |  26 +++
>  tools/perf/util/metricgroup.h                 |   3 +
>  tools/perf/util/stat-display.c                | 108 +++++++++-
>  tools/perf/util/stat-shadow.c                 | 131 ++++++++++--
>  tools/perf/util/stat.h                        |  15 ++
>  11 files changed, 563 insertions(+), 199 deletions(-)
>  create mode 100755 tools/perf/tests/shell/lib/stat_output.sh
>  create mode 100755 tools/perf/tests/shell/stat+std_output.sh
>
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V4 2/5] perf stat: New metricgroup output for the default mode
  2023-06-16  5:56   ` Ian Rogers
@ 2023-06-16 13:23     ` Liang, Kan
  0 siblings, 0 replies; 14+ messages in thread
From: Liang, Kan @ 2023-06-16 13:23 UTC (permalink / raw)
  To: Ian Rogers
  Cc: acme, mingo, peterz, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel, ak, eranian, ahmad.yasin



On 2023-06-16 1:56 a.m., Ian Rogers wrote:
> On Thu, Jun 15, 2023 at 8:15 PM <kan.liang@linux.intel.com> wrote:
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> In the default mode, the current output of the metricgroup include both
>> events and metrics, which is not necessary and just makes the output
>> hard to read. Since different ARCHs (even different generations in the
>> same ARCH) may use different events. The output also vary on different
>> platforms.
>>
>> For a metricgroup, only outputting the value of each metric is good
>> enough.
>>
>> Add a new field default_metricgroup in evsel to indicate an event of
>> the default metricgroup. For those events, printout() should print
>> the metricgroup name rather than each event.
>>
>> Add perf_stat__skip_metric_event() to skip the evsel in the Default
>> metricgroup, if it's not running or not the metric event.
>>
>> Add print_metricgroup_header_t to pass the functions which print the
>> display name of each metricgroup in the Default metricgroup. Support
>> all three output methods.
>>
>> Factor out perf_stat__print_shadow_stats_metricgroup() to print out
>> each metrics.
>>
>> On SPR
>> Before:
>>
>>  ./perf_old stat sleep 1
>>
>>  Performance counter stats for 'sleep 1':
>>
>>               0.54 msec task-clock:u                     #    0.001 CPUs utilized
>>                  0      context-switches:u               #    0.000 /sec
>>                  0      cpu-migrations:u                 #    0.000 /sec
>>                 68      page-faults:u                    #  125.445 K/sec
>>            540,970      cycles:u                         #    0.998 GHz
>>            556,325      instructions:u                   #    1.03  insn per cycle
>>            123,602      branches:u                       #  228.018 M/sec
>>              6,889      branch-misses:u                  #    5.57% of all branches
>>          3,245,820      TOPDOWN.SLOTS:u                  #     18.4 %  tma_backend_bound
>>                                                   #     17.2 %  tma_retiring
>>                                                   #     23.1 %  tma_bad_speculation
>>                                                   #     41.4 %  tma_frontend_bound
>>            564,859      topdown-retiring:u
>>          1,370,999      topdown-fe-bound:u
>>            603,271      topdown-be-bound:u
>>            744,874      topdown-bad-spec:u
>>             12,661      INT_MISC.UOP_DROPPING:u          #   23.357 M/sec
>>
>>        1.001798215 seconds time elapsed
>>
>>        0.000193000 seconds user
>>        0.001700000 seconds sys
>>
>> After:
>>
>> $ ./perf stat sleep 1
>>
>>  Performance counter stats for 'sleep 1':
>>
>>               0.51 msec task-clock:u                     #    0.001 CPUs utilized
>>                  0      context-switches:u               #    0.000 /sec
>>                  0      cpu-migrations:u                 #    0.000 /sec
>>                 68      page-faults:u                    #  132.683 K/sec
>>            545,228      cycles:u                         #    1.064 GHz
>>            555,509      instructions:u                   #    1.02  insn per cycle
>>            123,574      branches:u                       #  241.120 M/sec
>>              6,957      branch-misses:u                  #    5.63% of all branches
>>                         TopdownL1                 #     17.5 %  tma_backend_bound
>>                                                   #     22.6 %  tma_bad_speculation
>>                                                   #     42.7 %  tma_frontend_bound
>>                                                   #     17.1 %  tma_retiring
>>                         TopdownL2                 #     21.8 %  tma_branch_mispredicts
>>                                                   #     11.5 %  tma_core_bound
>>                                                   #     13.4 %  tma_fetch_bandwidth
>>                                                   #     29.3 %  tma_fetch_latency
>>                                                   #      2.7 %  tma_heavy_operations
>>                                                   #     14.5 %  tma_light_operations
>>                                                   #      0.8 %  tma_machine_clears
>>                                                   #      6.1 %  tma_memory_bound
>>
>>        1.001712086 seconds time elapsed
>>
>>        0.000151000 seconds user
>>        0.001618000 seconds sys
>>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> 
> Reviewed-by: Ian Rogers <irogers@google.com>
> 
> Just an observation: I think a lot of the "default" terminology is
> confusing as default means more like automatically selected when no
> event or metric are given. Reading default as meaning default makes
> the comments on perf_stat__print_shadow_stats_metricgroup somewhat
> counter intuitive.
>

I agree. I once want to call the proposed display mode metricgroup-only
mode or even metric-only mode. I think that's a proper name. But we
already have a metric-only mode. That will only bring confusion. Since
it's only used in the default mode, I use the default everywhere in the
comments for now.

But for the long term, I think we may want to base on this display mode
to create a new metric-only mode to replace the current metric-only mode.

Personally, I think there are some drawbacks of the current metric-only
mode, e.g.,
- All the metrics are printed in one line. It may not be a problem for
the JSON or CSV mode. But it's hard to read in the STD mode.
- No metricgroup name. If two or more metrics are collected, all the
metrics are mixed together. It's hard to tell which metrics belong to
which metricgroup.
- No parents-kids relation. We never have such feature. But it should be
an useful improvement.

Thanks,
Kan

> Thanks,
> Ian
> 
>> ---
>>  tools/perf/builtin-stat.c      |   1 +
>>  tools/perf/util/evsel.h        |   1 +
>>  tools/perf/util/stat-display.c | 108 ++++++++++++++++++++++++---
>>  tools/perf/util/stat-shadow.c  | 131 ++++++++++++++++++++++++++++++---
>>  tools/perf/util/stat.h         |  15 ++++
>>  5 files changed, 234 insertions(+), 22 deletions(-)
>>
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 55601b4b5c34..3f4e76f76f94 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -2172,6 +2172,7 @@ static int add_default_attributes(void)
>>
>>                         evlist__for_each_entry(metric_evlist, metric_evsel) {
>>                                 metric_evsel->skippable = true;
>> +                               metric_evsel->default_metricgroup = true;
>>                         }
>>                         evlist__splice_list_tail(evsel_list, &metric_evlist->core.entries);
>>                         evlist__delete(metric_evlist);
>> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
>> index cc6fb3049b99..9f06d6cd5379 100644
>> --- a/tools/perf/util/evsel.h
>> +++ b/tools/perf/util/evsel.h
>> @@ -131,6 +131,7 @@ struct evsel {
>>         bool                    reset_group;
>>         bool                    errored;
>>         bool                    needs_auxtrace_mmap;
>> +       bool                    default_metricgroup; /* A member of the Default metricgroup */
>>         struct hashmap          *per_pkg_mask;
>>         int                     err;
>>         struct {
>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>> index a2bbdc25d979..7329b3340f88 100644
>> --- a/tools/perf/util/stat-display.c
>> +++ b/tools/perf/util/stat-display.c
>> @@ -25,6 +25,7 @@
>>  #define CNTR_NOT_SUPPORTED     "<not supported>"
>>  #define CNTR_NOT_COUNTED       "<not counted>"
>>
>> +#define MGROUP_LEN   50
>>  #define METRIC_LEN   38
>>  #define EVNAME_LEN   32
>>  #define COUNTS_LEN   18
>> @@ -364,16 +365,27 @@ static void new_line_std(struct perf_stat_config *config __maybe_unused,
>>         os->newline = true;
>>  }
>>
>> -static void do_new_line_std(struct perf_stat_config *config,
>> -                           struct outstate *os)
>> +static inline void __new_line_std_csv(struct perf_stat_config *config,
>> +                                     struct outstate *os)
>>  {
>>         fputc('\n', os->fh);
>>         if (os->prefix)
>>                 fputs(os->prefix, os->fh);
>>         aggr_printout(config, os->evsel, os->id, os->aggr_nr);
>> +}
>> +
>> +static inline void __new_line_std(struct outstate *os)
>> +{
>> +       fprintf(os->fh, "                                                 ");
>> +}
>> +
>> +static void do_new_line_std(struct perf_stat_config *config,
>> +                           struct outstate *os)
>> +{
>> +       __new_line_std_csv(config, os);
>>         if (config->aggr_mode == AGGR_NONE)
>>                 fprintf(os->fh, "        ");
>> -       fprintf(os->fh, "                                                 ");
>> +       __new_line_std(os);
>>  }
>>
>>  static void print_metric_std(struct perf_stat_config *config,
>> @@ -408,10 +420,7 @@ static void new_line_csv(struct perf_stat_config *config, void *ctx)
>>         struct outstate *os = ctx;
>>         int i;
>>
>> -       fputc('\n', os->fh);
>> -       if (os->prefix)
>> -               fprintf(os->fh, "%s", os->prefix);
>> -       aggr_printout(config, os->evsel, os->id, os->aggr_nr);
>> +       __new_line_std_csv(config, os);
>>         for (i = 0; i < os->nfields; i++)
>>                 fputs(config->csv_sep, os->fh);
>>  }
>> @@ -462,6 +471,54 @@ static void new_line_json(struct perf_stat_config *config, void *ctx)
>>         aggr_printout(config, os->evsel, os->id, os->aggr_nr);
>>  }
>>
>> +static void print_metricgroup_header_json(struct perf_stat_config *config,
>> +                                         void *ctx,
>> +                                         const char *metricgroup_name)
>> +{
>> +       if (!metricgroup_name)
>> +               return;
>> +
>> +       fprintf(config->output, "\"metricgroup\" : \"%s\"}", metricgroup_name);
>> +       new_line_json(config, ctx);
>> +}
>> +
>> +static void print_metricgroup_header_csv(struct perf_stat_config *config,
>> +                                        void *ctx,
>> +                                        const char *metricgroup_name)
>> +{
>> +       struct outstate *os = ctx;
>> +       int i;
>> +
>> +       if (!metricgroup_name) {
>> +               /* Leave space for running and enabling */
>> +               for (i = 0; i < os->nfields - 2; i++)
>> +                       fputs(config->csv_sep, os->fh);
>> +               return;
>> +       }
>> +
>> +       for (i = 0; i < os->nfields; i++)
>> +               fputs(config->csv_sep, os->fh);
>> +       fprintf(config->output, "%s", metricgroup_name);
>> +       new_line_csv(config, ctx);
>> +}
>> +
>> +static void print_metricgroup_header_std(struct perf_stat_config *config,
>> +                                        void *ctx,
>> +                                        const char *metricgroup_name)
>> +{
>> +       struct outstate *os = ctx;
>> +       int n;
>> +
>> +       if (!metricgroup_name) {
>> +               __new_line_std(os);
>> +               return;
>> +       }
>> +
>> +       n = fprintf(config->output, " %*s", EVNAME_LEN, metricgroup_name);
>> +
>> +       fprintf(config->output, "%*s", MGROUP_LEN - n - 1, "");
>> +}
>> +
>>  /* Filter out some columns that don't work well in metrics only mode */
>>
>>  static bool valid_only_metric(const char *unit)
>> @@ -713,19 +770,23 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
>>         struct perf_stat_output_ctx out;
>>         print_metric_t pm;
>>         new_line_t nl;
>> +       print_metricgroup_header_t pmh;
>>         bool ok = true;
>>         struct evsel *counter = os->evsel;
>>
>>         if (config->csv_output) {
>>                 pm = config->metric_only ? print_metric_only_csv : print_metric_csv;
>>                 nl = config->metric_only ? new_line_metric : new_line_csv;
>> +               pmh = print_metricgroup_header_csv;
>>                 os->nfields = 4 + (counter->cgrp ? 1 : 0);
>>         } else if (config->json_output) {
>>                 pm = config->metric_only ? print_metric_only_json : print_metric_json;
>>                 nl = config->metric_only ? new_line_metric : new_line_json;
>> +               pmh = print_metricgroup_header_json;
>>         } else {
>>                 pm = config->metric_only ? print_metric_only : print_metric_std;
>>                 nl = config->metric_only ? new_line_metric : new_line_std;
>> +               pmh = print_metricgroup_header_std;
>>         }
>>
>>         if (run == 0 || ena == 0 || counter->counts->scaled == -1) {
>> @@ -747,10 +808,11 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
>>
>>         out.print_metric = pm;
>>         out.new_line = nl;
>> +       out.print_metricgroup_header = pmh;
>>         out.ctx = os;
>>         out.force_header = false;
>>
>> -       if (!config->metric_only) {
>> +       if (!config->metric_only && !counter->default_metricgroup) {
>>                 abs_printout(config, os->id, os->aggr_nr, counter, uval, ok);
>>
>>                 print_noise(config, counter, noise, /*before_metric=*/true);
>> @@ -758,8 +820,31 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
>>         }
>>
>>         if (ok) {
>> -               perf_stat__print_shadow_stats(config, counter, uval, aggr_idx,
>> -                                             &out, &config->metric_events);
>> +               if (!config->metric_only && counter->default_metricgroup) {
>> +                       void *from = NULL;
>> +
>> +                       aggr_printout(config, os->evsel, os->id, os->aggr_nr);
>> +                       /* Print out all the metricgroup with the same metric event. */
>> +                       do {
>> +                               int num = 0;
>> +
>> +                               /* Print out the new line for the next new metricgroup. */
>> +                               if (from) {
>> +                                       if (config->json_output)
>> +                                               new_line_json(config, (void *)os);
>> +                                       else
>> +                                               __new_line_std_csv(config, os);
>> +                               }
>> +
>> +                               print_noise(config, counter, noise, /*before_metric=*/true);
>> +                               print_running(config, run, ena, /*before_metric=*/true);
>> +                               from = perf_stat__print_shadow_stats_metricgroup(config, counter, aggr_idx,
>> +                                                                                &num, from, &out,
>> +                                                                                &config->metric_events);
>> +                       } while (from != NULL);
>> +               } else
>> +                       perf_stat__print_shadow_stats(config, counter, uval, aggr_idx,
>> +                                                     &out, &config->metric_events);
>>         } else {
>>                 pm(config, os, /*color=*/NULL, /*format=*/NULL, /*unit=*/"", /*val=*/0);
>>         }
>> @@ -889,6 +974,9 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
>>         ena = aggr->counts.ena;
>>         run = aggr->counts.run;
>>
>> +       if (perf_stat__skip_metric_event(counter, &config->metric_events, ena, run))
>> +               return;
>> +
>>         if (val == 0 && should_skip_zero_counter(config, counter, &id))
>>                 return;
>>
>> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
>> index 1566a206ba42..1c5c3eeba4cf 100644
>> --- a/tools/perf/util/stat-shadow.c
>> +++ b/tools/perf/util/stat-shadow.c
>> @@ -539,6 +539,106 @@ double test_generic_metric(struct metric_expr *mexp, int aggr_idx)
>>         return ratio;
>>  }
>>
>> +static void perf_stat__print_metricgroup_header(struct perf_stat_config *config,
>> +                                               struct evsel *evsel,
>> +                                               void *ctxp,
>> +                                               const char *name,
>> +                                               struct perf_stat_output_ctx *out)
>> +{
>> +       bool need_full_name = perf_pmus__num_core_pmus() > 1;
>> +       static const char *last_name;
>> +       static const char *last_pmu;
>> +       char full_name[64];
>> +
>> +       /*
>> +        * A metricgroup may have several metric events,
>> +        * e.g.,TopdownL1 on e-core of ADL.
>> +        * The name has been output by the first metric
>> +        * event. Only align with other metics from
>> +        * different metric events.
>> +        */
>> +       if (last_name && !strcmp(last_name, name)) {
>> +               if (!need_full_name || !strcmp(last_pmu, evsel->pmu_name)) {
>> +                       out->print_metricgroup_header(config, ctxp, NULL);
>> +                       return;
>> +               }
>> +       }
>> +
>> +       if (need_full_name)
>> +               scnprintf(full_name, sizeof(full_name), "%s (%s)", name, evsel->pmu_name);
>> +       else
>> +               scnprintf(full_name, sizeof(full_name), "%s", name);
>> +
>> +       out->print_metricgroup_header(config, ctxp, full_name);
>> +
>> +       last_name = name;
>> +       last_pmu = evsel->pmu_name;
>> +}
>> +
>> +/**
>> + * perf_stat__print_shadow_stats_metricgroup - Print out metrics associated with the evsel
>> + *                                            For the non-default, all metrics associated
>> + *                                            with the evsel are printed.
>> + *                                            For the default mode, only the metrics from
>> + *                                            the same metricgroup and the name of the
>> + *                                            metricgroup are printed. To print the metrics
>> + *                                            from the next metricgroup (if available),
>> + *                                            invoke the function with correspoinding
>> + *                                            metric_expr.
>> + */
>> +void *perf_stat__print_shadow_stats_metricgroup(struct perf_stat_config *config,
>> +                                               struct evsel *evsel,
>> +                                               int aggr_idx,
>> +                                               int *num,
>> +                                               void *from,
>> +                                               struct perf_stat_output_ctx *out,
>> +                                               struct rblist *metric_events)
>> +{
>> +       struct metric_event *me;
>> +       struct metric_expr *mexp = from;
>> +       void *ctxp = out->ctx;
>> +       bool header_printed = false;
>> +       const char *name = NULL;
>> +
>> +       me = metricgroup__lookup(metric_events, evsel, false);
>> +       if (me == NULL)
>> +               return NULL;
>> +
>> +       if (!mexp)
>> +               mexp = list_first_entry(&me->head, typeof(*mexp), nd);
>> +
>> +       list_for_each_entry_from(mexp, &me->head, nd) {
>> +               /* Print the display name of the Default metricgroup */
>> +               if (!config->metric_only && me->is_default) {
>> +                       if (!name)
>> +                               name = mexp->default_metricgroup_name;
>> +                       /*
>> +                        * Two or more metricgroup may share the same metric
>> +                        * event, e.g., TopdownL1 and TopdownL2 on SPR.
>> +                        * Return and print the prefix, e.g., noise, running
>> +                        * for the next metricgroup.
>> +                        */
>> +                       if (strcmp(name, mexp->default_metricgroup_name))
>> +                               return (void *)mexp;
>> +                       /* Only print the name of the metricgroup once */
>> +                       if (!header_printed) {
>> +                               header_printed = true;
>> +                               perf_stat__print_metricgroup_header(config, evsel, ctxp,
>> +                                                                   name, out);
>> +                       }
>> +               }
>> +
>> +               if ((*num)++ > 0)
>> +                       out->new_line(config, ctxp);
>> +               generic_metric(config, mexp->metric_expr, mexp->metric_threshold,
>> +                              mexp->metric_events, mexp->metric_refs, evsel->name,
>> +                              mexp->metric_name, mexp->metric_unit, mexp->runtime,
>> +                              aggr_idx, out);
>> +       }
>> +
>> +       return NULL;
>> +}
>> +
>>  void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>>                                    struct evsel *evsel,
>>                                    double avg, int aggr_idx,
>> @@ -565,7 +665,6 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>>         };
>>         print_metric_t print_metric = out->print_metric;
>>         void *ctxp = out->ctx;
>> -       struct metric_event *me;
>>         int num = 1;
>>
>>         if (config->iostat_run) {
>> @@ -592,18 +691,26 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>>                 }
>>         }
>>
>> -       if ((me = metricgroup__lookup(metric_events, evsel, false)) != NULL) {
>> -               struct metric_expr *mexp;
>> +       perf_stat__print_shadow_stats_metricgroup(config, evsel, aggr_idx,
>> +                                                 &num, NULL, out, metric_events);
>>
>> -               list_for_each_entry (mexp, &me->head, nd) {
>> -                       if (num++ > 0)
>> -                               out->new_line(config, ctxp);
>> -                       generic_metric(config, mexp->metric_expr, mexp->metric_threshold,
>> -                                      mexp->metric_events, mexp->metric_refs, evsel->name,
>> -                                      mexp->metric_name, mexp->metric_unit, mexp->runtime,
>> -                                      aggr_idx, out);
>> -               }
>> -       }
>>         if (num == 0)
>>                 print_metric(config, ctxp, NULL, NULL, NULL, 0);
>>  }
>> +
>> +/**
>> + * perf_stat__skip_metric_event - Skip the evsel in the Default metricgroup,
>> + *                               if it's not running or not the metric event.
>> + */
>> +bool perf_stat__skip_metric_event(struct evsel *evsel,
>> +                                 struct rblist *metric_events,
>> +                                 u64 ena, u64 run)
>> +{
>> +       if (!evsel->default_metricgroup)
>> +               return false;
>> +
>> +       if (!ena || !run)
>> +               return true;
>> +
>> +       return !metricgroup__lookup(metric_events, evsel, false);
>> +}
>> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
>> index 7abff7cbb5a1..934f79778cea 100644
>> --- a/tools/perf/util/stat.h
>> +++ b/tools/perf/util/stat.h
>> @@ -158,11 +158,16 @@ typedef void (*print_metric_t)(struct perf_stat_config *config,
>>                                const char *fmt, double val);
>>  typedef void (*new_line_t)(struct perf_stat_config *config, void *ctx);
>>
>> +/* Used to print the display name of the Default metricgroup for now. */
>> +typedef void (*print_metricgroup_header_t)(struct perf_stat_config *config,
>> +                                          void *ctx, const char *metricgroup_name);
>> +
>>  void perf_stat__reset_shadow_stats(void);
>>  struct perf_stat_output_ctx {
>>         void *ctx;
>>         print_metric_t print_metric;
>>         new_line_t new_line;
>> +       print_metricgroup_header_t print_metricgroup_header;
>>         bool force_header;
>>  };
>>
>> @@ -171,6 +176,16 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>>                                    double avg, int aggr_idx,
>>                                    struct perf_stat_output_ctx *out,
>>                                    struct rblist *metric_events);
>> +bool perf_stat__skip_metric_event(struct evsel *evsel,
>> +                                 struct rblist *metric_events,
>> +                                 u64 ena, u64 run);
>> +void *perf_stat__print_shadow_stats_metricgroup(struct perf_stat_config *config,
>> +                                               struct evsel *evsel,
>> +                                               int aggr_idx,
>> +                                               int *num,
>> +                                               void *from,
>> +                                               struct perf_stat_output_ctx *out,
>> +                                               struct rblist *metric_events);
>>
>>  int evlist__alloc_stats(struct perf_stat_config *config,
>>                         struct evlist *evlist, bool alloc_raw);
>> --
>> 2.35.1
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V4 0/5] New metricgroup output in perf stat default mode
  2023-06-16  5:59 ` [PATCH V4 0/5] New metricgroup output in perf stat default mode Ian Rogers
@ 2023-06-16 13:26   ` Liang, Kan
  2023-06-16 13:39     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 14+ messages in thread
From: Liang, Kan @ 2023-06-16 13:26 UTC (permalink / raw)
  To: Ian Rogers
  Cc: acme, mingo, peterz, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel, ak, eranian, ahmad.yasin



On 2023-06-16 1:59 a.m., Ian Rogers wrote:
> On Thu, Jun 15, 2023 at 8:14 PM <kan.liang@linux.intel.com> wrote:
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> Changes since V3:
>> - Move the full name (PMU + metricgroup name) generation from the metric
>>   code to the output code. (Ian)
>> - Add default tags for Hisi hip08 L1 metrics (John)
>> - Some patches have been merged. Drop them from the V4.
>>
>> Changes since V2:
>> - Fixes memory leak (Ian)
>>   (Ian, I cannot reproduce the memory leak on all my machines. Please
>>    check whether the fix works on your side. Thanks.)
>> - Add Reviewed-by tags for several patches.
>>
>> Changes since V1:
>> - Remove EVSEL_EVENT_MASK and use the __evsel__match which is suggested
>>   by Ian.
>> - Support TopdownL1 on both e-core and p-core of ADL in the default
>>   mode. (Ian)
>> - Have separate patches for the modifications of metricgroup and output.
>>   (Ian)
>> - Does 2nd sort for the Default metricgroup. Remove the logic of
>>   changing the associated metric event. (Ian)
>> - Move all the metric related code to stat-shadow (Ian)
>> - Move the commong functions between stat+csv_output and stat+std_output
>>   to the lib directory (Ian)
>>
>> In the default mode, the current output of the metricgroup include both
>> events and metrics, which is not necessary and makes the output hard to
>> read. Also, different ARCHs (even different generations of the ARCH) may
>> have a different output format because of the different events in a
>> metrics.
>>
>> The patch proposes a new output format which only outputting the value
>> of each metric and the metricgroup name. It can brings a clean and
>> consistent output format among ARCHs and generations.
>>
>> The patches 1-2 introduce the new metricgroup output.
>>
>> The patches 3-4 improve the tests to cover the default mode.
>>
>> The patch 5 update the event list for Hisi hip08.
>>
>> Here are some examples for the new output.
>>
>> STD output:
>>
>> On SPR
>>
>> perf stat -a sleep 1
>>
>>  Performance counter stats for 'system wide':
>>
>>         226,054.13 msec cpu-clock                        #  224.588 CPUs utilized
>>                932      context-switches                 #    4.123 /sec
>>                224      cpu-migrations                   #    0.991 /sec
>>                 76      page-faults                      #    0.336 /sec
>>         45,940,682      cycles                           #    0.000 GHz
>>         36,676,047      instructions                     #    0.80  insn per cycle
>>          7,044,516      branches                         #   31.163 K/sec
>>             62,169      branch-misses                    #    0.88% of all branches
>>                         TopdownL1                 #     68.7 %  tma_backend_bound
>>                                                   #      3.1 %  tma_bad_speculation
>>                                                   #     13.0 %  tma_frontend_bound
>>                                                   #     15.2 %  tma_retiring
>>                         TopdownL2                 #      2.7 %  tma_branch_mispredicts
>>                                                   #     19.6 %  tma_core_bound
>>                                                   #      4.8 %  tma_fetch_bandwidth
>>                                                   #      8.3 %  tma_fetch_latency
>>                                                   #      2.9 %  tma_heavy_operations
>>                                                   #     12.3 %  tma_light_operations
>>                                                   #      0.4 %  tma_machine_clears
>>                                                   #     49.1 %  tma_memory_bound
>>
>>        1.006529767 seconds time elapsed
>>
>> perf stat -a sleep 1
>>
>>  Performance counter stats for 'system wide':
>>
>>          32,127.99 msec cpu-clock                        #   31.992 CPUs utilized
>>                240      context-switches                 #    7.470 /sec
>>                 32      cpu-migrations                   #    0.996 /sec
>>                 74      page-faults                      #    2.303 /sec
>>          6,313,960      cpu_core/cycles/                 #    0.000 GHz
>>        257,711,907      cpu_atom/cycles/                 #    0.008 GHz                         (54.18%)
>>          4,477,162      cpu_core/instructions/           #    0.71  insn per cycle
>>         37,721,481      cpu_atom/instructions/           #    5.97  insn per cycle              (63.33%)
>>            809,747      cpu_core/branches/               #   25.204 K/sec
>>          6,621,226      cpu_atom/branches/               #  206.089 K/sec                       (63.32%)
>>             39,667      cpu_core/branch-misses/          #    4.90% of all branches
>>          1,032,146      cpu_atom/branch-misses/          #  127.47% of all branches             (63.33%)
>>              TopdownL1 (cpu_core)                 #      nan %  tma_backend_bound
>>                                                   #      0.0 %  tma_bad_speculation
>>                                                   #      nan %  tma_frontend_bound
>>                                                   #      nan %  tma_retiring
>>              TopdownL1 (cpu_atom)                 #     13.6 %  tma_bad_speculation      (63.36%)
>>                                                   #     41.1 %  tma_frontend_bound       (63.54%)
>>                                                   #     39.2 %  tma_backend_bound
>>                                                   #     39.2 %  tma_backend_bound_aux    (63.93%)
>>                                                   #      5.4 %  tma_retiring             (64.15%)
>>
>>        1.004244114 seconds time elapsed
>>
>> JSON output
>>
>> on SPR
>>
>> perf stat --json -a sleep 1
>> {"counter-value" : "225904.823297", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 225904323425, "pcnt-running" : 100.00, "metric-value" : "224.456872", "metric-unit" : "CPUs utilized"}
>> {"counter-value" : "986.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 225904108985, "pcnt-running" : 100.00, "metric-value" : "4.364670", "metric-unit" : "/sec"}
>> {"counter-value" : "224.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 225904016141, "pcnt-running" : 100.00, "metric-value" : "0.991568", "metric-unit" : "/sec"}
>> {"counter-value" : "76.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 225903913270, "pcnt-running" : 100.00, "metric-value" : "0.336425", "metric-unit" : "/sec"}
>> {"counter-value" : "48433482.000000", "unit" : "", "event" : "cycles", "event-runtime" : 225903792732, "pcnt-running" : 100.00, "metric-value" : "0.000214", "metric-unit" : "GHz"}
>> {"counter-value" : "38620409.000000", "unit" : "", "event" : "instructions", "event-runtime" : 225903657830, "pcnt-running" : 100.00, "metric-value" : "0.797391", "metric-unit" : "insn per cycle"}
>> {"counter-value" : "7369473.000000", "unit" : "", "event" : "branches", "event-runtime" : 225903464328, "pcnt-running" : 100.00, "metric-value" : "32.622026", "metric-unit" : "K/sec"}
>> {"counter-value" : "54747.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 225903234523, "pcnt-running" : 100.00, "metric-value" : "0.742889", "metric-unit" : "of all branches"}
>> {"event-runtime" : 225902840555, "pcnt-running" : 100.00, "metricgroup" : "TopdownL1"}
>> {"metric-value" : "69.950631", "metric-unit" : "%  tma_backend_bound"}
>> {"metric-value" : "2.771783", "metric-unit" : "%  tma_bad_speculation"}
>> {"metric-value" : "12.026074", "metric-unit" : "%  tma_frontend_bound"}
>> {"metric-value" : "15.251513", "metric-unit" : "%  tma_retiring"}
>> {"event-runtime" : 225902840555, "pcnt-running" : 100.00, "metricgroup" : "TopdownL2"}
>> {"metric-value" : "2.351757", "metric-unit" : "%  tma_branch_mispredicts"}
>> {"metric-value" : "19.729771", "metric-unit" : "%  tma_core_bound"}
>> {"metric-value" : "4.555207", "metric-unit" : "%  tma_fetch_bandwidth"}
>> {"metric-value" : "7.470867", "metric-unit" : "%  tma_fetch_latency"}
>> {"metric-value" : "2.938808", "metric-unit" : "%  tma_heavy_operations"}
>> {"metric-value" : "12.312705", "metric-unit" : "%  tma_light_operations"}
>> {"metric-value" : "0.420026", "metric-unit" : "%  tma_machine_clears"}
>> {"metric-value" : "50.220860", "metric-unit" : "%  tma_memory_bound"}
>>
>> On hybrid
>>
>> perf stat --json -a sleep 1
>> {"counter-value" : "32131.530625", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 32131536951, "pcnt-running" : 100.00, "metric-value" : "31.992642", "metric-unit" : "CPUs utilized"}
>> {"counter-value" : "328.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 32131525778, "pcnt-running" : 100.00, "metric-value" : "10.208042", "metric-unit" : "/sec"}
>> {"counter-value" : "32.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 32131515104, "pcnt-running" : 100.00, "metric-value" : "0.995906", "metric-unit" : "/sec"}
>> {"counter-value" : "353.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 32131501396, "pcnt-running" : 100.00, "metric-value" : "10.986094", "metric-unit" : "/sec"}
>> {"counter-value" : "18685492.000000", "unit" : "", "event" : "cpu_core/cycles/", "event-runtime" : 16061585292, "pcnt-running" : 100.00, "metric-value" : "0.000582", "metric-unit" : "GHz"}
>> {"counter-value" : "255620352.000000", "unit" : "", "event" : "cpu_atom/cycles/", "event-runtime" : 8690268422, "pcnt-running" : 54.00, "metric-value" : "0.007955", "metric-unit" : "GHz"}
>> {"counter-value" : "15489913.000000", "unit" : "", "event" : "cpu_core/instructions/", "event-runtime" : 16061582200, "pcnt-running" : 100.00, "metric-value" : "0.828981", "metric-unit" : "insn per cycle"}
>> {"counter-value" : "38790161.000000", "unit" : "", "event" : "cpu_atom/instructions/", "event-runtime" : 10163133324, "pcnt-running" : 63.00, "metric-value" : "2.075951", "metric-unit" : "insn per cycle"}
>> {"counter-value" : "2908031.000000", "unit" : "", "event" : "cpu_core/branches/", "event-runtime" : 16061563416, "pcnt-running" : 100.00, "metric-value" : "90.503967", "metric-unit" : "K/sec"}
>> {"counter-value" : "6814948.000000", "unit" : "", "event" : "cpu_atom/branches/", "event-runtime" : 10161711336, "pcnt-running" : 63.00, "metric-value" : "212.095343", "metric-unit" : "K/sec"}
>> {"counter-value" : "97638.000000", "unit" : "", "event" : "cpu_core/branch-misses/", "event-runtime" : 16061535261, "pcnt-running" : 100.00, "metric-value" : "3.357530", "metric-unit" : "of all branches"}
>> {"counter-value" : "1017066.000000", "unit" : "", "event" : "cpu_atom/branch-misses/", "event-runtime" : 10159971797, "pcnt-running" : 63.00, "metric-value" : "34.974386", "metric-unit" : "of all branches"}
>> {"event-runtime" : 16061513607, "pcnt-running" : 100.00, "metricgroup" : "TopdownL1 (cpu_core)"}
>> {"metric-value" : "nan", "metric-unit" : "%  tma_backend_bound"}
>> {"metric-value" : "0.000000", "metric-unit" : "%  tma_bad_speculation"}
>> {"metric-value" : "nan", "metric-unit" : "%  tma_frontend_bound"}
>> {"metric-value" : "nan", "metric-unit" : "%  tma_retiring"}
>> {"event-runtime" : 10157398501, "pcnt-running" : 63.00, "metricgroup" : "TopdownL1 (cpu_atom)"}
>> {"metric-value" : "13.719821", "metric-unit" : "%  tma_bad_speculation"}
>> {"event-runtime" : 10178698656, "pcnt-running" : 63.00, "metric-value" : "41.016738", "metric-unit" : "%  tma_frontend_bound"}
>> {"event-runtime" : 10240582902, "pcnt-running" : 63.00, "metric-value" : "39.327764", "metric-unit" : "%  tma_backend_bound"}
>> {"metric-value" : "39.327764", "metric-unit" : "%  tma_backend_bound_aux"}
>> {"event-runtime" : 10284284920, "pcnt-running" : 64.00, "metric-value" : "5.374638", "metric-unit" : "%  tma_retiring"}
>>
>> CSV output
>>
>> On SPR
>>
>> perf stat -x, -a sleep 1
>> 225851.20,msec,cpu-clock,225850700108,100.00,224.431,CPUs utilized
>> 976,,context-switches,225850504803,100.00,4.321,/sec
>> 224,,cpu-migrations,225850410336,100.00,0.992,/sec
>> 76,,page-faults,225850304155,100.00,0.337,/sec
>> 52288305,,cycles,225850188531,100.00,0.000,GHz
>> 37977214,,instructions,225850071251,100.00,0.73,insn per cycle
>> 7299859,,branches,225849890722,100.00,32.322,K/sec
>> 51102,,branch-misses,225849672536,100.00,0.70,of all branches
>> ,225849327050,100.00,,,,TopdownL1
>> ,,,,,70.1,%  tma_backend_bound
>> ,,,,,2.7,%  tma_bad_speculation
>> ,,,,,12.5,%  tma_frontend_bound
>> ,,,,,14.6,%  tma_retiring
>> ,225849327050,100.00,,,,TopdownL2
>> ,,,,,2.3,%  tma_branch_mispredicts
>> ,,,,,19.6,%  tma_core_bound
>> ,,,,,4.6,%  tma_fetch_bandwidth
>> ,,,,,7.9,%  tma_fetch_latency
>> ,,,,,2.9,%  tma_heavy_operations
>> ,,,,,11.7,%  tma_light_operations
>> ,,,,,0.5,%  tma_machine_clears
>> ,,,,,50.5,%  tma_memory_bound
>>
>> On Hybrid
>>
>> perf stat -x, -a sleep 1
>> 32139.34,msec,cpu-clock,32139351409,100.00,32.001,CPUs utilized
>> 225,,context-switches,32139342672,100.00,7.001,/sec
>> 32,,cpu-migrations,32139337772,100.00,0.996,/sec
>> 72,,page-faults,32139328384,100.00,2.240,/sec
>> 6766433,,cpu_core/cycles/,16067551558,100.00,0.000,GHz
>> 256500230,,cpu_atom/cycles/,8695757391,54.00,0.008,GHz
>> 4688595,,cpu_core/instructions/,16067558976,100.00,0.69,insn per cycle
>> 37487490,,cpu_atom/instructions/,10165193856,63.00,5.54,insn per cycle
>> 845211,,cpu_core/branches/,16067540225,100.00,26.298,K/sec
>> 6571193,,cpu_atom/branches/,10155940853,63.00,204.459,K/sec
>> 41359,,cpu_core/branch-misses/,16067516493,100.00,4.89,of all branches
>> 1020231,,cpu_atom/branch-misses/,10159363620,63.00,120.71,of all branches
>> ,16067494476,100.00,,,,TopdownL1 (cpu_core)
>> ,,,,,,%  tma_backend_bound
>> ,,,,,0.0,%  tma_bad_speculation
>> ,,,,,,%  tma_frontend_bound
>> ,,,,,,%  tma_retiring
>> ,10160989992,63.00,,,,TopdownL1 (cpu_atom)
>> ,,,,,13.8,%  tma_bad_speculation
>> ,10188319019,63.00,,,41.3,%  tma_frontend_bound
>> ,10258326591,63.00,,,38.6,%  tma_backend_bound
>> ,,,,,38.6,%  tma_backend_bound_aux
>> ,10282689488,64.00,,,5.4,%  tma_retiring
>>
>> Kan Liang (5):
>>   perf metrics: Sort the Default metricgroup
>>   perf stat: New metricgroup output for the default mode
>>   perf test: Move all the check functions of stat csv output to lib
>>   perf test: Add test case for the standard perf stat output
>>   perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics
> 
> Just to be clear, I'm happy with this to be submitted having put
> reviewed/acked-by on it.
> 

Thanks Ian. Appreciate all your feedback and comments.

Thanks,
Kan

> Thanks,
> Ian
> 
>>  tools/perf/builtin-stat.c                     |   1 +
>>  .../arch/arm64/hisilicon/hip08/metrics.json   |  12 +-
>>  tools/perf/tests/shell/lib/stat_output.sh     | 169 ++++++++++++++++
>>  tools/perf/tests/shell/stat+csv_output.sh     | 188 ++----------------
>>  tools/perf/tests/shell/stat+std_output.sh     | 108 ++++++++++
>>  tools/perf/util/evsel.h                       |   1 +
>>  tools/perf/util/metricgroup.c                 |  26 +++
>>  tools/perf/util/metricgroup.h                 |   3 +
>>  tools/perf/util/stat-display.c                | 108 +++++++++-
>>  tools/perf/util/stat-shadow.c                 | 131 ++++++++++--
>>  tools/perf/util/stat.h                        |  15 ++
>>  11 files changed, 563 insertions(+), 199 deletions(-)
>>  create mode 100755 tools/perf/tests/shell/lib/stat_output.sh
>>  create mode 100755 tools/perf/tests/shell/stat+std_output.sh
>>
>> --
>> 2.35.1
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V4 0/5] New metricgroup output in perf stat default mode
  2023-06-16 13:26   ` Liang, Kan
@ 2023-06-16 13:39     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-06-16 13:39 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Ian Rogers, mingo, peterz, namhyung, jolsa, adrian.hunter,
	linux-perf-users, linux-kernel, ak, eranian, ahmad.yasin

Em Fri, Jun 16, 2023 at 09:26:26AM -0400, Liang, Kan escreveu:
> 
> 
> On 2023-06-16 1:59 a.m., Ian Rogers wrote:
> > On Thu, Jun 15, 2023 at 8:14 PM <kan.liang@linux.intel.com> wrote:
> >>
> >> From: Kan Liang <kan.liang@linux.intel.com>
> >>
> >> Changes since V3:
> >> - Move the full name (PMU + metricgroup name) generation from the metric
> >>   code to the output code. (Ian)
> >> - Add default tags for Hisi hip08 L1 metrics (John)
> >> - Some patches have been merged. Drop them from the V4.
> >>
> >> Changes since V2:
> >> - Fixes memory leak (Ian)
> >>   (Ian, I cannot reproduce the memory leak on all my machines. Please
> >>    check whether the fix works on your side. Thanks.)
> >> - Add Reviewed-by tags for several patches.
> >>
> >> Changes since V1:
> >> - Remove EVSEL_EVENT_MASK and use the __evsel__match which is suggested
> >>   by Ian.
> >> - Support TopdownL1 on both e-core and p-core of ADL in the default
> >>   mode. (Ian)
> >> - Have separate patches for the modifications of metricgroup and output.
> >>   (Ian)
> >> - Does 2nd sort for the Default metricgroup. Remove the logic of
> >>   changing the associated metric event. (Ian)
> >> - Move all the metric related code to stat-shadow (Ian)
> >> - Move the commong functions between stat+csv_output and stat+std_output
> >>   to the lib directory (Ian)
> >>
> >> In the default mode, the current output of the metricgroup include both
> >> events and metrics, which is not necessary and makes the output hard to
> >> read. Also, different ARCHs (even different generations of the ARCH) may
> >> have a different output format because of the different events in a
> >> metrics.
> >>
> >> The patch proposes a new output format which only outputting the value
> >> of each metric and the metricgroup name. It can brings a clean and
> >> consistent output format among ARCHs and generations.
> >>
> >> The patches 1-2 introduce the new metricgroup output.
> >>
> >> The patches 3-4 improve the tests to cover the default mode.
> >>
> >> The patch 5 update the event list for Hisi hip08.
> >>
> >> Here are some examples for the new output.
> >>
> >> STD output:
> >>
> >> On SPR
> >>
> >> perf stat -a sleep 1
> >>
> >>  Performance counter stats for 'system wide':
> >>
> >>         226,054.13 msec cpu-clock                        #  224.588 CPUs utilized
> >>                932      context-switches                 #    4.123 /sec
> >>                224      cpu-migrations                   #    0.991 /sec
> >>                 76      page-faults                      #    0.336 /sec
> >>         45,940,682      cycles                           #    0.000 GHz
> >>         36,676,047      instructions                     #    0.80  insn per cycle
> >>          7,044,516      branches                         #   31.163 K/sec
> >>             62,169      branch-misses                    #    0.88% of all branches
> >>                         TopdownL1                 #     68.7 %  tma_backend_bound
> >>                                                   #      3.1 %  tma_bad_speculation
> >>                                                   #     13.0 %  tma_frontend_bound
> >>                                                   #     15.2 %  tma_retiring
> >>                         TopdownL2                 #      2.7 %  tma_branch_mispredicts
> >>                                                   #     19.6 %  tma_core_bound
> >>                                                   #      4.8 %  tma_fetch_bandwidth
> >>                                                   #      8.3 %  tma_fetch_latency
> >>                                                   #      2.9 %  tma_heavy_operations
> >>                                                   #     12.3 %  tma_light_operations
> >>                                                   #      0.4 %  tma_machine_clears
> >>                                                   #     49.1 %  tma_memory_bound
> >>
> >>        1.006529767 seconds time elapsed
> >>
> >> perf stat -a sleep 1
> >>
> >>  Performance counter stats for 'system wide':
> >>
> >>          32,127.99 msec cpu-clock                        #   31.992 CPUs utilized
> >>                240      context-switches                 #    7.470 /sec
> >>                 32      cpu-migrations                   #    0.996 /sec
> >>                 74      page-faults                      #    2.303 /sec
> >>          6,313,960      cpu_core/cycles/                 #    0.000 GHz
> >>        257,711,907      cpu_atom/cycles/                 #    0.008 GHz                         (54.18%)
> >>          4,477,162      cpu_core/instructions/           #    0.71  insn per cycle
> >>         37,721,481      cpu_atom/instructions/           #    5.97  insn per cycle              (63.33%)
> >>            809,747      cpu_core/branches/               #   25.204 K/sec
> >>          6,621,226      cpu_atom/branches/               #  206.089 K/sec                       (63.32%)
> >>             39,667      cpu_core/branch-misses/          #    4.90% of all branches
> >>          1,032,146      cpu_atom/branch-misses/          #  127.47% of all branches             (63.33%)
> >>              TopdownL1 (cpu_core)                 #      nan %  tma_backend_bound
> >>                                                   #      0.0 %  tma_bad_speculation
> >>                                                   #      nan %  tma_frontend_bound
> >>                                                   #      nan %  tma_retiring
> >>              TopdownL1 (cpu_atom)                 #     13.6 %  tma_bad_speculation      (63.36%)
> >>                                                   #     41.1 %  tma_frontend_bound       (63.54%)
> >>                                                   #     39.2 %  tma_backend_bound
> >>                                                   #     39.2 %  tma_backend_bound_aux    (63.93%)
> >>                                                   #      5.4 %  tma_retiring             (64.15%)
> >>
> >>        1.004244114 seconds time elapsed
> >>
> >> JSON output
> >>
> >> on SPR
> >>
> >> perf stat --json -a sleep 1
> >> {"counter-value" : "225904.823297", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 225904323425, "pcnt-running" : 100.00, "metric-value" : "224.456872", "metric-unit" : "CPUs utilized"}
> >> {"counter-value" : "986.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 225904108985, "pcnt-running" : 100.00, "metric-value" : "4.364670", "metric-unit" : "/sec"}
> >> {"counter-value" : "224.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 225904016141, "pcnt-running" : 100.00, "metric-value" : "0.991568", "metric-unit" : "/sec"}
> >> {"counter-value" : "76.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 225903913270, "pcnt-running" : 100.00, "metric-value" : "0.336425", "metric-unit" : "/sec"}
> >> {"counter-value" : "48433482.000000", "unit" : "", "event" : "cycles", "event-runtime" : 225903792732, "pcnt-running" : 100.00, "metric-value" : "0.000214", "metric-unit" : "GHz"}
> >> {"counter-value" : "38620409.000000", "unit" : "", "event" : "instructions", "event-runtime" : 225903657830, "pcnt-running" : 100.00, "metric-value" : "0.797391", "metric-unit" : "insn per cycle"}
> >> {"counter-value" : "7369473.000000", "unit" : "", "event" : "branches", "event-runtime" : 225903464328, "pcnt-running" : 100.00, "metric-value" : "32.622026", "metric-unit" : "K/sec"}
> >> {"counter-value" : "54747.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 225903234523, "pcnt-running" : 100.00, "metric-value" : "0.742889", "metric-unit" : "of all branches"}
> >> {"event-runtime" : 225902840555, "pcnt-running" : 100.00, "metricgroup" : "TopdownL1"}
> >> {"metric-value" : "69.950631", "metric-unit" : "%  tma_backend_bound"}
> >> {"metric-value" : "2.771783", "metric-unit" : "%  tma_bad_speculation"}
> >> {"metric-value" : "12.026074", "metric-unit" : "%  tma_frontend_bound"}
> >> {"metric-value" : "15.251513", "metric-unit" : "%  tma_retiring"}
> >> {"event-runtime" : 225902840555, "pcnt-running" : 100.00, "metricgroup" : "TopdownL2"}
> >> {"metric-value" : "2.351757", "metric-unit" : "%  tma_branch_mispredicts"}
> >> {"metric-value" : "19.729771", "metric-unit" : "%  tma_core_bound"}
> >> {"metric-value" : "4.555207", "metric-unit" : "%  tma_fetch_bandwidth"}
> >> {"metric-value" : "7.470867", "metric-unit" : "%  tma_fetch_latency"}
> >> {"metric-value" : "2.938808", "metric-unit" : "%  tma_heavy_operations"}
> >> {"metric-value" : "12.312705", "metric-unit" : "%  tma_light_operations"}
> >> {"metric-value" : "0.420026", "metric-unit" : "%  tma_machine_clears"}
> >> {"metric-value" : "50.220860", "metric-unit" : "%  tma_memory_bound"}
> >>
> >> On hybrid
> >>
> >> perf stat --json -a sleep 1
> >> {"counter-value" : "32131.530625", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 32131536951, "pcnt-running" : 100.00, "metric-value" : "31.992642", "metric-unit" : "CPUs utilized"}
> >> {"counter-value" : "328.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 32131525778, "pcnt-running" : 100.00, "metric-value" : "10.208042", "metric-unit" : "/sec"}
> >> {"counter-value" : "32.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 32131515104, "pcnt-running" : 100.00, "metric-value" : "0.995906", "metric-unit" : "/sec"}
> >> {"counter-value" : "353.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 32131501396, "pcnt-running" : 100.00, "metric-value" : "10.986094", "metric-unit" : "/sec"}
> >> {"counter-value" : "18685492.000000", "unit" : "", "event" : "cpu_core/cycles/", "event-runtime" : 16061585292, "pcnt-running" : 100.00, "metric-value" : "0.000582", "metric-unit" : "GHz"}
> >> {"counter-value" : "255620352.000000", "unit" : "", "event" : "cpu_atom/cycles/", "event-runtime" : 8690268422, "pcnt-running" : 54.00, "metric-value" : "0.007955", "metric-unit" : "GHz"}
> >> {"counter-value" : "15489913.000000", "unit" : "", "event" : "cpu_core/instructions/", "event-runtime" : 16061582200, "pcnt-running" : 100.00, "metric-value" : "0.828981", "metric-unit" : "insn per cycle"}
> >> {"counter-value" : "38790161.000000", "unit" : "", "event" : "cpu_atom/instructions/", "event-runtime" : 10163133324, "pcnt-running" : 63.00, "metric-value" : "2.075951", "metric-unit" : "insn per cycle"}
> >> {"counter-value" : "2908031.000000", "unit" : "", "event" : "cpu_core/branches/", "event-runtime" : 16061563416, "pcnt-running" : 100.00, "metric-value" : "90.503967", "metric-unit" : "K/sec"}
> >> {"counter-value" : "6814948.000000", "unit" : "", "event" : "cpu_atom/branches/", "event-runtime" : 10161711336, "pcnt-running" : 63.00, "metric-value" : "212.095343", "metric-unit" : "K/sec"}
> >> {"counter-value" : "97638.000000", "unit" : "", "event" : "cpu_core/branch-misses/", "event-runtime" : 16061535261, "pcnt-running" : 100.00, "metric-value" : "3.357530", "metric-unit" : "of all branches"}
> >> {"counter-value" : "1017066.000000", "unit" : "", "event" : "cpu_atom/branch-misses/", "event-runtime" : 10159971797, "pcnt-running" : 63.00, "metric-value" : "34.974386", "metric-unit" : "of all branches"}
> >> {"event-runtime" : 16061513607, "pcnt-running" : 100.00, "metricgroup" : "TopdownL1 (cpu_core)"}
> >> {"metric-value" : "nan", "metric-unit" : "%  tma_backend_bound"}
> >> {"metric-value" : "0.000000", "metric-unit" : "%  tma_bad_speculation"}
> >> {"metric-value" : "nan", "metric-unit" : "%  tma_frontend_bound"}
> >> {"metric-value" : "nan", "metric-unit" : "%  tma_retiring"}
> >> {"event-runtime" : 10157398501, "pcnt-running" : 63.00, "metricgroup" : "TopdownL1 (cpu_atom)"}
> >> {"metric-value" : "13.719821", "metric-unit" : "%  tma_bad_speculation"}
> >> {"event-runtime" : 10178698656, "pcnt-running" : 63.00, "metric-value" : "41.016738", "metric-unit" : "%  tma_frontend_bound"}
> >> {"event-runtime" : 10240582902, "pcnt-running" : 63.00, "metric-value" : "39.327764", "metric-unit" : "%  tma_backend_bound"}
> >> {"metric-value" : "39.327764", "metric-unit" : "%  tma_backend_bound_aux"}
> >> {"event-runtime" : 10284284920, "pcnt-running" : 64.00, "metric-value" : "5.374638", "metric-unit" : "%  tma_retiring"}
> >>
> >> CSV output
> >>
> >> On SPR
> >>
> >> perf stat -x, -a sleep 1
> >> 225851.20,msec,cpu-clock,225850700108,100.00,224.431,CPUs utilized
> >> 976,,context-switches,225850504803,100.00,4.321,/sec
> >> 224,,cpu-migrations,225850410336,100.00,0.992,/sec
> >> 76,,page-faults,225850304155,100.00,0.337,/sec
> >> 52288305,,cycles,225850188531,100.00,0.000,GHz
> >> 37977214,,instructions,225850071251,100.00,0.73,insn per cycle
> >> 7299859,,branches,225849890722,100.00,32.322,K/sec
> >> 51102,,branch-misses,225849672536,100.00,0.70,of all branches
> >> ,225849327050,100.00,,,,TopdownL1
> >> ,,,,,70.1,%  tma_backend_bound
> >> ,,,,,2.7,%  tma_bad_speculation
> >> ,,,,,12.5,%  tma_frontend_bound
> >> ,,,,,14.6,%  tma_retiring
> >> ,225849327050,100.00,,,,TopdownL2
> >> ,,,,,2.3,%  tma_branch_mispredicts
> >> ,,,,,19.6,%  tma_core_bound
> >> ,,,,,4.6,%  tma_fetch_bandwidth
> >> ,,,,,7.9,%  tma_fetch_latency
> >> ,,,,,2.9,%  tma_heavy_operations
> >> ,,,,,11.7,%  tma_light_operations
> >> ,,,,,0.5,%  tma_machine_clears
> >> ,,,,,50.5,%  tma_memory_bound
> >>
> >> On Hybrid
> >>
> >> perf stat -x, -a sleep 1
> >> 32139.34,msec,cpu-clock,32139351409,100.00,32.001,CPUs utilized
> >> 225,,context-switches,32139342672,100.00,7.001,/sec
> >> 32,,cpu-migrations,32139337772,100.00,0.996,/sec
> >> 72,,page-faults,32139328384,100.00,2.240,/sec
> >> 6766433,,cpu_core/cycles/,16067551558,100.00,0.000,GHz
> >> 256500230,,cpu_atom/cycles/,8695757391,54.00,0.008,GHz
> >> 4688595,,cpu_core/instructions/,16067558976,100.00,0.69,insn per cycle
> >> 37487490,,cpu_atom/instructions/,10165193856,63.00,5.54,insn per cycle
> >> 845211,,cpu_core/branches/,16067540225,100.00,26.298,K/sec
> >> 6571193,,cpu_atom/branches/,10155940853,63.00,204.459,K/sec
> >> 41359,,cpu_core/branch-misses/,16067516493,100.00,4.89,of all branches
> >> 1020231,,cpu_atom/branch-misses/,10159363620,63.00,120.71,of all branches
> >> ,16067494476,100.00,,,,TopdownL1 (cpu_core)
> >> ,,,,,,%  tma_backend_bound
> >> ,,,,,0.0,%  tma_bad_speculation
> >> ,,,,,,%  tma_frontend_bound
> >> ,,,,,,%  tma_retiring
> >> ,10160989992,63.00,,,,TopdownL1 (cpu_atom)
> >> ,,,,,13.8,%  tma_bad_speculation
> >> ,10188319019,63.00,,,41.3,%  tma_frontend_bound
> >> ,10258326591,63.00,,,38.6,%  tma_backend_bound
> >> ,,,,,38.6,%  tma_backend_bound_aux
> >> ,10282689488,64.00,,,5.4,%  tma_retiring
> >>
> >> Kan Liang (5):
> >>   perf metrics: Sort the Default metricgroup
> >>   perf stat: New metricgroup output for the default mode
> >>   perf test: Move all the check functions of stat csv output to lib
> >>   perf test: Add test case for the standard perf stat output
> >>   perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics
> > 
> > Just to be clear, I'm happy with this to be submitted having put
> > reviewed/acked-by on it.
> > 
> 
> Thanks Ian. Appreciate all your feedback and comments.

Applied,

- Arnaldo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V4 5/5] perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics
  2023-06-16  3:14 ` [PATCH V4 5/5] perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics kan.liang
  2023-06-16  5:57   ` Ian Rogers
@ 2023-06-16 13:48   ` John Garry
  1 sibling, 0 replies; 14+ messages in thread
From: John Garry @ 2023-06-16 13:48 UTC (permalink / raw)
  To: kan.liang, acme, mingo, peterz, irogers, namhyung, jolsa,
	adrian.hunter, linux-perf-users, linux-kernel
  Cc: ak, eranian, ahmad.yasin

On 16/06/2023 04:14, kan.liang@linux.intel.com wrote:
> From: Kan Liang<kan.liang@linux.intel.com>
> 
> Add the default tags for Hisi hip08 as well.
> 
> Signed-off-by: Kan Liang<kan.liang@linux.intel.com>
> Cc: John Garry<john.g.garry@oracle.com>

Reviewed-by: John Garry <john.g.garry@oracle.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-06-16 13:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-16  3:14 [PATCH V4 0/5] New metricgroup output in perf stat default mode kan.liang
2023-06-16  3:14 ` [PATCH V4 1/5] perf metrics: Sort the Default metricgroup kan.liang
2023-06-16  5:48   ` Ian Rogers
2023-06-16  3:14 ` [PATCH V4 2/5] perf stat: New metricgroup output for the default mode kan.liang
2023-06-16  5:56   ` Ian Rogers
2023-06-16 13:23     ` Liang, Kan
2023-06-16  3:14 ` [PATCH V4 3/5] perf test: Move all the check functions of stat csv output to lib kan.liang
2023-06-16  3:14 ` [PATCH V4 4/5] perf test: Add test case for the standard perf stat output kan.liang
2023-06-16  3:14 ` [PATCH V4 5/5] perf vendor events arm64: Add default tags for Hisi hip08 L1 metrics kan.liang
2023-06-16  5:57   ` Ian Rogers
2023-06-16 13:48   ` John Garry
2023-06-16  5:59 ` [PATCH V4 0/5] New metricgroup output in perf stat default mode Ian Rogers
2023-06-16 13:26   ` Liang, Kan
2023-06-16 13:39     ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).