From: Ian Rogers <irogers@google.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
Maxime Coquelin <mcoquelin.stm32@gmail.com>,
Alexandre Torgue <alexandre.torgue@foss.st.com>,
Zhengjun Xing <zhengjun.xing@linux.intel.com>,
Sandipan Das <sandipan.das@amd.com>,
James Clark <james.clark@arm.com>,
Kajol Jain <kjain@linux.ibm.com>,
John Garry <john.g.garry@oracle.com>,
Kan Liang <kan.liang@linux.intel.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Andrii Nakryiko <andrii@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
Suzuki Poulouse <suzuki.poulose@arm.com>,
Leo Yan <leo.yan@linaro.org>,
Florian Fischer <florian.fischer@muhq.space>,
Ravi Bangoria <ravi.bangoria@amd.com>,
Jing Zhang <renyu.zj@linux.alibaba.com>,
Sean Christopherson <seanjc@google.com>,
Athira Rajeev <atrajeev@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
linux-stm32@st-md-mailman.stormreply.com,
linux-arm-kernel@lists.infradead.org,
Perry Taylor <perry.taylor@intel.com>,
Caleb Biggers <caleb.biggers@intel.com>
Cc: Stephane Eranian <eranian@google.com>, Ian Rogers <irogers@google.com>
Subject: [PATCH v1 42/51] perf doc: Refresh topdown documentation
Date: Sun, 19 Feb 2023 01:28:39 -0800 [thread overview]
Message-ID: <20230219092848.639226-43-irogers@google.com> (raw)
In-Reply-To: <20230219092848.639226-1-irogers@google.com>
perf stat now supports --topdown for any platform with the TopdownL1
metric group including Intel before Icelake. Tweak the documentation
to reflect this.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/Documentation/perf-stat.txt | 27 +++++-----
tools/perf/Documentation/topdown.txt | 70 +++++++++++---------------
2 files changed, 44 insertions(+), 53 deletions(-)
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 18abdc1dce05..29bdcfa93f04 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -394,10 +394,10 @@ See perf list output for the possible metrics and metricgroups.
Do not aggregate counts across all monitored CPUs.
--topdown::
-Print complete top-down metrics supported by the CPU. This allows to
-determine bottle necks in the CPU pipeline for CPU bound workloads,
-by breaking the cycles consumed down into frontend bound, backend bound,
-bad speculation and retiring.
+Print top-down metrics supported by the CPU. This allows to determine
+bottle necks in the CPU pipeline for CPU bound workloads, by breaking
+the cycles consumed down into frontend bound, backend bound, bad
+speculation and retiring.
Frontend bound means that the CPU cannot fetch and decode instructions fast
enough. Backend bound means that computation or memory access is the bottle
@@ -430,15 +430,18 @@ CPUs the workload runs on. If needed the CPUs can be forced using
taskset.
--td-level::
-Print the top-down statistics that equal to or lower than the input level.
-It allows users to print the interested top-down metrics level instead of
-the complete top-down metrics.
+Print the top-down statistics that equal the input level. It allows
+users to print the interested top-down metrics level instead of the
+level 1 top-down metrics.
+
+As the higher levels gather more metrics and use more counters they
+will be less accurate. By convention a metric can be examined by
+appending '_group' to it and this will increase accuracy compared to
+gathering all metrics for a level. For example, level 1 analysis may
+highlight 'tma_frontend_bound'. This metric may be drilled into with
+'tma_frontend_bound_group' with
+'perf stat -M tma_frontend_bound_group...'.
-The availability of the top-down metrics level depends on the hardware. For
-example, Ice Lake only supports L1 top-down metrics. The Sapphire Rapids
-supports both L1 and L2 top-down metrics.
-
-Default: 0 means the max level that the current hardware support.
Error out if the input is higher than the supported max level.
--no-merge::
diff --git a/tools/perf/Documentation/topdown.txt b/tools/perf/Documentation/topdown.txt
index a15b93fdcf50..ae0aee86844f 100644
--- a/tools/perf/Documentation/topdown.txt
+++ b/tools/perf/Documentation/topdown.txt
@@ -1,46 +1,35 @@
-Using TopDown metrics in user space
------------------------------------
+Using TopDown metrics
+---------------------
-Intel CPUs (since Sandy Bridge and Silvermont) support a TopDown
-methodology to break down CPU pipeline execution into 4 bottlenecks:
-frontend bound, backend bound, bad speculation, retiring.
+TopDown metrics break apart performance bottlenecks. Starting at level
+1 it is typical to get metrics on retiring, bad speculation, frontend
+bound, and backend bound. Higher levels provide more detail in to the
+level 1 bottlenecks, such as at level 2: core bound, memory bound,
+heavy operations, light operations, branch mispredicts, machine
+clears, fetch latency and fetch bandwidth. For more details see [1][2][3].
-For more details on Topdown see [1][5]
+perf stat --topdown implements this using available metrics that vary
+per architecture.
-Traditionally this was implemented by events in generic counters
-and specific formulas to compute the bottlenecks.
-
-perf stat --topdown implements this.
-
-Full Top Down includes more levels that can break down the
-bottlenecks further. This is not directly implemented in perf,
-but available in other tools that can run on top of perf,
-such as toplev[2] or vtune[3]
+% perf stat -a --topdown -I1000
+# time % tma_retiring % tma_backend_bound % tma_frontend_bound % tma_bad_speculation
+ 1.001141351 11.5 34.9 46.9 6.7
+ 2.006141972 13.4 28.1 50.4 8.1
+ 3.010162040 12.9 28.1 51.1 8.0
+ 4.014009311 12.5 28.6 51.8 7.2
+ 5.017838554 11.8 33.0 48.0 7.2
+ 5.704818971 14.0 27.5 51.3 7.3
+...
-New Topdown features in Ice Lake
-===============================
+New Topdown features in Intel Ice Lake
+======================================
With Ice Lake CPUs the TopDown metrics are directly available as
fixed counters and do not require generic counters. This allows
to collect TopDown always in addition to other events.
-% perf stat -a --topdown -I1000
-# time retiring bad speculation frontend bound backend bound
- 1.001281330 23.0% 15.3% 29.6% 32.1%
- 2.003009005 5.0% 6.8% 46.6% 41.6%
- 3.004646182 6.7% 6.7% 46.0% 40.6%
- 4.006326375 5.0% 6.4% 47.6% 41.0%
- 5.007991804 5.1% 6.3% 46.3% 42.3%
- 6.009626773 6.2% 7.1% 47.3% 39.3%
- 7.011296356 4.7% 6.7% 46.2% 42.4%
- 8.012951831 4.7% 6.7% 47.5% 41.1%
-...
-
-This also enables measuring TopDown per thread/process instead
-of only per core.
-
-Using TopDown through RDPMC in applications on Ice Lake
-======================================================
+Using TopDown through RDPMC in applications on Intel Ice Lake
+=============================================================
For more fine grained measurements it can be useful to
access the new directly from user space. This is more complicated,
@@ -301,8 +290,8 @@ This "opens" a new measurement period.
A program using RDPMC for TopDown should schedule such a reset
regularly, as in every few seconds.
-Limits on Ice Lake
-==================
+Limits on Intel Ice Lake
+========================
Four pseudo TopDown metric events are exposed for the end-users,
topdown-retiring, topdown-bad-spec, topdown-fe-bound and topdown-be-bound.
@@ -318,8 +307,8 @@ a sampling read group. Since the SLOTS event must be the leader of a TopDown
group, the second event of the group is the sampling event.
For example, perf record -e '{slots, $sampling_event, topdown-retiring}:S'
-Extension on Sapphire Rapids Server
-===================================
+Extension on Intel Sapphire Rapids Server
+=========================================
The metrics counter is extended to support TMA method level 2 metrics.
The lower half of the register is the TMA level 1 metrics (legacy).
The upper half is also divided into four 8-bit fields for the new level 2
@@ -338,7 +327,6 @@ other four level 2 metrics by subtracting corresponding metrics as below.
[1] https://software.intel.com/en-us/top-down-microarchitecture-analysis-method-win
-[2] https://github.com/andikleen/pmu-tools/wiki/toplev-manual
-[3] https://software.intel.com/en-us/intel-vtune-amplifier-xe
+[2] https://sites.google.com/site/analysismethods/yasin-pubs
+[3] https://perf.wiki.kernel.org/index.php/Top-Down_Analysis
[4] https://github.com/andikleen/pmu-tools/tree/master/jevents
-[5] https://sites.google.com/site/analysismethods/yasin-pubs
--
2.39.2.637.g21b0678d19-goog
next prev parent reply other threads:[~2023-02-19 9:45 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-19 9:27 [PATCH v1 00/51] shadow metric clean up and improvements Ian Rogers
2023-02-19 9:27 ` [PATCH v1 01/51] perf tools: Ensure evsel name is initialized Ian Rogers
2023-02-28 12:06 ` kajoljain
2023-02-19 9:27 ` [PATCH v1 02/51] perf metrics: Improve variable names Ian Rogers
2023-02-19 9:28 ` [PATCH v1 03/51] perf pmu-events: Remove aggr_mode from pmu_event Ian Rogers
2023-02-19 9:28 ` [PATCH v1 04/51] perf pmu-events: Change aggr_mode to be an enum Ian Rogers
2023-02-19 9:28 ` [PATCH v1 05/51] perf pmu-events: Change deprecated to be a bool Ian Rogers
2023-02-19 9:28 ` [PATCH v1 06/51] perf pmu-events: Change perpkg " Ian Rogers
2023-02-19 9:28 ` [PATCH v1 07/51] perf expr: Make the online topology accessible globally Ian Rogers
2023-02-19 9:28 ` [PATCH v1 08/51] perf pmu-events: Make the metric_constraint an enum Ian Rogers
2023-02-19 9:28 ` [PATCH v1 09/51] perf pmu-events: Don't '\0' terminate enum values Ian Rogers
2023-02-19 9:28 ` [PATCH v1 11/51] perf vendor events intel: Refresh alderlake-n metrics Ian Rogers
2023-02-19 9:28 ` [PATCH v1 16/51] perf vendor events intel: Add graniterapids events Ian Rogers
2023-02-19 9:28 ` [PATCH v1 24/51] perf vendor events intel: Refresh knightslanding events Ian Rogers
2023-02-19 9:28 ` [PATCH v1 25/51] perf vendor events intel: Refresh sandybridge events Ian Rogers
2023-02-19 9:28 ` [PATCH v1 27/51] perf vendor events intel: Refresh silvermont events Ian Rogers
2023-02-19 9:28 ` [PATCH v1 31/51] perf vendor events intel: Refresh westmereep-dp events Ian Rogers
2023-02-19 9:28 ` [PATCH v1 32/51] perf jevents: Add rand support to metrics Ian Rogers
2023-02-19 9:28 ` [PATCH v1 33/51] perf jevent: Parse metric thresholds Ian Rogers
2023-02-19 9:28 ` [PATCH v1 34/51] perf pmu-events: Test parsing metric thresholds with the fake PMU Ian Rogers
2023-02-19 9:28 ` [PATCH v1 35/51] perf list: Support for printing metric thresholds Ian Rogers
2023-02-19 9:28 ` [PATCH v1 36/51] perf metric: Compute and print threshold values Ian Rogers
2023-02-19 9:28 ` [PATCH v1 37/51] perf expr: More explicit NAN handling Ian Rogers
2023-02-19 9:28 ` [PATCH v1 38/51] perf metric: Add --metric-no-threshold option Ian Rogers
2023-02-19 9:28 ` [PATCH v1 39/51] perf stat: Add TopdownL1 metric as a default if present Ian Rogers
2023-02-27 19:12 ` Liang, Kan
2023-02-27 19:33 ` Ian Rogers
2023-02-27 20:12 ` Liang, Kan
2023-02-28 6:27 ` Ian Rogers
2023-02-28 14:15 ` Liang, Kan
2023-02-19 9:28 ` [PATCH v1 40/51] perf stat: Implement --topdown using json metrics Ian Rogers
2023-02-19 9:28 ` [PATCH v1 41/51] perf stat: Remove topdown event special handling Ian Rogers
2023-02-19 9:28 ` Ian Rogers [this message]
2023-02-19 9:28 ` [PATCH v1 43/51] perf stat: Remove hard coded transaction events Ian Rogers
2023-02-19 9:28 ` [PATCH v1 44/51] perf stat: Use metrics for --smi-cost Ian Rogers
2023-02-19 9:28 ` [PATCH v1 45/51] perf stat: Remove perf_stat_evsel_id Ian Rogers
2023-02-19 9:28 ` [PATCH v1 46/51] perf stat: Move enums from header Ian Rogers
2023-02-19 9:28 ` [PATCH v1 47/51] perf stat: Hide runtime_stat Ian Rogers
2023-02-19 9:28 ` [PATCH v1 48/51] perf stat: Add cpu_aggr_map for loop Ian Rogers
2023-02-19 9:28 ` [PATCH v1 49/51] perf metric: Directly use counts rather than saved_value Ian Rogers
2023-02-19 9:28 ` [PATCH v1 50/51] perf stat: Use " Ian Rogers
2023-02-24 22:48 ` Namhyung Kim
2023-02-25 5:47 ` Ian Rogers
2023-02-19 9:28 ` [PATCH v1 51/51] perf stat: Remove saved_value/runtime_stat Ian Rogers
2023-02-19 11:17 ` [PATCH v1 00/51] shadow metric clean up and improvements Arnaldo Carvalho de Melo
2023-02-19 15:43 ` Ian Rogers
2023-02-21 17:44 ` Ian Rogers
2023-02-22 13:47 ` Arnaldo Carvalho de Melo
2023-02-27 22:04 ` Liang, Kan
2023-02-28 6:21 ` Ian Rogers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230219092848.639226-43-irogers@google.com \
--to=irogers@google.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=alexandre.torgue@foss.st.com \
--cc=andrii@kernel.org \
--cc=atrajeev@linux.vnet.ibm.com \
--cc=caleb.biggers@intel.com \
--cc=eddyz87@gmail.com \
--cc=eranian@google.com \
--cc=florian.fischer@muhq.space \
--cc=james.clark@arm.com \
--cc=john.g.garry@oracle.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=kjain@linux.ibm.com \
--cc=leo.yan@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-stm32@st-md-mailman.stormreply.com \
--cc=mark.rutland@arm.com \
--cc=mcoquelin.stm32@gmail.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=perry.taylor@intel.com \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
--cc=renyu.zj@linux.alibaba.com \
--cc=sandipan.das@amd.com \
--cc=seanjc@google.com \
--cc=suzuki.poulose@arm.com \
--cc=zhengjun.xing@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).