linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: weilin.wang@intel.com
To: weilin.wang@intel.com, Ian Rogers <irogers@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Kan Liang <kan.liang@linux.intel.com>
Cc: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Perry Taylor <perry.taylor@intel.com>,
	Samantha Alt <samantha.alt@intel.com>,
	Caleb Biggers <caleb.biggers@intel.com>,
	Mark Rutland <mark.rutland@arm.com>
Subject: [RFC PATCH 00/25] Perf stat metric grouping with hardware information
Date: Sun, 24 Sep 2023 23:17:59 -0700	[thread overview]
Message-ID: <20230925061824.3818631-1-weilin.wang@intel.com> (raw)

From: Weilin Wang <weilin.wang@intel.com>

Perf stat metric grouping generates event groups that are provided to kernel for
data collection using the hardware counters. Sometimes, the grouping might fail
and kernel has to retry the groups because generated groups do not fit in the
hardware counters correctly. In some other cases, the groupings are collected
correctly, however, they left some hardware counters unused.

To improve these inefficiencies, we would like to propose a hardware aware
grouping method that does metric/event grouping based on event counter
restriction rules and the availability of hardware counters in the system. This
method is generic as long as all the restriction rules could be provided from
the pmu-event JSON files.

This patch set includes code that does hardware aware grouping and updated
pmu-event JSON files for four platforms (SapphireRapids, Icelakex, Cascadelakex,
and Tigerlake) for your testing and experimenting. We've successfully tested
these patches on three platforms (SapphireRapids, Icelakex, and Cascadelakex)
with topdown metrics from TopdownL1 to TopdownL6.

There are some optimization opportunities that we might implement in the future:
1) Better NMI hanlding: when NMI watchdog is enabled, we reduce the default_core
total counter size by one. This could be improved to better utilize the counter.
2) Fill important events into unused counter for better counter utlization:
there might be some unused counters scattered in the groups. We could consider
to add important events in this slots if necessary. This could help increase the
multiplexing percentage and help improve accuracy if the event is critical.

Remaining questions for dicussion:
3) Where to start grouping from? The current implementation start grouping by
combining all the events into a single list. This step deduplicates events. But
it does not maintain the relationship of events according to the metrics, i.e.
events required by one metric may not be collected at the same time. Another
type of starting point would be grouping each individual metric and then try to
merge the groups.
4) Any comments, suggestions, new ideas?
5) If you are interested to test the patch out and the pmu-event JSON files of
your testing platform is not provided here, please let me know so that I could
provide you the files.


Weilin Wang (25):
  perf stat: Add hardware-grouping cmd option to perf stat
  perf stat: Add basic functions for the hardware-grouping stat cmd
    option
  perf pmu-events: Add functions in jevent.py
  perf pmu-events: Add counter info into JSON files for SapphireRapids
  perf pmu-events: Add event counter data for Cascadelakex
  perf pmu-events: Add event counter data for Icelakex
  perf stat: Add helper functions for hardware-grouping method
  perf stat: Add functions to get counter info
  perf stat: Add helper functions for hardware-grouping method
  perf stat: Add helper functions to hardware-grouping method
  perf stat: Add utility functions to hardware-grouping method
  perf stat: Add more functions for hardware-grouping method
  perf stat: Add functions to hardware-grouping method
  perf stat: Add build string function and topdown events handling in
    hardware-grouping
  perf stat: Add function to combine metrics for hardware-grouping
  perf stat: Update keyword core to default_core to adjust to the
    changes for events with no unit
  perf stat: Handle taken alone in hardware-grouping
  perf stat: Handle NMI in hardware-grouping
  perf stat: Handle grouping method fall back in hardware-grouping
  perf stat: Code refactoring in hardware-grouping
  perf stat: Add tool events support in hardware-grouping
  perf stat: Add TSC support in hardware-grouping
  perf stat: Fix a return error issue in hardware-grouping
  perf stat: Add check to ensure correctness in platform that does not
    support hardware-grouping
  perf pmu-events: Add event counter data for Tigerlake

 tools/lib/bitmap.c                            |   20 +
 tools/perf/builtin-stat.c                     |    7 +
 .../arch/x86/cascadelakex/cache.json          | 1237 ++++++++++++
 .../arch/x86/cascadelakex/counter.json        |   17 +
 .../arch/x86/cascadelakex/floating-point.json |   16 +
 .../arch/x86/cascadelakex/frontend.json       |   68 +
 .../arch/x86/cascadelakex/memory.json         |  751 ++++++++
 .../arch/x86/cascadelakex/other.json          |  168 ++
 .../arch/x86/cascadelakex/pipeline.json       |  102 +
 .../arch/x86/cascadelakex/uncore-cache.json   | 1138 +++++++++++
 .../x86/cascadelakex/uncore-interconnect.json | 1272 +++++++++++++
 .../arch/x86/cascadelakex/uncore-io.json      |  394 ++++
 .../arch/x86/cascadelakex/uncore-memory.json  |  509 +++++
 .../arch/x86/cascadelakex/uncore-power.json   |   25 +
 .../arch/x86/cascadelakex/virtual-memory.json |   28 +
 .../pmu-events/arch/x86/icelakex/cache.json   |   98 +
 .../pmu-events/arch/x86/icelakex/counter.json |   17 +
 .../arch/x86/icelakex/floating-point.json     |   13 +
 .../arch/x86/icelakex/frontend.json           |   55 +
 .../pmu-events/arch/x86/icelakex/memory.json  |   53 +
 .../pmu-events/arch/x86/icelakex/other.json   |   52 +
 .../arch/x86/icelakex/pipeline.json           |   92 +
 .../arch/x86/icelakex/uncore-cache.json       |  965 ++++++++++
 .../x86/icelakex/uncore-interconnect.json     | 1667 +++++++++++++++++
 .../arch/x86/icelakex/uncore-io.json          |  966 ++++++++++
 .../arch/x86/icelakex/uncore-memory.json      |  186 ++
 .../arch/x86/icelakex/uncore-power.json       |   26 +
 .../arch/x86/icelakex/virtual-memory.json     |   22 +
 .../arch/x86/sapphirerapids/cache.json        |  104 +
 .../arch/x86/sapphirerapids/counter.json      |   17 +
 .../x86/sapphirerapids/floating-point.json    |   25 +
 .../arch/x86/sapphirerapids/frontend.json     |   98 +-
 .../arch/x86/sapphirerapids/memory.json       |   44 +
 .../arch/x86/sapphirerapids/other.json        |   40 +
 .../arch/x86/sapphirerapids/pipeline.json     |  118 ++
 .../arch/x86/sapphirerapids/uncore-cache.json |  534 +++++-
 .../arch/x86/sapphirerapids/uncore-cxl.json   |   56 +
 .../sapphirerapids/uncore-interconnect.json   |  476 +++++
 .../arch/x86/sapphirerapids/uncore-io.json    |  373 ++++
 .../x86/sapphirerapids/uncore-memory.json     |  391 ++++
 .../arch/x86/sapphirerapids/uncore-power.json |   24 +
 .../x86/sapphirerapids/virtual-memory.json    |   20 +
 .../pmu-events/arch/x86/tigerlake/cache.json  |   65 +
 .../arch/x86/tigerlake/counter.json           |    7 +
 .../arch/x86/tigerlake/floating-point.json    |   13 +
 .../arch/x86/tigerlake/frontend.json          |   56 +
 .../pmu-events/arch/x86/tigerlake/memory.json |   31 +
 .../pmu-events/arch/x86/tigerlake/other.json  |    4 +
 .../arch/x86/tigerlake/pipeline.json          |   96 +
 .../x86/tigerlake/uncore-interconnect.json    |   11 +
 .../arch/x86/tigerlake/uncore-memory.json     |    6 +
 .../arch/x86/tigerlake/uncore-other.json      |    1 +
 .../arch/x86/tigerlake/virtual-memory.json    |   20 +
 tools/perf/pmu-events/jevents.py              |  179 +-
 tools/perf/pmu-events/pmu-events.h            |   26 +-
 tools/perf/util/metricgroup.c                 |  927 +++++++++
 tools/perf/util/metricgroup.h                 |   82 +
 tools/perf/util/pmu.c                         |    5 +
 tools/perf/util/pmu.h                         |    1 +
 tools/perf/util/stat.h                        |    1 +
 60 files changed, 13790 insertions(+), 25 deletions(-)
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/counter.json
 create mode 100644 tools/perf/pmu-events/arch/x86/icelakex/counter.json
 create mode 100644 tools/perf/pmu-events/arch/x86/sapphirerapids/counter.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tigerlake/counter.json

--
2.39.3


             reply	other threads:[~2023-09-25  6:19 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-25  6:17 weilin.wang [this message]
2023-09-25  6:18 ` [RFC PATCH 01/25] perf stat: Add hardware-grouping cmd option to perf stat weilin.wang
2023-09-26 14:50   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 02/25] perf stat: Add basic functions for the hardware-grouping stat cmd option weilin.wang
2023-09-26 15:10   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 03/25] perf pmu-events: Add functions in jevent.py weilin.wang
2023-09-26 15:17   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 04/25] perf pmu-events: Add counter info into JSON files for SapphireRapids weilin.wang
2023-09-26 15:20   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 05/25] perf pmu-events: Add event counter data for Cascadelakex weilin.wang
2023-09-25  6:18 ` [RFC PATCH 06/25] perf pmu-events: Add event counter data for Icelakex weilin.wang
2023-09-25  6:18 ` [RFC PATCH 07/25] perf stat: Add helper functions for hardware-grouping method weilin.wang
2023-09-26 15:28   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 08/25] perf stat: Add functions to get counter info weilin.wang
2023-09-26 15:37   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 09/25] perf stat: Add helper functions for hardware-grouping method weilin.wang
2023-09-26  3:37   ` Yang Jihong
2023-09-26 20:51     ` Wang, Weilin
2023-09-25  6:18 ` [RFC PATCH 10/25] perf stat: Add helper functions to " weilin.wang
2023-09-26  3:44   ` Yang Jihong
2023-09-26 15:55   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 11/25] perf stat: Add utility " weilin.wang
2023-09-26 16:02   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 12/25] perf stat: Add more functions for " weilin.wang
2023-09-25  6:18 ` [RFC PATCH 13/25] perf stat: Add functions to " weilin.wang
2023-09-26 16:18   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 14/25] perf stat: Add build string function and topdown events handling in hardware-grouping weilin.wang
2023-09-26 16:21   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 15/25] perf stat: Add function to combine metrics for hardware-grouping weilin.wang
2023-09-25  6:18 ` [RFC PATCH 16/25] perf stat: Update keyword core to default_core to adjust to the changes for events with no unit weilin.wang
2023-09-26 16:25   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 17/25] perf stat: Handle taken alone in hardware-grouping weilin.wang
2023-09-25  6:18 ` [RFC PATCH 18/25] perf stat: Handle NMI " weilin.wang
2023-09-25  6:18 ` [RFC PATCH 19/25] perf stat: Handle grouping method fall back " weilin.wang
2023-09-25  6:18 ` [RFC PATCH 20/25] perf stat: Code refactoring " weilin.wang
2023-09-25  6:18 ` [RFC PATCH 21/25] perf stat: Add tool events support " weilin.wang
2023-09-25  6:18 ` [RFC PATCH 22/25] perf stat: Add TSC " weilin.wang
2023-09-26 16:35   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 23/25] perf stat: Fix a return error issue " weilin.wang
2023-09-26 16:36   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 24/25] perf stat: Add check to ensure correctness in platform that does not support hardware-grouping weilin.wang
2023-09-26 16:38   ` Liang, Kan
2023-09-25  6:18 ` [RFC PATCH 25/25] perf pmu-events: Add event counter data for Tigerlake weilin.wang
2023-09-26 16:41   ` Liang, Kan
2023-09-25 18:29 ` [RFC PATCH 00/25] Perf stat metric grouping with hardware information Ian Rogers
2023-09-26 20:40   ` Wang, Weilin
2023-09-26 14:43 ` Liang, Kan
2023-09-26 16:48   ` Liang, Kan
2023-09-26 20:40     ` Wang, Weilin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230925061824.3818631-1-weilin.wang@intel.com \
    --to=weilin.wang@intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=caleb.biggers@intel.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=perry.taylor@intel.com \
    --cc=peterz@infradead.org \
    --cc=samantha.alt@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).