Re: [PATCH v3 21/30] perf vendor events: Update Intel sandybridge

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Sedat Dilek <sedat.dilek@gmail.com>
To: Ian Rogers <irogers@google.com>
Cc: perry.taylor@intel.com, caleb.biggers@intel.com,
	kshipra.bopardikar@intel.com,
	Kan Liang <kan.liang@linux.intel.com>,
	Zhengjun Xing <zhengjun.xing@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Maxime Coquelin <mcoquelin.stm32@gmail.com>,
	Alexandre Torgue <alexandre.torgue@foss.st.com>,
	Andi Kleen <ak@linux.intel.com>,
	James Clark <james.clark@arm.com>,
	John Garry <john.garry@huawei.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH v3 21/30] perf vendor events: Update Intel sandybridge
Date: Fri, 29 Jul 2022 10:41:47 +0200	[thread overview]
Message-ID: <CA+icZUU-AmzdkWqBCWw=izbWJfpw4GP+UUaOE6SRs3tiAtmKng@mail.gmail.com> (raw)
In-Reply-To: <20220727220832.2865794-22-irogers@google.com>

On Thu, Jul 28, 2022 at 12:09 AM Ian Rogers <irogers@google.com> wrote:
>
> Update to v17, the metrics are based on TMA 4.4 full.
>
> Use script at:
> https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
>
> to download and generate the latest events and metrics. Manually copy
> the sandybridge files into perf and update mapfile.csv.
>
> Tested on a non-sandybridge with 'perf test':
>  10: PMU events                                                      :
>  10.1: PMU event table sanity                                        : Ok
>  10.2: PMU event map aliases                                         : Ok
>  10.3: Parsing of PMU event table metrics                            : Ok
>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>

Hi Ian,

thanks for v3 patchset.

I used latest perf/core Git branch from Arnaldo's tree plus some
custom patches (to fix binutils v2.38.90 and opennssl-v3 issues plus
gnu11 tools patches) and build with LLVM-14.

When I run on my Intel SandyBridge CPU...

$ ~/bin/perf test
...
 10: PMU events                                                      :
10.1: PMU event table sanity                                        : Ok
10.2: PMU event map aliases                                         : Ok
10.3: Parsing of PMU event table metrics                            : Ok
10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
...

First time I ran perf with option test.

Looks that good to you?

Regards,
-Sedat-

> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/pmu-events/arch/x86/mapfile.csv            |  2 +-
>  tools/perf/pmu-events/arch/x86/sandybridge/cache.json |  2 +-
>  .../arch/x86/sandybridge/floating-point.json          |  2 +-
>  .../pmu-events/arch/x86/sandybridge/frontend.json     |  4 ++--
>  .../perf/pmu-events/arch/x86/sandybridge/memory.json  |  2 +-
>  tools/perf/pmu-events/arch/x86/sandybridge/other.json |  2 +-
>  .../pmu-events/arch/x86/sandybridge/pipeline.json     | 10 +++++-----
>  .../pmu-events/arch/x86/sandybridge/snb-metrics.json  | 11 +++++++++--
>  .../pmu-events/arch/x86/sandybridge/uncore-other.json |  2 +-
>  .../arch/x86/sandybridge/virtual-memory.json          |  2 +-
>  10 files changed, 23 insertions(+), 16 deletions(-)
>
> diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
> index 2f9419ee2d29..0b56c4a8a3a8 100644
> --- a/tools/perf/pmu-events/arch/x86/mapfile.csv
> +++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
> @@ -19,12 +19,12 @@ GenuineIntel-6-(57|85),v9,knightslanding,core
>  GenuineIntel-6-AA,v1.00,meteorlake,core
>  GenuineIntel-6-1[AEF],v3,nehalemep,core
>  GenuineIntel-6-2E,v3,nehalemex,core
> +GenuineIntel-6-2A,v17,sandybridge,core
>  GenuineIntel-6-[4589]E,v24,skylake,core
>  GenuineIntel-6-A[56],v24,skylake,core
>  GenuineIntel-6-37,v13,silvermont,core
>  GenuineIntel-6-4D,v13,silvermont,core
>  GenuineIntel-6-4C,v13,silvermont,core
> -GenuineIntel-6-2A,v15,sandybridge,core
>  GenuineIntel-6-2C,v2,westmereep-dp,core
>  GenuineIntel-6-25,v2,westmereep-sp,core
>  GenuineIntel-6-2F,v2,westmereex,core
> diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/cache.json b/tools/perf/pmu-events/arch/x86/sandybridge/cache.json
> index 92a7269eb444..a1d622352131 100644
> --- a/tools/perf/pmu-events/arch/x86/sandybridge/cache.json
> +++ b/tools/perf/pmu-events/arch/x86/sandybridge/cache.json
> @@ -1876,4 +1876,4 @@
>          "SampleAfterValue": "100003",
>          "UMask": "0x10"
>      }
> -]
> \ No newline at end of file
> +]
> diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/floating-point.json b/tools/perf/pmu-events/arch/x86/sandybridge/floating-point.json
> index 713878fd062b..eb2ff2cfdf6b 100644
> --- a/tools/perf/pmu-events/arch/x86/sandybridge/floating-point.json
> +++ b/tools/perf/pmu-events/arch/x86/sandybridge/floating-point.json
> @@ -135,4 +135,4 @@
>          "SampleAfterValue": "2000003",
>          "UMask": "0x1"
>      }
> -]
> \ No newline at end of file
> +]
> diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/frontend.json b/tools/perf/pmu-events/arch/x86/sandybridge/frontend.json
> index fa22f9463b66..e2c82e43a2de 100644
> --- a/tools/perf/pmu-events/arch/x86/sandybridge/frontend.json
> +++ b/tools/perf/pmu-events/arch/x86/sandybridge/frontend.json
> @@ -176,7 +176,7 @@
>          "CounterMask": "1",
>          "EventCode": "0x79",
>          "EventName": "IDQ.MS_CYCLES",
> -        "PublicDescription": "This event counts cycles during which the microcode sequencer assisted the front-end in delivering uops.  Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder.  Using other instructions, if possible, will usually improve performance.  See the Intel 64 and IA-32 Architectures Optimization Reference Manual for more information.",
> +        "PublicDescription": "This event counts cycles during which the microcode sequencer assisted the front-end in delivering uops.  Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder.  Using other instructions, if possible, will usually improve performance.  See the Intel(R) 64 and IA-32 Architectures Optimization Reference Manual for more information.",
>          "SampleAfterValue": "2000003",
>          "UMask": "0x30"
>      },
> @@ -311,4 +311,4 @@
>          "SampleAfterValue": "2000003",
>          "UMask": "0x1"
>      }
> -]
> \ No newline at end of file
> +]
> diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/memory.json b/tools/perf/pmu-events/arch/x86/sandybridge/memory.json
> index 931892d34076..3c283ca309f3 100644
> --- a/tools/perf/pmu-events/arch/x86/sandybridge/memory.json
> +++ b/tools/perf/pmu-events/arch/x86/sandybridge/memory.json
> @@ -442,4 +442,4 @@
>          "SampleAfterValue": "100003",
>          "UMask": "0x1"
>      }
> -]
> \ No newline at end of file
> +]
> diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/other.json b/tools/perf/pmu-events/arch/x86/sandybridge/other.json
> index e251f535ec09..2f873ab14156 100644
> --- a/tools/perf/pmu-events/arch/x86/sandybridge/other.json
> +++ b/tools/perf/pmu-events/arch/x86/sandybridge/other.json
> @@ -55,4 +55,4 @@
>          "SampleAfterValue": "2000003",
>          "UMask": "0x1"
>      }
> -]
> \ No newline at end of file
> +]
> diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/pipeline.json b/tools/perf/pmu-events/arch/x86/sandybridge/pipeline.json
> index b9a3f194a00a..2c3b6c92aa6b 100644
> --- a/tools/perf/pmu-events/arch/x86/sandybridge/pipeline.json
> +++ b/tools/perf/pmu-events/arch/x86/sandybridge/pipeline.json
> @@ -609,7 +609,7 @@
>          "UMask": "0x3"
>      },
>      {
> -        "BriefDescription": "Number of occurences waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...).",
> +        "BriefDescription": "Number of occurrences waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...).",
>          "Counter": "0,1,2,3",
>          "CounterHTOff": "0,1,2,3,4,5,6,7",
>          "CounterMask": "1",
> @@ -652,7 +652,7 @@
>          "CounterHTOff": "0,1,2,3,4,5,6,7",
>          "EventCode": "0x03",
>          "EventName": "LD_BLOCKS.STORE_FORWARD",
> -        "PublicDescription": "This event counts loads that followed a store to the same address, where the data could not be forwarded inside the pipeline from the store to the load.  The most common reason why store forwarding would be blocked is when a load's address range overlaps with a preceeding smaller uncompleted store.  See the table of not supported store forwards in the Intel 64 and IA-32 Architectures Optimization Reference Manual.  The penalty for blocked store forwarding is that the load must wait for the store to complete before it can be issued.",
> +        "PublicDescription": "This event counts loads that followed a store to the same address, where the data could not be forwarded inside the pipeline from the store to the load.  The most common reason why store forwarding would be blocked is when a load's address range overlaps with a preceeding smaller uncompleted store.  See the table of not supported store forwards in the Intel(R) 64 and IA-32 Architectures Optimization Reference Manual.  The penalty for blocked store forwarding is that the load must wait for the store to complete before it can be issued.",
>          "SampleAfterValue": "100003",
>          "UMask": "0x2"
>      },
> @@ -778,7 +778,7 @@
>          "CounterMask": "1",
>          "EventCode": "0x59",
>          "EventName": "PARTIAL_RAT_STALLS.FLAGS_MERGE_UOP_CYCLES",
> -        "PublicDescription": "This event counts the number of cycles spent executing performance-sensitive flags-merging uops. For example, shift CL (merge_arith_flags). For more details, See the Intel 64 and IA-32 Architectures Optimization Reference Manual.",
> +        "PublicDescription": "This event counts the number of cycles spent executing performance-sensitive flags-merging uops. For example, shift CL (merge_arith_flags). For more details, See the Intel(R) 64 and IA-32 Architectures Optimization Reference Manual.",
>          "SampleAfterValue": "2000003",
>          "UMask": "0x20"
>      },
> @@ -797,7 +797,7 @@
>          "CounterHTOff": "0,1,2,3,4,5,6,7",
>          "EventCode": "0x59",
>          "EventName": "PARTIAL_RAT_STALLS.SLOW_LEA_WINDOW",
> -        "PublicDescription": "This event counts the number of cycles with at least one slow LEA uop being allocated. A uop is generally considered as slow LEA if it has three sources (for example, two sources and immediate) regardless of whether it is a result of LEA instruction or not. Examples of the slow LEA uop are or uops with base, index, and offset source operands using base and index reqisters, where base is EBR/RBP/R13, using RIP relative or 16-bit addressing modes. See the Intel 64 and IA-32 Architectures Optimization Reference Manual for more details about slow LEA instructions.",
> +        "PublicDescription": "This event counts the number of cycles with at least one slow LEA uop being allocated. A uop is generally considered as slow LEA if it has three sources (for example, two sources and immediate) regardless of whether it is a result of LEA instruction or not. Examples of the slow LEA uop are or uops with base, index, and offset source operands using base and index reqisters, where base is EBR/RBP/R13, using RIP relative or 16-bit addressing modes. See the Intel(R) 64 and IA-32 Architectures Optimization Reference Manual for more details about slow LEA instructions.",
>          "SampleAfterValue": "2000003",
>          "UMask": "0x40"
>      },
> @@ -1209,4 +1209,4 @@
>          "SampleAfterValue": "2000003",
>          "UMask": "0x1"
>      }
> -]
> \ No newline at end of file
> +]
> diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/snb-metrics.json b/tools/perf/pmu-events/arch/x86/sandybridge/snb-metrics.json
> index c8e7050d9c26..ae7ed267b2a2 100644
> --- a/tools/perf/pmu-events/arch/x86/sandybridge/snb-metrics.json
> +++ b/tools/perf/pmu-events/arch/x86/sandybridge/snb-metrics.json
> @@ -124,7 +124,7 @@
>          "MetricName": "FLOPc_SMT"
>      },
>      {
> -        "BriefDescription": "Instruction-Level-Parallelism (average number of uops executed when there is at least 1 uop executed)",
> +        "BriefDescription": "Instruction-Level-Parallelism (average number of uops executed when there is execution) per-core",
>          "MetricExpr": "UOPS_DISPATCHED.THREAD / (( cpu@UOPS_DISPATCHED.CORE\\,cmask\\=1@ / 2 ) if #SMT_on else cpu@UOPS_DISPATCHED.CORE\\,cmask\\=1@)",
>          "MetricGroup": "Backend;Cor;Pipeline;PortsUtil",
>          "MetricName": "ILP"
> @@ -141,6 +141,12 @@
>          "MetricGroup": "Summary;TmaL1",
>          "MetricName": "Instructions"
>      },
> +    {
> +        "BriefDescription": "Average number of Uops retired in cycles where at least one uop has retired.",
> +        "MetricExpr": "UOPS_RETIRED.RETIRE_SLOTS / cpu@UOPS_RETIRED.RETIRE_SLOTS\\,cmask\\=1@",
> +        "MetricGroup": "Pipeline;Ret",
> +        "MetricName": "Retire"
> +    },
>      {
>          "BriefDescription": "Fraction of Uops delivered by the DSB (aka Decoded ICache; or Uop Cache)",
>          "MetricExpr": "IDQ.DSB_UOPS / (( IDQ.DSB_UOPS + LSD.UOPS + IDQ.MITE_UOPS + IDQ.MS_UOPS ) )",
> @@ -163,7 +169,8 @@
>          "BriefDescription": "Giga Floating Point Operations Per Second",
>          "MetricExpr": "( ( 1 * ( FP_COMP_OPS_EXE.SSE_SCALAR_SINGLE + FP_COMP_OPS_EXE.SSE_SCALAR_DOUBLE ) + 2 * FP_COMP_OPS_EXE.SSE_PACKED_DOUBLE + 4 * ( FP_COMP_OPS_EXE.SSE_PACKED_SINGLE + SIMD_FP_256.PACKED_DOUBLE ) + 8 * SIMD_FP_256.PACKED_SINGLE ) / 1000000000 ) / duration_time",
>          "MetricGroup": "Cor;Flops;HPC",
> -        "MetricName": "GFLOPs"
> +        "MetricName": "GFLOPs",
> +        "PublicDescription": "Giga Floating Point Operations Per Second. Aggregate across all supported options of: FP precisions, scalar and vector instructions, vector-width and AMX engine."
>      },
>      {
>          "BriefDescription": "Average Frequency Utilization relative nominal frequency",
> diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/uncore-other.json b/tools/perf/pmu-events/arch/x86/sandybridge/uncore-other.json
> index 6278068908cf..88f1e326205f 100644
> --- a/tools/perf/pmu-events/arch/x86/sandybridge/uncore-other.json
> +++ b/tools/perf/pmu-events/arch/x86/sandybridge/uncore-other.json
> @@ -82,10 +82,10 @@
>      {
>          "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles.",
>          "Counter": "Fixed",
> +        "EventCode": "0xff",
>          "EventName": "UNC_CLOCK.SOCKET",
>          "PerPkg": "1",
>          "PublicDescription": "This 48-bit fixed counter counts the UCLK cycles.",
> -        "UMask": "0x01",
>          "Unit": "ARB"
>      }
>  ]
> diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/virtual-memory.json b/tools/perf/pmu-events/arch/x86/sandybridge/virtual-memory.json
> index 4dd136d00a10..98362abba1a7 100644
> --- a/tools/perf/pmu-events/arch/x86/sandybridge/virtual-memory.json
> +++ b/tools/perf/pmu-events/arch/x86/sandybridge/virtual-memory.json
> @@ -146,4 +146,4 @@
>          "SampleAfterValue": "100007",
>          "UMask": "0x20"
>      }
> -]
> \ No newline at end of file
> +]
> --
> 2.37.1.359.gd136c6c3e2-goog
>

next prev parent reply	other threads:[~2022-07-29  8:42 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-27 22:08 [PATCH v3 00/30] Add generated latest Intel events and metrics Ian Rogers
2022-07-27 22:08 ` [PATCH v3 01/30] perf vendor events: Update Intel broadwellx Ian Rogers
2022-07-27 22:08 ` [PATCH v3 04/30] perf vendor events: Update Intel alderlake Ian Rogers
2022-07-27 22:08 ` [PATCH v3 05/30] perf vendor events: Update bonnell mapfile.csv Ian Rogers
2022-07-27 22:08 ` [PATCH v3 08/30] perf vendor events: Update goldmont mapfile.csv Ian Rogers
2022-07-27 22:08 ` [PATCH v3 09/30] perf vendor events: Update goldmontplus mapfile.csv Ian Rogers
2022-07-27 22:08 ` [PATCH v3 10/30] perf vendor events: Update Intel haswell Ian Rogers
2022-07-27 22:08 ` [PATCH v3 11/30] perf vendor events: Update Intel haswellx Ian Rogers
2022-07-27 22:08 ` [PATCH v3 12/30] perf vendor events: Update Intel icelake Ian Rogers
2022-07-27 22:08 ` [PATCH v3 13/30] perf vendor events: Update Intel icelakex Ian Rogers
2022-07-27 22:08 ` [PATCH v3 14/30] perf vendor events: Update Intel ivybridge Ian Rogers
2022-07-27 22:08 ` [PATCH v3 18/30] perf vendor events: Add Intel meteorlake Ian Rogers
2022-07-27 22:08 ` [PATCH v3 19/30] perf vendor events: Update Intel nehalemep Ian Rogers
2022-07-27 22:08 ` [PATCH v3 21/30] perf vendor events: Update Intel sandybridge Ian Rogers
2022-07-29  8:41   ` Sedat Dilek [this message]
2022-07-29  8:47     ` Sedat Dilek
2022-07-29 14:35       ` Ian Rogers
2022-07-30 18:44         ` Sedat Dilek
2022-07-27 22:08 ` [PATCH v3 22/30] perf vendor events: Update Intel sapphirerapids Ian Rogers
2022-07-27 22:08 ` [PATCH v3 23/30] perf vendor events: Update Intel silvermont Ian Rogers
2022-07-27 22:08 ` [PATCH v3 24/30] perf vendor events: Update Intel skylake Ian Rogers
2022-07-27 22:08 ` [PATCH v3 26/30] perf vendor events: Update Intel tigerlake Ian Rogers
2022-07-27 22:08 ` [PATCH v3 27/30] perf vendor events: Update Intel westmereep-dp Ian Rogers
2022-07-27 22:08 ` [PATCH v3 28/30] perf vendor events: Update Intel westmereep-sp Ian Rogers
2022-07-27 22:08 ` [PATCH v3 29/30] perf vendor events: Update Intel westmereex Ian Rogers
2022-07-27 22:08 ` [PATCH v3 30/30] perf vendor events: Update Intel snowridgex Ian Rogers
2022-07-28 19:14 ` [PATCH v3 00/30] Add generated latest Intel events and metrics Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+icZUU-AmzdkWqBCWw=izbWJfpw4GP+UUaOE6SRs3tiAtmKng@mail.gmail.com' \
    --to=sedat.dilek@gmail.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexandre.torgue@foss.st.com \
    --cc=caleb.biggers@intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=james.clark@arm.com \
    --cc=john.garry@huawei.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@linux.intel.com \
    --cc=kshipra.bopardikar@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mcoquelin.stm32@gmail.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=perry.taylor@intel.com \
    --cc=peterz@infradead.org \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).