linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch v2 00/24] Arch-PEBS and PMU supports for Clearwater Forest and Panther Lake
@ 2025-02-18 15:27 Dapeng Mi
  2025-02-18 15:27 ` [Patch v2 01/24] perf/x86: Add dynamic constraint Dapeng Mi
                   ` (23 more replies)
  0 siblings, 24 replies; 58+ messages in thread
From: Dapeng Mi @ 2025-02-18 15:27 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Kan Liang, Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi

This v2 patch series is based on latest perf/core tree "1623ced247f7
(x86/events/amd/iommu: Increase IOMMU_NAME_SIZE)" + extra first two
patches of patch set "Cleanup for Intel PMU initialization"[1].

Changes:

  v1 -> v2:
    * Add Panther Lake PMU support (patch 02/24)
    * Add PEBS static calls to avoid introducing too much
      x86_pmu.arch_pebs checks (patch 07~08/24)
    * Optimize PEBS constraints base on Kan's dynamic constranit patch
      (patch 13/24)
    * Split perf tools patch of supporting more vector registers to
      several small patches (patch 20~22/24)

Tests:

  * Run below tests on Clearwater Forest and no issue is found. Please
    notice nmi_watchdog is disabled when running the tests.

  a. Basic perf counting case.
    perf stat -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}' sleep 1

  b. Basic PMI based perf sampling case.
    perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}' sleep 1

  c. Basic PEBS based perf sampling case.
    perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}:p' sleep 1

  d. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
    perf record -e branches:p -Iax,bx,ip,ssp,xmm0,ymmh0 -b -c 10000 sleep 1

  e. PEBS sampling case with auxiliary (memory info) group
    perf mem record sleep 1

  f. PEBS sampling case with counter group
    perf record -e '{branches:p,branches,cycles}:S' -c 10000 sleep 1

  g. Perf stat and record test
    perf test 95; perf test 119

  h. perf-fuzzer test


  * Run similar tests on Panther Lake P-cores and E-cores and no issue
    is found. CPU 0 is P-core and CPU 9 is E-core. nmi_watchdog is
    disabled as well.

  P-core:

  a. Basic perf counting case.
    perf stat -e '{cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/cycles/,cpu_core/instructions/,cpu_core/ref-cycles/,cpu_core/slots/}' taskset -c 0 sleep 1

  b. Basic PMI based perf sampling case.
    perf record -e '{cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/cycles/,cpu_core/instructions/,cpu_core/ref-cycles/,cpu_core/slots/}' taskset -c 0 sleep 1

  c. Basic PEBS based perf sampling case.
    perf record -e '{cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/cycles/,cpu_core/instructions/,cpu_core/ref-cycles/,cpu_core/slots/}:p' taskset -c 0 sleep 1

  d. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
    perf record -e branches:p -Iax,bx,ip,ssp,xmm0,ymmh0 -b -c 10000 taskset -c 0 sleep 1

  e. PEBS sampling case for user space registers
    perf record -e branches:p --user-regs=ax,bx,ip -b -c 10000 taskset -c 0 sleep 1

  f. PEBS sampling case with auxiliary (memory info) group
    perf mem record taskset -c 0 sleep 1

  g. PEBS sampling case with counter group
    perf record -e '{branches:p,branches,cycles}:S' -c 10000 taskset -c 0 sleep 1

  h. Perf stat and record test
    perf test 95; perf test 119

  E-core:

  a. Basic perf counting case.
    perf stat -e '{cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/cycles/,cpu_atom/instructions/,cpu_atom/ref-cycles/,cpu_atom/topdown-bad-spec/,cpu_atom/topdown-fe-bound/,cpu_atom/topdown-retiring/}' taskset -c 9 sleep 1

  b. Basic PMI based perf sampling case.
    perf record -e '{cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/cycles/,cpu_atom/instructions/,cpu_atom/ref-cycles/,cpu_atom/topdown-bad-spec/,cpu_atom/topdown-fe-bound/,cpu_atom/topdown-retiring/}' taskset -c 9 sleep 1
  c. Basic PEBS based perf sampling case.
    perf record -e '{cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/cycles/,cpu_atom/instructions/,cpu_atom/ref-cycles/,cpu_atom/topdown-bad-spec/,cpu_atom/topdown-fe-bound/,cpu_atom/topdown-retiring/}:p' taskset -c 9 sleep 1

  d. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
    perf record -e branches:p -Iax,bx,ip,ssp,xmm0,ymmh0 -b -c 10000 taskset -c  sleep 1

  e. PEBS sampling case for user space registers
    perf record -e branches:p --user-regs=ax,bx,ip -b -c 10000 taskset -c 9 sleep 1

  f. PEBS sampling case with auxiliary (memory info) group
    perf mem record taskset -c 9 sleep 1

  g. PEBS sampling case with counter group
    perf record -e '{branches:p,branches,cycles}:S' -c 10000 taskset -c 9 sleep 1

History:
  v1: https://lore.kernel.org/all/20250123140721.2496639-1-dapeng1.mi@linux.intel.com/

Ref:
  [1]: https://lore.kernel.org/all/20250129154820.3755948-1-kan.liang@linux.intel.com/


Dapeng Mi (22):
  perf/x86/intel: Add PMU support for Clearwater Forest
  perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
  perf/x86/intel: Decouple BTS initialization from PEBS initialization
  perf/x86/intel: Rename x86_pmu.pebs to x86_pmu.ds_pebs
  perf/x86/intel: Introduce pairs of PEBS static calls
  perf/x86/intel: Initialize architectural PEBS
  perf/x86/intel/ds: Factor out common PEBS processing code to functions
  perf/x86/intel: Process arch-PEBS records or record fragments
  perf/x86/intel: Factor out common functions to process PEBS groups
  perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
  perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  perf/x86/intel: Add SSP register support for arch-PEBS
  perf/x86/intel: Add counter group support for arch-PEBS
  perf/core: Support to capture higher width vector registers
  perf/x86/intel: Support arch-PEBS vector registers group capturing
  perf tools: Support to show SSP register
  perf tools: Enhance arch__intr/user_reg_mask() helpers
  perf tools: Enhance sample_regs_user/intr to capture more registers
  perf tools: Support to capture more vector registers (x86/Intel)
  perf tools/tests: Add vector registers PEBS sampling test
  perf tools: Fix incorrect --user-regs comments

Kan Liang (2):
  perf/x86: Add dynamic constraint
  perf/x86/intel: Add Panther Lake support

 arch/arm/kernel/perf_regs.c                   |   6 +
 arch/arm64/kernel/perf_regs.c                 |   6 +
 arch/csky/kernel/perf_regs.c                  |   5 +
 arch/loongarch/kernel/perf_regs.c             |   5 +
 arch/mips/kernel/perf_regs.c                  |   5 +
 arch/powerpc/perf/perf_regs.c                 |   5 +
 arch/riscv/kernel/perf_regs.c                 |   5 +
 arch/s390/kernel/perf_regs.c                  |   5 +
 arch/x86/events/core.c                        | 105 ++-
 arch/x86/events/intel/bts.c                   |   6 +-
 arch/x86/events/intel/core.c                  | 330 +++++++-
 arch/x86/events/intel/ds.c                    | 722 ++++++++++++++----
 arch/x86/events/intel/lbr.c                   |   2 +-
 arch/x86/events/perf_event.h                  |  69 +-
 arch/x86/include/asm/intel_ds.h               |  10 +-
 arch/x86/include/asm/msr-index.h              |  28 +
 arch/x86/include/asm/perf_event.h             | 145 +++-
 arch/x86/include/uapi/asm/perf_regs.h         |  87 ++-
 arch/x86/kernel/perf_regs.c                   |  55 +-
 include/linux/perf_event.h                    |   3 +
 include/linux/perf_regs.h                     |  10 +
 include/uapi/linux/perf_event.h               |  11 +
 kernel/events/core.c                          |  53 +-
 tools/arch/x86/include/uapi/asm/perf_regs.h   |  90 ++-
 tools/include/uapi/linux/perf_event.h         |  14 +
 tools/perf/arch/arm/util/perf_regs.c          |   8 +-
 tools/perf/arch/arm64/util/perf_regs.c        |  11 +-
 tools/perf/arch/csky/util/perf_regs.c         |   8 +-
 tools/perf/arch/loongarch/util/perf_regs.c    |   8 +-
 tools/perf/arch/mips/util/perf_regs.c         |   8 +-
 tools/perf/arch/powerpc/util/perf_regs.c      |  17 +-
 tools/perf/arch/riscv/util/perf_regs.c        |   8 +-
 tools/perf/arch/s390/util/perf_regs.c         |   8 +-
 tools/perf/arch/x86/util/perf_regs.c          | 112 ++-
 tools/perf/builtin-record.c                   |   2 +-
 tools/perf/builtin-script.c                   |  23 +-
 tools/perf/tests/shell/record.sh              |  55 ++
 tools/perf/util/evsel.c                       |  36 +-
 tools/perf/util/intel-pt.c                    |   2 +-
 tools/perf/util/parse-regs-options.c          |  23 +-
 .../perf/util/perf-regs-arch/perf_regs_x86.c  |  90 +++
 tools/perf/util/perf_regs.c                   |   8 +-
 tools/perf/util/perf_regs.h                   |  20 +-
 tools/perf/util/record.h                      |   4 +-
 tools/perf/util/sample.h                      |   6 +-
 tools/perf/util/session.c                     |  29 +-
 tools/perf/util/synthetic-events.c            |   6 +-
 47 files changed, 1966 insertions(+), 308 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2025-03-05  1:41 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-18 15:27 [Patch v2 00/24] Arch-PEBS and PMU supports for Clearwater Forest and Panther Lake Dapeng Mi
2025-02-18 15:27 ` [Patch v2 01/24] perf/x86: Add dynamic constraint Dapeng Mi
2025-02-18 15:27 ` [Patch v2 02/24] perf/x86/intel: Add Panther Lake support Dapeng Mi
2025-02-18 15:27 ` [Patch v2 03/24] perf/x86/intel: Add PMU support for Clearwater Forest Dapeng Mi
2025-02-18 15:27 ` [Patch v2 04/24] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs Dapeng Mi
2025-02-18 15:27 ` [Patch v2 05/24] perf/x86/intel: Decouple BTS initialization from PEBS initialization Dapeng Mi
2025-02-18 15:28 ` [Patch v2 06/24] perf/x86/intel: Rename x86_pmu.pebs to x86_pmu.ds_pebs Dapeng Mi
2025-02-18 15:28 ` [Patch v2 07/24] perf/x86/intel: Introduce pairs of PEBS static calls Dapeng Mi
2025-02-18 15:28 ` [Patch v2 08/24] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
2025-02-18 15:28 ` [Patch v2 09/24] perf/x86/intel/ds: Factor out common PEBS processing code to functions Dapeng Mi
2025-02-18 15:28 ` [Patch v2 10/24] perf/x86/intel: Process arch-PEBS records or record fragments Dapeng Mi
2025-02-25 10:39   ` Peter Zijlstra
2025-02-25 11:00     ` Peter Zijlstra
2025-02-26  5:20       ` Mi, Dapeng
2025-02-26  9:35         ` Peter Zijlstra
2025-02-26 15:45           ` Liang, Kan
2025-02-27  2:04             ` Mi, Dapeng
2025-02-25 20:42     ` Andi Kleen
2025-02-26  2:54     ` Mi, Dapeng
2025-02-18 15:28 ` [Patch v2 11/24] perf/x86/intel: Factor out common functions to process PEBS groups Dapeng Mi
2025-02-25 11:02   ` Peter Zijlstra
2025-02-26  5:24     ` Mi, Dapeng
2025-02-18 15:28 ` [Patch v2 12/24] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
2025-02-25 11:18   ` Peter Zijlstra
2025-02-26  5:48     ` Mi, Dapeng
2025-02-26  9:46       ` Peter Zijlstra
2025-02-27  2:05         ` Mi, Dapeng
2025-02-25 11:25   ` Peter Zijlstra
2025-02-26  6:19     ` Mi, Dapeng
2025-02-26  9:48       ` Peter Zijlstra
2025-02-27  2:09         ` Mi, Dapeng
2025-02-18 15:28 ` [Patch v2 13/24] perf/x86/intel: Update dyn_constranit base on PEBS event precise level Dapeng Mi
2025-02-27 14:06   ` Liang, Kan
2025-03-05  1:41     ` Mi, Dapeng
2025-02-18 15:28 ` [Patch v2 14/24] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Dapeng Mi
2025-02-18 15:28 ` [Patch v2 15/24] perf/x86/intel: Add SSP register support for arch-PEBS Dapeng Mi
2025-02-25 11:52   ` Peter Zijlstra
2025-02-26  6:56     ` Mi, Dapeng
2025-02-25 11:54   ` Peter Zijlstra
2025-02-25 20:44     ` Andi Kleen
2025-02-27  6:29       ` Mi, Dapeng
2025-02-18 15:28 ` [Patch v2 16/24] perf/x86/intel: Add counter group " Dapeng Mi
2025-02-18 15:28 ` [Patch v2 17/24] perf/core: Support to capture higher width vector registers Dapeng Mi
2025-02-25 20:32   ` Peter Zijlstra
2025-02-26  7:55     ` Mi, Dapeng
2025-02-18 15:28 ` [Patch v2 18/24] perf/x86/intel: Support arch-PEBS vector registers group capturing Dapeng Mi
2025-02-25 15:32   ` Peter Zijlstra
2025-02-26  8:08     ` Mi, Dapeng
2025-02-27  6:40       ` Mi, Dapeng
2025-03-04  3:08         ` Mi, Dapeng
2025-03-04 16:26           ` Liang, Kan
2025-03-05  1:34             ` Mi, Dapeng
2025-02-18 15:28 ` [Patch v2 19/24] perf tools: Support to show SSP register Dapeng Mi
2025-02-18 15:28 ` [Patch v2 20/24] perf tools: Enhance arch__intr/user_reg_mask() helpers Dapeng Mi
2025-02-18 15:28 ` [Patch v2 21/24] perf tools: Enhance sample_regs_user/intr to capture more registers Dapeng Mi
2025-02-18 15:28 ` [Patch v2 22/24] perf tools: Support to capture more vector registers (x86/Intel) Dapeng Mi
2025-02-18 15:28 ` [Patch v2 23/24] perf tools/tests: Add vector registers PEBS sampling test Dapeng Mi
2025-02-18 15:28 ` [Patch v2 24/24] perf tools: Fix incorrect --user-regs comments Dapeng Mi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).