* [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest
@ 2025-01-23 14:07 Dapeng Mi
2025-01-23 14:07 ` [PATCH 01/20] perf/x86/intel: Add PMU support " Dapeng Mi
` (19 more replies)
0 siblings, 20 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
This patch series enables PMU and architectural PEBS (arch-PEBS) for
Clearwater Forest (CWF).
Comparing with previous generation Sierra Forest (SRF), CWF has two key
differences on PMU perspective.
a. Increases 3 fixed counters, fixed counter 4, 5 and 6 which are used
to profile topdown-bad-spec, topdown-fe-bound and topdown-retiring
events.
b. Introduce architectural PEBS which is used to replace previous
DS area based PEBS.
The general fixed counter bitmap (CPUID.23H.1H.EBX) support has been
upstreamed along with the ARL/LNL PMU enabling patches. Only CWF
specific event attributes need to be supported in this patch series.
Comparing with the legacy DS area based PEBS, especially for adaptive
PEBS, arch-PEBS basically inherits all currently supported PEBS
groups, such as basic-info group, memory-info group, GPRs group,
XMMs (Vector registers) group and LBRs group, but with some new fields
in these groups.
The key differences between legacy PEBS and arch-PEBS are
a. Arch-PEBS leverages CPUID.23H.4/5H sub-leaves to enumerate supported
capabilities. These two cpuid sub-leaves tell which PEBS groups are
supported and which GP and fixed counters support arch-PEBS.
IA32_PERF_CAPABILITIES MSR is not used for indicating PEBS capabilities
anymore.
b. Arch-PEBS increases several new MSRs, IA32_PEBS_BASE,
IA32_PEBS_INDEX and IA32_PMC_GPn/FXm_CFG_C MSRs. IA32_PEBS_BASE MSR is
used to tell HW the PEBS buffer starting address and size, HW would
write the captured records into the PEBS buffer. IA32_PEBS_INDEX MSR
provides several fields, like WR_OFFSET and THRESH_OFFSET. WR_OFFSET is
used to tell SW the latest PEBS record offset in PEBS buffer by HW and
THRESH_OFFSET tells HW that a PMI should be generated if current written
PEBS record cross this offset. IA32_PMC_GPn/FXm_CFG_C is per-counter
MSR, each GP or fixed counter has its own CFG_C MSR. This MSR is used to
configure which PEBS groups would be captured for the corresponding
counter once the counter overflows. Since each counter has its owned
CFG_C MSR, it means that arch-PEBS supports to capture different PEBS
groups for different counters. Whereas the legacy DS-based PEBS only
supports a global shared PEBS configuration, all counters have to share
same PEBS configuration. This is a significant improvement for
arch-PEBS, it provides more flexibility and higher efficiency. The
legacy MSRs IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG are deprecated in
arch-PEBS.
c. Arch-PEBS increases the capabilities to capture more registers, like
SSP register and higher width xsave-enabled vector registers, like
OPMASK/YMM/ZMM registers. Of course, not all platforms support to
capture all these vector registers, HW would suggest which vector
registers are supported by CPUID.23H.4H.EBX, such as CWF only supports
to capture XMM/YMM registers. New added SSP register would be placed into
GPRs group and all vector registers including XMM registers would be
placed into xsave-enabled registers group.
d. Arch-PEBS does some changes on the PEBS record layout although some
groups still have same format with previous legacy adaptive PEBS.
Arch-PEBS supports PEBS record fragments as well, if the continued bit
in the record header is set, it indicates there are fragments followed.
The details about arch-PEBS can be found in chapter 11 "Architectural
PEBS" of "Intel architecture instruction set extensions programming
reference"[1].
The patch 01/20 provides basic PMU support for CWF, the patch 02/20 fixes
an error about parsing archPerfmonExt (0x23) CPUID leaf, the left
patches (03/20 ~ 20/20) provides arch-PEBS support for kernel and perf
tools. This patch series is based on Peter's perf/core tree + Kan's PEBS
counter snapshot patchset (v10)[2].
Tests:
The following tests are run on CWF and no issue is found. Please notice
nmi_watchdog is disabled when running the tests.
a. Basic perf counting case.
perf stat -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}' sleep 1
b. Basic PMI based perf sampling case.
perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}' sleep 1
c. Basic PEBS based perf sampling case.
perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}:p' sleep 1
d. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
perf record -e branches:p -Iax,bx,ip,ssp,xmm0,ymmh0 -b -c 10000 sleep 1
e. PEBS sampling case with auxiliary (memory info) group
perf mem record sleep 1
f. PEBS sampling case with counter group
perf record -e '{branches:p,branches,cycles}:S' -c 10000 sleep 1
g. Perf stat and record test
perf test 95; perf test 119
h. perf-fuzzer test
Ref:
[1] https://www.intel.com/content/www/us/en/content-details/843860/intel-architecture-instruction-set-extensions-programming-reference.html
[2] https://lore.kernel.org/all/20250121152303.3128733-1-kan.liang@linux.intel.com/
Dapeng Mi (19):
perf/x86/intel: Add PMU support for Clearwater Forest
perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
perf/x86/intel: Decouple BTS initialization from PEBS initialization
perf/x86/intel: Rename x86_pmu.pebs to x86_pmu.ds_pebs
perf/x86/intel: Initialize architectural PEBS
perf/x86/intel/ds: Factor out common PEBS processing code to functions
perf/x86/intel: Process arch-PEBS records or record fragments
perf/x86/intel: Factor out common functions to process PEBS groups
perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
perf/x86/intel: Setup PEBS constraints base on counter & pdist map
perf/x86/intel: Setup PEBS data configuration and enable legacy groups
perf/x86/intel: Add SSP register support for arch-PEBS
perf/x86/intel: Add counter group support for arch-PEBS
perf/core: Support to capture higher width vector registers
perf/x86/intel: Support arch-PEBS vector registers group capturing
perf tools: Support to show SSP register
perf tools: Support to capture more vector registers (common part)
perf tools: Support to capture more vector registers (x86/Intel part)
perf tools/tests: Add vector registers PEBS sampling test
Kan Liang (1):
perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF
arch/arm/kernel/perf_regs.c | 6 +
arch/arm64/kernel/perf_regs.c | 6 +
arch/csky/kernel/perf_regs.c | 5 +
arch/loongarch/kernel/perf_regs.c | 5 +
arch/mips/kernel/perf_regs.c | 5 +
arch/powerpc/perf/perf_regs.c | 5 +
arch/riscv/kernel/perf_regs.c | 5 +
arch/s390/kernel/perf_regs.c | 5 +
arch/x86/events/core.c | 94 ++-
arch/x86/events/intel/bts.c | 6 +-
arch/x86/events/intel/core.c | 292 +++++++-
arch/x86/events/intel/ds.c | 697 ++++++++++++++----
arch/x86/events/perf_event.h | 62 +-
arch/x86/include/asm/intel_ds.h | 10 +-
arch/x86/include/asm/msr-index.h | 28 +
arch/x86/include/asm/perf_event.h | 147 +++-
arch/x86/include/uapi/asm/perf_regs.h | 86 ++-
arch/x86/kernel/perf_regs.c | 55 +-
include/linux/perf_event.h | 2 +
include/linux/perf_regs.h | 10 +
include/uapi/linux/perf_event.h | 10 +
kernel/events/core.c | 53 +-
tools/arch/x86/include/uapi/asm/perf_regs.h | 87 ++-
tools/include/uapi/linux/perf_event.h | 13 +
tools/perf/arch/arm/util/perf_regs.c | 5 +-
tools/perf/arch/arm64/util/perf_regs.c | 5 +-
tools/perf/arch/csky/util/perf_regs.c | 5 +-
tools/perf/arch/loongarch/util/perf_regs.c | 5 +-
tools/perf/arch/mips/util/perf_regs.c | 5 +-
tools/perf/arch/powerpc/util/perf_regs.c | 9 +-
tools/perf/arch/riscv/util/perf_regs.c | 5 +-
tools/perf/arch/s390/util/perf_regs.c | 5 +-
tools/perf/arch/x86/util/perf_regs.c | 108 ++-
tools/perf/builtin-script.c | 19 +-
tools/perf/tests/shell/record.sh | 55 ++
tools/perf/util/evsel.c | 14 +-
tools/perf/util/intel-pt.c | 2 +-
tools/perf/util/parse-regs-options.c | 23 +-
.../perf/util/perf-regs-arch/perf_regs_x86.c | 90 +++
tools/perf/util/perf_regs.c | 5 -
tools/perf/util/perf_regs.h | 18 +-
tools/perf/util/record.h | 2 +-
tools/perf/util/sample.h | 6 +-
tools/perf/util/session.c | 31 +-
tools/perf/util/synthetic-events.c | 7 +-
45 files changed, 1851 insertions(+), 267 deletions(-)
--
2.40.1
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 01/20] perf/x86/intel: Add PMU support for Clearwater Forest
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-27 16:26 ` Peter Zijlstra
2025-01-23 14:07 ` [PATCH 02/20] perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF Dapeng Mi
` (18 subsequent siblings)
19 siblings, 1 reply; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
From PMU's perspective, Clearwater Forest is similar to the previous
generation Sierra Forest.
The key differences are the ARCH PEBS feature and the new added 3 fixed
counters for topdown L1 metrics events.
The ARCH PEBS is supported in the following patches. This patch provides
support for basic perfmon features and 3 new added fixed counters.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/core.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index b140c1473a9d..5e8521a54474 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2220,6 +2220,18 @@ static struct extra_reg intel_cmt_extra_regs[] __read_mostly = {
EVENT_EXTRA_END
};
+EVENT_ATTR_STR(topdown-fe-bound, td_fe_bound_skt, "event=0x9c,umask=0x01");
+EVENT_ATTR_STR(topdown-retiring, td_retiring_skt, "event=0xc2,umask=0x02");
+EVENT_ATTR_STR(topdown-be-bound, td_be_bound_skt, "event=0xa4,umask=0x02");
+
+static struct attribute *skt_events_attrs[] = {
+ EVENT_PTR(td_fe_bound_skt),
+ EVENT_PTR(td_retiring_skt),
+ EVENT_PTR(td_bad_spec_cmt),
+ EVENT_PTR(td_be_bound_skt),
+ NULL,
+};
+
#define KNL_OT_L2_HITE BIT_ULL(19) /* Other Tile L2 Hit */
#define KNL_OT_L2_HITF BIT_ULL(20) /* Other Tile L2 Hit */
#define KNL_MCDRAM_LOCAL BIT_ULL(21)
@@ -6801,6 +6813,18 @@ __init int intel_pmu_init(void)
name = "crestmont";
break;
+ case INTEL_ATOM_DARKMONT_X:
+ intel_pmu_init_skt(NULL);
+ intel_pmu_pebs_data_source_cmt();
+ x86_pmu.pebs_latency_data = cmt_latency_data;
+ x86_pmu.get_event_constraints = cmt_get_event_constraints;
+ td_attr = skt_events_attrs;
+ mem_attr = grt_mem_attrs;
+ extra_attr = cmt_format_attr;
+ pr_cont("Darkmont events, ");
+ name = "darkmont";
+ break;
+
case INTEL_WESTMERE:
case INTEL_WESTMERE_EP:
case INTEL_WESTMERE_EX:
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 02/20] perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
2025-01-23 14:07 ` [PATCH 01/20] perf/x86/intel: Add PMU support " Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-27 16:29 ` Peter Zijlstra
2025-01-23 14:07 ` [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs Dapeng Mi
` (17 subsequent siblings)
19 siblings, 1 reply; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, stable
From: Kan Liang <kan.liang@linux.intel.com>
The EAX of the CPUID Leaf 023H enumerates the mask of valid sub-leaves.
To tell the availability of the sub-leaf 1 (enumerate the counter mask),
perf should check the bit 1 (0x2) of EAS, rather than bit 0 (0x1).
The error is not user-visible on bare metal. Because the sub-leaf 0 and
the sub-leaf 1 are always available. However, it may bring issues in a
virtualization environment when a VMM only enumerates the sub-leaf 0.
Fixes: eb467aaac21e ("perf/x86/intel: Support Architectural PerfMon Extension leaf")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
---
arch/x86/events/intel/core.c | 4 ++--
arch/x86/include/asm/perf_event.h | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 5e8521a54474..12eb96219740 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4966,8 +4966,8 @@ static void update_pmu_cap(struct x86_hybrid_pmu *pmu)
if (ebx & ARCH_PERFMON_EXT_EQ)
pmu->config_mask |= ARCH_PERFMON_EVENTSEL_EQ;
- if (sub_bitmaps & ARCH_PERFMON_NUM_COUNTER_LEAF_BIT) {
- cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
+ if (sub_bitmaps & ARCH_PERFMON_NUM_COUNTER_LEAF) {
+ cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF_BIT,
&eax, &ebx, &ecx, &edx);
pmu->cntr_mask64 = eax;
pmu->fixed_cntr_mask64 = ebx;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index adaeb8ca3a8a..71e2ae021374 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -197,7 +197,7 @@ union cpuid10_edx {
#define ARCH_PERFMON_EXT_UMASK2 0x1
#define ARCH_PERFMON_EXT_EQ 0x2
#define ARCH_PERFMON_NUM_COUNTER_LEAF_BIT 0x1
-#define ARCH_PERFMON_NUM_COUNTER_LEAF 0x1
+#define ARCH_PERFMON_NUM_COUNTER_LEAF BIT(ARCH_PERFMON_NUM_COUNTER_LEAF_BIT)
/*
* Intel Architectural LBR CPUID detection/enumeration details:
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
2025-01-23 14:07 ` [PATCH 01/20] perf/x86/intel: Add PMU support " Dapeng Mi
2025-01-23 14:07 ` [PATCH 02/20] perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 18:58 ` Andi Kleen
2025-01-23 14:07 ` [PATCH 04/20] perf/x86/intel: Decouple BTS initialization from PEBS initialization Dapeng Mi
` (16 subsequent siblings)
19 siblings, 1 reply; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
CPUID archPerfmonExt (0x23) leaves are supported to enumerate CPU
level's PMU capabilities on non-hybrid processors as well.
This patch supports to parse archPerfmonExt leaves on non-hybrid
processors. Architectural PEBS leverages archPerfmonExt sub-leaves 0x4
and 0x5 to enumerate the PEBS capabilities as well. This patch is a
precursor of the subsequent arch-PEBS enabling patches.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/core.c | 27 ++++++++++++++++++++-------
1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 12eb96219740..d29e7ada96aa 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4955,27 +4955,27 @@ static inline bool intel_pmu_broken_perf_cap(void)
return false;
}
-static void update_pmu_cap(struct x86_hybrid_pmu *pmu)
+static void update_pmu_cap(struct pmu *pmu)
{
unsigned int sub_bitmaps, eax, ebx, ecx, edx;
cpuid(ARCH_PERFMON_EXT_LEAF, &sub_bitmaps, &ebx, &ecx, &edx);
if (ebx & ARCH_PERFMON_EXT_UMASK2)
- pmu->config_mask |= ARCH_PERFMON_EVENTSEL_UMASK2;
+ hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2;
if (ebx & ARCH_PERFMON_EXT_EQ)
- pmu->config_mask |= ARCH_PERFMON_EVENTSEL_EQ;
+ hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ;
if (sub_bitmaps & ARCH_PERFMON_NUM_COUNTER_LEAF) {
cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF_BIT,
&eax, &ebx, &ecx, &edx);
- pmu->cntr_mask64 = eax;
- pmu->fixed_cntr_mask64 = ebx;
+ hybrid(pmu, cntr_mask64) = eax;
+ hybrid(pmu, fixed_cntr_mask64) = ebx;
}
if (!intel_pmu_broken_perf_cap()) {
/* Perf Metric (Bit 15) and PEBS via PT (Bit 16) are hybrid enumeration */
- rdmsrl(MSR_IA32_PERF_CAPABILITIES, pmu->intel_cap.capabilities);
+ rdmsrl(MSR_IA32_PERF_CAPABILITIES, hybrid(pmu, intel_cap).capabilities);
}
}
@@ -5066,7 +5066,7 @@ static bool init_hybrid_pmu(int cpu)
goto end;
if (this_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
- update_pmu_cap(pmu);
+ update_pmu_cap(&pmu->pmu);
intel_pmu_check_hybrid_pmus(pmu);
@@ -6564,6 +6564,7 @@ __init int intel_pmu_init(void)
x86_pmu.pebs_events_mask = intel_pmu_pebs_mask(x86_pmu.cntr_mask64);
x86_pmu.pebs_capable = PEBS_COUNTER_MASK;
+ x86_pmu.config_mask = X86_RAW_EVENT_MASK;
/*
* Quirk: v2 perfmon does not report fixed-purpose events, so
@@ -7374,6 +7375,18 @@ __init int intel_pmu_init(void)
x86_pmu.attr_update = hybrid_attr_update;
}
+ /*
+ * The archPerfmonExt (0x23) includes an enhanced enumeration of
+ * PMU architectural features with a per-core view. For non-hybrid,
+ * each core has the same PMU capabilities. It's good enough to
+ * update the x86_pmu from the booting CPU. For hybrid, the x86_pmu
+ * is used to keep the common capabilities. Still keep the values
+ * from the leaf 0xa. The core specific update will be done later
+ * when a new type is online.
+ */
+ if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
+ update_pmu_cap(NULL);
+
intel_pmu_check_counters_mask(&x86_pmu.cntr_mask64,
&x86_pmu.fixed_cntr_mask64,
&x86_pmu.intel_ctrl);
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 04/20] perf/x86/intel: Decouple BTS initialization from PEBS initialization
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (2 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 05/20] perf/x86/intel: Rename x86_pmu.pebs to x86_pmu.ds_pebs Dapeng Mi
` (15 subsequent siblings)
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Move x86_pmu.bts flag initialization into bts_init() from
intel_ds_init() and rename intel_ds_init() to intel_pebs_init() since it
fully initializes PEBS now after removing the x86_pmu.bts
initialization.
It's safe to move x86_pmu.bts into bts_init() since all x86_pmu.bts flag
are called after bts_init() execution.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/bts.c | 6 +++++-
arch/x86/events/intel/core.c | 2 +-
arch/x86/events/intel/ds.c | 5 ++---
arch/x86/events/perf_event.h | 2 +-
4 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 8f78b0c900ef..a205d1fb37b1 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -584,7 +584,11 @@ static void bts_event_read(struct perf_event *event)
static __init int bts_init(void)
{
- if (!boot_cpu_has(X86_FEATURE_DTES64) || !x86_pmu.bts)
+ if (!boot_cpu_has(X86_FEATURE_DTES64))
+ return -ENODEV;
+
+ x86_pmu.bts = boot_cpu_has(X86_FEATURE_BTS);
+ if (!x86_pmu.bts)
return -ENODEV;
if (boot_cpu_has(X86_FEATURE_PTI)) {
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index d29e7ada96aa..91afba51038f 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -6593,7 +6593,7 @@ __init int intel_pmu_init(void)
if (boot_cpu_has(X86_FEATURE_ARCH_LBR))
intel_pmu_arch_lbr_init();
- intel_ds_init();
+ intel_pebs_init();
x86_add_quirk(intel_arch_events_quirk); /* Install first, so it runs last */
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 13a78a8a2780..86fa6d8c45cf 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2650,10 +2650,10 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
}
/*
- * BTS, PEBS probe and setup
+ * PEBS probe and setup
*/
-void __init intel_ds_init(void)
+void __init intel_pebs_init(void)
{
/*
* No support for 32bit formats
@@ -2661,7 +2661,6 @@ void __init intel_ds_init(void)
if (!boot_cpu_has(X86_FEATURE_DTES64))
return;
- x86_pmu.bts = boot_cpu_has(X86_FEATURE_BTS);
x86_pmu.pebs = boot_cpu_has(X86_FEATURE_PEBS);
x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
if (x86_pmu.version <= 4)
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index a698e6484b3b..e15c2d0dbb27 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1661,7 +1661,7 @@ void intel_pmu_drain_pebs_buffer(void);
void intel_pmu_store_pebs_lbrs(struct lbr_entry *lbr);
-void intel_ds_init(void);
+void intel_pebs_init(void);
void intel_pmu_lbr_save_brstack(struct perf_sample_data *data,
struct cpu_hw_events *cpuc,
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 05/20] perf/x86/intel: Rename x86_pmu.pebs to x86_pmu.ds_pebs
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (3 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 04/20] perf/x86/intel: Decouple BTS initialization from PEBS initialization Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 06/20] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
` (14 subsequent siblings)
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Since architectural PEBS would be introduced in subsequent patches,
rename x86_pmu.pebs to x86_pmu.ds_pebs for distinguishing with the
upcoming architectural PEBS.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/core.c | 6 +++---
arch/x86/events/intel/ds.c | 20 ++++++++++----------
arch/x86/events/perf_event.h | 2 +-
3 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 91afba51038f..0063afa0ddac 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4268,7 +4268,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
.guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask & ~pebs_mask,
};
- if (!x86_pmu.pebs)
+ if (!x86_pmu.ds_pebs)
return arr;
/*
@@ -5447,7 +5447,7 @@ static __init void intel_clovertown_quirk(void)
* these chips.
*/
pr_warn("PEBS disabled due to CPU errata\n");
- x86_pmu.pebs = 0;
+ x86_pmu.ds_pebs = 0;
x86_pmu.pebs_constraints = NULL;
}
@@ -5945,7 +5945,7 @@ tsx_is_visible(struct kobject *kobj, struct attribute *attr, int i)
static umode_t
pebs_is_visible(struct kobject *kobj, struct attribute *attr, int i)
{
- return x86_pmu.pebs ? attr->mode : 0;
+ return x86_pmu.ds_pebs ? attr->mode : 0;
}
static umode_t
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 86fa6d8c45cf..e8a06c8486af 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -624,7 +624,7 @@ static int alloc_pebs_buffer(int cpu)
int max, node = cpu_to_node(cpu);
void *buffer, *insn_buff, *cea;
- if (!x86_pmu.pebs)
+ if (!x86_pmu.ds_pebs)
return 0;
buffer = dsalloc_pages(bsiz, GFP_KERNEL, cpu);
@@ -659,7 +659,7 @@ static void release_pebs_buffer(int cpu)
struct cpu_hw_events *hwev = per_cpu_ptr(&cpu_hw_events, cpu);
void *cea;
- if (!x86_pmu.pebs)
+ if (!x86_pmu.ds_pebs)
return;
kfree(per_cpu(insn_buffer, cpu));
@@ -734,7 +734,7 @@ void release_ds_buffers(void)
{
int cpu;
- if (!x86_pmu.bts && !x86_pmu.pebs)
+ if (!x86_pmu.bts && !x86_pmu.ds_pebs)
return;
for_each_possible_cpu(cpu)
@@ -763,13 +763,13 @@ void reserve_ds_buffers(void)
x86_pmu.bts_active = 0;
x86_pmu.pebs_active = 0;
- if (!x86_pmu.bts && !x86_pmu.pebs)
+ if (!x86_pmu.bts && !x86_pmu.ds_pebs)
return;
if (!x86_pmu.bts)
bts_err = 1;
- if (!x86_pmu.pebs)
+ if (!x86_pmu.ds_pebs)
pebs_err = 1;
for_each_possible_cpu(cpu) {
@@ -805,7 +805,7 @@ void reserve_ds_buffers(void)
if (x86_pmu.bts && !bts_err)
x86_pmu.bts_active = 1;
- if (x86_pmu.pebs && !pebs_err)
+ if (x86_pmu.ds_pebs && !pebs_err)
x86_pmu.pebs_active = 1;
for_each_possible_cpu(cpu) {
@@ -2661,12 +2661,12 @@ void __init intel_pebs_init(void)
if (!boot_cpu_has(X86_FEATURE_DTES64))
return;
- x86_pmu.pebs = boot_cpu_has(X86_FEATURE_PEBS);
+ x86_pmu.ds_pebs = boot_cpu_has(X86_FEATURE_PEBS);
x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
if (x86_pmu.version <= 4)
x86_pmu.pebs_no_isolation = 1;
- if (x86_pmu.pebs) {
+ if (x86_pmu.ds_pebs) {
char pebs_type = x86_pmu.intel_cap.pebs_trap ? '+' : '-';
char *pebs_qual = "";
int format = x86_pmu.intel_cap.pebs_format;
@@ -2750,7 +2750,7 @@ void __init intel_pebs_init(void)
default:
pr_cont("no PEBS fmt%d%c, ", format, pebs_type);
- x86_pmu.pebs = 0;
+ x86_pmu.ds_pebs = 0;
}
}
}
@@ -2759,7 +2759,7 @@ void perf_restore_debug_store(void)
{
struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
- if (!x86_pmu.bts && !x86_pmu.pebs)
+ if (!x86_pmu.bts && !x86_pmu.ds_pebs)
return;
wrmsrl(MSR_IA32_DS_AREA, (unsigned long)ds);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index e15c2d0dbb27..d5b7f5605e1e 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -888,7 +888,7 @@ struct x86_pmu {
*/
unsigned int bts :1,
bts_active :1,
- pebs :1,
+ ds_pebs :1,
pebs_active :1,
pebs_broken :1,
pebs_prec_dist :1,
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 06/20] perf/x86/intel: Initialize architectural PEBS
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (4 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 05/20] perf/x86/intel: Rename x86_pmu.pebs to x86_pmu.ds_pebs Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-28 11:22 ` Peter Zijlstra
2025-01-23 14:07 ` [PATCH 07/20] perf/x86/intel/ds: Factor out common PEBS processing code to functions Dapeng Mi
` (13 subsequent siblings)
19 siblings, 1 reply; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
arch-PEBS leverages CPUID.23H.4/5 sub-leaves enumerate arch-PEBS
supported capabilities and counters bitmap. This patch parses these 2
sub-leaves and initializes arch-PEBS capabilities and corresponding
structures.
Since IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG MSRs are no longer existed
for arch-PEBS, avoid code to access these MSRs as well if arch-PEBS is
supported.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/core.c | 21 +++++++++++++-----
arch/x86/events/intel/core.c | 20 ++++++++++++++++-
arch/x86/events/intel/ds.c | 36 ++++++++++++++++++++++++++-----
arch/x86/events/perf_event.h | 25 ++++++++++++++++++---
arch/x86/include/asm/perf_event.h | 7 ++++++
5 files changed, 95 insertions(+), 14 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 7b6430e5a77b..c36cc606bd19 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -549,14 +549,22 @@ static inline int precise_br_compat(struct perf_event *event)
return m == b;
}
-int x86_pmu_max_precise(void)
+int x86_pmu_max_precise(struct pmu *pmu)
{
int precise = 0;
- /* Support for constant skid */
if (x86_pmu.pebs_active && !x86_pmu.pebs_broken) {
- precise++;
+ /* arch PEBS */
+ if (x86_pmu.arch_pebs) {
+ precise = 2;
+ if (hybrid(pmu, arch_pebs_cap).pdists)
+ precise++;
+
+ return precise;
+ }
+ /* legacy PEBS - support for constant skid */
+ precise++;
/* Support for IP fixup */
if (x86_pmu.lbr_nr || x86_pmu.intel_cap.pebs_format >= 2)
precise++;
@@ -564,13 +572,14 @@ int x86_pmu_max_precise(void)
if (x86_pmu.pebs_prec_dist)
precise++;
}
+
return precise;
}
int x86_pmu_hw_config(struct perf_event *event)
{
if (event->attr.precise_ip) {
- int precise = x86_pmu_max_precise();
+ int precise = x86_pmu_max_precise(event->pmu);
if (event->attr.precise_ip > precise)
return -EOPNOTSUPP;
@@ -2615,7 +2624,9 @@ static ssize_t max_precise_show(struct device *cdev,
struct device_attribute *attr,
char *buf)
{
- return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise());
+ struct pmu *pmu = dev_get_drvdata(cdev);
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise(pmu));
}
static DEVICE_ATTR_RO(max_precise);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 0063afa0ddac..dc49dcf9b705 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4973,6 +4973,21 @@ static void update_pmu_cap(struct pmu *pmu)
hybrid(pmu, fixed_cntr_mask64) = ebx;
}
+ /* Bits[5:4] should be set simultaneously if arch-PEBS is supported */
+ if ((sub_bitmaps & ARCH_PERFMON_PEBS_LEAVES) == ARCH_PERFMON_PEBS_LEAVES) {
+ cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_CAP_LEAF_BIT,
+ &eax, &ebx, &ecx, &edx);
+ hybrid(pmu, arch_pebs_cap).caps = (u64)ebx << 32;
+
+ cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_COUNTER_LEAF_BIT,
+ &eax, &ebx, &ecx, &edx);
+ hybrid(pmu, arch_pebs_cap).counters = ((u64)ecx << 32) | eax;
+ hybrid(pmu, arch_pebs_cap).pdists = ((u64)edx << 32) | ebx;
+ } else {
+ WARN_ON(x86_pmu.arch_pebs == 1);
+ x86_pmu.arch_pebs = 0;
+ }
+
if (!intel_pmu_broken_perf_cap()) {
/* Perf Metric (Bit 15) and PEBS via PT (Bit 16) are hybrid enumeration */
rdmsrl(MSR_IA32_PERF_CAPABILITIES, hybrid(pmu, intel_cap).capabilities);
@@ -5945,7 +5960,7 @@ tsx_is_visible(struct kobject *kobj, struct attribute *attr, int i)
static umode_t
pebs_is_visible(struct kobject *kobj, struct attribute *attr, int i)
{
- return x86_pmu.ds_pebs ? attr->mode : 0;
+ return intel_pmu_has_pebs() ? attr->mode : 0;
}
static umode_t
@@ -7387,6 +7402,9 @@ __init int intel_pmu_init(void)
if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
update_pmu_cap(NULL);
+ if (x86_pmu.arch_pebs)
+ pr_cont("Architectural PEBS, ");
+
intel_pmu_check_counters_mask(&x86_pmu.cntr_mask64,
&x86_pmu.fixed_cntr_mask64,
&x86_pmu.intel_ctrl);
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index e8a06c8486af..1b33a6a60584 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1537,6 +1537,9 @@ void intel_pmu_pebs_enable(struct perf_event *event)
cpuc->pebs_enabled |= 1ULL << hwc->idx;
+ if (x86_pmu.arch_pebs)
+ return;
+
if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5))
cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32);
else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
@@ -1606,6 +1609,11 @@ void intel_pmu_pebs_disable(struct perf_event *event)
cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
+ hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
+
+ if (x86_pmu.arch_pebs)
+ return;
+
if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) &&
(x86_pmu.version < 5))
cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
@@ -1616,15 +1624,13 @@ void intel_pmu_pebs_disable(struct perf_event *event)
if (cpuc->enabled)
wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
-
- hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
}
void intel_pmu_pebs_enable_all(void)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
- if (cpuc->pebs_enabled)
+ if (!x86_pmu.arch_pebs && cpuc->pebs_enabled)
wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
}
@@ -1632,7 +1638,7 @@ void intel_pmu_pebs_disable_all(void)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
- if (cpuc->pebs_enabled)
+ if (!x86_pmu.arch_pebs && cpuc->pebs_enabled)
__intel_pmu_pebs_disable_all();
}
@@ -2649,11 +2655,23 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
}
}
+static void __init intel_arch_pebs_init(void)
+{
+ /*
+ * Current hybrid platforms always both support arch-PEBS or not
+ * on all kinds of cores. So directly set x86_pmu.arch_pebs flag
+ * if boot cpu supports arch-PEBS.
+ */
+ x86_pmu.arch_pebs = 1;
+ x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
+ x86_pmu.pebs_capable = ~0ULL;
+}
+
/*
* PEBS probe and setup
*/
-void __init intel_pebs_init(void)
+static void __init intel_ds_pebs_init(void)
{
/*
* No support for 32bit formats
@@ -2755,6 +2773,14 @@ void __init intel_pebs_init(void)
}
}
+void __init intel_pebs_init(void)
+{
+ if (x86_pmu.intel_cap.pebs_format == 0xf)
+ intel_arch_pebs_init();
+ else
+ intel_ds_pebs_init();
+}
+
void perf_restore_debug_store(void)
{
struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index d5b7f5605e1e..85cb36ad5520 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -707,6 +707,12 @@ enum atom_native_id {
skt_native_id = 0x3, /* Skymont */
};
+struct arch_pebs_cap {
+ u64 caps;
+ u64 counters;
+ u64 pdists;
+};
+
struct x86_hybrid_pmu {
struct pmu pmu;
const char *name;
@@ -742,6 +748,8 @@ struct x86_hybrid_pmu {
mid_ack :1,
enabled_ack :1;
+ struct arch_pebs_cap arch_pebs_cap;
+
u64 pebs_data_source[PERF_PEBS_DATA_SOURCE_MAX];
};
@@ -884,7 +892,7 @@ struct x86_pmu {
union perf_capabilities intel_cap;
/*
- * Intel DebugStore bits
+ * Intel DebugStore and PEBS bits
*/
unsigned int bts :1,
bts_active :1,
@@ -895,7 +903,8 @@ struct x86_pmu {
pebs_no_tlb :1,
pebs_no_isolation :1,
pebs_block :1,
- pebs_ept :1;
+ pebs_ept :1,
+ arch_pebs :1;
int pebs_record_size;
int pebs_buffer_size;
u64 pebs_events_mask;
@@ -907,6 +916,11 @@ struct x86_pmu {
u64 rtm_abort_event;
u64 pebs_capable;
+ /*
+ * Intel Architectural PEBS
+ */
+ struct arch_pebs_cap arch_pebs_cap;
+
/*
* Intel LBR
*/
@@ -1196,7 +1210,7 @@ int x86_reserve_hardware(void);
void x86_release_hardware(void);
-int x86_pmu_max_precise(void);
+int x86_pmu_max_precise(struct pmu *pmu);
void hw_perf_lbr_event_destroy(struct perf_event *event);
@@ -1766,6 +1780,11 @@ static inline int intel_pmu_max_num_pebs(struct pmu *pmu)
return fls((u32)hybrid(pmu, pebs_events_mask));
}
+static inline bool intel_pmu_has_pebs(void)
+{
+ return x86_pmu.ds_pebs || x86_pmu.arch_pebs;
+}
+
#else /* CONFIG_CPU_SUP_INTEL */
static inline void reserve_ds_buffers(void)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 71e2ae021374..00ffb9933aba 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -198,6 +198,13 @@ union cpuid10_edx {
#define ARCH_PERFMON_EXT_EQ 0x2
#define ARCH_PERFMON_NUM_COUNTER_LEAF_BIT 0x1
#define ARCH_PERFMON_NUM_COUNTER_LEAF BIT(ARCH_PERFMON_NUM_COUNTER_LEAF_BIT)
+#define ARCH_PERFMON_PEBS_CAP_LEAF_BIT 0x4
+#define ARCH_PERFMON_PEBS_CAP_LEAF BIT(ARCH_PERFMON_PEBS_CAP_LEAF_BIT)
+#define ARCH_PERFMON_PEBS_COUNTER_LEAF_BIT 0x5
+#define ARCH_PERFMON_PEBS_COUNTER_LEAF BIT(ARCH_PERFMON_PEBS_COUNTER_LEAF_BIT)
+
+#define ARCH_PERFMON_PEBS_LEAVES (ARCH_PERFMON_PEBS_CAP_LEAF | \
+ ARCH_PERFMON_PEBS_COUNTER_LEAF)
/*
* Intel Architectural LBR CPUID detection/enumeration details:
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 07/20] perf/x86/intel/ds: Factor out common PEBS processing code to functions
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (5 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 06/20] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 08/20] perf/x86/intel: Process arch-PEBS records or record fragments Dapeng Mi
` (12 subsequent siblings)
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Beside some PEBS record layout difference, arch-PEBS can share most of
PEBS record processing code with adaptive PEBS. Thus, factor out these
common processing code to independent inline functions, so they can be
reused by subsequent arch-PEBS handler.
Suggested-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/ds.c | 80 ++++++++++++++++++++++++++------------
1 file changed, 55 insertions(+), 25 deletions(-)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 1b33a6a60584..be190cb03ef8 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2587,6 +2587,54 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
}
}
+static inline void __intel_pmu_handle_pebs_record(struct pt_regs *iregs,
+ struct pt_regs *regs,
+ struct perf_sample_data *data,
+ void *at, u64 pebs_status,
+ short *counts, void **last,
+ setup_fn setup_sample)
+{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ struct perf_event *event;
+ int bit;
+
+ for_each_set_bit(bit, (unsigned long *)&pebs_status, X86_PMC_IDX_MAX) {
+ event = cpuc->events[bit];
+
+ if (WARN_ON_ONCE(!event) ||
+ WARN_ON_ONCE(!event->attr.precise_ip))
+ continue;
+
+ if (counts[bit]++)
+ __intel_pmu_pebs_event(event, iregs, regs, data,
+ last[bit], setup_sample);
+
+ last[bit] = at;
+ }
+}
+
+static inline void
+__intel_pmu_handle_last_pebs_record(struct pt_regs *iregs, struct pt_regs *regs,
+ struct perf_sample_data *data, u64 mask,
+ short *counts, void **last,
+ setup_fn setup_sample)
+{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ struct perf_event *event;
+ int bit;
+
+ for_each_set_bit(bit, (unsigned long *)&mask, X86_PMC_IDX_MAX) {
+ if (!counts[bit])
+ continue;
+
+ event = cpuc->events[bit];
+
+ __intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit],
+ counts[bit], setup_sample);
+ }
+
+}
+
static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_data *data)
{
short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
@@ -2596,9 +2644,7 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
struct x86_perf_regs perf_regs;
struct pt_regs *regs = &perf_regs.regs;
struct pebs_basic *basic;
- struct perf_event *event;
void *base, *at, *top;
- int bit;
u64 mask;
if (!x86_pmu.pebs_active)
@@ -2611,6 +2657,7 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
mask = hybrid(cpuc->pmu, pebs_events_mask) |
(hybrid(cpuc->pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED);
+ mask &= cpuc->pebs_enabled;
if (unlikely(base >= top)) {
intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX);
@@ -2628,31 +2675,14 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
if (basic->format_size != cpuc->pebs_record_size)
continue;
- pebs_status = basic->applicable_counters & cpuc->pebs_enabled & mask;
- for_each_set_bit(bit, (unsigned long *)&pebs_status, X86_PMC_IDX_MAX) {
- event = cpuc->events[bit];
-
- if (WARN_ON_ONCE(!event) ||
- WARN_ON_ONCE(!event->attr.precise_ip))
- continue;
-
- if (counts[bit]++) {
- __intel_pmu_pebs_event(event, iregs, regs, data, last[bit],
- setup_pebs_adaptive_sample_data);
- }
- last[bit] = at;
- }
+ pebs_status = mask & basic->applicable_counters;
+ __intel_pmu_handle_pebs_record(iregs, regs, data, at,
+ pebs_status, counts, last,
+ setup_pebs_adaptive_sample_data);
}
- for_each_set_bit(bit, (unsigned long *)&mask, X86_PMC_IDX_MAX) {
- if (!counts[bit])
- continue;
-
- event = cpuc->events[bit];
-
- __intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit],
- counts[bit], setup_pebs_adaptive_sample_data);
- }
+ __intel_pmu_handle_last_pebs_record(iregs, regs, data, mask, counts, last,
+ setup_pebs_adaptive_sample_data);
}
static void __init intel_arch_pebs_init(void)
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 08/20] perf/x86/intel: Process arch-PEBS records or record fragments
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (6 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 07/20] perf/x86/intel/ds: Factor out common PEBS processing code to functions Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 09/20] perf/x86/intel: Factor out common functions to process PEBS groups Dapeng Mi
` (11 subsequent siblings)
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
A significant difference with adaptive PEBS is that arch-PEBS record
supports fragments which means an arch-PEBS record could be split into
several independent fragments which have its own arch-PEBS header in
each fragment.
This patch defines architectural PEBS record layout structures and add
helpers to process arch-PEBS records or fragments. Only legacy PEBS
groups like basic, GPR, XMM and LBR groups are supported in this patch,
the new added YMM/ZMM/OPMASK vector registers capturing would be
supported in subsequent patches.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/core.c | 9 ++
arch/x86/events/intel/ds.c | 219 ++++++++++++++++++++++++++++++
arch/x86/include/asm/msr-index.h | 6 +
arch/x86/include/asm/perf_event.h | 100 ++++++++++++++
4 files changed, 334 insertions(+)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index dc49dcf9b705..d73d899d6b02 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3114,6 +3114,15 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
}
+ /*
+ * Arch PEBS sets bit 54 in the global status register
+ */
+ if (__test_and_clear_bit(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT,
+ (unsigned long *)&status)) {
+ handled++;
+ x86_pmu.drain_pebs(regs, &data);
+ }
+
/*
* Intel PT
*/
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index be190cb03ef8..680637d63679 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2222,6 +2222,153 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
format_group);
}
+static inline bool arch_pebs_record_continued(struct arch_pebs_header *header)
+{
+ /* Continue bit or null PEBS record indicates fragment follows. */
+ return header->cont || !(header->format & GENMASK_ULL(63, 16));
+}
+
+static void setup_arch_pebs_sample_data(struct perf_event *event,
+ struct pt_regs *iregs, void *__pebs,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
+{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ struct arch_pebs_header *header = NULL;
+ struct arch_pebs_aux *meminfo = NULL;
+ struct arch_pebs_gprs *gprs = NULL;
+ struct x86_perf_regs *perf_regs;
+ void *next_record;
+ void *at = __pebs;
+ u64 sample_type;
+
+ if (at == NULL)
+ return;
+
+ perf_regs = container_of(regs, struct x86_perf_regs, regs);
+ perf_regs->xmm_regs = NULL;
+
+ sample_type = event->attr.sample_type;
+ perf_sample_data_init(data, 0, event->hw.last_period);
+ data->period = event->hw.last_period;
+
+ /*
+ * We must however always use iregs for the unwinder to stay sane; the
+ * record BP,SP,IP can point into thin air when the record is from a
+ * previous PMI context or an (I)RET happened between the record and
+ * PMI.
+ */
+ if (sample_type & PERF_SAMPLE_CALLCHAIN)
+ perf_sample_save_callchain(data, event, iregs);
+
+ *regs = *iregs;
+
+again:
+ header = at;
+ next_record = at + sizeof(struct arch_pebs_header);
+ if (header->basic) {
+ struct arch_pebs_basic *basic = next_record;
+
+ /* The ip in basic is EventingIP */
+ set_linear_ip(regs, basic->ip);
+ regs->flags = PERF_EFLAGS_EXACT;
+ setup_pebs_time(event, data, basic->tsc);
+
+ if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT)
+ data->weight.var3_w = basic->valid ? basic->retire : 0;
+
+ next_record = basic + 1;
+ }
+
+ /*
+ * The record for MEMINFO is in front of GP
+ * But PERF_SAMPLE_TRANSACTION needs gprs->ax.
+ * Save the pointer here but process later.
+ */
+ if (header->aux) {
+ meminfo = next_record;
+ next_record = meminfo + 1;
+ }
+
+ if (header->gpr) {
+ gprs = next_record;
+ next_record = gprs + 1;
+
+ if (event->attr.precise_ip < 2) {
+ set_linear_ip(regs, gprs->ip);
+ regs->flags &= ~PERF_EFLAGS_EXACT;
+ }
+
+ if (sample_type & PERF_SAMPLE_REGS_INTR)
+ adaptive_pebs_save_regs(regs, (struct pebs_gprs *)gprs);
+ }
+
+ if (header->aux) {
+ if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
+ u16 latency = meminfo->cache_latency;
+ u64 tsx_latency = intel_get_tsx_weight(meminfo->tsx_tuning);
+
+ data->weight.var2_w = meminfo->instr_latency;
+
+ if (sample_type & PERF_SAMPLE_WEIGHT)
+ data->weight.full = latency ?: tsx_latency;
+ else
+ data->weight.var1_dw = latency ?: (u32)tsx_latency;
+ data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
+ }
+
+ if (sample_type & PERF_SAMPLE_DATA_SRC) {
+ data->data_src.val = get_data_src(event, meminfo->aux);
+ data->sample_flags |= PERF_SAMPLE_DATA_SRC;
+ }
+
+ if (sample_type & PERF_SAMPLE_ADDR_TYPE) {
+ data->addr = meminfo->address;
+ data->sample_flags |= PERF_SAMPLE_ADDR;
+ }
+
+ if (sample_type & PERF_SAMPLE_TRANSACTION) {
+ data->txn = intel_get_tsx_transaction(meminfo->tsx_tuning,
+ gprs ? gprs->ax : 0);
+ data->sample_flags |= PERF_SAMPLE_TRANSACTION;
+ }
+ }
+
+ if (header->xmm) {
+ struct arch_pebs_xmm *xmm;
+
+ next_record += sizeof(struct arch_pebs_xer_header);
+
+ xmm = next_record;
+ perf_regs->xmm_regs = xmm->xmm;
+ next_record = xmm + 1;
+ }
+
+ if (header->lbr) {
+ struct arch_pebs_lbr_header *lbr_header = next_record;
+ struct lbr_entry *lbr;
+ int num_lbr;
+
+ next_record = lbr_header + 1;
+ lbr = next_record;
+
+ num_lbr = header->lbr == ARCH_PEBS_LBR_NUM_VAR ? lbr_header->depth :
+ header->lbr * ARCH_PEBS_BASE_LBR_ENTRIES;
+ next_record += num_lbr * sizeof(struct lbr_entry);
+
+ if (has_branch_stack(event)) {
+ intel_pmu_store_pebs_lbrs(lbr);
+ intel_pmu_lbr_save_brstack(data, cpuc, event);
+ }
+ }
+
+ /* Parse followed fragments if there are. */
+ if (arch_pebs_record_continued(header)) {
+ at = at + header->size;
+ goto again;
+ }
+}
+
static inline void *
get_next_pebs_record_by_bit(void *base, void *top, int bit)
{
@@ -2685,6 +2832,77 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
setup_pebs_adaptive_sample_data);
}
+static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
+ struct perf_sample_data *data)
+{
+ short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
+ void *last[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS];
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ union arch_pebs_index index;
+ struct x86_perf_regs perf_regs;
+ struct pt_regs *regs = &perf_regs.regs;
+ void *base, *at, *top;
+ u64 mask;
+
+ rdmsrl(MSR_IA32_PEBS_INDEX, index.full);
+
+ if (unlikely(!index.split.wr)) {
+ intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX);
+ return;
+ }
+
+ base = cpuc->ds_pebs_vaddr;
+ top = (void *)((u64)cpuc->ds_pebs_vaddr +
+ (index.split.wr << ARCH_PEBS_INDEX_WR_SHIFT));
+
+ mask = hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled;
+
+ if (!iregs)
+ iregs = &dummy_iregs;
+
+ /* Process all but the last event for each counter. */
+ for (at = base; at < top;) {
+ struct arch_pebs_header *header;
+ struct arch_pebs_basic *basic;
+ u64 pebs_status;
+
+ header = at;
+
+ if (WARN_ON_ONCE(!header->size))
+ break;
+
+ /* 1st fragment or single record must have basic group */
+ if (!header->basic) {
+ at += header->size;
+ continue;
+ }
+
+ basic = at + sizeof(struct arch_pebs_header);
+ pebs_status = mask & basic->applicable_counters;
+ __intel_pmu_handle_pebs_record(iregs, regs, data, at,
+ pebs_status, counts, last,
+ setup_arch_pebs_sample_data);
+
+ /* Skip non-last fragments */
+ while (arch_pebs_record_continued(header)) {
+ if (!header->size)
+ break;
+ at += header->size;
+ header = at;
+ }
+
+ /* Skip last fragment or the single record */
+ at += header->size;
+ }
+
+ __intel_pmu_handle_last_pebs_record(iregs, regs, data, mask, counts,
+ last, setup_arch_pebs_sample_data);
+
+ index.split.wr = 0;
+ index.split.full = 0;
+ wrmsrl(MSR_IA32_PEBS_INDEX, index.full);
+}
+
static void __init intel_arch_pebs_init(void)
{
/*
@@ -2694,6 +2912,7 @@ static void __init intel_arch_pebs_init(void)
*/
x86_pmu.arch_pebs = 1;
x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
+ x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
x86_pmu.pebs_capable = ~0ULL;
}
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3ae84c3b8e6d..59d3a050985e 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -312,6 +312,12 @@
#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
+/* Arch PEBS */
+#define MSR_IA32_PEBS_BASE 0x000003f4
+#define MSR_IA32_PEBS_INDEX 0x000003f5
+#define ARCH_PEBS_OFFSET_MASK 0x7fffff
+#define ARCH_PEBS_INDEX_WR_SHIFT 4
+
#define MSR_IA32_RTIT_CTL 0x00000570
#define RTIT_CTL_TRACEEN BIT(0)
#define RTIT_CTL_CYCLEACC BIT(1)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 00ffb9933aba..d0a3a13b8dae 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -412,6 +412,8 @@ static inline bool is_topdown_idx(int idx)
#define GLOBAL_STATUS_LBRS_FROZEN BIT_ULL(GLOBAL_STATUS_LBRS_FROZEN_BIT)
#define GLOBAL_STATUS_TRACE_TOPAPMI_BIT 55
#define GLOBAL_STATUS_TRACE_TOPAPMI BIT_ULL(GLOBAL_STATUS_TRACE_TOPAPMI_BIT)
+#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT 54
+#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD BIT_ULL(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT)
#define GLOBAL_STATUS_PERF_METRICS_OVF_BIT 48
#define GLOBAL_CTRL_EN_PERF_METRICS 48
@@ -473,6 +475,104 @@ struct pebs_xmm {
u64 xmm[16*2]; /* two entries for each register */
};
+/*
+ * Arch PEBS
+ */
+union arch_pebs_index {
+ struct {
+ u64 rsvd:4,
+ wr:23,
+ rsvd2:4,
+ full:1,
+ en:1,
+ rsvd3:3,
+ thresh:23,
+ rsvd4:5;
+ } split;
+ u64 full;
+};
+
+struct arch_pebs_header {
+ union {
+ u64 format;
+ struct {
+ u64 size:16, /* Record size */
+ rsvd:14,
+ mode:1, /* 64BIT_MODE */
+ cont:1,
+ rsvd2:3,
+ cntr:5,
+ lbr:2,
+ rsvd3:7,
+ xmm:1,
+ ymmh:1,
+ rsvd4:2,
+ opmask:1,
+ zmmh:1,
+ h16zmm:1,
+ rsvd5:5,
+ gpr:1,
+ aux:1,
+ basic:1;
+ };
+ };
+ u64 rsvd6;
+};
+
+struct arch_pebs_basic {
+ u64 ip;
+ u64 applicable_counters;
+ u64 tsc;
+ u64 retire :16, /* Retire Latency */
+ valid :1,
+ rsvd :47;
+ u64 rsvd2;
+ u64 rsvd3;
+};
+
+struct arch_pebs_aux {
+ u64 address;
+ u64 rsvd;
+ u64 rsvd2;
+ u64 rsvd3;
+ u64 rsvd4;
+ u64 aux;
+ u64 instr_latency :16,
+ pad2 :16,
+ cache_latency :16,
+ pad3 :16;
+ u64 tsx_tuning;
+};
+
+struct arch_pebs_gprs {
+ u64 flags, ip, ax, cx, dx, bx, sp, bp, si, di;
+ u64 r8, r9, r10, r11, r12, r13, r14, r15, ssp;
+ u64 rsvd;
+};
+
+struct arch_pebs_xer_header {
+ u64 xstate;
+ u64 rsvd;
+};
+
+struct arch_pebs_xmm {
+ u64 xmm[16*2]; /* two entries for each register */
+};
+
+#define ARCH_PEBS_LBR_NAN 0x0
+#define ARCH_PEBS_LBR_NUM_8 0x1
+#define ARCH_PEBS_LBR_NUM_16 0x2
+#define ARCH_PEBS_LBR_NUM_VAR 0x3
+#define ARCH_PEBS_BASE_LBR_ENTRIES 8
+struct arch_pebs_lbr_header {
+ u64 rsvd;
+ u64 ctl;
+ u64 depth;
+ u64 ler_from;
+ u64 ler_to;
+ u64 ler_info;
+};
+
/*
* AMD Extended Performance Monitoring and Debug cpuid feature detection
*/
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 09/20] perf/x86/intel: Factor out common functions to process PEBS groups
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (7 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 08/20] perf/x86/intel: Process arch-PEBS records or record fragments Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 10/20] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
` (10 subsequent siblings)
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Adaptive PEBS and arch-PEBS share lots of same code to process these
PEBS groups, like basic, GPR and meminfo groups. Extract these shared
code to common functions to avoid duplicated code.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/ds.c | 239 ++++++++++++++++++-------------------
1 file changed, 119 insertions(+), 120 deletions(-)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 680637d63679..dce2b6ee8bd1 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2061,6 +2061,91 @@ static inline void __setup_pebs_counter_group(struct cpu_hw_events *cpuc,
#define PEBS_LATENCY_MASK 0xffff
+static inline void __setup_perf_sample_data(struct perf_event *event,
+ struct pt_regs *iregs,
+ struct perf_sample_data *data)
+{
+ perf_sample_data_init(data, 0, event->hw.last_period);
+ data->period = event->hw.last_period;
+
+ /*
+ * We must however always use iregs for the unwinder to stay sane; the
+ * record BP,SP,IP can point into thin air when the record is from a
+ * previous PMI context or an (I)RET happened between the record and
+ * PMI.
+ */
+ perf_sample_save_callchain(data, event, iregs);
+}
+
+static inline void __setup_pebs_basic_group(struct perf_event *event,
+ struct pt_regs *regs,
+ struct perf_sample_data *data,
+ u64 sample_type, u64 ip,
+ u64 tsc, u16 retire)
+{
+ /* The ip in basic is EventingIP */
+ set_linear_ip(regs, ip);
+ regs->flags = PERF_EFLAGS_EXACT;
+ setup_pebs_time(event, data, tsc);
+
+ if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT)
+ data->weight.var3_w = retire;
+}
+
+static inline void __setup_pebs_gpr_group(struct perf_event *event,
+ struct pt_regs *regs,
+ struct pebs_gprs *gprs,
+ u64 sample_type)
+{
+ if (event->attr.precise_ip < 2) {
+ set_linear_ip(regs, gprs->ip);
+ regs->flags &= ~PERF_EFLAGS_EXACT;
+ }
+
+ if (sample_type & PERF_SAMPLE_REGS_INTR)
+ adaptive_pebs_save_regs(regs, gprs);
+}
+
+static inline void __setup_pebs_meminfo_group(struct perf_event *event,
+ struct perf_sample_data *data,
+ u64 sample_type, u64 latency,
+ u16 instr_latency, u64 address,
+ u64 aux, u64 tsx_tuning, u64 ax)
+{
+ if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
+ u64 tsx_latency = intel_get_tsx_weight(tsx_tuning);
+
+ data->weight.var2_w = instr_latency;
+
+ /*
+ * Although meminfo::latency is defined as a u64,
+ * only the lower 32 bits include the valid data
+ * in practice on Ice Lake and earlier platforms.
+ */
+ if (sample_type & PERF_SAMPLE_WEIGHT)
+ data->weight.full = latency ?: tsx_latency;
+ else
+ data->weight.var1_dw = (u32)latency ?: tsx_latency;
+
+ data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
+ }
+
+ if (sample_type & PERF_SAMPLE_DATA_SRC) {
+ data->data_src.val = get_data_src(event, aux);
+ data->sample_flags |= PERF_SAMPLE_DATA_SRC;
+ }
+
+ if (sample_type & PERF_SAMPLE_ADDR_TYPE) {
+ data->addr = address;
+ data->sample_flags |= PERF_SAMPLE_ADDR;
+ }
+
+ if (sample_type & PERF_SAMPLE_TRANSACTION) {
+ data->txn = intel_get_tsx_transaction(tsx_tuning, ax);
+ data->sample_flags |= PERF_SAMPLE_TRANSACTION;
+ }
+}
+
/*
* With adaptive PEBS the layout depends on what fields are configured.
*/
@@ -2070,12 +2155,14 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
struct pt_regs *regs)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ u64 sample_type = event->attr.sample_type;
struct pebs_basic *basic = __pebs;
void *next_record = basic + 1;
- u64 sample_type, format_group;
struct pebs_meminfo *meminfo = NULL;
struct pebs_gprs *gprs = NULL;
struct x86_perf_regs *perf_regs;
+ u64 format_group;
+ u16 retire;
if (basic == NULL)
return;
@@ -2083,32 +2170,17 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
perf_regs = container_of(regs, struct x86_perf_regs, regs);
perf_regs->xmm_regs = NULL;
- sample_type = event->attr.sample_type;
format_group = basic->format_group;
- perf_sample_data_init(data, 0, event->hw.last_period);
- data->period = event->hw.last_period;
- setup_pebs_time(event, data, basic->tsc);
-
- /*
- * We must however always use iregs for the unwinder to stay sane; the
- * record BP,SP,IP can point into thin air when the record is from a
- * previous PMI context or an (I)RET happened between the record and
- * PMI.
- */
- perf_sample_save_callchain(data, event, iregs);
+ __setup_perf_sample_data(event, iregs, data);
*regs = *iregs;
- /* The ip in basic is EventingIP */
- set_linear_ip(regs, basic->ip);
- regs->flags = PERF_EFLAGS_EXACT;
- if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) {
- if (x86_pmu.flags & PMU_FL_RETIRE_LATENCY)
- data->weight.var3_w = basic->retire_latency;
- else
- data->weight.var3_w = 0;
- }
+ /* basic group */
+ retire = x86_pmu.flags & PMU_FL_RETIRE_LATENCY ?
+ basic->retire_latency : 0;
+ __setup_pebs_basic_group(event, regs, data, sample_type,
+ basic->ip, basic->tsc, retire);
/*
* The record for MEMINFO is in front of GP
@@ -2124,54 +2196,20 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
gprs = next_record;
next_record = gprs + 1;
- if (event->attr.precise_ip < 2) {
- set_linear_ip(regs, gprs->ip);
- regs->flags &= ~PERF_EFLAGS_EXACT;
- }
-
- if (sample_type & PERF_SAMPLE_REGS_INTR)
- adaptive_pebs_save_regs(regs, gprs);
+ __setup_pebs_gpr_group(event, regs, gprs, sample_type);
}
if (format_group & PEBS_DATACFG_MEMINFO) {
- if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
- u64 latency = x86_pmu.flags & PMU_FL_INSTR_LATENCY ?
- meminfo->cache_latency : meminfo->mem_latency;
-
- if (x86_pmu.flags & PMU_FL_INSTR_LATENCY)
- data->weight.var2_w = meminfo->instr_latency;
-
- /*
- * Although meminfo::latency is defined as a u64,
- * only the lower 32 bits include the valid data
- * in practice on Ice Lake and earlier platforms.
- */
- if (sample_type & PERF_SAMPLE_WEIGHT) {
- data->weight.full = latency ?:
- intel_get_tsx_weight(meminfo->tsx_tuning);
- } else {
- data->weight.var1_dw = (u32)latency ?:
- intel_get_tsx_weight(meminfo->tsx_tuning);
- }
-
- data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
- }
-
- if (sample_type & PERF_SAMPLE_DATA_SRC) {
- data->data_src.val = get_data_src(event, meminfo->aux);
- data->sample_flags |= PERF_SAMPLE_DATA_SRC;
- }
+ u64 latency = x86_pmu.flags & PMU_FL_INSTR_LATENCY ?
+ meminfo->cache_latency : meminfo->mem_latency;
+ u64 instr_latency = x86_pmu.flags & PMU_FL_INSTR_LATENCY ?
+ meminfo->instr_latency : 0;
+ u64 ax = gprs ? gprs->ax : 0;
- if (sample_type & PERF_SAMPLE_ADDR_TYPE) {
- data->addr = meminfo->address;
- data->sample_flags |= PERF_SAMPLE_ADDR;
- }
-
- if (sample_type & PERF_SAMPLE_TRANSACTION) {
- data->txn = intel_get_tsx_transaction(meminfo->tsx_tuning,
- gprs ? gprs->ax : 0);
- data->sample_flags |= PERF_SAMPLE_TRANSACTION;
- }
+ __setup_pebs_meminfo_group(event, data, sample_type, latency,
+ instr_latency, meminfo->address,
+ meminfo->aux, meminfo->tsx_tuning,
+ ax);
}
if (format_group & PEBS_DATACFG_XMMS) {
@@ -2234,13 +2272,13 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
struct pt_regs *regs)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ u64 sample_type = event->attr.sample_type;
struct arch_pebs_header *header = NULL;
struct arch_pebs_aux *meminfo = NULL;
struct arch_pebs_gprs *gprs = NULL;
struct x86_perf_regs *perf_regs;
void *next_record;
void *at = __pebs;
- u64 sample_type;
if (at == NULL)
return;
@@ -2248,18 +2286,7 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
perf_regs = container_of(regs, struct x86_perf_regs, regs);
perf_regs->xmm_regs = NULL;
- sample_type = event->attr.sample_type;
- perf_sample_data_init(data, 0, event->hw.last_period);
- data->period = event->hw.last_period;
-
- /*
- * We must however always use iregs for the unwinder to stay sane; the
- * record BP,SP,IP can point into thin air when the record is from a
- * previous PMI context or an (I)RET happened between the record and
- * PMI.
- */
- if (sample_type & PERF_SAMPLE_CALLCHAIN)
- perf_sample_save_callchain(data, event, iregs);
+ __setup_perf_sample_data(event, iregs, data);
*regs = *iregs;
@@ -2268,16 +2295,14 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
next_record = at + sizeof(struct arch_pebs_header);
if (header->basic) {
struct arch_pebs_basic *basic = next_record;
+ u16 retire = 0;
- /* The ip in basic is EventingIP */
- set_linear_ip(regs, basic->ip);
- regs->flags = PERF_EFLAGS_EXACT;
- setup_pebs_time(event, data, basic->tsc);
+ next_record = basic + 1;
if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT)
- data->weight.var3_w = basic->valid ? basic->retire : 0;
-
- next_record = basic + 1;
+ retire = basic->valid ? basic->retire : 0;
+ __setup_pebs_basic_group(event, regs, data, sample_type,
+ basic->ip, basic->tsc, retire);
}
/*
@@ -2294,44 +2319,18 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
gprs = next_record;
next_record = gprs + 1;
- if (event->attr.precise_ip < 2) {
- set_linear_ip(regs, gprs->ip);
- regs->flags &= ~PERF_EFLAGS_EXACT;
- }
-
- if (sample_type & PERF_SAMPLE_REGS_INTR)
- adaptive_pebs_save_regs(regs, (struct pebs_gprs *)gprs);
+ __setup_pebs_gpr_group(event, regs, (struct pebs_gprs *)gprs,
+ sample_type);
}
if (header->aux) {
- if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
- u16 latency = meminfo->cache_latency;
- u64 tsx_latency = intel_get_tsx_weight(meminfo->tsx_tuning);
+ u64 ax = gprs ? gprs->ax : 0;
- data->weight.var2_w = meminfo->instr_latency;
-
- if (sample_type & PERF_SAMPLE_WEIGHT)
- data->weight.full = latency ?: tsx_latency;
- else
- data->weight.var1_dw = latency ?: (u32)tsx_latency;
- data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
- }
-
- if (sample_type & PERF_SAMPLE_DATA_SRC) {
- data->data_src.val = get_data_src(event, meminfo->aux);
- data->sample_flags |= PERF_SAMPLE_DATA_SRC;
- }
-
- if (sample_type & PERF_SAMPLE_ADDR_TYPE) {
- data->addr = meminfo->address;
- data->sample_flags |= PERF_SAMPLE_ADDR;
- }
-
- if (sample_type & PERF_SAMPLE_TRANSACTION) {
- data->txn = intel_get_tsx_transaction(meminfo->tsx_tuning,
- gprs ? gprs->ax : 0);
- data->sample_flags |= PERF_SAMPLE_TRANSACTION;
- }
+ __setup_pebs_meminfo_group(event, data, sample_type,
+ meminfo->cache_latency,
+ meminfo->instr_latency,
+ meminfo->address, meminfo->aux,
+ meminfo->tsx_tuning, ax);
}
if (header->xmm) {
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 10/20] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (8 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 09/20] perf/x86/intel: Factor out common functions to process PEBS groups Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 11/20] perf/x86/intel: Setup PEBS constraints base on counter & pdist map Dapeng Mi
` (9 subsequent siblings)
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Arch-PEBS introduces a new MSR IA32_PEBS_BASE to store the arch-PEBS
buffer physical address. This patch allocates arch-PEBS buffer and then
initialize IA32_PEBS_BASE MSR with the buffer physical address.
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/core.c | 4 +-
arch/x86/events/intel/core.c | 4 +-
arch/x86/events/intel/ds.c | 112 ++++++++++++++++++++------------
arch/x86/events/perf_event.h | 16 ++---
arch/x86/include/asm/intel_ds.h | 3 +-
5 files changed, 84 insertions(+), 55 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index c36cc606bd19..f40b03adb5c7 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -411,7 +411,7 @@ int x86_reserve_hardware(void)
if (!reserve_pmc_hardware()) {
err = -EBUSY;
} else {
- reserve_ds_buffers();
+ reserve_bts_pebs_buffers();
reserve_lbr_buffers();
}
}
@@ -427,7 +427,7 @@ void x86_release_hardware(void)
{
if (atomic_dec_and_mutex_lock(&pmc_refcount, &pmc_reserve_mutex)) {
release_pmc_hardware();
- release_ds_buffers();
+ release_bts_pebs_buffers();
release_lbr_buffers();
mutex_unlock(&pmc_reserve_mutex);
}
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index d73d899d6b02..7775e1e1c1e9 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5122,7 +5122,7 @@ static void intel_pmu_cpu_starting(int cpu)
if (is_hybrid() && !init_hybrid_pmu(cpu))
return;
- init_debug_store_on_cpu(cpu);
+ init_pebs_buf_on_cpu(cpu);
/*
* Deal with CPUs that don't clear their LBRs on power-up.
*/
@@ -5216,7 +5216,7 @@ static void free_excl_cntrs(struct cpu_hw_events *cpuc)
static void intel_pmu_cpu_dying(int cpu)
{
- fini_debug_store_on_cpu(cpu);
+ fini_pebs_buf_on_cpu(cpu);
}
void intel_cpuc_finish(struct cpu_hw_events *cpuc)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index dce2b6ee8bd1..2f2c6b7c801b 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -545,26 +545,6 @@ struct pebs_record_skl {
u64 tsc;
};
-void init_debug_store_on_cpu(int cpu)
-{
- struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;
-
- if (!ds)
- return;
-
- wrmsr_on_cpu(cpu, MSR_IA32_DS_AREA,
- (u32)((u64)(unsigned long)ds),
- (u32)((u64)(unsigned long)ds >> 32));
-}
-
-void fini_debug_store_on_cpu(int cpu)
-{
- if (!per_cpu(cpu_hw_events, cpu).ds)
- return;
-
- wrmsr_on_cpu(cpu, MSR_IA32_DS_AREA, 0, 0);
-}
-
static DEFINE_PER_CPU(void *, insn_buffer);
static void ds_update_cea(void *cea, void *addr, size_t size, pgprot_t prot)
@@ -624,13 +604,18 @@ static int alloc_pebs_buffer(int cpu)
int max, node = cpu_to_node(cpu);
void *buffer, *insn_buff, *cea;
- if (!x86_pmu.ds_pebs)
+ if (!intel_pmu_has_pebs())
return 0;
- buffer = dsalloc_pages(bsiz, GFP_KERNEL, cpu);
+ buffer = dsalloc_pages(bsiz, preemptible() ? GFP_KERNEL : GFP_ATOMIC, cpu);
if (unlikely(!buffer))
return -ENOMEM;
+ if (x86_pmu.arch_pebs) {
+ hwev->pebs_vaddr = buffer;
+ return 0;
+ }
+
/*
* HSW+ already provides us the eventing ip; no need to allocate this
* buffer then.
@@ -643,7 +628,7 @@ static int alloc_pebs_buffer(int cpu)
}
per_cpu(insn_buffer, cpu) = insn_buff;
}
- hwev->ds_pebs_vaddr = buffer;
+ hwev->pebs_vaddr = buffer;
/* Update the cpu entry area mapping */
cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer;
ds->pebs_buffer_base = (unsigned long) cea;
@@ -659,17 +644,20 @@ static void release_pebs_buffer(int cpu)
struct cpu_hw_events *hwev = per_cpu_ptr(&cpu_hw_events, cpu);
void *cea;
- if (!x86_pmu.ds_pebs)
+ if (!intel_pmu_has_pebs())
return;
- kfree(per_cpu(insn_buffer, cpu));
- per_cpu(insn_buffer, cpu) = NULL;
+ if (x86_pmu.ds_pebs) {
+ kfree(per_cpu(insn_buffer, cpu));
+ per_cpu(insn_buffer, cpu) = NULL;
- /* Clear the fixmap */
- cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer;
- ds_clear_cea(cea, x86_pmu.pebs_buffer_size);
- dsfree_pages(hwev->ds_pebs_vaddr, x86_pmu.pebs_buffer_size);
- hwev->ds_pebs_vaddr = NULL;
+ /* Clear the fixmap */
+ cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer;
+ ds_clear_cea(cea, x86_pmu.pebs_buffer_size);
+ }
+
+ dsfree_pages(hwev->pebs_vaddr, x86_pmu.pebs_buffer_size);
+ hwev->pebs_vaddr = NULL;
}
static int alloc_bts_buffer(int cpu)
@@ -730,11 +718,11 @@ static void release_ds_buffer(int cpu)
per_cpu(cpu_hw_events, cpu).ds = NULL;
}
-void release_ds_buffers(void)
+void release_bts_pebs_buffers(void)
{
int cpu;
- if (!x86_pmu.bts && !x86_pmu.ds_pebs)
+ if (!x86_pmu.bts && !intel_pmu_has_pebs())
return;
for_each_possible_cpu(cpu)
@@ -746,7 +734,7 @@ void release_ds_buffers(void)
* observe cpu_hw_events.ds and not program the DS_AREA when
* they come up.
*/
- fini_debug_store_on_cpu(cpu);
+ fini_pebs_buf_on_cpu(cpu);
}
for_each_possible_cpu(cpu) {
@@ -755,7 +743,7 @@ void release_ds_buffers(void)
}
}
-void reserve_ds_buffers(void)
+void reserve_bts_pebs_buffers(void)
{
int bts_err = 0, pebs_err = 0;
int cpu;
@@ -763,19 +751,20 @@ void reserve_ds_buffers(void)
x86_pmu.bts_active = 0;
x86_pmu.pebs_active = 0;
- if (!x86_pmu.bts && !x86_pmu.ds_pebs)
+ if (!x86_pmu.bts && !intel_pmu_has_pebs())
return;
if (!x86_pmu.bts)
bts_err = 1;
- if (!x86_pmu.ds_pebs)
+ if (!intel_pmu_has_pebs())
pebs_err = 1;
for_each_possible_cpu(cpu) {
if (alloc_ds_buffer(cpu)) {
bts_err = 1;
- pebs_err = 1;
+ if (x86_pmu.ds_pebs)
+ pebs_err = 1;
}
if (!bts_err && alloc_bts_buffer(cpu))
@@ -805,7 +794,7 @@ void reserve_ds_buffers(void)
if (x86_pmu.bts && !bts_err)
x86_pmu.bts_active = 1;
- if (x86_pmu.ds_pebs && !pebs_err)
+ if (intel_pmu_has_pebs() && !pebs_err)
x86_pmu.pebs_active = 1;
for_each_possible_cpu(cpu) {
@@ -813,11 +802,50 @@ void reserve_ds_buffers(void)
* Ignores wrmsr_on_cpu() errors for offline CPUs they
* will get this call through intel_pmu_cpu_starting().
*/
- init_debug_store_on_cpu(cpu);
+ init_pebs_buf_on_cpu(cpu);
}
}
}
+void init_pebs_buf_on_cpu(int cpu)
+{
+ struct cpu_hw_events *cpuc = per_cpu_ptr(&cpu_hw_events, cpu);
+
+ if (x86_pmu.arch_pebs) {
+ u64 arch_pebs_base;
+
+ if (!cpuc->pebs_vaddr)
+ return;
+
+ /*
+ * 4KB-aligned pointer of the output buffer
+ * (__alloc_pages_node() return page aligned address)
+ * Buffer Size = 4KB * 2^SIZE
+ * contiguous physical buffer (__alloc_pages_node() with order)
+ */
+ arch_pebs_base = virt_to_phys(cpuc->pebs_vaddr) | PEBS_BUFFER_SHIFT;
+
+ wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE,
+ (u32)arch_pebs_base,
+ (u32)(arch_pebs_base >> 32));
+ } else if (cpuc->ds) {
+ /* legacy PEBS */
+ wrmsr_on_cpu(cpu, MSR_IA32_DS_AREA,
+ (u32)((u64)(unsigned long)cpuc->ds),
+ (u32)((u64)(unsigned long)cpuc->ds >> 32));
+ }
+}
+
+void fini_pebs_buf_on_cpu(int cpu)
+{
+ struct cpu_hw_events *cpuc = per_cpu_ptr(&cpu_hw_events, cpu);
+
+ if (x86_pmu.arch_pebs)
+ wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, 0, 0);
+ else if (cpuc->ds)
+ wrmsr_on_cpu(cpu, MSR_IA32_DS_AREA, 0, 0);
+}
+
/*
* BTS
*/
@@ -2850,8 +2878,8 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
return;
}
- base = cpuc->ds_pebs_vaddr;
- top = (void *)((u64)cpuc->ds_pebs_vaddr +
+ base = cpuc->pebs_vaddr;
+ top = (void *)((u64)cpuc->pebs_vaddr +
(index.split.wr << ARCH_PEBS_INDEX_WR_SHIFT));
mask = hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 85cb36ad5520..a3c4374fe7f3 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -266,11 +266,11 @@ struct cpu_hw_events {
int is_fake;
/*
- * Intel DebugStore bits
+ * Intel DebugStore/PEBS bits
*/
struct debug_store *ds;
- void *ds_pebs_vaddr;
void *ds_bts_vaddr;
+ void *pebs_vaddr;
u64 pebs_enabled;
int n_pebs;
int n_large_pebs;
@@ -1594,13 +1594,13 @@ extern void intel_cpuc_finish(struct cpu_hw_events *cpuc);
int intel_pmu_init(void);
-void init_debug_store_on_cpu(int cpu);
+void init_pebs_buf_on_cpu(int cpu);
-void fini_debug_store_on_cpu(int cpu);
+void fini_pebs_buf_on_cpu(int cpu);
-void release_ds_buffers(void);
+void release_bts_pebs_buffers(void);
-void reserve_ds_buffers(void);
+void reserve_bts_pebs_buffers(void);
void release_lbr_buffers(void);
@@ -1787,11 +1787,11 @@ static inline bool intel_pmu_has_pebs(void)
#else /* CONFIG_CPU_SUP_INTEL */
-static inline void reserve_ds_buffers(void)
+static inline void reserve_bts_pebs_buffers(void)
{
}
-static inline void release_ds_buffers(void)
+static inline void release_bts_pebs_buffers(void)
{
}
diff --git a/arch/x86/include/asm/intel_ds.h b/arch/x86/include/asm/intel_ds.h
index 5dbeac48a5b9..023c2883f9f3 100644
--- a/arch/x86/include/asm/intel_ds.h
+++ b/arch/x86/include/asm/intel_ds.h
@@ -4,7 +4,8 @@
#include <linux/percpu-defs.h>
#define BTS_BUFFER_SIZE (PAGE_SIZE << 4)
-#define PEBS_BUFFER_SIZE (PAGE_SIZE << 4)
+#define PEBS_BUFFER_SHIFT 4
+#define PEBS_BUFFER_SIZE (PAGE_SIZE << PEBS_BUFFER_SHIFT)
/* The maximal number of PEBS events: */
#define MAX_PEBS_EVENTS_FMT4 8
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 11/20] perf/x86/intel: Setup PEBS constraints base on counter & pdist map
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (9 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 10/20] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-27 16:07 ` Liang, Kan
2025-01-23 14:07 ` [PATCH 12/20] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Dapeng Mi
` (8 subsequent siblings)
19 siblings, 1 reply; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
arch-PEBS provides CPUIDs to enumerate which counters support PEBS
sampling and precise distribution PEBS sampling. Thus PEBS constraints
can be dynamically configured base on these counter and precise
distribution bitmap instead of defining them statically.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/core.c | 20 ++++++++++++++++++++
arch/x86/events/intel/ds.c | 1 +
2 files changed, 21 insertions(+)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 7775e1e1c1e9..0f1be36113fa 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3728,6 +3728,7 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
struct perf_event *event)
{
struct event_constraint *c1, *c2;
+ struct pmu *pmu = event->pmu;
c1 = cpuc->event_constraint[idx];
@@ -3754,6 +3755,25 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
c2->weight = hweight64(c2->idxmsk64);
}
+ if (x86_pmu.arch_pebs && event->attr.precise_ip) {
+ u64 pebs_cntrs_mask;
+ u64 cntrs_mask;
+
+ if (event->attr.precise_ip >= 3)
+ pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).pdists;
+ else
+ pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).counters;
+
+ cntrs_mask = hybrid(pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED |
+ hybrid(pmu, cntr_mask64);
+
+ if (pebs_cntrs_mask != cntrs_mask) {
+ c2 = dyn_constraint(cpuc, c2, idx);
+ c2->idxmsk64 &= pebs_cntrs_mask;
+ c2->weight = hweight64(c2->idxmsk64);
+ }
+ }
+
return c2;
}
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 2f2c6b7c801b..a573ce0e576a 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2941,6 +2941,7 @@ static void __init intel_arch_pebs_init(void)
x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
x86_pmu.pebs_capable = ~0ULL;
+ x86_pmu.flags |= PMU_FL_PEBS_ALL;
}
/*
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 12/20] perf/x86/intel: Setup PEBS data configuration and enable legacy groups
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (10 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 11/20] perf/x86/intel: Setup PEBS constraints base on counter & pdist map Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 13/20] perf/x86/intel: Add SSP register support for arch-PEBS Dapeng Mi
` (7 subsequent siblings)
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Different with legacy PEBS, arch-PEBS provides per-counter PEBS data
configuration by programing MSR IA32_PMC_GPx/FXx_CFG_C MSRs.
This patch obtains PEBS data configuration from event attribute and then
writes the PEBS data configuration to MSR IA32_PMC_GPx/FXx_CFG_C and
enable corresponding PEBS groups.
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/core.c | 127 +++++++++++++++++++++++++++++++
arch/x86/events/intel/ds.c | 17 +++++
arch/x86/events/perf_event.h | 15 ++++
arch/x86/include/asm/intel_ds.h | 7 ++
arch/x86/include/asm/msr-index.h | 10 +++
5 files changed, 176 insertions(+)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 0f1be36113fa..cb88ae60de8e 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2558,6 +2558,39 @@ static void intel_pmu_disable_fixed(struct perf_event *event)
cpuc->fixed_ctrl_val &= ~mask;
}
+static inline void __intel_pmu_update_event_ext(int idx, u64 ext)
+{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ u32 msr = idx < INTEL_PMC_IDX_FIXED ?
+ x86_pmu_cfg_c_addr(idx, true) :
+ x86_pmu_cfg_c_addr(idx - INTEL_PMC_IDX_FIXED, false);
+
+ cpuc->cfg_c_val[idx] = ext;
+ wrmsrl(msr, ext);
+}
+
+static void intel_pmu_disable_event_ext(struct perf_event *event)
+{
+ if (!x86_pmu.arch_pebs)
+ return;
+
+ /*
+ * Only clear CFG_C MSR for PEBS counter group events,
+ * it avoids the HW counter's value to be added into
+ * other PEBS records incorrectly after PEBS counter
+ * group events are disabled.
+ *
+ * For other events, it's unnecessary to clear CFG_C MSRs
+ * since CFG_C doesn't take effect if counter is in
+ * disabled state. That helps to reduce the WRMSR overhead
+ * in context switches.
+ */
+ if (!is_pebs_counter_event_group(event))
+ return;
+
+ __intel_pmu_update_event_ext(event->hw.idx, 0);
+}
+
static void intel_pmu_disable_event(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
@@ -2566,9 +2599,12 @@ static void intel_pmu_disable_event(struct perf_event *event)
switch (idx) {
case 0 ... INTEL_PMC_IDX_FIXED - 1:
intel_clear_masks(event, idx);
+ intel_pmu_disable_event_ext(event);
x86_pmu_disable_event(event);
break;
case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
+ intel_pmu_disable_event_ext(event);
+ fallthrough;
case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END:
intel_pmu_disable_fixed(event);
break;
@@ -2888,6 +2924,66 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
cpuc->fixed_ctrl_val |= bits;
}
+static void intel_pmu_enable_event_ext(struct perf_event *event)
+{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ struct hw_perf_event *hwc = &event->hw;
+ union arch_pebs_index cached, index;
+ struct arch_pebs_cap cap;
+ u64 ext = 0;
+
+ if (!x86_pmu.arch_pebs)
+ return;
+
+ cap = hybrid(cpuc->pmu, arch_pebs_cap);
+
+ if (event->attr.precise_ip) {
+ u64 pebs_data_cfg = intel_get_arch_pebs_data_config(event);
+
+ ext |= ARCH_PEBS_EN;
+ ext |= (-hwc->sample_period) & ARCH_PEBS_RELOAD;
+
+ if (pebs_data_cfg && cap.caps) {
+ if (pebs_data_cfg & PEBS_DATACFG_MEMINFO)
+ ext |= ARCH_PEBS_AUX & cap.caps;
+
+ if (pebs_data_cfg & PEBS_DATACFG_GP)
+ ext |= ARCH_PEBS_GPR & cap.caps;
+
+ if (pebs_data_cfg & PEBS_DATACFG_XMMS)
+ ext |= ARCH_PEBS_VECR_XMM & cap.caps;
+
+ if (pebs_data_cfg & PEBS_DATACFG_LBRS)
+ ext |= ARCH_PEBS_LBR & cap.caps;
+ }
+
+ if (cpuc->n_pebs == cpuc->n_large_pebs)
+ index.split.thresh = ARCH_PEBS_THRESH_MUL;
+ else
+ index.split.thresh = ARCH_PEBS_THRESH_SINGLE;
+
+ rdmsrl(MSR_IA32_PEBS_INDEX, cached.full);
+ if (index.split.thresh != cached.split.thresh || !cached.split.en) {
+ if (cached.split.thresh == ARCH_PEBS_THRESH_MUL &&
+ cached.split.wr > 0) {
+ /*
+ * Large PEBS was enabled.
+ * Drain PEBS buffer before applying the single PEBS.
+ */
+ intel_pmu_drain_pebs_buffer();
+ } else {
+ index.split.wr = 0;
+ index.split.full = 0;
+ index.split.en = 1;
+ wrmsrl(MSR_IA32_PEBS_INDEX, index.full);
+ }
+ }
+ }
+
+ if (cpuc->cfg_c_val[hwc->idx] != ext)
+ __intel_pmu_update_event_ext(hwc->idx, ext);
+}
+
static void intel_pmu_enable_event(struct perf_event *event)
{
u64 enable_mask = ARCH_PERFMON_EVENTSEL_ENABLE;
@@ -2902,9 +2998,12 @@ static void intel_pmu_enable_event(struct perf_event *event)
if (branch_sample_counters(event))
enable_mask |= ARCH_PERFMON_EVENTSEL_BR_CNTR;
intel_set_masks(event, idx);
+ intel_pmu_enable_event_ext(event);
__x86_pmu_enable_event(hwc, enable_mask);
break;
case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
+ intel_pmu_enable_event_ext(event);
+ fallthrough;
case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END:
intel_pmu_enable_fixed(event);
break;
@@ -4984,6 +5083,29 @@ static inline bool intel_pmu_broken_perf_cap(void)
return false;
}
+static inline void __intel_update_pmu_caps(struct pmu *pmu)
+{
+ struct pmu *dest_pmu = pmu ? pmu : x86_get_pmu(smp_processor_id());
+
+ if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_VECR_XMM)
+ dest_pmu->capabilities |= PERF_PMU_CAP_EXTENDED_REGS;
+}
+
+static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
+{
+ u64 caps = hybrid(pmu, arch_pebs_cap).caps;
+
+ x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
+ if (caps & ARCH_PEBS_LBR)
+ x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
+
+ if (!(caps & ARCH_PEBS_AUX))
+ x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
+ if (!(caps & ARCH_PEBS_GPR))
+ x86_pmu.large_pebs_flags &=
+ ~(PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER);
+}
+
static void update_pmu_cap(struct pmu *pmu)
{
unsigned int sub_bitmaps, eax, ebx, ecx, edx;
@@ -5012,6 +5134,9 @@ static void update_pmu_cap(struct pmu *pmu)
&eax, &ebx, &ecx, &edx);
hybrid(pmu, arch_pebs_cap).counters = ((u64)ecx << 32) | eax;
hybrid(pmu, arch_pebs_cap).pdists = ((u64)edx << 32) | ebx;
+
+ __intel_update_pmu_caps(pmu);
+ __intel_update_large_pebs_flags(pmu);
} else {
WARN_ON(x86_pmu.arch_pebs == 1);
x86_pmu.arch_pebs = 0;
@@ -5178,6 +5303,8 @@ static void intel_pmu_cpu_starting(int cpu)
}
}
+ __intel_update_pmu_caps(cpuc->pmu);
+
if (!cpuc->shared_regs)
return;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index a573ce0e576a..5d8c5c8d5e24 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1492,6 +1492,18 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
}
}
+u64 intel_get_arch_pebs_data_config(struct perf_event *event)
+{
+ u64 pebs_data_cfg = 0;
+
+ if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
+ return 0;
+
+ pebs_data_cfg |= pebs_update_adaptive_cfg(event);
+
+ return pebs_data_cfg;
+}
+
void intel_pmu_pebs_add(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -2927,6 +2939,11 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
index.split.wr = 0;
index.split.full = 0;
+ index.split.en = 1;
+ if (cpuc->n_pebs == cpuc->n_large_pebs)
+ index.split.thresh = ARCH_PEBS_THRESH_MUL;
+ else
+ index.split.thresh = ARCH_PEBS_THRESH_SINGLE;
wrmsrl(MSR_IA32_PEBS_INDEX, index.full);
}
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index a3c4374fe7f3..3acb03a5c214 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -286,6 +286,9 @@ struct cpu_hw_events {
u64 fixed_ctrl_val;
u64 active_fixed_ctrl_val;
+ /* Cached CFG_C values */
+ u64 cfg_c_val[X86_PMC_IDX_MAX];
+
/*
* Intel LBR bits
*/
@@ -1194,6 +1197,14 @@ static inline unsigned int x86_pmu_fixed_ctr_addr(int index)
x86_pmu.addr_offset(index, false) : index);
}
+static inline unsigned int x86_pmu_cfg_c_addr(int index, bool gp)
+{
+ u32 base = gp ? MSR_IA32_PMC_V6_GP0_CFG_C : MSR_IA32_PMC_V6_FX0_CFG_C;
+
+ return base + (x86_pmu.addr_offset ? x86_pmu.addr_offset(index, false) :
+ index * MSR_IA32_PMC_V6_STEP);
+}
+
static inline int x86_pmu_rdpmc_index(int index)
{
return x86_pmu.rdpmc_index ? x86_pmu.rdpmc_index(index) : index;
@@ -1615,6 +1626,8 @@ void intel_pmu_disable_bts(void);
int intel_pmu_drain_bts_buffer(void);
+void intel_pmu_drain_pebs_buffer(void);
+
u64 grt_latency_data(struct perf_event *event, u64 status);
u64 cmt_latency_data(struct perf_event *event, u64 status);
@@ -1748,6 +1761,8 @@ void intel_pmu_pebs_data_source_cmt(void);
void intel_pmu_pebs_data_source_lnl(void);
+u64 intel_get_arch_pebs_data_config(struct perf_event *event);
+
int intel_pmu_setup_lbr_filter(struct perf_event *event);
void intel_pt_interrupt(void);
diff --git a/arch/x86/include/asm/intel_ds.h b/arch/x86/include/asm/intel_ds.h
index 023c2883f9f3..7bb80c993bef 100644
--- a/arch/x86/include/asm/intel_ds.h
+++ b/arch/x86/include/asm/intel_ds.h
@@ -7,6 +7,13 @@
#define PEBS_BUFFER_SHIFT 4
#define PEBS_BUFFER_SIZE (PAGE_SIZE << PEBS_BUFFER_SHIFT)
+/*
+ * The largest PEBS record could consume a page, ensure
+ * a record at least can be written after triggering PMI.
+ */
+#define ARCH_PEBS_THRESH_MUL ((PEBS_BUFFER_SIZE - PAGE_SIZE) >> PEBS_BUFFER_SHIFT)
+#define ARCH_PEBS_THRESH_SINGLE 1
+
/* The maximal number of PEBS events: */
#define MAX_PEBS_EVENTS_FMT4 8
#define MAX_PEBS_EVENTS 32
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 59d3a050985e..a3fad7e910eb 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -318,6 +318,14 @@
#define ARCH_PEBS_OFFSET_MASK 0x7fffff
#define ARCH_PEBS_INDEX_WR_SHIFT 4
+#define ARCH_PEBS_RELOAD 0xffffffff
+#define ARCH_PEBS_LBR_SHIFT 40
+#define ARCH_PEBS_LBR (0x3ull << ARCH_PEBS_LBR_SHIFT)
+#define ARCH_PEBS_VECR_XMM BIT_ULL(49)
+#define ARCH_PEBS_GPR BIT_ULL(61)
+#define ARCH_PEBS_AUX BIT_ULL(62)
+#define ARCH_PEBS_EN BIT_ULL(63)
+
#define MSR_IA32_RTIT_CTL 0x00000570
#define RTIT_CTL_TRACEEN BIT(0)
#define RTIT_CTL_CYCLEACC BIT(1)
@@ -597,7 +605,9 @@
/* V6 PMON MSR range */
#define MSR_IA32_PMC_V6_GP0_CTR 0x1900
#define MSR_IA32_PMC_V6_GP0_CFG_A 0x1901
+#define MSR_IA32_PMC_V6_GP0_CFG_C 0x1903
#define MSR_IA32_PMC_V6_FX0_CTR 0x1980
+#define MSR_IA32_PMC_V6_FX0_CFG_C 0x1983
#define MSR_IA32_PMC_V6_STEP 4
/* KeyID partitioning between MKTME and TDX */
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 13/20] perf/x86/intel: Add SSP register support for arch-PEBS
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (11 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 12/20] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-24 5:16 ` Andi Kleen
2025-01-23 14:07 ` [PATCH 14/20] perf/x86/intel: Add counter group " Dapeng Mi
` (6 subsequent siblings)
19 siblings, 1 reply; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Arch-PEBS supports to capture SSP register in GPR group. This patch
supports to read and output this register. SSP is for shadow stacks.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/core.c | 10 ++++++++++
arch/x86/events/intel/ds.c | 3 +++
arch/x86/include/asm/perf_event.h | 1 +
arch/x86/include/uapi/asm/perf_regs.h | 3 ++-
arch/x86/kernel/perf_regs.c | 5 +++++
5 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index f40b03adb5c7..7ed80f01f15d 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -646,6 +646,16 @@ int x86_pmu_hw_config(struct perf_event *event)
return -EINVAL;
}
+ /* sample_regs_user never support SSP register. */
+ if (unlikely(event->attr.sample_regs_user & BIT_ULL(PERF_REG_X86_SSP)))
+ return -EINVAL;
+
+ if (unlikely(event->attr.sample_regs_intr & BIT_ULL(PERF_REG_X86_SSP))) {
+ /* Only arch-PEBS supports to capture SSP register. */
+ if (!x86_pmu.arch_pebs || !event->attr.precise_ip)
+ return -EINVAL;
+ }
+
/* sample_regs_user never support XMM registers */
if (unlikely(event->attr.sample_regs_user & PERF_REG_EXTENDED_MASK))
return -EINVAL;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 5d8c5c8d5e24..a7e101f6f2d6 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2209,6 +2209,7 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
perf_regs = container_of(regs, struct x86_perf_regs, regs);
perf_regs->xmm_regs = NULL;
+ perf_regs->ssp = 0;
format_group = basic->format_group;
@@ -2325,6 +2326,7 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
perf_regs = container_of(regs, struct x86_perf_regs, regs);
perf_regs->xmm_regs = NULL;
+ perf_regs->ssp = 0;
__setup_perf_sample_data(event, iregs, data);
@@ -2361,6 +2363,7 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
__setup_pebs_gpr_group(event, regs, (struct pebs_gprs *)gprs,
sample_type);
+ perf_regs->ssp = gprs->ssp;
}
if (header->aux) {
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index d0a3a13b8dae..cca8a0d68cbc 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -671,6 +671,7 @@ extern void perf_events_lapic_init(void);
struct pt_regs;
struct x86_perf_regs {
struct pt_regs regs;
+ u64 ssp;
u64 *xmm_regs;
};
diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/asm/perf_regs.h
index 7c9d2bb3833b..2e88fdebd259 100644
--- a/arch/x86/include/uapi/asm/perf_regs.h
+++ b/arch/x86/include/uapi/asm/perf_regs.h
@@ -27,9 +27,10 @@ enum perf_event_x86_regs {
PERF_REG_X86_R13,
PERF_REG_X86_R14,
PERF_REG_X86_R15,
+ PERF_REG_X86_SSP,
/* These are the limits for the GPRs. */
PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1,
- PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1,
+ PERF_REG_X86_64_MAX = PERF_REG_X86_SSP + 1,
/* These all need two bits set because they are 128bit */
PERF_REG_X86_XMM0 = 32,
diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c
index 624703af80a1..4b15c7488ec1 100644
--- a/arch/x86/kernel/perf_regs.c
+++ b/arch/x86/kernel/perf_regs.c
@@ -54,6 +54,8 @@ static unsigned int pt_regs_offset[PERF_REG_X86_MAX] = {
PT_REGS_OFFSET(PERF_REG_X86_R13, r13),
PT_REGS_OFFSET(PERF_REG_X86_R14, r14),
PT_REGS_OFFSET(PERF_REG_X86_R15, r15),
+ /* The pt_regs struct does not store Shadow stack pointer. */
+ (unsigned int) -1,
#endif
};
@@ -68,6 +70,9 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
return perf_regs->xmm_regs[idx - PERF_REG_X86_XMM0];
}
+ if (idx == PERF_REG_X86_SSP)
+ return perf_regs->ssp;
+
if (WARN_ON_ONCE(idx >= ARRAY_SIZE(pt_regs_offset)))
return 0;
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 14/20] perf/x86/intel: Add counter group support for arch-PEBS
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (12 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 13/20] perf/x86/intel: Add SSP register support for arch-PEBS Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 15/20] perf/core: Support to capture higher width vector registers Dapeng Mi
` (5 subsequent siblings)
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Base on previous adaptive PEBS counter snapshot support, add counter
group support for architectural PEBS. Since arch-PEBS shares same
counter group layout with adaptive PEBS, directly reuse
__setup_pebs_counter_group() helper to process arch-PEBS counter group.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/core.c | 36 +++++++++++++++++++++++++++++--
arch/x86/events/intel/ds.c | 27 ++++++++++++++++++++++-
arch/x86/events/perf_event.h | 2 ++
arch/x86/include/asm/msr-index.h | 6 ++++++
arch/x86/include/asm/perf_event.h | 13 ++++++++---
5 files changed, 78 insertions(+), 6 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index cb88ae60de8e..9c5b44a73ca2 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2955,6 +2955,17 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
if (pebs_data_cfg & PEBS_DATACFG_LBRS)
ext |= ARCH_PEBS_LBR & cap.caps;
+
+ if (pebs_data_cfg &
+ (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT))
+ ext |= ARCH_PEBS_CNTR_GP & cap.caps;
+
+ if (pebs_data_cfg &
+ (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT))
+ ext |= ARCH_PEBS_CNTR_FIXED & cap.caps;
+
+ if (pebs_data_cfg & PEBS_DATACFG_METRICS)
+ ext |= ARCH_PEBS_CNTR_METRICS & cap.caps;
}
if (cpuc->n_pebs == cpuc->n_large_pebs)
@@ -2980,6 +2991,9 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
}
}
+ if (is_pebs_counter_event_group(event))
+ ext |= ARCH_PEBS_CNTR_ALLOW;
+
if (cpuc->cfg_c_val[hwc->idx] != ext)
__intel_pmu_update_event_ext(hwc->idx, ext);
}
@@ -4131,6 +4145,20 @@ static inline bool intel_pmu_has_cap(struct perf_event *event, int idx)
return test_bit(idx, (unsigned long *)&intel_cap->capabilities);
}
+static inline bool intel_pmu_has_pebs_counter_group(struct pmu *pmu)
+{
+ u64 caps;
+
+ if (x86_pmu.intel_cap.pebs_format >= 6 && x86_pmu.intel_cap.pebs_baseline)
+ return true;
+
+ caps = hybrid(pmu, arch_pebs_cap).caps;
+ if (x86_pmu.arch_pebs && (caps & ARCH_PEBS_CNTR_MASK))
+ return true;
+
+ return false;
+}
+
static int intel_pmu_hw_config(struct perf_event *event)
{
int ret = x86_pmu_hw_config(event);
@@ -4243,8 +4271,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
}
if ((event->attr.sample_type & PERF_SAMPLE_READ) &&
- (x86_pmu.intel_cap.pebs_format >= 6) &&
- x86_pmu.intel_cap.pebs_baseline &&
+ intel_pmu_has_pebs_counter_group(event->pmu) &&
is_sampling_event(event) &&
event->attr.precise_ip)
event->group_leader->hw.flags |= PERF_X86_EVENT_PEBS_CNTR;
@@ -5089,6 +5116,9 @@ static inline void __intel_update_pmu_caps(struct pmu *pmu)
if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_VECR_XMM)
dest_pmu->capabilities |= PERF_PMU_CAP_EXTENDED_REGS;
+
+ if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_CNTR_MASK)
+ x86_pmu.late_setup = intel_pmu_late_setup;
}
static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
@@ -5098,6 +5128,8 @@ static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
if (caps & ARCH_PEBS_LBR)
x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
+ if (caps & ARCH_PEBS_CNTR_MASK)
+ x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
if (!(caps & ARCH_PEBS_AUX))
x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index a7e101f6f2d6..32a44e3571cb 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1383,7 +1383,7 @@ static void __intel_pmu_pebs_update_cfg(struct perf_event *event,
}
-static void intel_pmu_late_setup(void)
+void intel_pmu_late_setup(void)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct perf_event *event;
@@ -1494,13 +1494,20 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
u64 intel_get_arch_pebs_data_config(struct perf_event *event)
{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
u64 pebs_data_cfg = 0;
+ u64 cntr_mask;
if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
return 0;
pebs_data_cfg |= pebs_update_adaptive_cfg(event);
+ cntr_mask = (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT) |
+ (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT) |
+ PEBS_DATACFG_CNTR | PEBS_DATACFG_METRICS;
+ pebs_data_cfg |= cpuc->pebs_data_cfg & cntr_mask;
+
return pebs_data_cfg;
}
@@ -2404,6 +2411,24 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
}
}
+ if (header->cntr) {
+ struct arch_pebs_cntr_header *cntr = next_record;
+ unsigned int nr;
+
+ next_record += sizeof(struct arch_pebs_cntr_header);
+
+ if (is_pebs_counter_event_group(event)) {
+ __setup_pebs_counter_group(cpuc, event,
+ (struct pebs_cntr_header *)cntr, next_record);
+ data->sample_flags |= PERF_SAMPLE_READ;
+ }
+
+ nr = hweight32(cntr->cntr) + hweight32(cntr->fixed);
+ if (cntr->metrics == INTEL_CNTR_METRICS)
+ nr += 2;
+ next_record += nr * sizeof(u64);
+ }
+
/* Parse followed fragments if there are. */
if (arch_pebs_record_continued(header)) {
at = at + header->size;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 3acb03a5c214..ce8757cb229c 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1688,6 +1688,8 @@ void intel_pmu_drain_pebs_buffer(void);
void intel_pmu_store_pebs_lbrs(struct lbr_entry *lbr);
+void intel_pmu_late_setup(void);
+
void intel_pebs_init(void);
void intel_pmu_lbr_save_brstack(struct perf_sample_data *data,
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index a3fad7e910eb..6235df132ee0 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -319,12 +319,18 @@
#define ARCH_PEBS_INDEX_WR_SHIFT 4
#define ARCH_PEBS_RELOAD 0xffffffff
+#define ARCH_PEBS_CNTR_ALLOW BIT_ULL(35)
+#define ARCH_PEBS_CNTR_GP BIT_ULL(36)
+#define ARCH_PEBS_CNTR_FIXED BIT_ULL(37)
+#define ARCH_PEBS_CNTR_METRICS BIT_ULL(38)
#define ARCH_PEBS_LBR_SHIFT 40
#define ARCH_PEBS_LBR (0x3ull << ARCH_PEBS_LBR_SHIFT)
#define ARCH_PEBS_VECR_XMM BIT_ULL(49)
#define ARCH_PEBS_GPR BIT_ULL(61)
#define ARCH_PEBS_AUX BIT_ULL(62)
#define ARCH_PEBS_EN BIT_ULL(63)
+#define ARCH_PEBS_CNTR_MASK (ARCH_PEBS_CNTR_GP | ARCH_PEBS_CNTR_FIXED | \
+ ARCH_PEBS_CNTR_METRICS)
#define MSR_IA32_RTIT_CTL 0x00000570
#define RTIT_CTL_TRACEEN BIT(0)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index cca8a0d68cbc..a38d791cd0c2 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -137,16 +137,16 @@
#define ARCH_PERFMON_EVENTS_COUNT 7
#define PEBS_DATACFG_MEMINFO BIT_ULL(0)
-#define PEBS_DATACFG_GP BIT_ULL(1)
+#define PEBS_DATACFG_GP BIT_ULL(1)
#define PEBS_DATACFG_XMMS BIT_ULL(2)
#define PEBS_DATACFG_LBRS BIT_ULL(3)
-#define PEBS_DATACFG_LBR_SHIFT 24
#define PEBS_DATACFG_CNTR BIT_ULL(4)
+#define PEBS_DATACFG_METRICS BIT_ULL(5)
+#define PEBS_DATACFG_LBR_SHIFT 24
#define PEBS_DATACFG_CNTR_SHIFT 32
#define PEBS_DATACFG_CNTR_MASK GENMASK_ULL(15, 0)
#define PEBS_DATACFG_FIX_SHIFT 48
#define PEBS_DATACFG_FIX_MASK GENMASK_ULL(7, 0)
-#define PEBS_DATACFG_METRICS BIT_ULL(5)
/* Steal the highest bit of pebs_data_cfg for SW usage */
#define PEBS_UPDATE_DS_SW BIT_ULL(63)
@@ -573,6 +573,13 @@ struct arch_pebs_lbr_header {
u64 ler_info;
};
+struct arch_pebs_cntr_header {
+ u32 cntr;
+ u32 fixed;
+ u32 metrics;
+ u32 reserved;
+};
+
/*
* AMD Extended Performance Monitoring and Debug cpuid feature detection
*/
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 15/20] perf/core: Support to capture higher width vector registers
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (13 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 14/20] perf/x86/intel: Add counter group " Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 16/20] perf/x86/intel: Support arch-PEBS vector registers group capturing Dapeng Mi
` (4 subsequent siblings)
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Arch-PEBS supports to capture more vector registers like OPMASK/YMM/ZMM
registers besides XMM registers. This patch extends PERF_SAMPLE_REGS_INTR
attribute to support these higher width vector registers capturing.
The array sample_regs_intr_ext[] is added into perf_event_attr structure
to record user configured extended register bitmap and a helper
perf_reg_ext_validate() is added to validate if these registers are
supported on some specific PMUs.
This patch just adds the common perf/core support, the x86/intel specific
support would be added in next patch.
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/arm/kernel/perf_regs.c | 6 ++
arch/arm64/kernel/perf_regs.c | 6 ++
arch/csky/kernel/perf_regs.c | 5 ++
arch/loongarch/kernel/perf_regs.c | 5 ++
arch/mips/kernel/perf_regs.c | 5 ++
arch/powerpc/perf/perf_regs.c | 5 ++
arch/riscv/kernel/perf_regs.c | 5 ++
arch/s390/kernel/perf_regs.c | 5 ++
arch/x86/include/asm/perf_event.h | 4 ++
arch/x86/include/uapi/asm/perf_regs.h | 83 ++++++++++++++++++++++++++-
arch/x86/kernel/perf_regs.c | 50 +++++++++++++++-
include/linux/perf_event.h | 2 +
include/linux/perf_regs.h | 10 ++++
include/uapi/linux/perf_event.h | 10 ++++
kernel/events/core.c | 53 ++++++++++++++++-
15 files changed, 249 insertions(+), 5 deletions(-)
diff --git a/arch/arm/kernel/perf_regs.c b/arch/arm/kernel/perf_regs.c
index 0529f90395c9..86b2002d0846 100644
--- a/arch/arm/kernel/perf_regs.c
+++ b/arch/arm/kernel/perf_regs.c
@@ -37,3 +37,9 @@ void perf_get_regs_user(struct perf_regs *regs_user,
regs_user->regs = task_pt_regs(current);
regs_user->abi = perf_reg_abi(current);
}
+
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size)
+{
+ return -EINVAL;
+}
+
diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c
index b4eece3eb17d..1c91fd3530d5 100644
--- a/arch/arm64/kernel/perf_regs.c
+++ b/arch/arm64/kernel/perf_regs.c
@@ -104,3 +104,9 @@ void perf_get_regs_user(struct perf_regs *regs_user,
regs_user->regs = task_pt_regs(current);
regs_user->abi = perf_reg_abi(current);
}
+
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size)
+{
+ return -EINVAL;
+}
+
diff --git a/arch/csky/kernel/perf_regs.c b/arch/csky/kernel/perf_regs.c
index 09b7f88a2d6a..d2e2af0bf1ad 100644
--- a/arch/csky/kernel/perf_regs.c
+++ b/arch/csky/kernel/perf_regs.c
@@ -26,6 +26,11 @@ int perf_reg_validate(u64 mask)
return 0;
}
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size)
+{
+ return -EINVAL;
+}
+
u64 perf_reg_abi(struct task_struct *task)
{
return PERF_SAMPLE_REGS_ABI_32;
diff --git a/arch/loongarch/kernel/perf_regs.c b/arch/loongarch/kernel/perf_regs.c
index 263ac4ab5af6..e1df67e3fab4 100644
--- a/arch/loongarch/kernel/perf_regs.c
+++ b/arch/loongarch/kernel/perf_regs.c
@@ -34,6 +34,11 @@ int perf_reg_validate(u64 mask)
return 0;
}
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size)
+{
+ return -EINVAL;
+}
+
u64 perf_reg_value(struct pt_regs *regs, int idx)
{
if (WARN_ON_ONCE((u32)idx >= PERF_REG_LOONGARCH_MAX))
diff --git a/arch/mips/kernel/perf_regs.c b/arch/mips/kernel/perf_regs.c
index e686780d1647..bbb5f25b9191 100644
--- a/arch/mips/kernel/perf_regs.c
+++ b/arch/mips/kernel/perf_regs.c
@@ -37,6 +37,11 @@ int perf_reg_validate(u64 mask)
return 0;
}
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size)
+{
+ return -EINVAL;
+}
+
u64 perf_reg_value(struct pt_regs *regs, int idx)
{
long v;
diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
index 350dccb0143c..d919c628aee3 100644
--- a/arch/powerpc/perf/perf_regs.c
+++ b/arch/powerpc/perf/perf_regs.c
@@ -132,6 +132,11 @@ int perf_reg_validate(u64 mask)
return 0;
}
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size)
+{
+ return -EINVAL;
+}
+
u64 perf_reg_abi(struct task_struct *task)
{
if (is_tsk_32bit_task(task))
diff --git a/arch/riscv/kernel/perf_regs.c b/arch/riscv/kernel/perf_regs.c
index fd304a248de6..5beb60544c9a 100644
--- a/arch/riscv/kernel/perf_regs.c
+++ b/arch/riscv/kernel/perf_regs.c
@@ -26,6 +26,11 @@ int perf_reg_validate(u64 mask)
return 0;
}
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size)
+{
+ return -EINVAL;
+}
+
u64 perf_reg_abi(struct task_struct *task)
{
#if __riscv_xlen == 64
diff --git a/arch/s390/kernel/perf_regs.c b/arch/s390/kernel/perf_regs.c
index a6b058ee4a36..9247573229b0 100644
--- a/arch/s390/kernel/perf_regs.c
+++ b/arch/s390/kernel/perf_regs.c
@@ -42,6 +42,11 @@ int perf_reg_validate(u64 mask)
return 0;
}
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size)
+{
+ return -EINVAL;
+}
+
u64 perf_reg_abi(struct task_struct *task)
{
if (test_tsk_thread_flag(task, TIF_31BIT))
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index a38d791cd0c2..54125b344b2b 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -680,6 +680,10 @@ struct x86_perf_regs {
struct pt_regs regs;
u64 ssp;
u64 *xmm_regs;
+ u64 *opmask_regs;
+ u64 *ymmh_regs;
+ u64 **zmmh_regs;
+ u64 **h16zmm_regs;
};
extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs);
diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/asm/perf_regs.h
index 2e88fdebd259..6651e5af448d 100644
--- a/arch/x86/include/uapi/asm/perf_regs.h
+++ b/arch/x86/include/uapi/asm/perf_regs.h
@@ -32,7 +32,7 @@ enum perf_event_x86_regs {
PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1,
PERF_REG_X86_64_MAX = PERF_REG_X86_SSP + 1,
- /* These all need two bits set because they are 128bit */
+ /* These all need two bits set because they are 128 bits */
PERF_REG_X86_XMM0 = 32,
PERF_REG_X86_XMM1 = 34,
PERF_REG_X86_XMM2 = 36,
@@ -52,6 +52,87 @@ enum perf_event_x86_regs {
/* These include both GPRs and XMMX registers */
PERF_REG_X86_XMM_MAX = PERF_REG_X86_XMM15 + 2,
+
+ /*
+ * YMM upper bits need two bits set because they are 128 bits.
+ * PERF_REG_X86_YMMH0 = 64
+ */
+ PERF_REG_X86_YMMH0 = PERF_REG_X86_XMM_MAX,
+ PERF_REG_X86_YMMH1 = PERF_REG_X86_YMMH0 + 2,
+ PERF_REG_X86_YMMH2 = PERF_REG_X86_YMMH1 + 2,
+ PERF_REG_X86_YMMH3 = PERF_REG_X86_YMMH2 + 2,
+ PERF_REG_X86_YMMH4 = PERF_REG_X86_YMMH3 + 2,
+ PERF_REG_X86_YMMH5 = PERF_REG_X86_YMMH4 + 2,
+ PERF_REG_X86_YMMH6 = PERF_REG_X86_YMMH5 + 2,
+ PERF_REG_X86_YMMH7 = PERF_REG_X86_YMMH6 + 2,
+ PERF_REG_X86_YMMH8 = PERF_REG_X86_YMMH7 + 2,
+ PERF_REG_X86_YMMH9 = PERF_REG_X86_YMMH8 + 2,
+ PERF_REG_X86_YMMH10 = PERF_REG_X86_YMMH9 + 2,
+ PERF_REG_X86_YMMH11 = PERF_REG_X86_YMMH10 + 2,
+ PERF_REG_X86_YMMH12 = PERF_REG_X86_YMMH11 + 2,
+ PERF_REG_X86_YMMH13 = PERF_REG_X86_YMMH12 + 2,
+ PERF_REG_X86_YMMH14 = PERF_REG_X86_YMMH13 + 2,
+ PERF_REG_X86_YMMH15 = PERF_REG_X86_YMMH14 + 2,
+ PERF_REG_X86_YMMH_MAX = PERF_REG_X86_YMMH15 + 2,
+
+ /*
+ * ZMM0-15 upper bits need four bits set because they are 256 bits
+ * PERF_REG_X86_ZMMH0 = 96
+ */
+ PERF_REG_X86_ZMMH0 = PERF_REG_X86_YMMH_MAX,
+ PERF_REG_X86_ZMMH1 = PERF_REG_X86_ZMMH0 + 4,
+ PERF_REG_X86_ZMMH2 = PERF_REG_X86_ZMMH1 + 4,
+ PERF_REG_X86_ZMMH3 = PERF_REG_X86_ZMMH2 + 4,
+ PERF_REG_X86_ZMMH4 = PERF_REG_X86_ZMMH3 + 4,
+ PERF_REG_X86_ZMMH5 = PERF_REG_X86_ZMMH4 + 4,
+ PERF_REG_X86_ZMMH6 = PERF_REG_X86_ZMMH5 + 4,
+ PERF_REG_X86_ZMMH7 = PERF_REG_X86_ZMMH6 + 4,
+ PERF_REG_X86_ZMMH8 = PERF_REG_X86_ZMMH7 + 4,
+ PERF_REG_X86_ZMMH9 = PERF_REG_X86_ZMMH8 + 4,
+ PERF_REG_X86_ZMMH10 = PERF_REG_X86_ZMMH9 + 4,
+ PERF_REG_X86_ZMMH11 = PERF_REG_X86_ZMMH10 + 4,
+ PERF_REG_X86_ZMMH12 = PERF_REG_X86_ZMMH11 + 4,
+ PERF_REG_X86_ZMMH13 = PERF_REG_X86_ZMMH12 + 4,
+ PERF_REG_X86_ZMMH14 = PERF_REG_X86_ZMMH13 + 4,
+ PERF_REG_X86_ZMMH15 = PERF_REG_X86_ZMMH14 + 4,
+ PERF_REG_X86_ZMMH_MAX = PERF_REG_X86_ZMMH15 + 4,
+
+ /*
+ * ZMM16-31 need eight bits set because they are 512 bits
+ * PERF_REG_X86_ZMM16 = 160
+ */
+ PERF_REG_X86_ZMM16 = PERF_REG_X86_ZMMH_MAX,
+ PERF_REG_X86_ZMM17 = PERF_REG_X86_ZMM16 + 8,
+ PERF_REG_X86_ZMM18 = PERF_REG_X86_ZMM17 + 8,
+ PERF_REG_X86_ZMM19 = PERF_REG_X86_ZMM18 + 8,
+ PERF_REG_X86_ZMM20 = PERF_REG_X86_ZMM19 + 8,
+ PERF_REG_X86_ZMM21 = PERF_REG_X86_ZMM20 + 8,
+ PERF_REG_X86_ZMM22 = PERF_REG_X86_ZMM21 + 8,
+ PERF_REG_X86_ZMM23 = PERF_REG_X86_ZMM22 + 8,
+ PERF_REG_X86_ZMM24 = PERF_REG_X86_ZMM23 + 8,
+ PERF_REG_X86_ZMM25 = PERF_REG_X86_ZMM24 + 8,
+ PERF_REG_X86_ZMM26 = PERF_REG_X86_ZMM25 + 8,
+ PERF_REG_X86_ZMM27 = PERF_REG_X86_ZMM26 + 8,
+ PERF_REG_X86_ZMM28 = PERF_REG_X86_ZMM27 + 8,
+ PERF_REG_X86_ZMM29 = PERF_REG_X86_ZMM28 + 8,
+ PERF_REG_X86_ZMM30 = PERF_REG_X86_ZMM29 + 8,
+ PERF_REG_X86_ZMM31 = PERF_REG_X86_ZMM30 + 8,
+ PERF_REG_X86_ZMM_MAX = PERF_REG_X86_ZMM31 + 8,
+
+ /*
+ * OPMASK Registers
+ * PERF_REG_X86_OPMASK0 = 288
+ */
+ PERF_REG_X86_OPMASK0 = PERF_REG_X86_ZMM_MAX,
+ PERF_REG_X86_OPMASK1 = PERF_REG_X86_OPMASK0 + 1,
+ PERF_REG_X86_OPMASK2 = PERF_REG_X86_OPMASK1 + 1,
+ PERF_REG_X86_OPMASK3 = PERF_REG_X86_OPMASK2 + 1,
+ PERF_REG_X86_OPMASK4 = PERF_REG_X86_OPMASK3 + 1,
+ PERF_REG_X86_OPMASK5 = PERF_REG_X86_OPMASK4 + 1,
+ PERF_REG_X86_OPMASK6 = PERF_REG_X86_OPMASK5 + 1,
+ PERF_REG_X86_OPMASK7 = PERF_REG_X86_OPMASK6 + 1,
+
+ PERF_REG_X86_VEC_MAX = PERF_REG_X86_OPMASK7 + 1,
};
#define PERF_REG_EXTENDED_MASK (~((1ULL << PERF_REG_X86_XMM0) - 1))
diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c
index 4b15c7488ec1..1447cd341868 100644
--- a/arch/x86/kernel/perf_regs.c
+++ b/arch/x86/kernel/perf_regs.c
@@ -59,12 +59,41 @@ static unsigned int pt_regs_offset[PERF_REG_X86_MAX] = {
#endif
};
-u64 perf_reg_value(struct pt_regs *regs, int idx)
+static u64 perf_reg_ext_value(struct pt_regs *regs, int idx)
{
struct x86_perf_regs *perf_regs;
+ perf_regs = container_of(regs, struct x86_perf_regs, regs);
+
+ switch (idx) {
+ case PERF_REG_X86_YMMH0 ... PERF_REG_X86_YMMH_MAX - 1:
+ idx -= PERF_REG_X86_YMMH0;
+ return !perf_regs->ymmh_regs ? 0 : perf_regs->ymmh_regs[idx];
+ case PERF_REG_X86_ZMMH0 ... PERF_REG_X86_ZMMH_MAX - 1:
+ idx -= PERF_REG_X86_ZMMH0;
+ return !perf_regs->zmmh_regs ? 0 : perf_regs->zmmh_regs[idx / 4][idx % 4];
+ case PERF_REG_X86_ZMM16 ... PERF_REG_X86_ZMM_MAX - 1:
+ idx -= PERF_REG_X86_ZMM16;
+ return !perf_regs->h16zmm_regs ? 0 : perf_regs->h16zmm_regs[idx / 8][idx % 8];
+ case PERF_REG_X86_OPMASK0 ... PERF_REG_X86_OPMASK7:
+ idx -= PERF_REG_X86_OPMASK0;
+ return !perf_regs->opmask_regs ? 0 : perf_regs->opmask_regs[idx];
+ default:
+ WARN_ON_ONCE(1);
+ break;
+ }
+
+ return 0;
+}
+
+u64 perf_reg_value(struct pt_regs *regs, int idx)
+{
+ struct x86_perf_regs *perf_regs = container_of(regs, struct x86_perf_regs, regs);
+
+ if (idx >= PERF_REG_EXTENDED_OFFSET)
+ return perf_reg_ext_value(regs, idx);
+
if (idx >= PERF_REG_X86_XMM0 && idx < PERF_REG_X86_XMM_MAX) {
- perf_regs = container_of(regs, struct x86_perf_regs, regs);
if (!perf_regs->xmm_regs)
return 0;
return perf_regs->xmm_regs[idx - PERF_REG_X86_XMM0];
@@ -100,6 +129,11 @@ int perf_reg_validate(u64 mask)
return 0;
}
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size)
+{
+ return -EINVAL;
+}
+
u64 perf_reg_abi(struct task_struct *task)
{
return PERF_SAMPLE_REGS_ABI_32;
@@ -125,6 +159,18 @@ int perf_reg_validate(u64 mask)
return 0;
}
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size)
+{
+ if (!mask || !size || size > PERF_NUM_EXT_REGS)
+ return -EINVAL;
+
+ if (find_last_bit(mask, size) >
+ (PERF_REG_X86_VEC_MAX - PERF_REG_EXTENDED_OFFSET))
+ return -EINVAL;
+
+ return 0;
+}
+
u64 perf_reg_abi(struct task_struct *task)
{
if (!user_64bit_mode(task_pt_regs(task)))
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2d07bc1193f3..3612ef66f86c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -301,6 +301,7 @@ struct perf_event_pmu_context;
#define PERF_PMU_CAP_AUX_OUTPUT 0x0080
#define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100
#define PERF_PMU_CAP_AUX_PAUSE 0x0200
+#define PERF_PMU_CAP_MORE_EXT_REGS 0x0400
/**
* pmu::scope
@@ -1389,6 +1390,7 @@ static inline void perf_clear_branch_entry_bitfields(struct perf_branch_entry *b
br->reserved = 0;
}
+extern bool has_more_extended_regs(struct perf_event *event);
extern void perf_output_sample(struct perf_output_handle *handle,
struct perf_event_header *header,
struct perf_sample_data *data,
diff --git a/include/linux/perf_regs.h b/include/linux/perf_regs.h
index f632c5725f16..aa4dfb5af552 100644
--- a/include/linux/perf_regs.h
+++ b/include/linux/perf_regs.h
@@ -9,6 +9,8 @@ struct perf_regs {
struct pt_regs *regs;
};
+#define PERF_REG_EXTENDED_OFFSET 64
+
#ifdef CONFIG_HAVE_PERF_REGS
#include <asm/perf_regs.h>
@@ -21,6 +23,8 @@ int perf_reg_validate(u64 mask);
u64 perf_reg_abi(struct task_struct *task);
void perf_get_regs_user(struct perf_regs *regs_user,
struct pt_regs *regs);
+int perf_reg_ext_validate(unsigned long *mask, unsigned int size);
+
#else
#define PERF_REG_EXTENDED_MASK 0
@@ -35,6 +39,12 @@ static inline int perf_reg_validate(u64 mask)
return mask ? -ENOSYS : 0;
}
+static inline int perf_reg_ext_validate(unsigned long *mask,
+ unsigned int size)
+{
+ return -EINVAL;
+}
+
static inline u64 perf_reg_abi(struct task_struct *task)
{
return PERF_SAMPLE_REGS_ABI_NONE;
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 0524d541d4e3..575cd653291c 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -379,6 +379,10 @@ enum perf_event_read_format {
#define PERF_ATTR_SIZE_VER6 120 /* add: aux_sample_size */
#define PERF_ATTR_SIZE_VER7 128 /* add: sig_data */
#define PERF_ATTR_SIZE_VER8 136 /* add: config3 */
+#define PERF_ATTR_SIZE_VER9 168 /* add: sample_regs_intr_ext[PERF_EXT_REGS_ARRAY_SIZE] */
+
+#define PERF_EXT_REGS_ARRAY_SIZE 4
+#define PERF_NUM_EXT_REGS (PERF_EXT_REGS_ARRAY_SIZE * 64)
/*
* Hardware event_id to monitor via a performance monitoring event:
@@ -531,6 +535,12 @@ struct perf_event_attr {
__u64 sig_data;
__u64 config3; /* extension of config2 */
+
+ /*
+ * Extension sets of regs to dump for each sample.
+ * See asm/perf_regs.h for details.
+ */
+ __u64 sample_regs_intr_ext[PERF_EXT_REGS_ARRAY_SIZE];
};
/*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0f8c55990783..0da480b5e025 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7081,6 +7081,21 @@ perf_output_sample_regs(struct perf_output_handle *handle,
}
}
+static void
+perf_output_sample_regs_ext(struct perf_output_handle *handle,
+ struct pt_regs *regs,
+ unsigned long *mask,
+ unsigned int size)
+{
+ int bit;
+ u64 val;
+
+ for_each_set_bit(bit, mask, size) {
+ val = perf_reg_value(regs, bit + PERF_REG_EXTENDED_OFFSET);
+ perf_output_put(handle, val);
+ }
+}
+
static void perf_sample_regs_user(struct perf_regs *regs_user,
struct pt_regs *regs)
{
@@ -7509,6 +7524,13 @@ static void perf_output_read(struct perf_output_handle *handle,
perf_output_read_one(handle, event, enabled, running);
}
+inline bool has_more_extended_regs(struct perf_event *event)
+{
+ return !!bitmap_weight(
+ (unsigned long *)event->attr.sample_regs_intr_ext,
+ PERF_NUM_EXT_REGS);
+}
+
void perf_output_sample(struct perf_output_handle *handle,
struct perf_event_header *header,
struct perf_sample_data *data,
@@ -7666,6 +7688,12 @@ void perf_output_sample(struct perf_output_handle *handle,
perf_output_sample_regs(handle,
data->regs_intr.regs,
mask);
+ if (has_more_extended_regs(event)) {
+ perf_output_sample_regs_ext(
+ handle, data->regs_intr.regs,
+ (unsigned long *)event->attr.sample_regs_intr_ext,
+ PERF_NUM_EXT_REGS);
+ }
}
}
@@ -7980,6 +8008,12 @@ void perf_prepare_sample(struct perf_sample_data *data,
u64 mask = event->attr.sample_regs_intr;
size += hweight64(mask) * sizeof(u64);
+
+ if (has_more_extended_regs(event)) {
+ size += bitmap_weight(
+ (unsigned long *)event->attr.sample_regs_intr_ext,
+ PERF_NUM_EXT_REGS) * sizeof(u64);
+ }
}
data->dyn_size += size;
@@ -11991,6 +12025,10 @@ static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)
has_extended_regs(event))
ret = -EOPNOTSUPP;
+ if (!(pmu->capabilities & PERF_PMU_CAP_MORE_EXT_REGS) &&
+ has_more_extended_regs(event))
+ ret = -EOPNOTSUPP;
+
if (pmu->capabilities & PERF_PMU_CAP_NO_EXCLUDE &&
event_has_any_exclude_flag(event))
ret = -EINVAL;
@@ -12561,8 +12599,19 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
if (!attr->sample_max_stack)
attr->sample_max_stack = sysctl_perf_event_max_stack;
- if (attr->sample_type & PERF_SAMPLE_REGS_INTR)
- ret = perf_reg_validate(attr->sample_regs_intr);
+ if (attr->sample_type & PERF_SAMPLE_REGS_INTR) {
+ if (attr->sample_regs_intr != 0)
+ ret = perf_reg_validate(attr->sample_regs_intr);
+ if (ret)
+ return ret;
+ if (!!bitmap_weight((unsigned long *)attr->sample_regs_intr_ext,
+ PERF_NUM_EXT_REGS))
+ ret = perf_reg_ext_validate(
+ (unsigned long *)attr->sample_regs_intr_ext,
+ PERF_NUM_EXT_REGS);
+ if (ret)
+ return ret;
+ }
#ifndef CONFIG_CGROUP_PERF
if (attr->sample_type & PERF_SAMPLE_CGROUP)
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 16/20] perf/x86/intel: Support arch-PEBS vector registers group capturing
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (14 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 15/20] perf/core: Support to capture higher width vector registers Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 17/20] perf tools: Support to show SSP register Dapeng Mi
` (3 subsequent siblings)
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Add x86/intel specific vector register (VECR) group capturing for
arch-PEBS. Enable corresponding VECR group bits in
GPx_CFG_C/FX0_CFG_C MSRs if users configures these vector registers
bitmap in perf_event_attr and parse VECR group in arch-PEBS record.
Currently vector registers capturing is only supported by PEBS based
sampling, PMU driver would return error if PMI based sampling tries to
capture these vector registers.
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/core.c | 59 ++++++++++++++++++++++
arch/x86/events/intel/core.c | 15 ++++++
arch/x86/events/intel/ds.c | 82 ++++++++++++++++++++++++++++---
arch/x86/include/asm/msr-index.h | 6 +++
arch/x86/include/asm/perf_event.h | 20 ++++++++
5 files changed, 175 insertions(+), 7 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 7ed80f01f15d..f17a8c9c6391 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -576,6 +576,39 @@ int x86_pmu_max_precise(struct pmu *pmu)
return precise;
}
+static bool has_vec_regs(struct perf_event *event, int start, int end)
+{
+ /* -1 to subtract PERF_REG_EXTENDED_OFFSET */
+ int idx = start / 64 - 1;
+ int s = start % 64;
+ int e = end % 64;
+
+ return event->attr.sample_regs_intr_ext[idx] & GENMASK_ULL(e, s);
+}
+
+static inline bool has_ymmh_regs(struct perf_event *event)
+{
+ return has_vec_regs(event, PERF_REG_X86_YMMH0, PERF_REG_X86_YMMH15 + 1);
+}
+
+static inline bool has_zmmh_regs(struct perf_event *event)
+{
+ return has_vec_regs(event, PERF_REG_X86_ZMMH0, PERF_REG_X86_ZMMH7 + 3) ||
+ has_vec_regs(event, PERF_REG_X86_ZMMH8, PERF_REG_X86_ZMMH15 + 3);
+}
+
+static inline bool has_h16zmm_regs(struct perf_event *event)
+{
+ return has_vec_regs(event, PERF_REG_X86_ZMM16, PERF_REG_X86_ZMM19 + 7) ||
+ has_vec_regs(event, PERF_REG_X86_ZMM20, PERF_REG_X86_ZMM27 + 7) ||
+ has_vec_regs(event, PERF_REG_X86_ZMM28, PERF_REG_X86_ZMM31 + 7);
+}
+
+static inline bool has_opmask_regs(struct perf_event *event)
+{
+ return has_vec_regs(event, PERF_REG_X86_OPMASK0, PERF_REG_X86_OPMASK7);
+}
+
int x86_pmu_hw_config(struct perf_event *event)
{
if (event->attr.precise_ip) {
@@ -671,6 +704,32 @@ int x86_pmu_hw_config(struct perf_event *event)
return -EINVAL;
}
+ /*
+ * Architectural PEBS supports to capture more vector registers besides
+ * XMM registers, like YMM, OPMASK and ZMM registers.
+ */
+ if (unlikely(has_more_extended_regs(event))) {
+ u64 caps = hybrid(event->pmu, arch_pebs_cap).caps;
+
+ if (!(event->pmu->capabilities & PERF_PMU_CAP_MORE_EXT_REGS))
+ return -EINVAL;
+
+ if (has_opmask_regs(event) && !(caps & ARCH_PEBS_VECR_OPMASK))
+ return -EINVAL;
+
+ if (has_ymmh_regs(event) && !(caps & ARCH_PEBS_VECR_YMM))
+ return -EINVAL;
+
+ if (has_zmmh_regs(event) && !(caps & ARCH_PEBS_VECR_ZMMH))
+ return -EINVAL;
+
+ if (has_h16zmm_regs(event) && !(caps & ARCH_PEBS_VECR_H16ZMM))
+ return -EINVAL;
+
+ if (!event->attr.precise_ip)
+ return -EINVAL;
+ }
+
return x86_setup_perfctr(event);
}
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 9c5b44a73ca2..0c828a42b1ad 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2953,6 +2953,18 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
if (pebs_data_cfg & PEBS_DATACFG_XMMS)
ext |= ARCH_PEBS_VECR_XMM & cap.caps;
+ if (pebs_data_cfg & PEBS_DATACFG_YMMS)
+ ext |= ARCH_PEBS_VECR_YMM & cap.caps;
+
+ if (pebs_data_cfg & PEBS_DATACFG_OPMASKS)
+ ext |= ARCH_PEBS_VECR_OPMASK & cap.caps;
+
+ if (pebs_data_cfg & PEBS_DATACFG_ZMMHS)
+ ext |= ARCH_PEBS_VECR_ZMMH & cap.caps;
+
+ if (pebs_data_cfg & PEBS_DATACFG_H16ZMMS)
+ ext |= ARCH_PEBS_VECR_H16ZMM & cap.caps;
+
if (pebs_data_cfg & PEBS_DATACFG_LBRS)
ext |= ARCH_PEBS_LBR & cap.caps;
@@ -5117,6 +5129,9 @@ static inline void __intel_update_pmu_caps(struct pmu *pmu)
if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_VECR_XMM)
dest_pmu->capabilities |= PERF_PMU_CAP_EXTENDED_REGS;
+ if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_VECR_EXT)
+ dest_pmu->capabilities |= PERF_PMU_CAP_MORE_EXT_REGS;
+
if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_CNTR_MASK)
x86_pmu.late_setup = intel_pmu_late_setup;
}
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 32a44e3571cb..fc5716b257d7 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1413,6 +1413,7 @@ static u64 pebs_update_adaptive_cfg(struct perf_event *event)
u64 sample_type = attr->sample_type;
u64 pebs_data_cfg = 0;
bool gprs, tsx_weight;
+ int bit = 0;
if (!(sample_type & ~(PERF_SAMPLE_IP|PERF_SAMPLE_TIME)) &&
attr->precise_ip > 1)
@@ -1437,9 +1438,37 @@ static u64 pebs_update_adaptive_cfg(struct perf_event *event)
if (gprs || (attr->precise_ip < 2) || tsx_weight)
pebs_data_cfg |= PEBS_DATACFG_GP;
- if ((sample_type & PERF_SAMPLE_REGS_INTR) &&
- (attr->sample_regs_intr & PERF_REG_EXTENDED_MASK))
- pebs_data_cfg |= PEBS_DATACFG_XMMS;
+ if (sample_type & PERF_SAMPLE_REGS_INTR) {
+ if (attr->sample_regs_intr & PERF_REG_EXTENDED_MASK)
+ pebs_data_cfg |= PEBS_DATACFG_XMMS;
+
+ for_each_set_bit_from(bit,
+ (unsigned long *)event->attr.sample_regs_intr_ext,
+ PERF_NUM_EXT_REGS) {
+ switch (bit + PERF_REG_EXTENDED_OFFSET) {
+ case PERF_REG_X86_OPMASK0 ... PERF_REG_X86_OPMASK7:
+ pebs_data_cfg |= PEBS_DATACFG_OPMASKS;
+ bit = PERF_REG_X86_YMMH0 -
+ PERF_REG_EXTENDED_OFFSET - 1;
+ break;
+ case PERF_REG_X86_YMMH0 ... PERF_REG_X86_ZMMH0 - 1:
+ pebs_data_cfg |= PEBS_DATACFG_YMMS;
+ bit = PERF_REG_X86_ZMMH0 -
+ PERF_REG_EXTENDED_OFFSET - 1;
+ break;
+ case PERF_REG_X86_ZMMH0 ... PERF_REG_X86_ZMM16 - 1:
+ pebs_data_cfg |= PEBS_DATACFG_ZMMHS;
+ bit = PERF_REG_X86_ZMM16 -
+ PERF_REG_EXTENDED_OFFSET - 1;
+ break;
+ case PERF_REG_X86_ZMM16 ... PERF_REG_X86_ZMM_MAX - 1:
+ pebs_data_cfg |= PEBS_DATACFG_H16ZMMS;
+ bit = PERF_REG_X86_ZMM_MAX -
+ PERF_REG_EXTENDED_OFFSET - 1;
+ break;
+ }
+ }
+ }
if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
/*
@@ -2216,6 +2245,10 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
perf_regs = container_of(regs, struct x86_perf_regs, regs);
perf_regs->xmm_regs = NULL;
+ perf_regs->ymmh_regs = NULL;
+ perf_regs->opmask_regs = NULL;
+ perf_regs->zmmh_regs = NULL;
+ perf_regs->h16zmm_regs = NULL;
perf_regs->ssp = 0;
format_group = basic->format_group;
@@ -2333,6 +2366,10 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
perf_regs = container_of(regs, struct x86_perf_regs, regs);
perf_regs->xmm_regs = NULL;
+ perf_regs->ymmh_regs = NULL;
+ perf_regs->opmask_regs = NULL;
+ perf_regs->zmmh_regs = NULL;
+ perf_regs->h16zmm_regs = NULL;
perf_regs->ssp = 0;
__setup_perf_sample_data(event, iregs, data);
@@ -2383,14 +2420,45 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
meminfo->tsx_tuning, ax);
}
- if (header->xmm) {
+ if (header->xmm || header->ymmh || header->opmask ||
+ header->zmmh || header->h16zmm) {
struct arch_pebs_xmm *xmm;
+ struct arch_pebs_ymmh *ymmh;
+ struct arch_pebs_zmmh *zmmh;
+ struct arch_pebs_h16zmm *h16zmm;
+ struct arch_pebs_opmask *opmask;
next_record += sizeof(struct arch_pebs_xer_header);
- xmm = next_record;
- perf_regs->xmm_regs = xmm->xmm;
- next_record = xmm + 1;
+ if (header->xmm) {
+ xmm = next_record;
+ perf_regs->xmm_regs = xmm->xmm;
+ next_record = xmm + 1;
+ }
+
+ if (header->ymmh) {
+ ymmh = next_record;
+ perf_regs->ymmh_regs = ymmh->ymmh;
+ next_record = ymmh + 1;
+ }
+
+ if (header->opmask) {
+ opmask = next_record;
+ perf_regs->opmask_regs = opmask->opmask;
+ next_record = opmask + 1;
+ }
+
+ if (header->zmmh) {
+ zmmh = next_record;
+ perf_regs->zmmh_regs = (u64 **)zmmh->zmmh;
+ next_record = zmmh + 1;
+ }
+
+ if (header->h16zmm) {
+ h16zmm = next_record;
+ perf_regs->h16zmm_regs = (u64 **)h16zmm->h16zmm;
+ next_record = h16zmm + 1;
+ }
}
if (header->lbr) {
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 6235df132ee0..e017ee8556e5 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -326,6 +326,12 @@
#define ARCH_PEBS_LBR_SHIFT 40
#define ARCH_PEBS_LBR (0x3ull << ARCH_PEBS_LBR_SHIFT)
#define ARCH_PEBS_VECR_XMM BIT_ULL(49)
+#define ARCH_PEBS_VECR_YMM BIT_ULL(50)
+#define ARCH_PEBS_VECR_OPMASK BIT_ULL(53)
+#define ARCH_PEBS_VECR_ZMMH BIT_ULL(54)
+#define ARCH_PEBS_VECR_H16ZMM BIT_ULL(55)
+#define ARCH_PEBS_VECR_EXT_SHIFT 50
+#define ARCH_PEBS_VECR_EXT (0x3full << ARCH_PEBS_VECR_EXT_SHIFT)
#define ARCH_PEBS_GPR BIT_ULL(61)
#define ARCH_PEBS_AUX BIT_ULL(62)
#define ARCH_PEBS_EN BIT_ULL(63)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 54125b344b2b..79368ece2bf9 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -142,6 +142,10 @@
#define PEBS_DATACFG_LBRS BIT_ULL(3)
#define PEBS_DATACFG_CNTR BIT_ULL(4)
#define PEBS_DATACFG_METRICS BIT_ULL(5)
+#define PEBS_DATACFG_YMMS BIT_ULL(6)
+#define PEBS_DATACFG_OPMASKS BIT_ULL(7)
+#define PEBS_DATACFG_ZMMHS BIT_ULL(8)
+#define PEBS_DATACFG_H16ZMMS BIT_ULL(9)
#define PEBS_DATACFG_LBR_SHIFT 24
#define PEBS_DATACFG_CNTR_SHIFT 32
#define PEBS_DATACFG_CNTR_MASK GENMASK_ULL(15, 0)
@@ -559,6 +563,22 @@ struct arch_pebs_xmm {
u64 xmm[16*2]; /* two entries for each register */
};
+struct arch_pebs_ymmh {
+ u64 ymmh[16*2]; /* two entries for each register */
+};
+
+struct arch_pebs_opmask {
+ u64 opmask[8];
+};
+
+struct arch_pebs_zmmh {
+ u64 zmmh[16][4]; /* four entries for each register */
+};
+
+struct arch_pebs_h16zmm {
+ u64 h16zmm[16][8]; /* eight entries for each register */
+};
+
#define ARCH_PEBS_LBR_NAN 0x0
#define ARCH_PEBS_LBR_NUM_8 0x1
#define ARCH_PEBS_LBR_NUM_16 0x2
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 17/20] perf tools: Support to show SSP register
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (15 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 16/20] perf/x86/intel: Support arch-PEBS vector registers group capturing Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 16:15 ` Ian Rogers
2025-01-23 14:07 ` [PATCH 18/20] perf tools: Support to capture more vector registers (common part) Dapeng Mi
` (2 subsequent siblings)
19 siblings, 1 reply; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Add SSP register support.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
tools/arch/x86/include/uapi/asm/perf_regs.h | 4 +++-
tools/perf/arch/x86/util/perf_regs.c | 2 ++
tools/perf/util/intel-pt.c | 2 +-
tools/perf/util/perf-regs-arch/perf_regs_x86.c | 2 ++
4 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/tools/arch/x86/include/uapi/asm/perf_regs.h b/tools/arch/x86/include/uapi/asm/perf_regs.h
index 7c9d2bb3833b..158e353070c3 100644
--- a/tools/arch/x86/include/uapi/asm/perf_regs.h
+++ b/tools/arch/x86/include/uapi/asm/perf_regs.h
@@ -27,9 +27,11 @@ enum perf_event_x86_regs {
PERF_REG_X86_R13,
PERF_REG_X86_R14,
PERF_REG_X86_R15,
+ PERF_REG_X86_SSP,
/* These are the limits for the GPRs. */
PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1,
- PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1,
+ PERF_REG_X86_64_MAX = PERF_REG_X86_SSP + 1,
+ PERF_REG_INTEL_PT_MAX = PERF_REG_X86_R15 + 1,
/* These all need two bits set because they are 128bit */
PERF_REG_X86_XMM0 = 32,
diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
index 12fd93f04802..9f492568f3b4 100644
--- a/tools/perf/arch/x86/util/perf_regs.c
+++ b/tools/perf/arch/x86/util/perf_regs.c
@@ -36,6 +36,8 @@ static const struct sample_reg sample_reg_masks[] = {
SMPL_REG(R14, PERF_REG_X86_R14),
SMPL_REG(R15, PERF_REG_X86_R15),
#endif
+ SMPL_REG(SSP, PERF_REG_X86_SSP),
+
SMPL_REG2(XMM0, PERF_REG_X86_XMM0),
SMPL_REG2(XMM1, PERF_REG_X86_XMM1),
SMPL_REG2(XMM2, PERF_REG_X86_XMM2),
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 30be6dfe09eb..86196275c1e7 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2139,7 +2139,7 @@ static u64 *intel_pt_add_gp_regs(struct regs_dump *intr_regs, u64 *pos,
u32 bit;
int i;
- for (i = 0, bit = 1; i < PERF_REG_X86_64_MAX; i++, bit <<= 1) {
+ for (i = 0, bit = 1; i < PERF_REG_INTEL_PT_MAX; i++, bit <<= 1) {
/* Get the PEBS gp_regs array index */
int n = pebs_gp_regs[i] - 1;
diff --git a/tools/perf/util/perf-regs-arch/perf_regs_x86.c b/tools/perf/util/perf-regs-arch/perf_regs_x86.c
index 708954a9d35d..9a909f02bc04 100644
--- a/tools/perf/util/perf-regs-arch/perf_regs_x86.c
+++ b/tools/perf/util/perf-regs-arch/perf_regs_x86.c
@@ -54,6 +54,8 @@ const char *__perf_reg_name_x86(int id)
return "R14";
case PERF_REG_X86_R15:
return "R15";
+ case PERF_REG_X86_SSP:
+ return "ssp";
#define XMM(x) \
case PERF_REG_X86_XMM ## x: \
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 18/20] perf tools: Support to capture more vector registers (common part)
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (16 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 17/20] perf tools: Support to show SSP register Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 16:42 ` Ian Rogers
2025-01-23 14:07 ` [PATCH 19/20] perf tools: Support to capture more vector registers (x86/Intel part) Dapeng Mi
2025-01-23 14:07 ` [PATCH 20/20] perf tools/tests: Add vector registers PEBS sampling test Dapeng Mi
19 siblings, 1 reply; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Intel architectural PEBS supports to capture more vector registers like
OPMASK/YMM/ZMM registers besides already supported XMM registers.
arch-PEBS vector registers (VCER) capturing on perf core/pmu driver
(Intel) has been supported by previous patches. This patch adds perf
tool's part support. In detail, add support for the new
sample_regs_intr_ext register selector in perf_event_attr. This 32 bytes
bitmap is used to select the new register group OPMASK, YMMH, ZMMH and
ZMM in VECR. Update perf regs to introduce the new registers.
This single patch only introduces the common support, x86/intel specific
support would be added in next patch.
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
tools/include/uapi/linux/perf_event.h | 13 +++++++++
tools/perf/arch/arm/util/perf_regs.c | 5 +---
tools/perf/arch/arm64/util/perf_regs.c | 5 +---
tools/perf/arch/csky/util/perf_regs.c | 5 +---
tools/perf/arch/loongarch/util/perf_regs.c | 5 +---
tools/perf/arch/mips/util/perf_regs.c | 5 +---
tools/perf/arch/powerpc/util/perf_regs.c | 9 ++++---
tools/perf/arch/riscv/util/perf_regs.c | 5 +---
tools/perf/arch/s390/util/perf_regs.c | 5 +---
tools/perf/arch/x86/util/perf_regs.c | 9 ++++---
tools/perf/builtin-script.c | 19 ++++++++++---
tools/perf/util/evsel.c | 14 +++++++---
tools/perf/util/parse-regs-options.c | 23 +++++++++-------
tools/perf/util/perf_regs.c | 5 ----
tools/perf/util/perf_regs.h | 18 +++++++++++--
tools/perf/util/record.h | 2 +-
tools/perf/util/sample.h | 6 ++++-
tools/perf/util/session.c | 31 +++++++++++++---------
tools/perf/util/synthetic-events.c | 7 +++--
19 files changed, 116 insertions(+), 75 deletions(-)
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 4842c36fdf80..02d8f55f6247 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -379,6 +379,13 @@ enum perf_event_read_format {
#define PERF_ATTR_SIZE_VER6 120 /* add: aux_sample_size */
#define PERF_ATTR_SIZE_VER7 128 /* add: sig_data */
#define PERF_ATTR_SIZE_VER8 136 /* add: config3 */
+#define PERF_ATTR_SIZE_VER9 168 /* add: sample_regs_intr_ext[PERF_EXT_REGS_ARRAY_SIZE] */
+
+#define PERF_EXT_REGS_ARRAY_SIZE 4
+#define PERF_NUM_EXT_REGS (PERF_EXT_REGS_ARRAY_SIZE * 64)
+
+#define PERF_NUM_INTR_REGS (PERF_EXT_REGS_ARRAY_SIZE + 1)
+#define PERF_NUM_INTR_REGS_SIZE ((PERF_NUM_INTR_REGS) * 64)
/*
* Hardware event_id to monitor via a performance monitoring event:
@@ -522,6 +529,12 @@ struct perf_event_attr {
__u64 sig_data;
__u64 config3; /* extension of config2 */
+
+ /*
+ * Extension sets of regs to dump for each sample.
+ * See asm/perf_regs.h for details.
+ */
+ __u64 sample_regs_intr_ext[PERF_EXT_REGS_ARRAY_SIZE];
};
/*
diff --git a/tools/perf/arch/arm/util/perf_regs.c b/tools/perf/arch/arm/util/perf_regs.c
index f94a0210c7b7..3a3c2779efd4 100644
--- a/tools/perf/arch/arm/util/perf_regs.c
+++ b/tools/perf/arch/arm/util/perf_regs.c
@@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
SMPL_REG_END
};
-uint64_t arch__intr_reg_mask(void)
-{
- return PERF_REGS_MASK;
-}
+void arch__intr_reg_mask(unsigned long *mask) {}
uint64_t arch__user_reg_mask(void)
{
diff --git a/tools/perf/arch/arm64/util/perf_regs.c b/tools/perf/arch/arm64/util/perf_regs.c
index 09308665e28a..754bb8423733 100644
--- a/tools/perf/arch/arm64/util/perf_regs.c
+++ b/tools/perf/arch/arm64/util/perf_regs.c
@@ -140,10 +140,7 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
return SDT_ARG_VALID;
}
-uint64_t arch__intr_reg_mask(void)
-{
- return PERF_REGS_MASK;
-}
+void arch__intr_reg_mask(unsigned long *mask) {}
uint64_t arch__user_reg_mask(void)
{
diff --git a/tools/perf/arch/csky/util/perf_regs.c b/tools/perf/arch/csky/util/perf_regs.c
index 6b1665f41180..9d132150ecb6 100644
--- a/tools/perf/arch/csky/util/perf_regs.c
+++ b/tools/perf/arch/csky/util/perf_regs.c
@@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
SMPL_REG_END
};
-uint64_t arch__intr_reg_mask(void)
-{
- return PERF_REGS_MASK;
-}
+void arch__intr_reg_mask(unsigned long *mask) {}
uint64_t arch__user_reg_mask(void)
{
diff --git a/tools/perf/arch/loongarch/util/perf_regs.c b/tools/perf/arch/loongarch/util/perf_regs.c
index f94a0210c7b7..3a3c2779efd4 100644
--- a/tools/perf/arch/loongarch/util/perf_regs.c
+++ b/tools/perf/arch/loongarch/util/perf_regs.c
@@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
SMPL_REG_END
};
-uint64_t arch__intr_reg_mask(void)
-{
- return PERF_REGS_MASK;
-}
+void arch__intr_reg_mask(unsigned long *mask) {}
uint64_t arch__user_reg_mask(void)
{
diff --git a/tools/perf/arch/mips/util/perf_regs.c b/tools/perf/arch/mips/util/perf_regs.c
index 6b1665f41180..9d132150ecb6 100644
--- a/tools/perf/arch/mips/util/perf_regs.c
+++ b/tools/perf/arch/mips/util/perf_regs.c
@@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
SMPL_REG_END
};
-uint64_t arch__intr_reg_mask(void)
-{
- return PERF_REGS_MASK;
-}
+void arch__intr_reg_mask(unsigned long *mask) {}
uint64_t arch__user_reg_mask(void)
{
diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
index e8e6e6fc6f17..08ab9ed692fb 100644
--- a/tools/perf/arch/powerpc/util/perf_regs.c
+++ b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -186,7 +186,7 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
return SDT_ARG_VALID;
}
-uint64_t arch__intr_reg_mask(void)
+void arch__intr_reg_mask(unsigned long *mask)
{
struct perf_event_attr attr = {
.type = PERF_TYPE_HARDWARE,
@@ -198,7 +198,9 @@ uint64_t arch__intr_reg_mask(void)
};
int fd;
u32 version;
- u64 extended_mask = 0, mask = PERF_REGS_MASK;
+ u64 extended_mask = 0;
+
+ *(u64 *)mask = PERF_REGS_MASK;
/*
* Get the PVR value to set the extended
@@ -223,9 +225,8 @@ uint64_t arch__intr_reg_mask(void)
fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
if (fd != -1) {
close(fd);
- mask |= extended_mask;
+ *(u64 *)mask |= extended_mask;
}
- return mask;
}
uint64_t arch__user_reg_mask(void)
diff --git a/tools/perf/arch/riscv/util/perf_regs.c b/tools/perf/arch/riscv/util/perf_regs.c
index 6b1665f41180..9d132150ecb6 100644
--- a/tools/perf/arch/riscv/util/perf_regs.c
+++ b/tools/perf/arch/riscv/util/perf_regs.c
@@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
SMPL_REG_END
};
-uint64_t arch__intr_reg_mask(void)
-{
- return PERF_REGS_MASK;
-}
+void arch__intr_reg_mask(unsigned long *mask) {}
uint64_t arch__user_reg_mask(void)
{
diff --git a/tools/perf/arch/s390/util/perf_regs.c b/tools/perf/arch/s390/util/perf_regs.c
index 6b1665f41180..9d132150ecb6 100644
--- a/tools/perf/arch/s390/util/perf_regs.c
+++ b/tools/perf/arch/s390/util/perf_regs.c
@@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
SMPL_REG_END
};
-uint64_t arch__intr_reg_mask(void)
-{
- return PERF_REGS_MASK;
-}
+void arch__intr_reg_mask(unsigned long *mask) {}
uint64_t arch__user_reg_mask(void)
{
diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
index 9f492568f3b4..52f08498d005 100644
--- a/tools/perf/arch/x86/util/perf_regs.c
+++ b/tools/perf/arch/x86/util/perf_regs.c
@@ -283,7 +283,7 @@ const struct sample_reg *arch__sample_reg_masks(void)
return sample_reg_masks;
}
-uint64_t arch__intr_reg_mask(void)
+void arch__intr_reg_mask(unsigned long *mask)
{
struct perf_event_attr attr = {
.type = PERF_TYPE_HARDWARE,
@@ -295,6 +295,9 @@ uint64_t arch__intr_reg_mask(void)
.exclude_kernel = 1,
};
int fd;
+
+ *(u64 *)mask = PERF_REGS_MASK;
+
/*
* In an unnamed union, init it here to build on older gcc versions
*/
@@ -320,10 +323,8 @@ uint64_t arch__intr_reg_mask(void)
fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
if (fd != -1) {
close(fd);
- return (PERF_REG_EXTENDED_MASK | PERF_REGS_MASK);
+ *(u64 *)mask |= PERF_REG_EXTENDED_MASK;
}
-
- return PERF_REGS_MASK;
}
uint64_t arch__user_reg_mask(void)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 9e47905f75a6..66d3923e4040 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -704,10 +704,11 @@ static int perf_session__check_output_opt(struct perf_session *session)
}
static int perf_sample__fprintf_regs(struct regs_dump *regs, uint64_t mask, const char *arch,
- FILE *fp)
+ unsigned long *mask_ext, FILE *fp)
{
unsigned i = 0, r;
int printed = 0;
+ u64 val;
if (!regs || !regs->regs)
return 0;
@@ -715,7 +716,15 @@ static int perf_sample__fprintf_regs(struct regs_dump *regs, uint64_t mask, cons
printed += fprintf(fp, " ABI:%" PRIu64 " ", regs->abi);
for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
- u64 val = regs->regs[i++];
+ val = regs->regs[i++];
+ printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r, arch), val);
+ }
+
+ if (!mask_ext)
+ return printed;
+
+ for_each_set_bit(r, mask_ext, PERF_NUM_EXT_REGS) {
+ val = regs->regs[i++];
printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r, arch), val);
}
@@ -776,14 +785,16 @@ static int perf_sample__fprintf_iregs(struct perf_sample *sample,
struct perf_event_attr *attr, const char *arch, FILE *fp)
{
return perf_sample__fprintf_regs(&sample->intr_regs,
- attr->sample_regs_intr, arch, fp);
+ attr->sample_regs_intr, arch,
+ (unsigned long *)attr->sample_regs_intr_ext,
+ fp);
}
static int perf_sample__fprintf_uregs(struct perf_sample *sample,
struct perf_event_attr *attr, const char *arch, FILE *fp)
{
return perf_sample__fprintf_regs(&sample->user_regs,
- attr->sample_regs_user, arch, fp);
+ attr->sample_regs_user, arch, NULL, fp);
}
static int perf_sample__fprintf_start(struct perf_script *script,
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f745723d486b..297b960ac446 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1314,9 +1314,11 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
if (callchain && callchain->enabled && !evsel->no_aux_samples)
evsel__config_callchain(evsel, opts, callchain);
- if (opts->sample_intr_regs && !evsel->no_aux_samples &&
- !evsel__is_dummy_event(evsel)) {
- attr->sample_regs_intr = opts->sample_intr_regs;
+ if (bitmap_weight(opts->sample_intr_regs, PERF_NUM_INTR_REGS_SIZE) &&
+ !evsel->no_aux_samples && !evsel__is_dummy_event(evsel)) {
+ attr->sample_regs_intr = opts->sample_intr_regs[0];
+ memcpy(attr->sample_regs_intr_ext, &opts->sample_intr_regs[1],
+ PERF_NUM_EXT_REGS / 8);
evsel__set_sample_bit(evsel, REGS_INTR);
}
@@ -3097,10 +3099,16 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
u64 mask = evsel->core.attr.sample_regs_intr;
+ unsigned long *mask_ext =
+ (unsigned long *)evsel->core.attr.sample_regs_intr_ext;
+ u64 *intr_regs_mask;
sz = hweight64(mask) * sizeof(u64);
+ sz += bitmap_weight(mask_ext, PERF_NUM_EXT_REGS) * sizeof(u64);
OVERFLOW_CHECK(array, sz, max_size);
data->intr_regs.mask = mask;
+ intr_regs_mask = (u64 *)&data->intr_regs.mask_ext;
+ memcpy(&intr_regs_mask[1], mask_ext, PERF_NUM_EXT_REGS);
data->intr_regs.regs = (u64 *)array;
array = (void *)array + sz;
}
diff --git a/tools/perf/util/parse-regs-options.c b/tools/perf/util/parse-regs-options.c
index cda1c620968e..666c2a172ef2 100644
--- a/tools/perf/util/parse-regs-options.c
+++ b/tools/perf/util/parse-regs-options.c
@@ -12,11 +12,13 @@
static int
__parse_regs(const struct option *opt, const char *str, int unset, bool intr)
{
+ unsigned int size = intr ? PERF_NUM_INTR_REGS * 64 : 64;
uint64_t *mode = (uint64_t *)opt->value;
const struct sample_reg *r = NULL;
char *s, *os = NULL, *p;
int ret = -1;
- uint64_t mask;
+ DECLARE_BITMAP(mask, size);
+ DECLARE_BITMAP(mask_tmp, size);
if (unset)
return 0;
@@ -24,13 +26,14 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
/*
* cannot set it twice
*/
- if (*mode)
+ if (bitmap_weight((unsigned long *)mode, size))
return -1;
+ bitmap_zero(mask, size);
if (intr)
- mask = arch__intr_reg_mask();
+ arch__intr_reg_mask(mask);
else
- mask = arch__user_reg_mask();
+ *(uint64_t *)mask = arch__user_reg_mask();
/* str may be NULL in case no arg is passed to -I */
if (str) {
@@ -47,7 +50,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
if (!strcmp(s, "?")) {
fprintf(stderr, "available registers: ");
for (r = arch__sample_reg_masks(); r->name; r++) {
- if (r->mask & mask)
+ bitmap_and(mask_tmp, mask, r->mask_ext, size);
+ if (bitmap_weight(mask_tmp, size))
fprintf(stderr, "%s ", r->name);
}
fputc('\n', stderr);
@@ -55,7 +59,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
goto error;
}
for (r = arch__sample_reg_masks(); r->name; r++) {
- if ((r->mask & mask) && !strcasecmp(s, r->name))
+ bitmap_and(mask_tmp, mask, r->mask_ext, size);
+ if (bitmap_weight(mask_tmp, size) && !strcasecmp(s, r->name))
break;
}
if (!r || !r->name) {
@@ -64,7 +69,7 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
goto error;
}
- *mode |= r->mask;
+ bitmap_or((unsigned long *)mode, (unsigned long *)mode, r->mask_ext, size);
if (!p)
break;
@@ -75,8 +80,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
ret = 0;
/* default to all possible regs */
- if (*mode == 0)
- *mode = mask;
+ if (!bitmap_weight((unsigned long *)mode, size))
+ bitmap_or((unsigned long *)mode, (unsigned long *)mode, mask, size);
error:
free(os);
return ret;
diff --git a/tools/perf/util/perf_regs.c b/tools/perf/util/perf_regs.c
index 44b90bbf2d07..b36eafc10e84 100644
--- a/tools/perf/util/perf_regs.c
+++ b/tools/perf/util/perf_regs.c
@@ -11,11 +11,6 @@ int __weak arch_sdt_arg_parse_op(char *old_op __maybe_unused,
return SDT_ARG_SKIP;
}
-uint64_t __weak arch__intr_reg_mask(void)
-{
- return 0;
-}
-
uint64_t __weak arch__user_reg_mask(void)
{
return 0;
diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
index f2d0736d65cc..5018b8d040ee 100644
--- a/tools/perf/util/perf_regs.h
+++ b/tools/perf/util/perf_regs.h
@@ -4,18 +4,32 @@
#include <linux/types.h>
#include <linux/compiler.h>
+#include <linux/bitmap.h>
+#include <linux/perf_event.h>
+#include "util/record.h"
struct regs_dump;
struct sample_reg {
const char *name;
- uint64_t mask;
+ union {
+ uint64_t mask;
+ DECLARE_BITMAP(mask_ext, PERF_NUM_INTR_REGS * 64);
+ };
};
#define SMPL_REG_MASK(b) (1ULL << (b))
#define SMPL_REG(n, b) { .name = #n, .mask = SMPL_REG_MASK(b) }
#define SMPL_REG2_MASK(b) (3ULL << (b))
#define SMPL_REG2(n, b) { .name = #n, .mask = SMPL_REG2_MASK(b) }
+#define SMPL_REG_EXT(n, b) \
+ { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0x1ULL << (b % __BITS_PER_LONG) }
+#define SMPL_REG2_EXT(n, b) \
+ { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0x3ULL << (b % __BITS_PER_LONG) }
+#define SMPL_REG4_EXT(n, b) \
+ { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0xfULL << (b % __BITS_PER_LONG) }
+#define SMPL_REG8_EXT(n, b) \
+ { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0xffULL << (b % __BITS_PER_LONG) }
#define SMPL_REG_END { .name = NULL }
enum {
@@ -24,7 +38,7 @@ enum {
};
int arch_sdt_arg_parse_op(char *old_op, char **new_op);
-uint64_t arch__intr_reg_mask(void);
+void arch__intr_reg_mask(unsigned long *mask);
uint64_t arch__user_reg_mask(void);
const struct sample_reg *arch__sample_reg_masks(void);
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index a6566134e09e..16e44a640e57 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -57,7 +57,7 @@ struct record_opts {
unsigned int auxtrace_mmap_pages;
unsigned int user_freq;
u64 branch_stack;
- u64 sample_intr_regs;
+ u64 sample_intr_regs[PERF_NUM_INTR_REGS];
u64 sample_user_regs;
u64 default_interval;
u64 user_interval;
diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h
index 70b2c3135555..98c9c4260de6 100644
--- a/tools/perf/util/sample.h
+++ b/tools/perf/util/sample.h
@@ -4,13 +4,17 @@
#include <linux/perf_event.h>
#include <linux/types.h>
+#include <linux/bitmap.h>
/* number of register is bound by the number of bits in regs_dump::mask (64) */
#define PERF_SAMPLE_REGS_CACHE_SIZE (8 * sizeof(u64))
struct regs_dump {
u64 abi;
- u64 mask;
+ union {
+ u64 mask;
+ DECLARE_BITMAP(mask_ext, PERF_NUM_INTR_REGS * 64);
+ };
u64 *regs;
/* Cached values/mask filled by first register access. */
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 507e6cba9545..995f5c2963bc 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -909,12 +909,13 @@ static void branch_stack__printf(struct perf_sample *sample,
}
}
-static void regs_dump__printf(u64 mask, u64 *regs, const char *arch)
+static void regs_dump__printf(bool intr, struct regs_dump *regs, const char *arch)
{
+ unsigned int size = intr ? PERF_NUM_INTR_REGS * 64 : 64;
unsigned rid, i = 0;
- for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) {
- u64 val = regs[i++];
+ for_each_set_bit(rid, regs->mask_ext, size) {
+ u64 val = regs->regs[i++];
printf(".... %-5s 0x%016" PRIx64 "\n",
perf_reg_name(rid, arch), val);
@@ -935,16 +936,22 @@ static inline const char *regs_dump_abi(struct regs_dump *d)
return regs_abi[d->abi];
}
-static void regs__printf(const char *type, struct regs_dump *regs, const char *arch)
+static void regs__printf(bool intr, struct regs_dump *regs, const char *arch)
{
- u64 mask = regs->mask;
+ if (intr) {
+ u64 *mask = (u64 *)®s->mask_ext;
- printf("... %s regs: mask 0x%" PRIx64 " ABI %s\n",
- type,
- mask,
- regs_dump_abi(regs));
+ printf("... intr regs: mask 0x");
+ for (int i = 0; i < PERF_NUM_INTR_REGS; i++)
+ printf("%" PRIx64 "", mask[i]);
+ printf(" ABI %s\n", regs_dump_abi(regs));
+ } else {
+ printf("... user regs: mask 0x%" PRIx64 " ABI %s\n",
+ regs->mask,
+ regs_dump_abi(regs));
+ }
- regs_dump__printf(mask, regs->regs, arch);
+ regs_dump__printf(intr, regs, arch);
}
static void regs_user__printf(struct perf_sample *sample, const char *arch)
@@ -952,7 +959,7 @@ static void regs_user__printf(struct perf_sample *sample, const char *arch)
struct regs_dump *user_regs = &sample->user_regs;
if (user_regs->regs)
- regs__printf("user", user_regs, arch);
+ regs__printf(false, user_regs, arch);
}
static void regs_intr__printf(struct perf_sample *sample, const char *arch)
@@ -960,7 +967,7 @@ static void regs_intr__printf(struct perf_sample *sample, const char *arch)
struct regs_dump *intr_regs = &sample->intr_regs;
if (intr_regs->regs)
- regs__printf("intr", intr_regs, arch);
+ regs__printf(true, intr_regs, arch);
}
static void stack_user__printf(struct stack_dump *dump)
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index a58444c4aed1..35c5d58aa45f 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -1538,7 +1538,9 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
if (type & PERF_SAMPLE_REGS_INTR) {
if (sample->intr_regs.abi) {
result += sizeof(u64);
- sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
+ sz = bitmap_weight(sample->intr_regs.mask_ext,
+ PERF_NUM_INTR_REGS * 64) *
+ sizeof(u64);
result += sz;
} else {
result += sizeof(u64);
@@ -1741,7 +1743,8 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type, u64 read_fo
if (type & PERF_SAMPLE_REGS_INTR) {
if (sample->intr_regs.abi) {
*array++ = sample->intr_regs.abi;
- sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
+ sz = bitmap_weight(sample->intr_regs.mask_ext,
+ PERF_NUM_INTR_REGS * 64) * sizeof(u64);
memcpy(array, sample->intr_regs.regs, sz);
array = (void *)array + sz;
} else {
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 19/20] perf tools: Support to capture more vector registers (x86/Intel part)
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (17 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 18/20] perf tools: Support to capture more vector registers (common part) Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
2025-01-23 14:07 ` [PATCH 20/20] perf tools/tests: Add vector registers PEBS sampling test Dapeng Mi
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Intel architectural PEBS supports to capture more vector registers like
OPMASK/YMM/ZMM registers besides already supported XMM registers.
This patch adds Intel specific support to capture these new vector
registers for perf tools.
Besides, add SSP in perf regs. SSP is stored in general register group
and is selected by sample_regs_intr.
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
tools/arch/x86/include/uapi/asm/perf_regs.h | 83 +++++++++++++++-
tools/perf/arch/x86/util/perf_regs.c | 99 +++++++++++++++++++
.../perf/util/perf-regs-arch/perf_regs_x86.c | 88 +++++++++++++++++
3 files changed, 269 insertions(+), 1 deletion(-)
diff --git a/tools/arch/x86/include/uapi/asm/perf_regs.h b/tools/arch/x86/include/uapi/asm/perf_regs.h
index 158e353070c3..f723e8bf9963 100644
--- a/tools/arch/x86/include/uapi/asm/perf_regs.h
+++ b/tools/arch/x86/include/uapi/asm/perf_regs.h
@@ -33,7 +33,7 @@ enum perf_event_x86_regs {
PERF_REG_X86_64_MAX = PERF_REG_X86_SSP + 1,
PERF_REG_INTEL_PT_MAX = PERF_REG_X86_R15 + 1,
- /* These all need two bits set because they are 128bit */
+ /* These all need two bits set because they are 128 bits */
PERF_REG_X86_XMM0 = 32,
PERF_REG_X86_XMM1 = 34,
PERF_REG_X86_XMM2 = 36,
@@ -53,6 +53,87 @@ enum perf_event_x86_regs {
/* These include both GPRs and XMMX registers */
PERF_REG_X86_XMM_MAX = PERF_REG_X86_XMM15 + 2,
+
+ /*
+ * YMM upper bits need two bits set because they are 128 bits.
+ * PERF_REG_X86_YMMH0 = 64
+ */
+ PERF_REG_X86_YMMH0 = PERF_REG_X86_XMM_MAX,
+ PERF_REG_X86_YMMH1 = PERF_REG_X86_YMMH0 + 2,
+ PERF_REG_X86_YMMH2 = PERF_REG_X86_YMMH1 + 2,
+ PERF_REG_X86_YMMH3 = PERF_REG_X86_YMMH2 + 2,
+ PERF_REG_X86_YMMH4 = PERF_REG_X86_YMMH3 + 2,
+ PERF_REG_X86_YMMH5 = PERF_REG_X86_YMMH4 + 2,
+ PERF_REG_X86_YMMH6 = PERF_REG_X86_YMMH5 + 2,
+ PERF_REG_X86_YMMH7 = PERF_REG_X86_YMMH6 + 2,
+ PERF_REG_X86_YMMH8 = PERF_REG_X86_YMMH7 + 2,
+ PERF_REG_X86_YMMH9 = PERF_REG_X86_YMMH8 + 2,
+ PERF_REG_X86_YMMH10 = PERF_REG_X86_YMMH9 + 2,
+ PERF_REG_X86_YMMH11 = PERF_REG_X86_YMMH10 + 2,
+ PERF_REG_X86_YMMH12 = PERF_REG_X86_YMMH11 + 2,
+ PERF_REG_X86_YMMH13 = PERF_REG_X86_YMMH12 + 2,
+ PERF_REG_X86_YMMH14 = PERF_REG_X86_YMMH13 + 2,
+ PERF_REG_X86_YMMH15 = PERF_REG_X86_YMMH14 + 2,
+ PERF_REG_X86_YMMH_MAX = PERF_REG_X86_YMMH15 + 2,
+
+ /*
+ * ZMM0-15 upper bits need four bits set because they are 256 bits
+ * PERF_REG_X86_ZMMH0 = 96
+ */
+ PERF_REG_X86_ZMMH0 = PERF_REG_X86_YMMH_MAX,
+ PERF_REG_X86_ZMMH1 = PERF_REG_X86_ZMMH0 + 4,
+ PERF_REG_X86_ZMMH2 = PERF_REG_X86_ZMMH1 + 4,
+ PERF_REG_X86_ZMMH3 = PERF_REG_X86_ZMMH2 + 4,
+ PERF_REG_X86_ZMMH4 = PERF_REG_X86_ZMMH3 + 4,
+ PERF_REG_X86_ZMMH5 = PERF_REG_X86_ZMMH4 + 4,
+ PERF_REG_X86_ZMMH6 = PERF_REG_X86_ZMMH5 + 4,
+ PERF_REG_X86_ZMMH7 = PERF_REG_X86_ZMMH6 + 4,
+ PERF_REG_X86_ZMMH8 = PERF_REG_X86_ZMMH7 + 4,
+ PERF_REG_X86_ZMMH9 = PERF_REG_X86_ZMMH8 + 4,
+ PERF_REG_X86_ZMMH10 = PERF_REG_X86_ZMMH9 + 4,
+ PERF_REG_X86_ZMMH11 = PERF_REG_X86_ZMMH10 + 4,
+ PERF_REG_X86_ZMMH12 = PERF_REG_X86_ZMMH11 + 4,
+ PERF_REG_X86_ZMMH13 = PERF_REG_X86_ZMMH12 + 4,
+ PERF_REG_X86_ZMMH14 = PERF_REG_X86_ZMMH13 + 4,
+ PERF_REG_X86_ZMMH15 = PERF_REG_X86_ZMMH14 + 4,
+ PERF_REG_X86_ZMMH_MAX = PERF_REG_X86_ZMMH15 + 4,
+
+ /*
+ * ZMM16-31 need eight bits set because they are 512 bits
+ * PERF_REG_X86_ZMM16 = 160
+ */
+ PERF_REG_X86_ZMM16 = PERF_REG_X86_ZMMH_MAX,
+ PERF_REG_X86_ZMM17 = PERF_REG_X86_ZMM16 + 8,
+ PERF_REG_X86_ZMM18 = PERF_REG_X86_ZMM17 + 8,
+ PERF_REG_X86_ZMM19 = PERF_REG_X86_ZMM18 + 8,
+ PERF_REG_X86_ZMM20 = PERF_REG_X86_ZMM19 + 8,
+ PERF_REG_X86_ZMM21 = PERF_REG_X86_ZMM20 + 8,
+ PERF_REG_X86_ZMM22 = PERF_REG_X86_ZMM21 + 8,
+ PERF_REG_X86_ZMM23 = PERF_REG_X86_ZMM22 + 8,
+ PERF_REG_X86_ZMM24 = PERF_REG_X86_ZMM23 + 8,
+ PERF_REG_X86_ZMM25 = PERF_REG_X86_ZMM24 + 8,
+ PERF_REG_X86_ZMM26 = PERF_REG_X86_ZMM25 + 8,
+ PERF_REG_X86_ZMM27 = PERF_REG_X86_ZMM26 + 8,
+ PERF_REG_X86_ZMM28 = PERF_REG_X86_ZMM27 + 8,
+ PERF_REG_X86_ZMM29 = PERF_REG_X86_ZMM28 + 8,
+ PERF_REG_X86_ZMM30 = PERF_REG_X86_ZMM29 + 8,
+ PERF_REG_X86_ZMM31 = PERF_REG_X86_ZMM30 + 8,
+ PERF_REG_X86_ZMM_MAX = PERF_REG_X86_ZMM31 + 8,
+
+ /*
+ * OPMASK Registers
+ * PERF_REG_X86_OPMASK0 = 288
+ */
+ PERF_REG_X86_OPMASK0 = PERF_REG_X86_ZMM_MAX,
+ PERF_REG_X86_OPMASK1 = PERF_REG_X86_OPMASK0 + 1,
+ PERF_REG_X86_OPMASK2 = PERF_REG_X86_OPMASK1 + 1,
+ PERF_REG_X86_OPMASK3 = PERF_REG_X86_OPMASK2 + 1,
+ PERF_REG_X86_OPMASK4 = PERF_REG_X86_OPMASK3 + 1,
+ PERF_REG_X86_OPMASK5 = PERF_REG_X86_OPMASK4 + 1,
+ PERF_REG_X86_OPMASK6 = PERF_REG_X86_OPMASK5 + 1,
+ PERF_REG_X86_OPMASK7 = PERF_REG_X86_OPMASK6 + 1,
+
+ PERF_REG_X86_VEC_MAX = PERF_REG_X86_OPMASK7 + 1,
};
#define PERF_REG_EXTENDED_MASK (~((1ULL << PERF_REG_X86_XMM0) - 1))
diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
index 52f08498d005..e233e6fe2c72 100644
--- a/tools/perf/arch/x86/util/perf_regs.c
+++ b/tools/perf/arch/x86/util/perf_regs.c
@@ -54,6 +54,67 @@ static const struct sample_reg sample_reg_masks[] = {
SMPL_REG2(XMM13, PERF_REG_X86_XMM13),
SMPL_REG2(XMM14, PERF_REG_X86_XMM14),
SMPL_REG2(XMM15, PERF_REG_X86_XMM15),
+
+ SMPL_REG2_EXT(YMMH0, PERF_REG_X86_YMMH0),
+ SMPL_REG2_EXT(YMMH1, PERF_REG_X86_YMMH1),
+ SMPL_REG2_EXT(YMMH2, PERF_REG_X86_YMMH2),
+ SMPL_REG2_EXT(YMMH3, PERF_REG_X86_YMMH3),
+ SMPL_REG2_EXT(YMMH4, PERF_REG_X86_YMMH4),
+ SMPL_REG2_EXT(YMMH5, PERF_REG_X86_YMMH5),
+ SMPL_REG2_EXT(YMMH6, PERF_REG_X86_YMMH6),
+ SMPL_REG2_EXT(YMMH7, PERF_REG_X86_YMMH7),
+ SMPL_REG2_EXT(YMMH8, PERF_REG_X86_YMMH8),
+ SMPL_REG2_EXT(YMMH9, PERF_REG_X86_YMMH9),
+ SMPL_REG2_EXT(YMMH10, PERF_REG_X86_YMMH10),
+ SMPL_REG2_EXT(YMMH11, PERF_REG_X86_YMMH11),
+ SMPL_REG2_EXT(YMMH12, PERF_REG_X86_YMMH12),
+ SMPL_REG2_EXT(YMMH13, PERF_REG_X86_YMMH13),
+ SMPL_REG2_EXT(YMMH14, PERF_REG_X86_YMMH14),
+ SMPL_REG2_EXT(YMMH15, PERF_REG_X86_YMMH15),
+
+ SMPL_REG4_EXT(ZMMH0, PERF_REG_X86_ZMMH0),
+ SMPL_REG4_EXT(ZMMH1, PERF_REG_X86_ZMMH1),
+ SMPL_REG4_EXT(ZMMH2, PERF_REG_X86_ZMMH2),
+ SMPL_REG4_EXT(ZMMH3, PERF_REG_X86_ZMMH3),
+ SMPL_REG4_EXT(ZMMH4, PERF_REG_X86_ZMMH4),
+ SMPL_REG4_EXT(ZMMH5, PERF_REG_X86_ZMMH5),
+ SMPL_REG4_EXT(ZMMH6, PERF_REG_X86_ZMMH6),
+ SMPL_REG4_EXT(ZMMH7, PERF_REG_X86_ZMMH7),
+ SMPL_REG4_EXT(ZMMH8, PERF_REG_X86_ZMMH8),
+ SMPL_REG4_EXT(ZMMH9, PERF_REG_X86_ZMMH9),
+ SMPL_REG4_EXT(ZMMH10, PERF_REG_X86_ZMMH10),
+ SMPL_REG4_EXT(ZMMH11, PERF_REG_X86_ZMMH11),
+ SMPL_REG4_EXT(ZMMH12, PERF_REG_X86_ZMMH12),
+ SMPL_REG4_EXT(ZMMH13, PERF_REG_X86_ZMMH13),
+ SMPL_REG4_EXT(ZMMH14, PERF_REG_X86_ZMMH14),
+ SMPL_REG4_EXT(ZMMH15, PERF_REG_X86_ZMMH15),
+
+ SMPL_REG8_EXT(ZMM16, PERF_REG_X86_ZMM16),
+ SMPL_REG8_EXT(ZMM17, PERF_REG_X86_ZMM17),
+ SMPL_REG8_EXT(ZMM18, PERF_REG_X86_ZMM18),
+ SMPL_REG8_EXT(ZMM19, PERF_REG_X86_ZMM19),
+ SMPL_REG8_EXT(ZMM20, PERF_REG_X86_ZMM20),
+ SMPL_REG8_EXT(ZMM21, PERF_REG_X86_ZMM21),
+ SMPL_REG8_EXT(ZMM22, PERF_REG_X86_ZMM22),
+ SMPL_REG8_EXT(ZMM23, PERF_REG_X86_ZMM23),
+ SMPL_REG8_EXT(ZMM24, PERF_REG_X86_ZMM24),
+ SMPL_REG8_EXT(ZMM25, PERF_REG_X86_ZMM25),
+ SMPL_REG8_EXT(ZMM26, PERF_REG_X86_ZMM26),
+ SMPL_REG8_EXT(ZMM27, PERF_REG_X86_ZMM27),
+ SMPL_REG8_EXT(ZMM28, PERF_REG_X86_ZMM28),
+ SMPL_REG8_EXT(ZMM29, PERF_REG_X86_ZMM29),
+ SMPL_REG8_EXT(ZMM30, PERF_REG_X86_ZMM30),
+ SMPL_REG8_EXT(ZMM31, PERF_REG_X86_ZMM31),
+
+ SMPL_REG_EXT(OPMASK0, PERF_REG_X86_OPMASK0),
+ SMPL_REG_EXT(OPMASK1, PERF_REG_X86_OPMASK1),
+ SMPL_REG_EXT(OPMASK2, PERF_REG_X86_OPMASK2),
+ SMPL_REG_EXT(OPMASK3, PERF_REG_X86_OPMASK3),
+ SMPL_REG_EXT(OPMASK4, PERF_REG_X86_OPMASK4),
+ SMPL_REG_EXT(OPMASK5, PERF_REG_X86_OPMASK5),
+ SMPL_REG_EXT(OPMASK6, PERF_REG_X86_OPMASK6),
+ SMPL_REG_EXT(OPMASK7, PERF_REG_X86_OPMASK7),
+
SMPL_REG_END
};
@@ -283,6 +344,32 @@ const struct sample_reg *arch__sample_reg_masks(void)
return sample_reg_masks;
}
+static void check_intr_reg_ext_mask(struct perf_event_attr *attr, int idx,
+ u64 fmask, unsigned long *mask)
+{
+ u64 src_mask[PERF_NUM_INTR_REGS] = { 0 };
+ int fd;
+
+ attr->sample_regs_intr = 0;
+ attr->sample_regs_intr_ext[idx] = fmask;
+ src_mask[idx + 1] = fmask;
+
+ fd = sys_perf_event_open(attr, 0, -1, -1, 0);
+ if (fd != -1) {
+ close(fd);
+ bitmap_or(mask, mask, (unsigned long *)src_mask,
+ PERF_NUM_INTR_REGS * 64);
+ }
+}
+
+#define PERF_REG_EXTENDED_YMMH_MASK GENMASK_ULL(31, 0)
+#define PERF_REG_EXTENDED_ZMMH_1ST_MASK GENMASK_ULL(63, 32)
+#define PERF_REG_EXTENDED_ZMMH_2ND_MASK GENMASK_ULL(31, 0)
+#define PERF_REG_EXTENDED_ZMM_1ST_MASK GENMASK_ULL(63, 32)
+#define PERF_REG_EXTENDED_ZMM_2ND_MASK GENMASK_ULL(63, 0)
+#define PERF_REG_EXTENDED_ZMM_3RD_MASK GENMASK_ULL(31, 0)
+#define PERF_REG_EXTENDED_OPMASK_MASK GENMASK_ULL(39, 32)
+
void arch__intr_reg_mask(unsigned long *mask)
{
struct perf_event_attr attr = {
@@ -325,6 +412,18 @@ void arch__intr_reg_mask(unsigned long *mask)
close(fd);
*(u64 *)mask |= PERF_REG_EXTENDED_MASK;
}
+
+ /* Check YMMH regs */
+ check_intr_reg_ext_mask(&attr, 0, PERF_REG_EXTENDED_YMMH_MASK, mask);
+ /* Check ZMMLH0-15 regs */
+ check_intr_reg_ext_mask(&attr, 0, PERF_REG_EXTENDED_ZMMH_1ST_MASK, mask);
+ check_intr_reg_ext_mask(&attr, 1, PERF_REG_EXTENDED_ZMMH_2ND_MASK, mask);
+ /* Check ZMM16-31 regs */
+ check_intr_reg_ext_mask(&attr, 1, PERF_REG_EXTENDED_ZMM_1ST_MASK, mask);
+ check_intr_reg_ext_mask(&attr, 2, PERF_REG_EXTENDED_ZMM_2ND_MASK, mask);
+ check_intr_reg_ext_mask(&attr, 3, PERF_REG_EXTENDED_ZMM_3RD_MASK, mask);
+ /* Check OPMASK regs */
+ check_intr_reg_ext_mask(&attr, 3, PERF_REG_EXTENDED_OPMASK_MASK, mask);
}
uint64_t arch__user_reg_mask(void)
diff --git a/tools/perf/util/perf-regs-arch/perf_regs_x86.c b/tools/perf/util/perf-regs-arch/perf_regs_x86.c
index 9a909f02bc04..c926046ebddc 100644
--- a/tools/perf/util/perf-regs-arch/perf_regs_x86.c
+++ b/tools/perf/util/perf-regs-arch/perf_regs_x86.c
@@ -78,6 +78,94 @@ const char *__perf_reg_name_x86(int id)
XMM(14)
XMM(15)
#undef XMM
+
+#define YMMH(x) \
+ case PERF_REG_X86_YMMH ## x: \
+ case PERF_REG_X86_YMMH ## x + 1: \
+ return "YMMH" #x;
+ YMMH(0)
+ YMMH(1)
+ YMMH(2)
+ YMMH(3)
+ YMMH(4)
+ YMMH(5)
+ YMMH(6)
+ YMMH(7)
+ YMMH(8)
+ YMMH(9)
+ YMMH(10)
+ YMMH(11)
+ YMMH(12)
+ YMMH(13)
+ YMMH(14)
+ YMMH(15)
+#undef YMMH
+
+#define ZMMH(x) \
+ case PERF_REG_X86_ZMMH ## x: \
+ case PERF_REG_X86_ZMMH ## x + 1: \
+ case PERF_REG_X86_ZMMH ## x + 2: \
+ case PERF_REG_X86_ZMMH ## x + 3: \
+ return "ZMMLH" #x;
+ ZMMH(0)
+ ZMMH(1)
+ ZMMH(2)
+ ZMMH(3)
+ ZMMH(4)
+ ZMMH(5)
+ ZMMH(6)
+ ZMMH(7)
+ ZMMH(8)
+ ZMMH(9)
+ ZMMH(10)
+ ZMMH(11)
+ ZMMH(12)
+ ZMMH(13)
+ ZMMH(14)
+ ZMMH(15)
+#undef ZMMH
+
+#define ZMM(x) \
+ case PERF_REG_X86_ZMM ## x: \
+ case PERF_REG_X86_ZMM ## x + 1: \
+ case PERF_REG_X86_ZMM ## x + 2: \
+ case PERF_REG_X86_ZMM ## x + 3: \
+ case PERF_REG_X86_ZMM ## x + 4: \
+ case PERF_REG_X86_ZMM ## x + 5: \
+ case PERF_REG_X86_ZMM ## x + 6: \
+ case PERF_REG_X86_ZMM ## x + 7: \
+ return "ZMM" #x;
+ ZMM(16)
+ ZMM(17)
+ ZMM(18)
+ ZMM(19)
+ ZMM(20)
+ ZMM(21)
+ ZMM(22)
+ ZMM(23)
+ ZMM(24)
+ ZMM(25)
+ ZMM(26)
+ ZMM(27)
+ ZMM(28)
+ ZMM(29)
+ ZMM(30)
+ ZMM(31)
+#undef ZMM
+
+#define OPMASK(x) \
+ case PERF_REG_X86_OPMASK ## x: \
+ return "opmask" #x;
+
+ OPMASK(0)
+ OPMASK(1)
+ OPMASK(2)
+ OPMASK(3)
+ OPMASK(4)
+ OPMASK(5)
+ OPMASK(6)
+ OPMASK(7)
+#undef OPMASK
default:
return NULL;
}
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 20/20] perf tools/tests: Add vector registers PEBS sampling test
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
` (18 preceding siblings ...)
2025-01-23 14:07 ` [PATCH 19/20] perf tools: Support to capture more vector registers (x86/Intel part) Dapeng Mi
@ 2025-01-23 14:07 ` Dapeng Mi
19 siblings, 0 replies; 47+ messages in thread
From: Dapeng Mi @ 2025-01-23 14:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Current adaptive PEBS supports to capture some vector registers like XMM
register, and arch-PEBS supports to capture wider vector registers like
YMM and ZMM registers. This patch adds a perf test case to verify these
vector registers can be captured correctly.
Suggested-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
tools/perf/tests/shell/record.sh | 55 ++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)
diff --git a/tools/perf/tests/shell/record.sh b/tools/perf/tests/shell/record.sh
index 0fc7a909ae9b..521eaa1972f9 100755
--- a/tools/perf/tests/shell/record.sh
+++ b/tools/perf/tests/shell/record.sh
@@ -116,6 +116,60 @@ test_register_capture() {
echo "Register capture test [Success]"
}
+test_vec_register_capture() {
+ echo "Vector register capture test"
+ if ! perf record -o /dev/null --quiet -e instructions:p true 2> /dev/null
+ then
+ echo "Vector register capture test [Skipped missing event]"
+ return
+ fi
+ if ! perf record --intr-regs=\? 2>&1 | grep -q 'XMM0'
+ then
+ echo "Vector register capture test [Skipped missing XMM registers]"
+ return
+ fi
+ if ! perf record -o - --intr-regs=xmm0 -e instructions:p \
+ -c 100000 ${testprog} 2> /dev/null \
+ | perf script -F ip,sym,iregs -i - 2> /dev/null \
+ | grep -q "XMM0:"
+ then
+ echo "Vector register capture test [Failed missing XMM output]"
+ err=1
+ return
+ fi
+ echo "Vector registe (XMM) capture test [Success]"
+ if ! perf record --intr-regs=\? 2>&1 | grep -q 'YMMH0'
+ then
+ echo "Vector register capture test [Skipped missing YMM registers]"
+ return
+ fi
+ if ! perf record -o - --intr-regs=ymmh0 -e instructions:p \
+ -c 100000 ${testprog} 2> /dev/null \
+ | perf script -F ip,sym,iregs -i - 2> /dev/null \
+ | grep -q "YMMH0:"
+ then
+ echo "Vector register capture test [Failed missing YMMH output]"
+ err=1
+ return
+ fi
+ echo "Vector registe (YMM) capture test [Success]"
+ if ! perf record --intr-regs=\? 2>&1 | grep -q 'ZMMH0'
+ then
+ echo "Vector register capture test [Skipped missing ZMM registers]"
+ return
+ fi
+ if ! perf record -o - --intr-regs=zmmh0 -e instructions:p \
+ -c 100000 ${testprog} 2> /dev/null \
+ | perf script -F ip,sym,iregs -i - 2> /dev/null \
+ | grep -q "ZMMH0:"
+ then
+ echo "Vector register capture test [Failed missing ZMMH output]"
+ err=1
+ return
+ fi
+ echo "Vector registe (ZMM) capture test [Success]"
+}
+
test_system_wide() {
echo "Basic --system-wide mode test"
if ! perf record -aB --synth=no -o "${perfdata}" ${testprog} 2> /dev/null
@@ -303,6 +357,7 @@ fi
test_per_thread
test_register_capture
+test_vec_register_capture
test_system_wide
test_workload
test_branch_counter
--
2.40.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH 17/20] perf tools: Support to show SSP register
2025-01-23 14:07 ` [PATCH 17/20] perf tools: Support to show SSP register Dapeng Mi
@ 2025-01-23 16:15 ` Ian Rogers
2025-02-06 2:57 ` Mi, Dapeng
0 siblings, 1 reply; 47+ messages in thread
From: Ian Rogers @ 2025-01-23 16:15 UTC (permalink / raw)
To: Dapeng Mi
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Adrian Hunter, Alexander Shishkin, Kan Liang,
Andi Kleen, Eranian Stephane, linux-kernel, linux-perf-users,
Dapeng Mi
On Wed, Jan 22, 2025 at 10:21 PM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>
> Add SSP register support.
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
> tools/arch/x86/include/uapi/asm/perf_regs.h | 4 +++-
> tools/perf/arch/x86/util/perf_regs.c | 2 ++
> tools/perf/util/intel-pt.c | 2 +-
> tools/perf/util/perf-regs-arch/perf_regs_x86.c | 2 ++
> 4 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/tools/arch/x86/include/uapi/asm/perf_regs.h b/tools/arch/x86/include/uapi/asm/perf_regs.h
> index 7c9d2bb3833b..158e353070c3 100644
> --- a/tools/arch/x86/include/uapi/asm/perf_regs.h
> +++ b/tools/arch/x86/include/uapi/asm/perf_regs.h
> @@ -27,9 +27,11 @@ enum perf_event_x86_regs {
> PERF_REG_X86_R13,
> PERF_REG_X86_R14,
> PERF_REG_X86_R15,
> + PERF_REG_X86_SSP,
nit: Would it be worth a comment here? SSP may not be apparent to
everyone. Perhaps something like:
```
/* Shadow stack pointer (SSP) present on Clearwater Forest and newer models. */
```
> /* These are the limits for the GPRs. */
> PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1,
> - PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1,
> + PERF_REG_X86_64_MAX = PERF_REG_X86_SSP + 1,
> + PERF_REG_INTEL_PT_MAX = PERF_REG_X86_R15 + 1,
nit: It's a little peculiar to me the "+1" here - but that's
pre-existing. Perhaps comments above here too:
```
/* The MAX_REG_X86_64 used generally, for PEBS, etc. */
PERF_REG_X86_64_MAX = PERF_REG_X86_SSP + 1,
/* The MAX_REG_INTEL_PT ignores the SSP register. */
PERF_REG_INTEL_PT_MAX = PERF_REG_X86_R15 + 1,
```
Otherwise:
Reviewed-by: Ian Rogers <irogers@google.com>
Thanks,
Ian
>
> /* These all need two bits set because they are 128bit */
> PERF_REG_X86_XMM0 = 32,
> diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
> index 12fd93f04802..9f492568f3b4 100644
> --- a/tools/perf/arch/x86/util/perf_regs.c
> +++ b/tools/perf/arch/x86/util/perf_regs.c
> @@ -36,6 +36,8 @@ static const struct sample_reg sample_reg_masks[] = {
> SMPL_REG(R14, PERF_REG_X86_R14),
> SMPL_REG(R15, PERF_REG_X86_R15),
> #endif
> + SMPL_REG(SSP, PERF_REG_X86_SSP),
> +
> SMPL_REG2(XMM0, PERF_REG_X86_XMM0),
> SMPL_REG2(XMM1, PERF_REG_X86_XMM1),
> SMPL_REG2(XMM2, PERF_REG_X86_XMM2),
> diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
> index 30be6dfe09eb..86196275c1e7 100644
> --- a/tools/perf/util/intel-pt.c
> +++ b/tools/perf/util/intel-pt.c
> @@ -2139,7 +2139,7 @@ static u64 *intel_pt_add_gp_regs(struct regs_dump *intr_regs, u64 *pos,
> u32 bit;
> int i;
>
> - for (i = 0, bit = 1; i < PERF_REG_X86_64_MAX; i++, bit <<= 1) {
> + for (i = 0, bit = 1; i < PERF_REG_INTEL_PT_MAX; i++, bit <<= 1) {
> /* Get the PEBS gp_regs array index */
> int n = pebs_gp_regs[i] - 1;
>
> diff --git a/tools/perf/util/perf-regs-arch/perf_regs_x86.c b/tools/perf/util/perf-regs-arch/perf_regs_x86.c
> index 708954a9d35d..9a909f02bc04 100644
> --- a/tools/perf/util/perf-regs-arch/perf_regs_x86.c
> +++ b/tools/perf/util/perf-regs-arch/perf_regs_x86.c
> @@ -54,6 +54,8 @@ const char *__perf_reg_name_x86(int id)
> return "R14";
> case PERF_REG_X86_R15:
> return "R15";
> + case PERF_REG_X86_SSP:
> + return "ssp";
>
> #define XMM(x) \
> case PERF_REG_X86_XMM ## x: \
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 18/20] perf tools: Support to capture more vector registers (common part)
2025-01-23 14:07 ` [PATCH 18/20] perf tools: Support to capture more vector registers (common part) Dapeng Mi
@ 2025-01-23 16:42 ` Ian Rogers
2025-01-27 15:50 ` Liang, Kan
0 siblings, 1 reply; 47+ messages in thread
From: Ian Rogers @ 2025-01-23 16:42 UTC (permalink / raw)
To: Dapeng Mi
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Adrian Hunter, Alexander Shishkin, Kan Liang,
Andi Kleen, Eranian Stephane, linux-kernel, linux-perf-users,
Dapeng Mi
On Wed, Jan 22, 2025 at 10:21 PM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>
> Intel architectural PEBS supports to capture more vector registers like
> OPMASK/YMM/ZMM registers besides already supported XMM registers.
>
> arch-PEBS vector registers (VCER) capturing on perf core/pmu driver
> (Intel) has been supported by previous patches. This patch adds perf
> tool's part support. In detail, add support for the new
> sample_regs_intr_ext register selector in perf_event_attr. This 32 bytes
> bitmap is used to select the new register group OPMASK, YMMH, ZMMH and
> ZMM in VECR. Update perf regs to introduce the new registers.
>
> This single patch only introduces the common support, x86/intel specific
> support would be added in next patch.
Could you break down what the individual changes are? I see quite a
few, some in printing, some with functions like arch__intr_reg_mask.
I'm sure the changes are well motivated but there is little detail in
the commit message. Perhaps there is some chance to separate each
change into its own patch. By detail I mean something like, "change
arch__intr_reg_mask to taking a pointer so that REG_MASK and array
initialization is possible."
It is a shame arch__intr_reg_mask doesn't match arch__user_reg_mask
following this change. Perhaps update them both for the sake of
consistency.
Out of scope here, I wonder in general how we can get this code out of
the arch directory? For example, it would be nice if we have say an
arm perf command running on qemu-user on an x86 that we perhaps want
to do the appropriate reg_mask for x86.
Thanks,
Ian
> Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
> tools/include/uapi/linux/perf_event.h | 13 +++++++++
> tools/perf/arch/arm/util/perf_regs.c | 5 +---
> tools/perf/arch/arm64/util/perf_regs.c | 5 +---
> tools/perf/arch/csky/util/perf_regs.c | 5 +---
> tools/perf/arch/loongarch/util/perf_regs.c | 5 +---
> tools/perf/arch/mips/util/perf_regs.c | 5 +---
> tools/perf/arch/powerpc/util/perf_regs.c | 9 ++++---
> tools/perf/arch/riscv/util/perf_regs.c | 5 +---
> tools/perf/arch/s390/util/perf_regs.c | 5 +---
> tools/perf/arch/x86/util/perf_regs.c | 9 ++++---
> tools/perf/builtin-script.c | 19 ++++++++++---
> tools/perf/util/evsel.c | 14 +++++++---
> tools/perf/util/parse-regs-options.c | 23 +++++++++-------
> tools/perf/util/perf_regs.c | 5 ----
> tools/perf/util/perf_regs.h | 18 +++++++++++--
> tools/perf/util/record.h | 2 +-
> tools/perf/util/sample.h | 6 ++++-
> tools/perf/util/session.c | 31 +++++++++++++---------
> tools/perf/util/synthetic-events.c | 7 +++--
> 19 files changed, 116 insertions(+), 75 deletions(-)
>
> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
> index 4842c36fdf80..02d8f55f6247 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -379,6 +379,13 @@ enum perf_event_read_format {
> #define PERF_ATTR_SIZE_VER6 120 /* add: aux_sample_size */
> #define PERF_ATTR_SIZE_VER7 128 /* add: sig_data */
> #define PERF_ATTR_SIZE_VER8 136 /* add: config3 */
> +#define PERF_ATTR_SIZE_VER9 168 /* add: sample_regs_intr_ext[PERF_EXT_REGS_ARRAY_SIZE] */
> +
> +#define PERF_EXT_REGS_ARRAY_SIZE 4
> +#define PERF_NUM_EXT_REGS (PERF_EXT_REGS_ARRAY_SIZE * 64)
> +
> +#define PERF_NUM_INTR_REGS (PERF_EXT_REGS_ARRAY_SIZE + 1)
> +#define PERF_NUM_INTR_REGS_SIZE ((PERF_NUM_INTR_REGS) * 64)
>
> /*
> * Hardware event_id to monitor via a performance monitoring event:
> @@ -522,6 +529,12 @@ struct perf_event_attr {
> __u64 sig_data;
>
> __u64 config3; /* extension of config2 */
> +
> + /*
> + * Extension sets of regs to dump for each sample.
> + * See asm/perf_regs.h for details.
> + */
> + __u64 sample_regs_intr_ext[PERF_EXT_REGS_ARRAY_SIZE];
> };
>
> /*
> diff --git a/tools/perf/arch/arm/util/perf_regs.c b/tools/perf/arch/arm/util/perf_regs.c
> index f94a0210c7b7..3a3c2779efd4 100644
> --- a/tools/perf/arch/arm/util/perf_regs.c
> +++ b/tools/perf/arch/arm/util/perf_regs.c
> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
> SMPL_REG_END
> };
>
> -uint64_t arch__intr_reg_mask(void)
> -{
> - return PERF_REGS_MASK;
> -}
> +void arch__intr_reg_mask(unsigned long *mask) {}
>
> uint64_t arch__user_reg_mask(void)
> {
> diff --git a/tools/perf/arch/arm64/util/perf_regs.c b/tools/perf/arch/arm64/util/perf_regs.c
> index 09308665e28a..754bb8423733 100644
> --- a/tools/perf/arch/arm64/util/perf_regs.c
> +++ b/tools/perf/arch/arm64/util/perf_regs.c
> @@ -140,10 +140,7 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
> return SDT_ARG_VALID;
> }
>
> -uint64_t arch__intr_reg_mask(void)
> -{
> - return PERF_REGS_MASK;
> -}
> +void arch__intr_reg_mask(unsigned long *mask) {}
>
> uint64_t arch__user_reg_mask(void)
> {
> diff --git a/tools/perf/arch/csky/util/perf_regs.c b/tools/perf/arch/csky/util/perf_regs.c
> index 6b1665f41180..9d132150ecb6 100644
> --- a/tools/perf/arch/csky/util/perf_regs.c
> +++ b/tools/perf/arch/csky/util/perf_regs.c
> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
> SMPL_REG_END
> };
>
> -uint64_t arch__intr_reg_mask(void)
> -{
> - return PERF_REGS_MASK;
> -}
> +void arch__intr_reg_mask(unsigned long *mask) {}
>
> uint64_t arch__user_reg_mask(void)
> {
> diff --git a/tools/perf/arch/loongarch/util/perf_regs.c b/tools/perf/arch/loongarch/util/perf_regs.c
> index f94a0210c7b7..3a3c2779efd4 100644
> --- a/tools/perf/arch/loongarch/util/perf_regs.c
> +++ b/tools/perf/arch/loongarch/util/perf_regs.c
> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
> SMPL_REG_END
> };
>
> -uint64_t arch__intr_reg_mask(void)
> -{
> - return PERF_REGS_MASK;
> -}
> +void arch__intr_reg_mask(unsigned long *mask) {}
>
> uint64_t arch__user_reg_mask(void)
> {
> diff --git a/tools/perf/arch/mips/util/perf_regs.c b/tools/perf/arch/mips/util/perf_regs.c
> index 6b1665f41180..9d132150ecb6 100644
> --- a/tools/perf/arch/mips/util/perf_regs.c
> +++ b/tools/perf/arch/mips/util/perf_regs.c
> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
> SMPL_REG_END
> };
>
> -uint64_t arch__intr_reg_mask(void)
> -{
> - return PERF_REGS_MASK;
> -}
> +void arch__intr_reg_mask(unsigned long *mask) {}
>
> uint64_t arch__user_reg_mask(void)
> {
> diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
> index e8e6e6fc6f17..08ab9ed692fb 100644
> --- a/tools/perf/arch/powerpc/util/perf_regs.c
> +++ b/tools/perf/arch/powerpc/util/perf_regs.c
> @@ -186,7 +186,7 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
> return SDT_ARG_VALID;
> }
>
> -uint64_t arch__intr_reg_mask(void)
> +void arch__intr_reg_mask(unsigned long *mask)
> {
> struct perf_event_attr attr = {
> .type = PERF_TYPE_HARDWARE,
> @@ -198,7 +198,9 @@ uint64_t arch__intr_reg_mask(void)
> };
> int fd;
> u32 version;
> - u64 extended_mask = 0, mask = PERF_REGS_MASK;
> + u64 extended_mask = 0;
> +
> + *(u64 *)mask = PERF_REGS_MASK;
>
> /*
> * Get the PVR value to set the extended
> @@ -223,9 +225,8 @@ uint64_t arch__intr_reg_mask(void)
> fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> if (fd != -1) {
> close(fd);
> - mask |= extended_mask;
> + *(u64 *)mask |= extended_mask;
> }
> - return mask;
> }
>
> uint64_t arch__user_reg_mask(void)
> diff --git a/tools/perf/arch/riscv/util/perf_regs.c b/tools/perf/arch/riscv/util/perf_regs.c
> index 6b1665f41180..9d132150ecb6 100644
> --- a/tools/perf/arch/riscv/util/perf_regs.c
> +++ b/tools/perf/arch/riscv/util/perf_regs.c
> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
> SMPL_REG_END
> };
>
> -uint64_t arch__intr_reg_mask(void)
> -{
> - return PERF_REGS_MASK;
> -}
> +void arch__intr_reg_mask(unsigned long *mask) {}
>
> uint64_t arch__user_reg_mask(void)
> {
> diff --git a/tools/perf/arch/s390/util/perf_regs.c b/tools/perf/arch/s390/util/perf_regs.c
> index 6b1665f41180..9d132150ecb6 100644
> --- a/tools/perf/arch/s390/util/perf_regs.c
> +++ b/tools/perf/arch/s390/util/perf_regs.c
> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
> SMPL_REG_END
> };
>
> -uint64_t arch__intr_reg_mask(void)
> -{
> - return PERF_REGS_MASK;
> -}
> +void arch__intr_reg_mask(unsigned long *mask) {}
>
> uint64_t arch__user_reg_mask(void)
> {
> diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
> index 9f492568f3b4..52f08498d005 100644
> --- a/tools/perf/arch/x86/util/perf_regs.c
> +++ b/tools/perf/arch/x86/util/perf_regs.c
> @@ -283,7 +283,7 @@ const struct sample_reg *arch__sample_reg_masks(void)
> return sample_reg_masks;
> }
>
> -uint64_t arch__intr_reg_mask(void)
> +void arch__intr_reg_mask(unsigned long *mask)
> {
> struct perf_event_attr attr = {
> .type = PERF_TYPE_HARDWARE,
> @@ -295,6 +295,9 @@ uint64_t arch__intr_reg_mask(void)
> .exclude_kernel = 1,
> };
> int fd;
> +
> + *(u64 *)mask = PERF_REGS_MASK;
> +
> /*
> * In an unnamed union, init it here to build on older gcc versions
> */
> @@ -320,10 +323,8 @@ uint64_t arch__intr_reg_mask(void)
> fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> if (fd != -1) {
> close(fd);
> - return (PERF_REG_EXTENDED_MASK | PERF_REGS_MASK);
> + *(u64 *)mask |= PERF_REG_EXTENDED_MASK;
> }
> -
> - return PERF_REGS_MASK;
> }
>
> uint64_t arch__user_reg_mask(void)
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 9e47905f75a6..66d3923e4040 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -704,10 +704,11 @@ static int perf_session__check_output_opt(struct perf_session *session)
> }
>
> static int perf_sample__fprintf_regs(struct regs_dump *regs, uint64_t mask, const char *arch,
> - FILE *fp)
> + unsigned long *mask_ext, FILE *fp)
> {
> unsigned i = 0, r;
> int printed = 0;
> + u64 val;
>
> if (!regs || !regs->regs)
> return 0;
> @@ -715,7 +716,15 @@ static int perf_sample__fprintf_regs(struct regs_dump *regs, uint64_t mask, cons
> printed += fprintf(fp, " ABI:%" PRIu64 " ", regs->abi);
>
> for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
> - u64 val = regs->regs[i++];
> + val = regs->regs[i++];
> + printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r, arch), val);
> + }
> +
> + if (!mask_ext)
> + return printed;
> +
> + for_each_set_bit(r, mask_ext, PERF_NUM_EXT_REGS) {
> + val = regs->regs[i++];
> printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r, arch), val);
> }
>
> @@ -776,14 +785,16 @@ static int perf_sample__fprintf_iregs(struct perf_sample *sample,
> struct perf_event_attr *attr, const char *arch, FILE *fp)
> {
> return perf_sample__fprintf_regs(&sample->intr_regs,
> - attr->sample_regs_intr, arch, fp);
> + attr->sample_regs_intr, arch,
> + (unsigned long *)attr->sample_regs_intr_ext,
> + fp);
> }
>
> static int perf_sample__fprintf_uregs(struct perf_sample *sample,
> struct perf_event_attr *attr, const char *arch, FILE *fp)
> {
> return perf_sample__fprintf_regs(&sample->user_regs,
> - attr->sample_regs_user, arch, fp);
> + attr->sample_regs_user, arch, NULL, fp);
> }
>
> static int perf_sample__fprintf_start(struct perf_script *script,
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index f745723d486b..297b960ac446 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1314,9 +1314,11 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
> if (callchain && callchain->enabled && !evsel->no_aux_samples)
> evsel__config_callchain(evsel, opts, callchain);
>
> - if (opts->sample_intr_regs && !evsel->no_aux_samples &&
> - !evsel__is_dummy_event(evsel)) {
> - attr->sample_regs_intr = opts->sample_intr_regs;
> + if (bitmap_weight(opts->sample_intr_regs, PERF_NUM_INTR_REGS_SIZE) &&
> + !evsel->no_aux_samples && !evsel__is_dummy_event(evsel)) {
> + attr->sample_regs_intr = opts->sample_intr_regs[0];
> + memcpy(attr->sample_regs_intr_ext, &opts->sample_intr_regs[1],
> + PERF_NUM_EXT_REGS / 8);
> evsel__set_sample_bit(evsel, REGS_INTR);
> }
>
> @@ -3097,10 +3099,16 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
>
> if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
> u64 mask = evsel->core.attr.sample_regs_intr;
> + unsigned long *mask_ext =
> + (unsigned long *)evsel->core.attr.sample_regs_intr_ext;
> + u64 *intr_regs_mask;
>
> sz = hweight64(mask) * sizeof(u64);
> + sz += bitmap_weight(mask_ext, PERF_NUM_EXT_REGS) * sizeof(u64);
> OVERFLOW_CHECK(array, sz, max_size);
> data->intr_regs.mask = mask;
> + intr_regs_mask = (u64 *)&data->intr_regs.mask_ext;
> + memcpy(&intr_regs_mask[1], mask_ext, PERF_NUM_EXT_REGS);
> data->intr_regs.regs = (u64 *)array;
> array = (void *)array + sz;
> }
> diff --git a/tools/perf/util/parse-regs-options.c b/tools/perf/util/parse-regs-options.c
> index cda1c620968e..666c2a172ef2 100644
> --- a/tools/perf/util/parse-regs-options.c
> +++ b/tools/perf/util/parse-regs-options.c
> @@ -12,11 +12,13 @@
> static int
> __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
> {
> + unsigned int size = intr ? PERF_NUM_INTR_REGS * 64 : 64;
> uint64_t *mode = (uint64_t *)opt->value;
> const struct sample_reg *r = NULL;
> char *s, *os = NULL, *p;
> int ret = -1;
> - uint64_t mask;
> + DECLARE_BITMAP(mask, size);
> + DECLARE_BITMAP(mask_tmp, size);
>
> if (unset)
> return 0;
> @@ -24,13 +26,14 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
> /*
> * cannot set it twice
> */
> - if (*mode)
> + if (bitmap_weight((unsigned long *)mode, size))
> return -1;
>
> + bitmap_zero(mask, size);
> if (intr)
> - mask = arch__intr_reg_mask();
> + arch__intr_reg_mask(mask);
> else
> - mask = arch__user_reg_mask();
> + *(uint64_t *)mask = arch__user_reg_mask();
>
> /* str may be NULL in case no arg is passed to -I */
> if (str) {
> @@ -47,7 +50,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
> if (!strcmp(s, "?")) {
> fprintf(stderr, "available registers: ");
> for (r = arch__sample_reg_masks(); r->name; r++) {
> - if (r->mask & mask)
> + bitmap_and(mask_tmp, mask, r->mask_ext, size);
> + if (bitmap_weight(mask_tmp, size))
> fprintf(stderr, "%s ", r->name);
> }
> fputc('\n', stderr);
> @@ -55,7 +59,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
> goto error;
> }
> for (r = arch__sample_reg_masks(); r->name; r++) {
> - if ((r->mask & mask) && !strcasecmp(s, r->name))
> + bitmap_and(mask_tmp, mask, r->mask_ext, size);
> + if (bitmap_weight(mask_tmp, size) && !strcasecmp(s, r->name))
> break;
> }
> if (!r || !r->name) {
> @@ -64,7 +69,7 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
> goto error;
> }
>
> - *mode |= r->mask;
> + bitmap_or((unsigned long *)mode, (unsigned long *)mode, r->mask_ext, size);
>
> if (!p)
> break;
> @@ -75,8 +80,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
> ret = 0;
>
> /* default to all possible regs */
> - if (*mode == 0)
> - *mode = mask;
> + if (!bitmap_weight((unsigned long *)mode, size))
> + bitmap_or((unsigned long *)mode, (unsigned long *)mode, mask, size);
> error:
> free(os);
> return ret;
> diff --git a/tools/perf/util/perf_regs.c b/tools/perf/util/perf_regs.c
> index 44b90bbf2d07..b36eafc10e84 100644
> --- a/tools/perf/util/perf_regs.c
> +++ b/tools/perf/util/perf_regs.c
> @@ -11,11 +11,6 @@ int __weak arch_sdt_arg_parse_op(char *old_op __maybe_unused,
> return SDT_ARG_SKIP;
> }
>
> -uint64_t __weak arch__intr_reg_mask(void)
> -{
> - return 0;
> -}
> -
> uint64_t __weak arch__user_reg_mask(void)
> {
> return 0;
> diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
> index f2d0736d65cc..5018b8d040ee 100644
> --- a/tools/perf/util/perf_regs.h
> +++ b/tools/perf/util/perf_regs.h
> @@ -4,18 +4,32 @@
>
> #include <linux/types.h>
> #include <linux/compiler.h>
> +#include <linux/bitmap.h>
> +#include <linux/perf_event.h>
> +#include "util/record.h"
>
> struct regs_dump;
>
> struct sample_reg {
> const char *name;
> - uint64_t mask;
> + union {
> + uint64_t mask;
> + DECLARE_BITMAP(mask_ext, PERF_NUM_INTR_REGS * 64);
> + };
> };
>
> #define SMPL_REG_MASK(b) (1ULL << (b))
> #define SMPL_REG(n, b) { .name = #n, .mask = SMPL_REG_MASK(b) }
> #define SMPL_REG2_MASK(b) (3ULL << (b))
> #define SMPL_REG2(n, b) { .name = #n, .mask = SMPL_REG2_MASK(b) }
> +#define SMPL_REG_EXT(n, b) \
> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0x1ULL << (b % __BITS_PER_LONG) }
> +#define SMPL_REG2_EXT(n, b) \
> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0x3ULL << (b % __BITS_PER_LONG) }
> +#define SMPL_REG4_EXT(n, b) \
> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0xfULL << (b % __BITS_PER_LONG) }
> +#define SMPL_REG8_EXT(n, b) \
> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0xffULL << (b % __BITS_PER_LONG) }
> #define SMPL_REG_END { .name = NULL }
>
> enum {
> @@ -24,7 +38,7 @@ enum {
> };
>
> int arch_sdt_arg_parse_op(char *old_op, char **new_op);
> -uint64_t arch__intr_reg_mask(void);
> +void arch__intr_reg_mask(unsigned long *mask);
> uint64_t arch__user_reg_mask(void);
> const struct sample_reg *arch__sample_reg_masks(void);
>
> diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
> index a6566134e09e..16e44a640e57 100644
> --- a/tools/perf/util/record.h
> +++ b/tools/perf/util/record.h
> @@ -57,7 +57,7 @@ struct record_opts {
> unsigned int auxtrace_mmap_pages;
> unsigned int user_freq;
> u64 branch_stack;
> - u64 sample_intr_regs;
> + u64 sample_intr_regs[PERF_NUM_INTR_REGS];
> u64 sample_user_regs;
> u64 default_interval;
> u64 user_interval;
> diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h
> index 70b2c3135555..98c9c4260de6 100644
> --- a/tools/perf/util/sample.h
> +++ b/tools/perf/util/sample.h
> @@ -4,13 +4,17 @@
>
> #include <linux/perf_event.h>
> #include <linux/types.h>
> +#include <linux/bitmap.h>
>
> /* number of register is bound by the number of bits in regs_dump::mask (64) */
> #define PERF_SAMPLE_REGS_CACHE_SIZE (8 * sizeof(u64))
>
> struct regs_dump {
> u64 abi;
> - u64 mask;
> + union {
> + u64 mask;
> + DECLARE_BITMAP(mask_ext, PERF_NUM_INTR_REGS * 64);
> + };
> u64 *regs;
>
> /* Cached values/mask filled by first register access. */
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 507e6cba9545..995f5c2963bc 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -909,12 +909,13 @@ static void branch_stack__printf(struct perf_sample *sample,
> }
> }
>
> -static void regs_dump__printf(u64 mask, u64 *regs, const char *arch)
> +static void regs_dump__printf(bool intr, struct regs_dump *regs, const char *arch)
> {
> + unsigned int size = intr ? PERF_NUM_INTR_REGS * 64 : 64;
> unsigned rid, i = 0;
>
> - for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) {
> - u64 val = regs[i++];
> + for_each_set_bit(rid, regs->mask_ext, size) {
> + u64 val = regs->regs[i++];
>
> printf(".... %-5s 0x%016" PRIx64 "\n",
> perf_reg_name(rid, arch), val);
> @@ -935,16 +936,22 @@ static inline const char *regs_dump_abi(struct regs_dump *d)
> return regs_abi[d->abi];
> }
>
> -static void regs__printf(const char *type, struct regs_dump *regs, const char *arch)
> +static void regs__printf(bool intr, struct regs_dump *regs, const char *arch)
> {
> - u64 mask = regs->mask;
> + if (intr) {
> + u64 *mask = (u64 *)®s->mask_ext;
>
> - printf("... %s regs: mask 0x%" PRIx64 " ABI %s\n",
> - type,
> - mask,
> - regs_dump_abi(regs));
> + printf("... intr regs: mask 0x");
> + for (int i = 0; i < PERF_NUM_INTR_REGS; i++)
> + printf("%" PRIx64 "", mask[i]);
> + printf(" ABI %s\n", regs_dump_abi(regs));
> + } else {
> + printf("... user regs: mask 0x%" PRIx64 " ABI %s\n",
> + regs->mask,
> + regs_dump_abi(regs));
> + }
>
> - regs_dump__printf(mask, regs->regs, arch);
> + regs_dump__printf(intr, regs, arch);
> }
>
> static void regs_user__printf(struct perf_sample *sample, const char *arch)
> @@ -952,7 +959,7 @@ static void regs_user__printf(struct perf_sample *sample, const char *arch)
> struct regs_dump *user_regs = &sample->user_regs;
>
> if (user_regs->regs)
> - regs__printf("user", user_regs, arch);
> + regs__printf(false, user_regs, arch);
> }
>
> static void regs_intr__printf(struct perf_sample *sample, const char *arch)
> @@ -960,7 +967,7 @@ static void regs_intr__printf(struct perf_sample *sample, const char *arch)
> struct regs_dump *intr_regs = &sample->intr_regs;
>
> if (intr_regs->regs)
> - regs__printf("intr", intr_regs, arch);
> + regs__printf(true, intr_regs, arch);
> }
>
> static void stack_user__printf(struct stack_dump *dump)
> diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> index a58444c4aed1..35c5d58aa45f 100644
> --- a/tools/perf/util/synthetic-events.c
> +++ b/tools/perf/util/synthetic-events.c
> @@ -1538,7 +1538,9 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
> if (type & PERF_SAMPLE_REGS_INTR) {
> if (sample->intr_regs.abi) {
> result += sizeof(u64);
> - sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
> + sz = bitmap_weight(sample->intr_regs.mask_ext,
> + PERF_NUM_INTR_REGS * 64) *
> + sizeof(u64);
> result += sz;
> } else {
> result += sizeof(u64);
> @@ -1741,7 +1743,8 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type, u64 read_fo
> if (type & PERF_SAMPLE_REGS_INTR) {
> if (sample->intr_regs.abi) {
> *array++ = sample->intr_regs.abi;
> - sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
> + sz = bitmap_weight(sample->intr_regs.mask_ext,
> + PERF_NUM_INTR_REGS * 64) * sizeof(u64);
> memcpy(array, sample->intr_regs.regs, sz);
> array = (void *)array + sz;
> } else {
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
2025-01-23 14:07 ` [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs Dapeng Mi
@ 2025-01-23 18:58 ` Andi Kleen
2025-01-27 15:19 ` Liang, Kan
0 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2025-01-23 18:58 UTC (permalink / raw)
To: Dapeng Mi
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Eranian Stephane, linux-kernel, linux-perf-users,
Dapeng Mi
> + /*
> + * The archPerfmonExt (0x23) includes an enhanced enumeration of
> + * PMU architectural features with a per-core view. For non-hybrid,
> + * each core has the same PMU capabilities. It's good enough to
> + * update the x86_pmu from the booting CPU. For hybrid, the x86_pmu
> + * is used to keep the common capabilities. Still keep the values
> + * from the leaf 0xa. The core specific update will be done later
> + * when a new type is online.
> + */
> + if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
> + update_pmu_cap(NULL);
It seems ugly to have these different code paths. Couldn't non hybrid
use x86_pmu in the same way? I assume it would be a larger patch.
-Andi
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 13/20] perf/x86/intel: Add SSP register support for arch-PEBS
2025-01-23 14:07 ` [PATCH 13/20] perf/x86/intel: Add SSP register support for arch-PEBS Dapeng Mi
@ 2025-01-24 5:16 ` Andi Kleen
2025-01-27 15:38 ` Liang, Kan
0 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2025-01-24 5:16 UTC (permalink / raw)
To: Dapeng Mi
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Eranian Stephane, linux-kernel, linux-perf-users,
Dapeng Mi
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index f40b03adb5c7..7ed80f01f15d 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -646,6 +646,16 @@ int x86_pmu_hw_config(struct perf_event *event)
> return -EINVAL;
> }
>
> + /* sample_regs_user never support SSP register. */
> + if (unlikely(event->attr.sample_regs_user & BIT_ULL(PERF_REG_X86_SSP)))
> + return -EINVAL;
Why not? It's somewhere.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
2025-01-23 18:58 ` Andi Kleen
@ 2025-01-27 15:19 ` Liang, Kan
2025-01-27 16:44 ` Peter Zijlstra
0 siblings, 1 reply; 47+ messages in thread
From: Liang, Kan @ 2025-01-27 15:19 UTC (permalink / raw)
To: Andi Kleen, Dapeng Mi
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On 2025-01-23 1:58 p.m., Andi Kleen wrote:
>> + /*
>> + * The archPerfmonExt (0x23) includes an enhanced enumeration of
>> + * PMU architectural features with a per-core view. For non-hybrid,
>> + * each core has the same PMU capabilities. It's good enough to
>> + * update the x86_pmu from the booting CPU. For hybrid, the x86_pmu
>> + * is used to keep the common capabilities. Still keep the values
>> + * from the leaf 0xa. The core specific update will be done later
>> + * when a new type is online.
>> + */
>> + if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
>> + update_pmu_cap(NULL);
>
> It seems ugly to have these different code paths. Couldn't non hybrid
> use x86_pmu in the same way? I assume it would be a larger patch.
The current non-hybrid is initialized in the intel_pmu_init(). But some
of the initialization code for the hybrid is in the
intel_pmu_cpu_starting(). Yes, it's better to move it together. It
should be a larger patch. Since it's impacted other features, a separate
patch set should be required.
Thanks,
Kan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 13/20] perf/x86/intel: Add SSP register support for arch-PEBS
2025-01-24 5:16 ` Andi Kleen
@ 2025-01-27 15:38 ` Liang, Kan
0 siblings, 0 replies; 47+ messages in thread
From: Liang, Kan @ 2025-01-27 15:38 UTC (permalink / raw)
To: Andi Kleen, Dapeng Mi
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On 2025-01-24 12:16 a.m., Andi Kleen wrote:
>> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
>> index f40b03adb5c7..7ed80f01f15d 100644
>> --- a/arch/x86/events/core.c
>> +++ b/arch/x86/events/core.c
>> @@ -646,6 +646,16 @@ int x86_pmu_hw_config(struct perf_event *event)
>> return -EINVAL;
>> }
>>
>> + /* sample_regs_user never support SSP register. */
>> + if (unlikely(event->attr.sample_regs_user & BIT_ULL(PERF_REG_X86_SSP)))
>> + return -EINVAL;
>
> Why not? It's somewhere.
The current REGS_USER only returns the registers in the struct pt_regs.
The ssp is not part of it. So it is only supported in the REGS_INTR. Is
it enough?
If we want to support ssp with REGS_USER, I think a arch-specific
function should be required early to avoid the perf_sample_regs_user().
Thanks,
Kan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 18/20] perf tools: Support to capture more vector registers (common part)
2025-01-23 16:42 ` Ian Rogers
@ 2025-01-27 15:50 ` Liang, Kan
2025-02-06 3:12 ` Mi, Dapeng
0 siblings, 1 reply; 47+ messages in thread
From: Liang, Kan @ 2025-01-27 15:50 UTC (permalink / raw)
To: Ian Rogers, Dapeng Mi
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On 2025-01-23 11:42 a.m., Ian Rogers wrote:
> On Wed, Jan 22, 2025 at 10:21 PM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>>
>> Intel architectural PEBS supports to capture more vector registers like
>> OPMASK/YMM/ZMM registers besides already supported XMM registers.
>>
>> arch-PEBS vector registers (VCER) capturing on perf core/pmu driver
>> (Intel) has been supported by previous patches. This patch adds perf
>> tool's part support. In detail, add support for the new
>> sample_regs_intr_ext register selector in perf_event_attr. This 32 bytes
>> bitmap is used to select the new register group OPMASK, YMMH, ZMMH and
>> ZMM in VECR. Update perf regs to introduce the new registers.
>>
>> This single patch only introduces the common support, x86/intel specific
>> support would be added in next patch.
>
> Could you break down what the individual changes are? I see quite a
> few, some in printing, some with functions like arch__intr_reg_mask.
> I'm sure the changes are well motivated but there is little detail in
> the commit message. Perhaps there is some chance to separate each
> change into its own patch. By detail I mean something like, "change
> arch__intr_reg_mask to taking a pointer so that REG_MASK and array
> initialization is possible."
>
> It is a shame arch__intr_reg_mask doesn't match arch__user_reg_mask
> following this change. Perhaps update them both for the sake of
> consistency.
Yes, it sounds cleaner. The same size but different mask. It may waste
some space but it should be OK.
>
> Out of scope here, I wonder in general how we can get this code out of
> the arch directory? For example, it would be nice if we have say an
> arm perf command running on qemu-user on an x86 that we perhaps want
> to do the appropriate reg_mask for x86.
Different ARCH has a different pt_regs. It seems hard to general a
generic reg list.
Thanks,
Kan
>
> Thanks,
> Ian
>
>> Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>> tools/include/uapi/linux/perf_event.h | 13 +++++++++
>> tools/perf/arch/arm/util/perf_regs.c | 5 +---
>> tools/perf/arch/arm64/util/perf_regs.c | 5 +---
>> tools/perf/arch/csky/util/perf_regs.c | 5 +---
>> tools/perf/arch/loongarch/util/perf_regs.c | 5 +---
>> tools/perf/arch/mips/util/perf_regs.c | 5 +---
>> tools/perf/arch/powerpc/util/perf_regs.c | 9 ++++---
>> tools/perf/arch/riscv/util/perf_regs.c | 5 +---
>> tools/perf/arch/s390/util/perf_regs.c | 5 +---
>> tools/perf/arch/x86/util/perf_regs.c | 9 ++++---
>> tools/perf/builtin-script.c | 19 ++++++++++---
>> tools/perf/util/evsel.c | 14 +++++++---
>> tools/perf/util/parse-regs-options.c | 23 +++++++++-------
>> tools/perf/util/perf_regs.c | 5 ----
>> tools/perf/util/perf_regs.h | 18 +++++++++++--
>> tools/perf/util/record.h | 2 +-
>> tools/perf/util/sample.h | 6 ++++-
>> tools/perf/util/session.c | 31 +++++++++++++---------
>> tools/perf/util/synthetic-events.c | 7 +++--
>> 19 files changed, 116 insertions(+), 75 deletions(-)
>>
>> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
>> index 4842c36fdf80..02d8f55f6247 100644
>> --- a/tools/include/uapi/linux/perf_event.h
>> +++ b/tools/include/uapi/linux/perf_event.h
>> @@ -379,6 +379,13 @@ enum perf_event_read_format {
>> #define PERF_ATTR_SIZE_VER6 120 /* add: aux_sample_size */
>> #define PERF_ATTR_SIZE_VER7 128 /* add: sig_data */
>> #define PERF_ATTR_SIZE_VER8 136 /* add: config3 */
>> +#define PERF_ATTR_SIZE_VER9 168 /* add: sample_regs_intr_ext[PERF_EXT_REGS_ARRAY_SIZE] */
>> +
>> +#define PERF_EXT_REGS_ARRAY_SIZE 4
>> +#define PERF_NUM_EXT_REGS (PERF_EXT_REGS_ARRAY_SIZE * 64)
>> +
>> +#define PERF_NUM_INTR_REGS (PERF_EXT_REGS_ARRAY_SIZE + 1)
>> +#define PERF_NUM_INTR_REGS_SIZE ((PERF_NUM_INTR_REGS) * 64)
>>
>> /*
>> * Hardware event_id to monitor via a performance monitoring event:
>> @@ -522,6 +529,12 @@ struct perf_event_attr {
>> __u64 sig_data;
>>
>> __u64 config3; /* extension of config2 */
>> +
>> + /*
>> + * Extension sets of regs to dump for each sample.
>> + * See asm/perf_regs.h for details.
>> + */
>> + __u64 sample_regs_intr_ext[PERF_EXT_REGS_ARRAY_SIZE];
>> };
>>
>> /*
>> diff --git a/tools/perf/arch/arm/util/perf_regs.c b/tools/perf/arch/arm/util/perf_regs.c
>> index f94a0210c7b7..3a3c2779efd4 100644
>> --- a/tools/perf/arch/arm/util/perf_regs.c
>> +++ b/tools/perf/arch/arm/util/perf_regs.c
>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>> SMPL_REG_END
>> };
>>
>> -uint64_t arch__intr_reg_mask(void)
>> -{
>> - return PERF_REGS_MASK;
>> -}
>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>
>> uint64_t arch__user_reg_mask(void)
>> {
>> diff --git a/tools/perf/arch/arm64/util/perf_regs.c b/tools/perf/arch/arm64/util/perf_regs.c
>> index 09308665e28a..754bb8423733 100644
>> --- a/tools/perf/arch/arm64/util/perf_regs.c
>> +++ b/tools/perf/arch/arm64/util/perf_regs.c
>> @@ -140,10 +140,7 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
>> return SDT_ARG_VALID;
>> }
>>
>> -uint64_t arch__intr_reg_mask(void)
>> -{
>> - return PERF_REGS_MASK;
>> -}
>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>
>> uint64_t arch__user_reg_mask(void)
>> {
>> diff --git a/tools/perf/arch/csky/util/perf_regs.c b/tools/perf/arch/csky/util/perf_regs.c
>> index 6b1665f41180..9d132150ecb6 100644
>> --- a/tools/perf/arch/csky/util/perf_regs.c
>> +++ b/tools/perf/arch/csky/util/perf_regs.c
>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>> SMPL_REG_END
>> };
>>
>> -uint64_t arch__intr_reg_mask(void)
>> -{
>> - return PERF_REGS_MASK;
>> -}
>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>
>> uint64_t arch__user_reg_mask(void)
>> {
>> diff --git a/tools/perf/arch/loongarch/util/perf_regs.c b/tools/perf/arch/loongarch/util/perf_regs.c
>> index f94a0210c7b7..3a3c2779efd4 100644
>> --- a/tools/perf/arch/loongarch/util/perf_regs.c
>> +++ b/tools/perf/arch/loongarch/util/perf_regs.c
>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>> SMPL_REG_END
>> };
>>
>> -uint64_t arch__intr_reg_mask(void)
>> -{
>> - return PERF_REGS_MASK;
>> -}
>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>
>> uint64_t arch__user_reg_mask(void)
>> {
>> diff --git a/tools/perf/arch/mips/util/perf_regs.c b/tools/perf/arch/mips/util/perf_regs.c
>> index 6b1665f41180..9d132150ecb6 100644
>> --- a/tools/perf/arch/mips/util/perf_regs.c
>> +++ b/tools/perf/arch/mips/util/perf_regs.c
>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>> SMPL_REG_END
>> };
>>
>> -uint64_t arch__intr_reg_mask(void)
>> -{
>> - return PERF_REGS_MASK;
>> -}
>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>
>> uint64_t arch__user_reg_mask(void)
>> {
>> diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
>> index e8e6e6fc6f17..08ab9ed692fb 100644
>> --- a/tools/perf/arch/powerpc/util/perf_regs.c
>> +++ b/tools/perf/arch/powerpc/util/perf_regs.c
>> @@ -186,7 +186,7 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
>> return SDT_ARG_VALID;
>> }
>>
>> -uint64_t arch__intr_reg_mask(void)
>> +void arch__intr_reg_mask(unsigned long *mask)
>> {
>> struct perf_event_attr attr = {
>> .type = PERF_TYPE_HARDWARE,
>> @@ -198,7 +198,9 @@ uint64_t arch__intr_reg_mask(void)
>> };
>> int fd;
>> u32 version;
>> - u64 extended_mask = 0, mask = PERF_REGS_MASK;
>> + u64 extended_mask = 0;
>> +
>> + *(u64 *)mask = PERF_REGS_MASK;
>>
>> /*
>> * Get the PVR value to set the extended
>> @@ -223,9 +225,8 @@ uint64_t arch__intr_reg_mask(void)
>> fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
>> if (fd != -1) {
>> close(fd);
>> - mask |= extended_mask;
>> + *(u64 *)mask |= extended_mask;
>> }
>> - return mask;
>> }
>>
>> uint64_t arch__user_reg_mask(void)
>> diff --git a/tools/perf/arch/riscv/util/perf_regs.c b/tools/perf/arch/riscv/util/perf_regs.c
>> index 6b1665f41180..9d132150ecb6 100644
>> --- a/tools/perf/arch/riscv/util/perf_regs.c
>> +++ b/tools/perf/arch/riscv/util/perf_regs.c
>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>> SMPL_REG_END
>> };
>>
>> -uint64_t arch__intr_reg_mask(void)
>> -{
>> - return PERF_REGS_MASK;
>> -}
>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>
>> uint64_t arch__user_reg_mask(void)
>> {
>> diff --git a/tools/perf/arch/s390/util/perf_regs.c b/tools/perf/arch/s390/util/perf_regs.c
>> index 6b1665f41180..9d132150ecb6 100644
>> --- a/tools/perf/arch/s390/util/perf_regs.c
>> +++ b/tools/perf/arch/s390/util/perf_regs.c
>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>> SMPL_REG_END
>> };
>>
>> -uint64_t arch__intr_reg_mask(void)
>> -{
>> - return PERF_REGS_MASK;
>> -}
>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>
>> uint64_t arch__user_reg_mask(void)
>> {
>> diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
>> index 9f492568f3b4..52f08498d005 100644
>> --- a/tools/perf/arch/x86/util/perf_regs.c
>> +++ b/tools/perf/arch/x86/util/perf_regs.c
>> @@ -283,7 +283,7 @@ const struct sample_reg *arch__sample_reg_masks(void)
>> return sample_reg_masks;
>> }
>>
>> -uint64_t arch__intr_reg_mask(void)
>> +void arch__intr_reg_mask(unsigned long *mask)
>> {
>> struct perf_event_attr attr = {
>> .type = PERF_TYPE_HARDWARE,
>> @@ -295,6 +295,9 @@ uint64_t arch__intr_reg_mask(void)
>> .exclude_kernel = 1,
>> };
>> int fd;
>> +
>> + *(u64 *)mask = PERF_REGS_MASK;
>> +
>> /*
>> * In an unnamed union, init it here to build on older gcc versions
>> */
>> @@ -320,10 +323,8 @@ uint64_t arch__intr_reg_mask(void)
>> fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
>> if (fd != -1) {
>> close(fd);
>> - return (PERF_REG_EXTENDED_MASK | PERF_REGS_MASK);
>> + *(u64 *)mask |= PERF_REG_EXTENDED_MASK;
>> }
>> -
>> - return PERF_REGS_MASK;
>> }
>>
>> uint64_t arch__user_reg_mask(void)
>> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
>> index 9e47905f75a6..66d3923e4040 100644
>> --- a/tools/perf/builtin-script.c
>> +++ b/tools/perf/builtin-script.c
>> @@ -704,10 +704,11 @@ static int perf_session__check_output_opt(struct perf_session *session)
>> }
>>
>> static int perf_sample__fprintf_regs(struct regs_dump *regs, uint64_t mask, const char *arch,
>> - FILE *fp)
>> + unsigned long *mask_ext, FILE *fp)
>> {
>> unsigned i = 0, r;
>> int printed = 0;
>> + u64 val;
>>
>> if (!regs || !regs->regs)
>> return 0;
>> @@ -715,7 +716,15 @@ static int perf_sample__fprintf_regs(struct regs_dump *regs, uint64_t mask, cons
>> printed += fprintf(fp, " ABI:%" PRIu64 " ", regs->abi);
>>
>> for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
>> - u64 val = regs->regs[i++];
>> + val = regs->regs[i++];
>> + printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r, arch), val);
>> + }
>> +
>> + if (!mask_ext)
>> + return printed;
>> +
>> + for_each_set_bit(r, mask_ext, PERF_NUM_EXT_REGS) {
>> + val = regs->regs[i++];
>> printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r, arch), val);
>> }
>>
>> @@ -776,14 +785,16 @@ static int perf_sample__fprintf_iregs(struct perf_sample *sample,
>> struct perf_event_attr *attr, const char *arch, FILE *fp)
>> {
>> return perf_sample__fprintf_regs(&sample->intr_regs,
>> - attr->sample_regs_intr, arch, fp);
>> + attr->sample_regs_intr, arch,
>> + (unsigned long *)attr->sample_regs_intr_ext,
>> + fp);
>> }
>>
>> static int perf_sample__fprintf_uregs(struct perf_sample *sample,
>> struct perf_event_attr *attr, const char *arch, FILE *fp)
>> {
>> return perf_sample__fprintf_regs(&sample->user_regs,
>> - attr->sample_regs_user, arch, fp);
>> + attr->sample_regs_user, arch, NULL, fp);
>> }
>>
>> static int perf_sample__fprintf_start(struct perf_script *script,
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index f745723d486b..297b960ac446 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -1314,9 +1314,11 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
>> if (callchain && callchain->enabled && !evsel->no_aux_samples)
>> evsel__config_callchain(evsel, opts, callchain);
>>
>> - if (opts->sample_intr_regs && !evsel->no_aux_samples &&
>> - !evsel__is_dummy_event(evsel)) {
>> - attr->sample_regs_intr = opts->sample_intr_regs;
>> + if (bitmap_weight(opts->sample_intr_regs, PERF_NUM_INTR_REGS_SIZE) &&
>> + !evsel->no_aux_samples && !evsel__is_dummy_event(evsel)) {
>> + attr->sample_regs_intr = opts->sample_intr_regs[0];
>> + memcpy(attr->sample_regs_intr_ext, &opts->sample_intr_regs[1],
>> + PERF_NUM_EXT_REGS / 8);
>> evsel__set_sample_bit(evsel, REGS_INTR);
>> }
>>
>> @@ -3097,10 +3099,16 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
>>
>> if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
>> u64 mask = evsel->core.attr.sample_regs_intr;
>> + unsigned long *mask_ext =
>> + (unsigned long *)evsel->core.attr.sample_regs_intr_ext;
>> + u64 *intr_regs_mask;
>>
>> sz = hweight64(mask) * sizeof(u64);
>> + sz += bitmap_weight(mask_ext, PERF_NUM_EXT_REGS) * sizeof(u64);
>> OVERFLOW_CHECK(array, sz, max_size);
>> data->intr_regs.mask = mask;
>> + intr_regs_mask = (u64 *)&data->intr_regs.mask_ext;
>> + memcpy(&intr_regs_mask[1], mask_ext, PERF_NUM_EXT_REGS);
>> data->intr_regs.regs = (u64 *)array;
>> array = (void *)array + sz;
>> }
>> diff --git a/tools/perf/util/parse-regs-options.c b/tools/perf/util/parse-regs-options.c
>> index cda1c620968e..666c2a172ef2 100644
>> --- a/tools/perf/util/parse-regs-options.c
>> +++ b/tools/perf/util/parse-regs-options.c
>> @@ -12,11 +12,13 @@
>> static int
>> __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>> {
>> + unsigned int size = intr ? PERF_NUM_INTR_REGS * 64 : 64;
>> uint64_t *mode = (uint64_t *)opt->value;
>> const struct sample_reg *r = NULL;
>> char *s, *os = NULL, *p;
>> int ret = -1;
>> - uint64_t mask;
>> + DECLARE_BITMAP(mask, size);
>> + DECLARE_BITMAP(mask_tmp, size);
>>
>> if (unset)
>> return 0;
>> @@ -24,13 +26,14 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>> /*
>> * cannot set it twice
>> */
>> - if (*mode)
>> + if (bitmap_weight((unsigned long *)mode, size))
>> return -1;
>>
>> + bitmap_zero(mask, size);
>> if (intr)
>> - mask = arch__intr_reg_mask();
>> + arch__intr_reg_mask(mask);
>> else
>> - mask = arch__user_reg_mask();
>> + *(uint64_t *)mask = arch__user_reg_mask();
>>
>> /* str may be NULL in case no arg is passed to -I */
>> if (str) {
>> @@ -47,7 +50,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>> if (!strcmp(s, "?")) {
>> fprintf(stderr, "available registers: ");
>> for (r = arch__sample_reg_masks(); r->name; r++) {
>> - if (r->mask & mask)
>> + bitmap_and(mask_tmp, mask, r->mask_ext, size);
>> + if (bitmap_weight(mask_tmp, size))
>> fprintf(stderr, "%s ", r->name);
>> }
>> fputc('\n', stderr);
>> @@ -55,7 +59,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>> goto error;
>> }
>> for (r = arch__sample_reg_masks(); r->name; r++) {
>> - if ((r->mask & mask) && !strcasecmp(s, r->name))
>> + bitmap_and(mask_tmp, mask, r->mask_ext, size);
>> + if (bitmap_weight(mask_tmp, size) && !strcasecmp(s, r->name))
>> break;
>> }
>> if (!r || !r->name) {
>> @@ -64,7 +69,7 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>> goto error;
>> }
>>
>> - *mode |= r->mask;
>> + bitmap_or((unsigned long *)mode, (unsigned long *)mode, r->mask_ext, size);
>>
>> if (!p)
>> break;
>> @@ -75,8 +80,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>> ret = 0;
>>
>> /* default to all possible regs */
>> - if (*mode == 0)
>> - *mode = mask;
>> + if (!bitmap_weight((unsigned long *)mode, size))
>> + bitmap_or((unsigned long *)mode, (unsigned long *)mode, mask, size);
>> error:
>> free(os);
>> return ret;
>> diff --git a/tools/perf/util/perf_regs.c b/tools/perf/util/perf_regs.c
>> index 44b90bbf2d07..b36eafc10e84 100644
>> --- a/tools/perf/util/perf_regs.c
>> +++ b/tools/perf/util/perf_regs.c
>> @@ -11,11 +11,6 @@ int __weak arch_sdt_arg_parse_op(char *old_op __maybe_unused,
>> return SDT_ARG_SKIP;
>> }
>>
>> -uint64_t __weak arch__intr_reg_mask(void)
>> -{
>> - return 0;
>> -}
>> -
>> uint64_t __weak arch__user_reg_mask(void)
>> {
>> return 0;
>> diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
>> index f2d0736d65cc..5018b8d040ee 100644
>> --- a/tools/perf/util/perf_regs.h
>> +++ b/tools/perf/util/perf_regs.h
>> @@ -4,18 +4,32 @@
>>
>> #include <linux/types.h>
>> #include <linux/compiler.h>
>> +#include <linux/bitmap.h>
>> +#include <linux/perf_event.h>
>> +#include "util/record.h"
>>
>> struct regs_dump;
>>
>> struct sample_reg {
>> const char *name;
>> - uint64_t mask;
>> + union {
>> + uint64_t mask;
>> + DECLARE_BITMAP(mask_ext, PERF_NUM_INTR_REGS * 64);
>> + };
>> };
>>
>> #define SMPL_REG_MASK(b) (1ULL << (b))
>> #define SMPL_REG(n, b) { .name = #n, .mask = SMPL_REG_MASK(b) }
>> #define SMPL_REG2_MASK(b) (3ULL << (b))
>> #define SMPL_REG2(n, b) { .name = #n, .mask = SMPL_REG2_MASK(b) }
>> +#define SMPL_REG_EXT(n, b) \
>> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0x1ULL << (b % __BITS_PER_LONG) }
>> +#define SMPL_REG2_EXT(n, b) \
>> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0x3ULL << (b % __BITS_PER_LONG) }
>> +#define SMPL_REG4_EXT(n, b) \
>> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0xfULL << (b % __BITS_PER_LONG) }
>> +#define SMPL_REG8_EXT(n, b) \
>> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0xffULL << (b % __BITS_PER_LONG) }
>> #define SMPL_REG_END { .name = NULL }
>>
>> enum {
>> @@ -24,7 +38,7 @@ enum {
>> };
>>
>> int arch_sdt_arg_parse_op(char *old_op, char **new_op);
>> -uint64_t arch__intr_reg_mask(void);
>> +void arch__intr_reg_mask(unsigned long *mask);
>> uint64_t arch__user_reg_mask(void);
>> const struct sample_reg *arch__sample_reg_masks(void);
>>
>> diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
>> index a6566134e09e..16e44a640e57 100644
>> --- a/tools/perf/util/record.h
>> +++ b/tools/perf/util/record.h
>> @@ -57,7 +57,7 @@ struct record_opts {
>> unsigned int auxtrace_mmap_pages;
>> unsigned int user_freq;
>> u64 branch_stack;
>> - u64 sample_intr_regs;
>> + u64 sample_intr_regs[PERF_NUM_INTR_REGS];
>> u64 sample_user_regs;
>> u64 default_interval;
>> u64 user_interval;
>> diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h
>> index 70b2c3135555..98c9c4260de6 100644
>> --- a/tools/perf/util/sample.h
>> +++ b/tools/perf/util/sample.h
>> @@ -4,13 +4,17 @@
>>
>> #include <linux/perf_event.h>
>> #include <linux/types.h>
>> +#include <linux/bitmap.h>
>>
>> /* number of register is bound by the number of bits in regs_dump::mask (64) */
>> #define PERF_SAMPLE_REGS_CACHE_SIZE (8 * sizeof(u64))
>>
>> struct regs_dump {
>> u64 abi;
>> - u64 mask;
>> + union {
>> + u64 mask;
>> + DECLARE_BITMAP(mask_ext, PERF_NUM_INTR_REGS * 64);
>> + };
>> u64 *regs;
>>
>> /* Cached values/mask filled by first register access. */
>> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
>> index 507e6cba9545..995f5c2963bc 100644
>> --- a/tools/perf/util/session.c
>> +++ b/tools/perf/util/session.c
>> @@ -909,12 +909,13 @@ static void branch_stack__printf(struct perf_sample *sample,
>> }
>> }
>>
>> -static void regs_dump__printf(u64 mask, u64 *regs, const char *arch)
>> +static void regs_dump__printf(bool intr, struct regs_dump *regs, const char *arch)
>> {
>> + unsigned int size = intr ? PERF_NUM_INTR_REGS * 64 : 64;
>> unsigned rid, i = 0;
>>
>> - for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) {
>> - u64 val = regs[i++];
>> + for_each_set_bit(rid, regs->mask_ext, size) {
>> + u64 val = regs->regs[i++];
>>
>> printf(".... %-5s 0x%016" PRIx64 "\n",
>> perf_reg_name(rid, arch), val);
>> @@ -935,16 +936,22 @@ static inline const char *regs_dump_abi(struct regs_dump *d)
>> return regs_abi[d->abi];
>> }
>>
>> -static void regs__printf(const char *type, struct regs_dump *regs, const char *arch)
>> +static void regs__printf(bool intr, struct regs_dump *regs, const char *arch)
>> {
>> - u64 mask = regs->mask;
>> + if (intr) {
>> + u64 *mask = (u64 *)®s->mask_ext;
>>
>> - printf("... %s regs: mask 0x%" PRIx64 " ABI %s\n",
>> - type,
>> - mask,
>> - regs_dump_abi(regs));
>> + printf("... intr regs: mask 0x");
>> + for (int i = 0; i < PERF_NUM_INTR_REGS; i++)
>> + printf("%" PRIx64 "", mask[i]);
>> + printf(" ABI %s\n", regs_dump_abi(regs));
>> + } else {
>> + printf("... user regs: mask 0x%" PRIx64 " ABI %s\n",
>> + regs->mask,
>> + regs_dump_abi(regs));
>> + }
>>
>> - regs_dump__printf(mask, regs->regs, arch);
>> + regs_dump__printf(intr, regs, arch);
>> }
>>
>> static void regs_user__printf(struct perf_sample *sample, const char *arch)
>> @@ -952,7 +959,7 @@ static void regs_user__printf(struct perf_sample *sample, const char *arch)
>> struct regs_dump *user_regs = &sample->user_regs;
>>
>> if (user_regs->regs)
>> - regs__printf("user", user_regs, arch);
>> + regs__printf(false, user_regs, arch);
>> }
>>
>> static void regs_intr__printf(struct perf_sample *sample, const char *arch)
>> @@ -960,7 +967,7 @@ static void regs_intr__printf(struct perf_sample *sample, const char *arch)
>> struct regs_dump *intr_regs = &sample->intr_regs;
>>
>> if (intr_regs->regs)
>> - regs__printf("intr", intr_regs, arch);
>> + regs__printf(true, intr_regs, arch);
>> }
>>
>> static void stack_user__printf(struct stack_dump *dump)
>> diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
>> index a58444c4aed1..35c5d58aa45f 100644
>> --- a/tools/perf/util/synthetic-events.c
>> +++ b/tools/perf/util/synthetic-events.c
>> @@ -1538,7 +1538,9 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
>> if (type & PERF_SAMPLE_REGS_INTR) {
>> if (sample->intr_regs.abi) {
>> result += sizeof(u64);
>> - sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
>> + sz = bitmap_weight(sample->intr_regs.mask_ext,
>> + PERF_NUM_INTR_REGS * 64) *
>> + sizeof(u64);
>> result += sz;
>> } else {
>> result += sizeof(u64);
>> @@ -1741,7 +1743,8 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type, u64 read_fo
>> if (type & PERF_SAMPLE_REGS_INTR) {
>> if (sample->intr_regs.abi) {
>> *array++ = sample->intr_regs.abi;
>> - sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
>> + sz = bitmap_weight(sample->intr_regs.mask_ext,
>> + PERF_NUM_INTR_REGS * 64) * sizeof(u64);
>> memcpy(array, sample->intr_regs.regs, sz);
>> array = (void *)array + sz;
>> } else {
>> --
>> 2.40.1
>>
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 11/20] perf/x86/intel: Setup PEBS constraints base on counter & pdist map
2025-01-23 14:07 ` [PATCH 11/20] perf/x86/intel: Setup PEBS constraints base on counter & pdist map Dapeng Mi
@ 2025-01-27 16:07 ` Liang, Kan
2025-02-06 2:47 ` Mi, Dapeng
0 siblings, 1 reply; 47+ messages in thread
From: Liang, Kan @ 2025-01-27 16:07 UTC (permalink / raw)
To: Dapeng Mi, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi
On 2025-01-23 9:07 a.m., Dapeng Mi wrote:
> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
> sampling and precise distribution PEBS sampling. Thus PEBS constraints
> can be dynamically configured base on these counter and precise
> distribution bitmap instead of defining them statically.
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
> arch/x86/events/intel/core.c | 20 ++++++++++++++++++++
> arch/x86/events/intel/ds.c | 1 +
> 2 files changed, 21 insertions(+)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 7775e1e1c1e9..0f1be36113fa 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3728,6 +3728,7 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
> struct perf_event *event)
> {
> struct event_constraint *c1, *c2;
> + struct pmu *pmu = event->pmu;
>
> c1 = cpuc->event_constraint[idx];
>
> @@ -3754,6 +3755,25 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
> c2->weight = hweight64(c2->idxmsk64);
> }
>
> + if (x86_pmu.arch_pebs && event->attr.precise_ip) {
> + u64 pebs_cntrs_mask;
> + u64 cntrs_mask;
> +
> + if (event->attr.precise_ip >= 3)
> + pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).pdists;
> + else
> + pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).counters;
> +
> + cntrs_mask = hybrid(pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED |
> + hybrid(pmu, cntr_mask64);
> +
> + if (pebs_cntrs_mask != cntrs_mask) {
> + c2 = dyn_constraint(cpuc, c2, idx);
> + c2->idxmsk64 &= pebs_cntrs_mask;
> + c2->weight = hweight64(c2->idxmsk64);
> + }
> + }
The pebs_cntrs_mask and cntrs_mask wouldn't be changed since the machine
boot. I don't think it's efficient to calculate them every time.
Maybe adding a local pebs_event_constraints_pdist[] and update both
pebs_event_constraints[] and pebs_event_constraints_pdist[] with the
enumerated mask at initialization time.
Update the intel_pebs_constraints() to utilize the corresponding array
according to the precise_ip.
The above may be avoided.
Thanks,
Kan
> +
> return c2;
> }
>
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index 2f2c6b7c801b..a573ce0e576a 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -2941,6 +2941,7 @@ static void __init intel_arch_pebs_init(void)
> x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
> x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
> x86_pmu.pebs_capable = ~0ULL;
> + x86_pmu.flags |= PMU_FL_PEBS_ALL;
> }
>
> /*
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 01/20] perf/x86/intel: Add PMU support for Clearwater Forest
2025-01-23 14:07 ` [PATCH 01/20] perf/x86/intel: Add PMU support " Dapeng Mi
@ 2025-01-27 16:26 ` Peter Zijlstra
2025-02-06 1:31 ` Mi, Dapeng
0 siblings, 1 reply; 47+ messages in thread
From: Peter Zijlstra @ 2025-01-27 16:26 UTC (permalink / raw)
To: Dapeng Mi
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On Thu, Jan 23, 2025 at 02:07:02PM +0000, Dapeng Mi wrote:
> From PMU's perspective, Clearwater Forest is similar to the previous
> generation Sierra Forest.
>
> The key differences are the ARCH PEBS feature and the new added 3 fixed
> counters for topdown L1 metrics events.
>
> The ARCH PEBS is supported in the following patches. This patch provides
> support for basic perfmon features and 3 new added fixed counters.
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
> arch/x86/events/intel/core.c | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index b140c1473a9d..5e8521a54474 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -2220,6 +2220,18 @@ static struct extra_reg intel_cmt_extra_regs[] __read_mostly = {
> EVENT_EXTRA_END
> };
>
> +EVENT_ATTR_STR(topdown-fe-bound, td_fe_bound_skt, "event=0x9c,umask=0x01");
> +EVENT_ATTR_STR(topdown-retiring, td_retiring_skt, "event=0xc2,umask=0x02");
> +EVENT_ATTR_STR(topdown-be-bound, td_be_bound_skt, "event=0xa4,umask=0x02");
> +
> +static struct attribute *skt_events_attrs[] = {
> + EVENT_PTR(td_fe_bound_skt),
> + EVENT_PTR(td_retiring_skt),
> + EVENT_PTR(td_bad_spec_cmt),
> + EVENT_PTR(td_be_bound_skt),
> + NULL,
> +};
The skt here is skymont, which is what Sierra Forest was based on, and
you just said that these counters are new with Darkmont, and as such the
lot should be called: dmt or whatever the proper trigraph is.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 02/20] perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF
2025-01-23 14:07 ` [PATCH 02/20] perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF Dapeng Mi
@ 2025-01-27 16:29 ` Peter Zijlstra
2025-01-27 16:43 ` Liang, Kan
0 siblings, 1 reply; 47+ messages in thread
From: Peter Zijlstra @ 2025-01-27 16:29 UTC (permalink / raw)
To: Dapeng Mi
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
stable
On Thu, Jan 23, 2025 at 02:07:03PM +0000, Dapeng Mi wrote:
> From: Kan Liang <kan.liang@linux.intel.com>
>
> The EAX of the CPUID Leaf 023H enumerates the mask of valid sub-leaves.
> To tell the availability of the sub-leaf 1 (enumerate the counter mask),
> perf should check the bit 1 (0x2) of EAS, rather than bit 0 (0x1).
>
> The error is not user-visible on bare metal. Because the sub-leaf 0 and
> the sub-leaf 1 are always available. However, it may bring issues in a
> virtualization environment when a VMM only enumerates the sub-leaf 0.
>
> Fixes: eb467aaac21e ("perf/x86/intel: Support Architectural PerfMon Extension leaf")
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Cc: stable@vger.kernel.org
> ---
> arch/x86/events/intel/core.c | 4 ++--
> arch/x86/include/asm/perf_event.h | 2 +-
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 5e8521a54474..12eb96219740 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4966,8 +4966,8 @@ static void update_pmu_cap(struct x86_hybrid_pmu *pmu)
> if (ebx & ARCH_PERFMON_EXT_EQ)
> pmu->config_mask |= ARCH_PERFMON_EVENTSEL_EQ;
>
> - if (sub_bitmaps & ARCH_PERFMON_NUM_COUNTER_LEAF_BIT) {
> - cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
> + if (sub_bitmaps & ARCH_PERFMON_NUM_COUNTER_LEAF) {
> + cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF_BIT,
> &eax, &ebx, &ecx, &edx);
> pmu->cntr_mask64 = eax;
> pmu->fixed_cntr_mask64 = ebx;
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index adaeb8ca3a8a..71e2ae021374 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -197,7 +197,7 @@ union cpuid10_edx {
> #define ARCH_PERFMON_EXT_UMASK2 0x1
> #define ARCH_PERFMON_EXT_EQ 0x2
> #define ARCH_PERFMON_NUM_COUNTER_LEAF_BIT 0x1
> -#define ARCH_PERFMON_NUM_COUNTER_LEAF 0x1
> +#define ARCH_PERFMON_NUM_COUNTER_LEAF BIT(ARCH_PERFMON_NUM_COUNTER_LEAF_BIT)
if you'll look around, you'll note this file uses BIT_ULL(), please stay
consistent.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 02/20] perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF
2025-01-27 16:29 ` Peter Zijlstra
@ 2025-01-27 16:43 ` Liang, Kan
2025-01-27 21:29 ` Peter Zijlstra
0 siblings, 1 reply; 47+ messages in thread
From: Liang, Kan @ 2025-01-27 16:43 UTC (permalink / raw)
To: Peter Zijlstra, Dapeng Mi
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
linux-kernel, linux-perf-users, Dapeng Mi, stable
On 2025-01-27 11:29 a.m., Peter Zijlstra wrote:
> On Thu, Jan 23, 2025 at 02:07:03PM +0000, Dapeng Mi wrote:
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> The EAX of the CPUID Leaf 023H enumerates the mask of valid sub-leaves.
>> To tell the availability of the sub-leaf 1 (enumerate the counter mask),
>> perf should check the bit 1 (0x2) of EAS, rather than bit 0 (0x1).
>>
>> The error is not user-visible on bare metal. Because the sub-leaf 0 and
>> the sub-leaf 1 are always available. However, it may bring issues in a
>> virtualization environment when a VMM only enumerates the sub-leaf 0.
>>
>> Fixes: eb467aaac21e ("perf/x86/intel: Support Architectural PerfMon Extension leaf")
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> Cc: stable@vger.kernel.org
>> ---
>> arch/x86/events/intel/core.c | 4 ++--
>> arch/x86/include/asm/perf_event.h | 2 +-
>> 2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 5e8521a54474..12eb96219740 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -4966,8 +4966,8 @@ static void update_pmu_cap(struct x86_hybrid_pmu *pmu)
>> if (ebx & ARCH_PERFMON_EXT_EQ)
>> pmu->config_mask |= ARCH_PERFMON_EVENTSEL_EQ;
>>
>> - if (sub_bitmaps & ARCH_PERFMON_NUM_COUNTER_LEAF_BIT) {
>> - cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
>> + if (sub_bitmaps & ARCH_PERFMON_NUM_COUNTER_LEAF) {
>> + cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF_BIT,
>> &eax, &ebx, &ecx, &edx);
>> pmu->cntr_mask64 = eax;
>> pmu->fixed_cntr_mask64 = ebx;
>> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
>> index adaeb8ca3a8a..71e2ae021374 100644
>> --- a/arch/x86/include/asm/perf_event.h
>> +++ b/arch/x86/include/asm/perf_event.h
>> @@ -197,7 +197,7 @@ union cpuid10_edx {
>> #define ARCH_PERFMON_EXT_UMASK2 0x1
>> #define ARCH_PERFMON_EXT_EQ 0x2
>> #define ARCH_PERFMON_NUM_COUNTER_LEAF_BIT 0x1
>> -#define ARCH_PERFMON_NUM_COUNTER_LEAF 0x1
>> +#define ARCH_PERFMON_NUM_COUNTER_LEAF BIT(ARCH_PERFMON_NUM_COUNTER_LEAF_BIT)
>
> if you'll look around, you'll note this file uses BIT_ULL(), please stay
> consistent.
But they are used for a 64-bit register.
The ARCH_PERFMON_NUM_COUNTER_LEAF is for the CPUID enumeration, which is
a u32.
Thanks,
Kan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
2025-01-27 15:19 ` Liang, Kan
@ 2025-01-27 16:44 ` Peter Zijlstra
2025-02-06 2:09 ` Mi, Dapeng
0 siblings, 1 reply; 47+ messages in thread
From: Peter Zijlstra @ 2025-01-27 16:44 UTC (permalink / raw)
To: Liang, Kan
Cc: Andi Kleen, Dapeng Mi, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On Mon, Jan 27, 2025 at 10:19:34AM -0500, Liang, Kan wrote:
>
>
> On 2025-01-23 1:58 p.m., Andi Kleen wrote:
> >> + /*
> >> + * The archPerfmonExt (0x23) includes an enhanced enumeration of
> >> + * PMU architectural features with a per-core view. For non-hybrid,
> >> + * each core has the same PMU capabilities. It's good enough to
> >> + * update the x86_pmu from the booting CPU. For hybrid, the x86_pmu
> >> + * is used to keep the common capabilities. Still keep the values
> >> + * from the leaf 0xa. The core specific update will be done later
> >> + * when a new type is online.
> >> + */
> >> + if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
> >> + update_pmu_cap(NULL);
> >
> > It seems ugly to have these different code paths. Couldn't non hybrid
> > use x86_pmu in the same way? I assume it would be a larger patch.
>
> The current non-hybrid is initialized in the intel_pmu_init(). But some
> of the initialization code for the hybrid is in the
> intel_pmu_cpu_starting(). Yes, it's better to move it together. It
> should be a larger patch. Since it's impacted other features, a separate
> patch set should be required.
IIRC the problem was that there were SKUs with the same FMS that were
both hybrid and non-hybrid and we wouldn't know until we brought up the
CPUs.
Thomas rewrote the topology bits since, so maybe we can do beter these
days.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 02/20] perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF
2025-01-27 16:43 ` Liang, Kan
@ 2025-01-27 21:29 ` Peter Zijlstra
2025-01-28 0:28 ` Liang, Kan
0 siblings, 1 reply; 47+ messages in thread
From: Peter Zijlstra @ 2025-01-27 21:29 UTC (permalink / raw)
To: Liang, Kan
Cc: Dapeng Mi, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
Ian Rogers, Adrian Hunter, Alexander Shishkin, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
stable
On Mon, Jan 27, 2025 at 11:43:53AM -0500, Liang, Kan wrote:
> But they are used for a 64-bit register.
> The ARCH_PERFMON_NUM_COUNTER_LEAF is for the CPUID enumeration, which is
> a u32.
A well, but CPUID should be using unions, no?
we have cpuid10_e[abd]x cpuid28_e[abc]x, so wheres cpuid23_e?x at?
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 02/20] perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF
2025-01-27 21:29 ` Peter Zijlstra
@ 2025-01-28 0:28 ` Liang, Kan
0 siblings, 0 replies; 47+ messages in thread
From: Liang, Kan @ 2025-01-28 0:28 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Dapeng Mi, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
Ian Rogers, Adrian Hunter, Alexander Shishkin, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
stable
On 2025-01-27 4:29 p.m., Peter Zijlstra wrote:
> On Mon, Jan 27, 2025 at 11:43:53AM -0500, Liang, Kan wrote:
>
>> But they are used for a 64-bit register.
>> The ARCH_PERFMON_NUM_COUNTER_LEAF is for the CPUID enumeration, which is
>> a u32.
>
> A well, but CPUID should be using unions, no?
>
> we have cpuid10_e[abd]x cpuid28_e[abc]x, so wheres cpuid23_e?x at?
>
Sure, I will add a cpuid23_e?x to make them consistent.
Thanks,
Kan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 06/20] perf/x86/intel: Initialize architectural PEBS
2025-01-23 14:07 ` [PATCH 06/20] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
@ 2025-01-28 11:22 ` Peter Zijlstra
2025-02-06 2:25 ` Mi, Dapeng
0 siblings, 1 reply; 47+ messages in thread
From: Peter Zijlstra @ 2025-01-28 11:22 UTC (permalink / raw)
To: Dapeng Mi
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On Thu, Jan 23, 2025 at 02:07:07PM +0000, Dapeng Mi wrote:
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index e8a06c8486af..1b33a6a60584 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -1537,6 +1537,9 @@ void intel_pmu_pebs_enable(struct perf_event *event)
>
> cpuc->pebs_enabled |= 1ULL << hwc->idx;
>
> + if (x86_pmu.arch_pebs)
> + return;
> +
> if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5))
> cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32);
> else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
> @@ -1606,6 +1609,11 @@ void intel_pmu_pebs_disable(struct perf_event *event)
>
> cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
>
> + hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
> +
> + if (x86_pmu.arch_pebs)
> + return;
> +
> if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) &&
> (x86_pmu.version < 5))
> cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
> @@ -1616,15 +1624,13 @@ void intel_pmu_pebs_disable(struct perf_event *event)
>
> if (cpuc->enabled)
> wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
> -
> - hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
> }
>
> void intel_pmu_pebs_enable_all(void)
> {
> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>
> - if (cpuc->pebs_enabled)
> + if (!x86_pmu.arch_pebs && cpuc->pebs_enabled)
> wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
> }
>
> @@ -1632,7 +1638,7 @@ void intel_pmu_pebs_disable_all(void)
> {
> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>
> - if (cpuc->pebs_enabled)
> + if (!x86_pmu.arch_pebs && cpuc->pebs_enabled)
> __intel_pmu_pebs_disable_all();
> }
So there's a ton of if (arch_pebs) sprinkled around. Can't we avoid that
by using a few static_call()s ?
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 01/20] perf/x86/intel: Add PMU support for Clearwater Forest
2025-01-27 16:26 ` Peter Zijlstra
@ 2025-02-06 1:31 ` Mi, Dapeng
2025-02-06 7:53 ` Peter Zijlstra
0 siblings, 1 reply; 47+ messages in thread
From: Mi, Dapeng @ 2025-02-06 1:31 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On 1/28/2025 12:26 AM, Peter Zijlstra wrote:
> On Thu, Jan 23, 2025 at 02:07:02PM +0000, Dapeng Mi wrote:
>> From PMU's perspective, Clearwater Forest is similar to the previous
>> generation Sierra Forest.
>>
>> The key differences are the ARCH PEBS feature and the new added 3 fixed
>> counters for topdown L1 metrics events.
>>
>> The ARCH PEBS is supported in the following patches. This patch provides
>> support for basic perfmon features and 3 new added fixed counters.
>>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>> arch/x86/events/intel/core.c | 24 ++++++++++++++++++++++++
>> 1 file changed, 24 insertions(+)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index b140c1473a9d..5e8521a54474 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -2220,6 +2220,18 @@ static struct extra_reg intel_cmt_extra_regs[] __read_mostly = {
>> EVENT_EXTRA_END
>> };
>>
>> +EVENT_ATTR_STR(topdown-fe-bound, td_fe_bound_skt, "event=0x9c,umask=0x01");
>> +EVENT_ATTR_STR(topdown-retiring, td_retiring_skt, "event=0xc2,umask=0x02");
>> +EVENT_ATTR_STR(topdown-be-bound, td_be_bound_skt, "event=0xa4,umask=0x02");
>> +
>> +static struct attribute *skt_events_attrs[] = {
>> + EVENT_PTR(td_fe_bound_skt),
>> + EVENT_PTR(td_retiring_skt),
>> + EVENT_PTR(td_bad_spec_cmt),
>> + EVENT_PTR(td_be_bound_skt),
>> + NULL,
>> +};
> The skt here is skymont, which is what Sierra Forest was based on, and
> you just said that these counters are new with Darkmont, and as such the
> lot should be called: dmt or whatever the proper trigraph is.
Sorry for late response since the Chinese new year holiday.
Sierra Forest is based on Crestmont instead of Skymont. The 3 new fixed
counters are introduced from Skymont and Darkmont inherits them. So these
attributes are named with "skt" suffix.
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
2025-01-27 16:44 ` Peter Zijlstra
@ 2025-02-06 2:09 ` Mi, Dapeng
0 siblings, 0 replies; 47+ messages in thread
From: Mi, Dapeng @ 2025-02-06 2:09 UTC (permalink / raw)
To: Peter Zijlstra, Liang, Kan
Cc: Andi Kleen, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
Ian Rogers, Adrian Hunter, Alexander Shishkin, Eranian Stephane,
linux-kernel, linux-perf-users, Dapeng Mi
On 1/28/2025 12:44 AM, Peter Zijlstra wrote:
> On Mon, Jan 27, 2025 at 10:19:34AM -0500, Liang, Kan wrote:
>>
>> On 2025-01-23 1:58 p.m., Andi Kleen wrote:
>>>> + /*
>>>> + * The archPerfmonExt (0x23) includes an enhanced enumeration of
>>>> + * PMU architectural features with a per-core view. For non-hybrid,
>>>> + * each core has the same PMU capabilities. It's good enough to
>>>> + * update the x86_pmu from the booting CPU. For hybrid, the x86_pmu
>>>> + * is used to keep the common capabilities. Still keep the values
>>>> + * from the leaf 0xa. The core specific update will be done later
>>>> + * when a new type is online.
>>>> + */
>>>> + if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
>>>> + update_pmu_cap(NULL);
>>> It seems ugly to have these different code paths. Couldn't non hybrid
>>> use x86_pmu in the same way? I assume it would be a larger patch.
>> The current non-hybrid is initialized in the intel_pmu_init(). But some
>> of the initialization code for the hybrid is in the
>> intel_pmu_cpu_starting(). Yes, it's better to move it together. It
>> should be a larger patch. Since it's impacted other features, a separate
>> patch set should be required.
> IIRC the problem was that there were SKUs with the same FMS that were
> both hybrid and non-hybrid and we wouldn't know until we brought up the
> CPUs.
>
> Thomas rewrote the topology bits since, so maybe we can do beter these
> days.
This optimization would be a fundamental and large change. As Kan said,
we'd better put it as a separate patch series, then it won't block this
arch-PEBS enabling patches.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 06/20] perf/x86/intel: Initialize architectural PEBS
2025-01-28 11:22 ` Peter Zijlstra
@ 2025-02-06 2:25 ` Mi, Dapeng
0 siblings, 0 replies; 47+ messages in thread
From: Mi, Dapeng @ 2025-02-06 2:25 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On 1/28/2025 7:22 PM, Peter Zijlstra wrote:
> On Thu, Jan 23, 2025 at 02:07:07PM +0000, Dapeng Mi wrote:
>
>
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index e8a06c8486af..1b33a6a60584 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -1537,6 +1537,9 @@ void intel_pmu_pebs_enable(struct perf_event *event)
>>
>> cpuc->pebs_enabled |= 1ULL << hwc->idx;
>>
>> + if (x86_pmu.arch_pebs)
>> + return;
>> +
>> if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5))
>> cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32);
>> else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
>> @@ -1606,6 +1609,11 @@ void intel_pmu_pebs_disable(struct perf_event *event)
>>
>> cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
>>
>> + hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
>> +
>> + if (x86_pmu.arch_pebs)
>> + return;
>> +
>> if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) &&
>> (x86_pmu.version < 5))
>> cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
>> @@ -1616,15 +1624,13 @@ void intel_pmu_pebs_disable(struct perf_event *event)
>>
>> if (cpuc->enabled)
>> wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
>> -
>> - hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
>> }
>>
>> void intel_pmu_pebs_enable_all(void)
>> {
>> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>>
>> - if (cpuc->pebs_enabled)
>> + if (!x86_pmu.arch_pebs && cpuc->pebs_enabled)
>> wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
>> }
>>
>> @@ -1632,7 +1638,7 @@ void intel_pmu_pebs_disable_all(void)
>> {
>> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>>
>> - if (cpuc->pebs_enabled)
>> + if (!x86_pmu.arch_pebs && cpuc->pebs_enabled)
>> __intel_pmu_pebs_disable_all();
>> }
> So there's a ton of if (arch_pebs) sprinkled around. Can't we avoid that
> by using a few static_call()s ?
Sure. Let me try it.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 11/20] perf/x86/intel: Setup PEBS constraints base on counter & pdist map
2025-01-27 16:07 ` Liang, Kan
@ 2025-02-06 2:47 ` Mi, Dapeng
2025-02-06 15:01 ` Liang, Kan
0 siblings, 1 reply; 47+ messages in thread
From: Mi, Dapeng @ 2025-02-06 2:47 UTC (permalink / raw)
To: Liang, Kan, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi
On 1/28/2025 12:07 AM, Liang, Kan wrote:
>
> On 2025-01-23 9:07 a.m., Dapeng Mi wrote:
>> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
>> sampling and precise distribution PEBS sampling. Thus PEBS constraints
>> can be dynamically configured base on these counter and precise
>> distribution bitmap instead of defining them statically.
>>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>> arch/x86/events/intel/core.c | 20 ++++++++++++++++++++
>> arch/x86/events/intel/ds.c | 1 +
>> 2 files changed, 21 insertions(+)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 7775e1e1c1e9..0f1be36113fa 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -3728,6 +3728,7 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>> struct perf_event *event)
>> {
>> struct event_constraint *c1, *c2;
>> + struct pmu *pmu = event->pmu;
>>
>> c1 = cpuc->event_constraint[idx];
>>
>> @@ -3754,6 +3755,25 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>> c2->weight = hweight64(c2->idxmsk64);
>> }
>>
>> + if (x86_pmu.arch_pebs && event->attr.precise_ip) {
>> + u64 pebs_cntrs_mask;
>> + u64 cntrs_mask;
>> +
>> + if (event->attr.precise_ip >= 3)
>> + pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).pdists;
>> + else
>> + pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).counters;
>> +
>> + cntrs_mask = hybrid(pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED |
>> + hybrid(pmu, cntr_mask64);
>> +
>> + if (pebs_cntrs_mask != cntrs_mask) {
>> + c2 = dyn_constraint(cpuc, c2, idx);
>> + c2->idxmsk64 &= pebs_cntrs_mask;
>> + c2->weight = hweight64(c2->idxmsk64);
>> + }
>> + }
> The pebs_cntrs_mask and cntrs_mask wouldn't be changed since the machine
> boot. I don't think it's efficient to calculate them every time.
>
> Maybe adding a local pebs_event_constraints_pdist[] and update both
> pebs_event_constraints[] and pebs_event_constraints_pdist[] with the
> enumerated mask at initialization time.
>
> Update the intel_pebs_constraints() to utilize the corresponding array
> according to the precise_ip.
>
> The above may be avoided.
Even we have these two arrays, we still need the dynamic constraint, right?
We can't predict what the event is, the event may be mapped to a quite
specific event constraint and we can know it in advance.
>
> Thanks,
> Kan
>
>> +
>> return c2;
>> }
>>
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index 2f2c6b7c801b..a573ce0e576a 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -2941,6 +2941,7 @@ static void __init intel_arch_pebs_init(void)
>> x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
>> x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
>> x86_pmu.pebs_capable = ~0ULL;
>> + x86_pmu.flags |= PMU_FL_PEBS_ALL;
>> }
>>
>> /*
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 17/20] perf tools: Support to show SSP register
2025-01-23 16:15 ` Ian Rogers
@ 2025-02-06 2:57 ` Mi, Dapeng
0 siblings, 0 replies; 47+ messages in thread
From: Mi, Dapeng @ 2025-02-06 2:57 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Adrian Hunter, Alexander Shishkin, Kan Liang,
Andi Kleen, Eranian Stephane, linux-kernel, linux-perf-users,
Dapeng Mi
On 1/24/2025 12:15 AM, Ian Rogers wrote:
> On Wed, Jan 22, 2025 at 10:21 PM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>> Add SSP register support.
>>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>> tools/arch/x86/include/uapi/asm/perf_regs.h | 4 +++-
>> tools/perf/arch/x86/util/perf_regs.c | 2 ++
>> tools/perf/util/intel-pt.c | 2 +-
>> tools/perf/util/perf-regs-arch/perf_regs_x86.c | 2 ++
>> 4 files changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/arch/x86/include/uapi/asm/perf_regs.h b/tools/arch/x86/include/uapi/asm/perf_regs.h
>> index 7c9d2bb3833b..158e353070c3 100644
>> --- a/tools/arch/x86/include/uapi/asm/perf_regs.h
>> +++ b/tools/arch/x86/include/uapi/asm/perf_regs.h
>> @@ -27,9 +27,11 @@ enum perf_event_x86_regs {
>> PERF_REG_X86_R13,
>> PERF_REG_X86_R14,
>> PERF_REG_X86_R15,
>> + PERF_REG_X86_SSP,
> nit: Would it be worth a comment here? SSP may not be apparent to
> everyone. Perhaps something like:
> ```
> /* Shadow stack pointer (SSP) present on Clearwater Forest and newer models. */
Sure.
> ```
>> /* These are the limits for the GPRs. */
>> PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1,
>> - PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1,
>> + PERF_REG_X86_64_MAX = PERF_REG_X86_SSP + 1,
>> + PERF_REG_INTEL_PT_MAX = PERF_REG_X86_R15 + 1,
> nit: It's a little peculiar to me the "+1" here - but that's
> pre-existing. Perhaps comments above here too:
> ```
> /* The MAX_REG_X86_64 used generally, for PEBS, etc. */
> PERF_REG_X86_64_MAX = PERF_REG_X86_SSP + 1,
> /* The MAX_REG_INTEL_PT ignores the SSP register. */
> PERF_REG_INTEL_PT_MAX = PERF_REG_X86_R15 + 1,
> ```
> Otherwise:
> Reviewed-by: Ian Rogers <irogers@google.com>
Sure. Thanks.
>
> Thanks,
> Ian
>
>> /* These all need two bits set because they are 128bit */
>> PERF_REG_X86_XMM0 = 32,
>> diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
>> index 12fd93f04802..9f492568f3b4 100644
>> --- a/tools/perf/arch/x86/util/perf_regs.c
>> +++ b/tools/perf/arch/x86/util/perf_regs.c
>> @@ -36,6 +36,8 @@ static const struct sample_reg sample_reg_masks[] = {
>> SMPL_REG(R14, PERF_REG_X86_R14),
>> SMPL_REG(R15, PERF_REG_X86_R15),
>> #endif
>> + SMPL_REG(SSP, PERF_REG_X86_SSP),
>> +
>> SMPL_REG2(XMM0, PERF_REG_X86_XMM0),
>> SMPL_REG2(XMM1, PERF_REG_X86_XMM1),
>> SMPL_REG2(XMM2, PERF_REG_X86_XMM2),
>> diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
>> index 30be6dfe09eb..86196275c1e7 100644
>> --- a/tools/perf/util/intel-pt.c
>> +++ b/tools/perf/util/intel-pt.c
>> @@ -2139,7 +2139,7 @@ static u64 *intel_pt_add_gp_regs(struct regs_dump *intr_regs, u64 *pos,
>> u32 bit;
>> int i;
>>
>> - for (i = 0, bit = 1; i < PERF_REG_X86_64_MAX; i++, bit <<= 1) {
>> + for (i = 0, bit = 1; i < PERF_REG_INTEL_PT_MAX; i++, bit <<= 1) {
>> /* Get the PEBS gp_regs array index */
>> int n = pebs_gp_regs[i] - 1;
>>
>> diff --git a/tools/perf/util/perf-regs-arch/perf_regs_x86.c b/tools/perf/util/perf-regs-arch/perf_regs_x86.c
>> index 708954a9d35d..9a909f02bc04 100644
>> --- a/tools/perf/util/perf-regs-arch/perf_regs_x86.c
>> +++ b/tools/perf/util/perf-regs-arch/perf_regs_x86.c
>> @@ -54,6 +54,8 @@ const char *__perf_reg_name_x86(int id)
>> return "R14";
>> case PERF_REG_X86_R15:
>> return "R15";
>> + case PERF_REG_X86_SSP:
>> + return "ssp";
>>
>> #define XMM(x) \
>> case PERF_REG_X86_XMM ## x: \
>> --
>> 2.40.1
>>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 18/20] perf tools: Support to capture more vector registers (common part)
2025-01-27 15:50 ` Liang, Kan
@ 2025-02-06 3:12 ` Mi, Dapeng
0 siblings, 0 replies; 47+ messages in thread
From: Mi, Dapeng @ 2025-02-06 3:12 UTC (permalink / raw)
To: Liang, Kan, Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On 1/27/2025 11:50 PM, Liang, Kan wrote:
>
> On 2025-01-23 11:42 a.m., Ian Rogers wrote:
>> On Wed, Jan 22, 2025 at 10:21 PM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>>> Intel architectural PEBS supports to capture more vector registers like
>>> OPMASK/YMM/ZMM registers besides already supported XMM registers.
>>>
>>> arch-PEBS vector registers (VCER) capturing on perf core/pmu driver
>>> (Intel) has been supported by previous patches. This patch adds perf
>>> tool's part support. In detail, add support for the new
>>> sample_regs_intr_ext register selector in perf_event_attr. This 32 bytes
>>> bitmap is used to select the new register group OPMASK, YMMH, ZMMH and
>>> ZMM in VECR. Update perf regs to introduce the new registers.
>>>
>>> This single patch only introduces the common support, x86/intel specific
>>> support would be added in next patch.
>> Could you break down what the individual changes are? I see quite a
>> few, some in printing, some with functions like arch__intr_reg_mask.
>> I'm sure the changes are well motivated but there is little detail in
>> the commit message. Perhaps there is some chance to separate each
>> change into its own patch. By detail I mean something like, "change
>> arch__intr_reg_mask to taking a pointer so that REG_MASK and array
>> initialization is possible."
Sure.
>>
>> It is a shame arch__intr_reg_mask doesn't match arch__user_reg_mask
>> following this change. Perhaps update them both for the sake of
>> consistency.
> Yes, it sounds cleaner. The same size but different mask. It may waste
> some space but it should be OK.
Good idea. Thanks.
>
>> Out of scope here, I wonder in general how we can get this code out of
>> the arch directory? For example, it would be nice if we have say an
>> arm perf command running on qemu-user on an x86 that we perhaps want
>> to do the appropriate reg_mask for x86.
> Different ARCH has a different pt_regs. It seems hard to general a
> generic reg list.
>
> Thanks,
> Kan
>> Thanks,
>> Ian
>>
>>> Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
>>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>> ---
>>> tools/include/uapi/linux/perf_event.h | 13 +++++++++
>>> tools/perf/arch/arm/util/perf_regs.c | 5 +---
>>> tools/perf/arch/arm64/util/perf_regs.c | 5 +---
>>> tools/perf/arch/csky/util/perf_regs.c | 5 +---
>>> tools/perf/arch/loongarch/util/perf_regs.c | 5 +---
>>> tools/perf/arch/mips/util/perf_regs.c | 5 +---
>>> tools/perf/arch/powerpc/util/perf_regs.c | 9 ++++---
>>> tools/perf/arch/riscv/util/perf_regs.c | 5 +---
>>> tools/perf/arch/s390/util/perf_regs.c | 5 +---
>>> tools/perf/arch/x86/util/perf_regs.c | 9 ++++---
>>> tools/perf/builtin-script.c | 19 ++++++++++---
>>> tools/perf/util/evsel.c | 14 +++++++---
>>> tools/perf/util/parse-regs-options.c | 23 +++++++++-------
>>> tools/perf/util/perf_regs.c | 5 ----
>>> tools/perf/util/perf_regs.h | 18 +++++++++++--
>>> tools/perf/util/record.h | 2 +-
>>> tools/perf/util/sample.h | 6 ++++-
>>> tools/perf/util/session.c | 31 +++++++++++++---------
>>> tools/perf/util/synthetic-events.c | 7 +++--
>>> 19 files changed, 116 insertions(+), 75 deletions(-)
>>>
>>> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
>>> index 4842c36fdf80..02d8f55f6247 100644
>>> --- a/tools/include/uapi/linux/perf_event.h
>>> +++ b/tools/include/uapi/linux/perf_event.h
>>> @@ -379,6 +379,13 @@ enum perf_event_read_format {
>>> #define PERF_ATTR_SIZE_VER6 120 /* add: aux_sample_size */
>>> #define PERF_ATTR_SIZE_VER7 128 /* add: sig_data */
>>> #define PERF_ATTR_SIZE_VER8 136 /* add: config3 */
>>> +#define PERF_ATTR_SIZE_VER9 168 /* add: sample_regs_intr_ext[PERF_EXT_REGS_ARRAY_SIZE] */
>>> +
>>> +#define PERF_EXT_REGS_ARRAY_SIZE 4
>>> +#define PERF_NUM_EXT_REGS (PERF_EXT_REGS_ARRAY_SIZE * 64)
>>> +
>>> +#define PERF_NUM_INTR_REGS (PERF_EXT_REGS_ARRAY_SIZE + 1)
>>> +#define PERF_NUM_INTR_REGS_SIZE ((PERF_NUM_INTR_REGS) * 64)
>>>
>>> /*
>>> * Hardware event_id to monitor via a performance monitoring event:
>>> @@ -522,6 +529,12 @@ struct perf_event_attr {
>>> __u64 sig_data;
>>>
>>> __u64 config3; /* extension of config2 */
>>> +
>>> + /*
>>> + * Extension sets of regs to dump for each sample.
>>> + * See asm/perf_regs.h for details.
>>> + */
>>> + __u64 sample_regs_intr_ext[PERF_EXT_REGS_ARRAY_SIZE];
>>> };
>>>
>>> /*
>>> diff --git a/tools/perf/arch/arm/util/perf_regs.c b/tools/perf/arch/arm/util/perf_regs.c
>>> index f94a0210c7b7..3a3c2779efd4 100644
>>> --- a/tools/perf/arch/arm/util/perf_regs.c
>>> +++ b/tools/perf/arch/arm/util/perf_regs.c
>>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>>> SMPL_REG_END
>>> };
>>>
>>> -uint64_t arch__intr_reg_mask(void)
>>> -{
>>> - return PERF_REGS_MASK;
>>> -}
>>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>>
>>> uint64_t arch__user_reg_mask(void)
>>> {
>>> diff --git a/tools/perf/arch/arm64/util/perf_regs.c b/tools/perf/arch/arm64/util/perf_regs.c
>>> index 09308665e28a..754bb8423733 100644
>>> --- a/tools/perf/arch/arm64/util/perf_regs.c
>>> +++ b/tools/perf/arch/arm64/util/perf_regs.c
>>> @@ -140,10 +140,7 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
>>> return SDT_ARG_VALID;
>>> }
>>>
>>> -uint64_t arch__intr_reg_mask(void)
>>> -{
>>> - return PERF_REGS_MASK;
>>> -}
>>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>>
>>> uint64_t arch__user_reg_mask(void)
>>> {
>>> diff --git a/tools/perf/arch/csky/util/perf_regs.c b/tools/perf/arch/csky/util/perf_regs.c
>>> index 6b1665f41180..9d132150ecb6 100644
>>> --- a/tools/perf/arch/csky/util/perf_regs.c
>>> +++ b/tools/perf/arch/csky/util/perf_regs.c
>>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>>> SMPL_REG_END
>>> };
>>>
>>> -uint64_t arch__intr_reg_mask(void)
>>> -{
>>> - return PERF_REGS_MASK;
>>> -}
>>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>>
>>> uint64_t arch__user_reg_mask(void)
>>> {
>>> diff --git a/tools/perf/arch/loongarch/util/perf_regs.c b/tools/perf/arch/loongarch/util/perf_regs.c
>>> index f94a0210c7b7..3a3c2779efd4 100644
>>> --- a/tools/perf/arch/loongarch/util/perf_regs.c
>>> +++ b/tools/perf/arch/loongarch/util/perf_regs.c
>>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>>> SMPL_REG_END
>>> };
>>>
>>> -uint64_t arch__intr_reg_mask(void)
>>> -{
>>> - return PERF_REGS_MASK;
>>> -}
>>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>>
>>> uint64_t arch__user_reg_mask(void)
>>> {
>>> diff --git a/tools/perf/arch/mips/util/perf_regs.c b/tools/perf/arch/mips/util/perf_regs.c
>>> index 6b1665f41180..9d132150ecb6 100644
>>> --- a/tools/perf/arch/mips/util/perf_regs.c
>>> +++ b/tools/perf/arch/mips/util/perf_regs.c
>>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>>> SMPL_REG_END
>>> };
>>>
>>> -uint64_t arch__intr_reg_mask(void)
>>> -{
>>> - return PERF_REGS_MASK;
>>> -}
>>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>>
>>> uint64_t arch__user_reg_mask(void)
>>> {
>>> diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
>>> index e8e6e6fc6f17..08ab9ed692fb 100644
>>> --- a/tools/perf/arch/powerpc/util/perf_regs.c
>>> +++ b/tools/perf/arch/powerpc/util/perf_regs.c
>>> @@ -186,7 +186,7 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
>>> return SDT_ARG_VALID;
>>> }
>>>
>>> -uint64_t arch__intr_reg_mask(void)
>>> +void arch__intr_reg_mask(unsigned long *mask)
>>> {
>>> struct perf_event_attr attr = {
>>> .type = PERF_TYPE_HARDWARE,
>>> @@ -198,7 +198,9 @@ uint64_t arch__intr_reg_mask(void)
>>> };
>>> int fd;
>>> u32 version;
>>> - u64 extended_mask = 0, mask = PERF_REGS_MASK;
>>> + u64 extended_mask = 0;
>>> +
>>> + *(u64 *)mask = PERF_REGS_MASK;
>>>
>>> /*
>>> * Get the PVR value to set the extended
>>> @@ -223,9 +225,8 @@ uint64_t arch__intr_reg_mask(void)
>>> fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
>>> if (fd != -1) {
>>> close(fd);
>>> - mask |= extended_mask;
>>> + *(u64 *)mask |= extended_mask;
>>> }
>>> - return mask;
>>> }
>>>
>>> uint64_t arch__user_reg_mask(void)
>>> diff --git a/tools/perf/arch/riscv/util/perf_regs.c b/tools/perf/arch/riscv/util/perf_regs.c
>>> index 6b1665f41180..9d132150ecb6 100644
>>> --- a/tools/perf/arch/riscv/util/perf_regs.c
>>> +++ b/tools/perf/arch/riscv/util/perf_regs.c
>>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>>> SMPL_REG_END
>>> };
>>>
>>> -uint64_t arch__intr_reg_mask(void)
>>> -{
>>> - return PERF_REGS_MASK;
>>> -}
>>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>>
>>> uint64_t arch__user_reg_mask(void)
>>> {
>>> diff --git a/tools/perf/arch/s390/util/perf_regs.c b/tools/perf/arch/s390/util/perf_regs.c
>>> index 6b1665f41180..9d132150ecb6 100644
>>> --- a/tools/perf/arch/s390/util/perf_regs.c
>>> +++ b/tools/perf/arch/s390/util/perf_regs.c
>>> @@ -6,10 +6,7 @@ static const struct sample_reg sample_reg_masks[] = {
>>> SMPL_REG_END
>>> };
>>>
>>> -uint64_t arch__intr_reg_mask(void)
>>> -{
>>> - return PERF_REGS_MASK;
>>> -}
>>> +void arch__intr_reg_mask(unsigned long *mask) {}
>>>
>>> uint64_t arch__user_reg_mask(void)
>>> {
>>> diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
>>> index 9f492568f3b4..52f08498d005 100644
>>> --- a/tools/perf/arch/x86/util/perf_regs.c
>>> +++ b/tools/perf/arch/x86/util/perf_regs.c
>>> @@ -283,7 +283,7 @@ const struct sample_reg *arch__sample_reg_masks(void)
>>> return sample_reg_masks;
>>> }
>>>
>>> -uint64_t arch__intr_reg_mask(void)
>>> +void arch__intr_reg_mask(unsigned long *mask)
>>> {
>>> struct perf_event_attr attr = {
>>> .type = PERF_TYPE_HARDWARE,
>>> @@ -295,6 +295,9 @@ uint64_t arch__intr_reg_mask(void)
>>> .exclude_kernel = 1,
>>> };
>>> int fd;
>>> +
>>> + *(u64 *)mask = PERF_REGS_MASK;
>>> +
>>> /*
>>> * In an unnamed union, init it here to build on older gcc versions
>>> */
>>> @@ -320,10 +323,8 @@ uint64_t arch__intr_reg_mask(void)
>>> fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
>>> if (fd != -1) {
>>> close(fd);
>>> - return (PERF_REG_EXTENDED_MASK | PERF_REGS_MASK);
>>> + *(u64 *)mask |= PERF_REG_EXTENDED_MASK;
>>> }
>>> -
>>> - return PERF_REGS_MASK;
>>> }
>>>
>>> uint64_t arch__user_reg_mask(void)
>>> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
>>> index 9e47905f75a6..66d3923e4040 100644
>>> --- a/tools/perf/builtin-script.c
>>> +++ b/tools/perf/builtin-script.c
>>> @@ -704,10 +704,11 @@ static int perf_session__check_output_opt(struct perf_session *session)
>>> }
>>>
>>> static int perf_sample__fprintf_regs(struct regs_dump *regs, uint64_t mask, const char *arch,
>>> - FILE *fp)
>>> + unsigned long *mask_ext, FILE *fp)
>>> {
>>> unsigned i = 0, r;
>>> int printed = 0;
>>> + u64 val;
>>>
>>> if (!regs || !regs->regs)
>>> return 0;
>>> @@ -715,7 +716,15 @@ static int perf_sample__fprintf_regs(struct regs_dump *regs, uint64_t mask, cons
>>> printed += fprintf(fp, " ABI:%" PRIu64 " ", regs->abi);
>>>
>>> for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
>>> - u64 val = regs->regs[i++];
>>> + val = regs->regs[i++];
>>> + printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r, arch), val);
>>> + }
>>> +
>>> + if (!mask_ext)
>>> + return printed;
>>> +
>>> + for_each_set_bit(r, mask_ext, PERF_NUM_EXT_REGS) {
>>> + val = regs->regs[i++];
>>> printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r, arch), val);
>>> }
>>>
>>> @@ -776,14 +785,16 @@ static int perf_sample__fprintf_iregs(struct perf_sample *sample,
>>> struct perf_event_attr *attr, const char *arch, FILE *fp)
>>> {
>>> return perf_sample__fprintf_regs(&sample->intr_regs,
>>> - attr->sample_regs_intr, arch, fp);
>>> + attr->sample_regs_intr, arch,
>>> + (unsigned long *)attr->sample_regs_intr_ext,
>>> + fp);
>>> }
>>>
>>> static int perf_sample__fprintf_uregs(struct perf_sample *sample,
>>> struct perf_event_attr *attr, const char *arch, FILE *fp)
>>> {
>>> return perf_sample__fprintf_regs(&sample->user_regs,
>>> - attr->sample_regs_user, arch, fp);
>>> + attr->sample_regs_user, arch, NULL, fp);
>>> }
>>>
>>> static int perf_sample__fprintf_start(struct perf_script *script,
>>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>>> index f745723d486b..297b960ac446 100644
>>> --- a/tools/perf/util/evsel.c
>>> +++ b/tools/perf/util/evsel.c
>>> @@ -1314,9 +1314,11 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
>>> if (callchain && callchain->enabled && !evsel->no_aux_samples)
>>> evsel__config_callchain(evsel, opts, callchain);
>>>
>>> - if (opts->sample_intr_regs && !evsel->no_aux_samples &&
>>> - !evsel__is_dummy_event(evsel)) {
>>> - attr->sample_regs_intr = opts->sample_intr_regs;
>>> + if (bitmap_weight(opts->sample_intr_regs, PERF_NUM_INTR_REGS_SIZE) &&
>>> + !evsel->no_aux_samples && !evsel__is_dummy_event(evsel)) {
>>> + attr->sample_regs_intr = opts->sample_intr_regs[0];
>>> + memcpy(attr->sample_regs_intr_ext, &opts->sample_intr_regs[1],
>>> + PERF_NUM_EXT_REGS / 8);
>>> evsel__set_sample_bit(evsel, REGS_INTR);
>>> }
>>>
>>> @@ -3097,10 +3099,16 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
>>>
>>> if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
>>> u64 mask = evsel->core.attr.sample_regs_intr;
>>> + unsigned long *mask_ext =
>>> + (unsigned long *)evsel->core.attr.sample_regs_intr_ext;
>>> + u64 *intr_regs_mask;
>>>
>>> sz = hweight64(mask) * sizeof(u64);
>>> + sz += bitmap_weight(mask_ext, PERF_NUM_EXT_REGS) * sizeof(u64);
>>> OVERFLOW_CHECK(array, sz, max_size);
>>> data->intr_regs.mask = mask;
>>> + intr_regs_mask = (u64 *)&data->intr_regs.mask_ext;
>>> + memcpy(&intr_regs_mask[1], mask_ext, PERF_NUM_EXT_REGS);
>>> data->intr_regs.regs = (u64 *)array;
>>> array = (void *)array + sz;
>>> }
>>> diff --git a/tools/perf/util/parse-regs-options.c b/tools/perf/util/parse-regs-options.c
>>> index cda1c620968e..666c2a172ef2 100644
>>> --- a/tools/perf/util/parse-regs-options.c
>>> +++ b/tools/perf/util/parse-regs-options.c
>>> @@ -12,11 +12,13 @@
>>> static int
>>> __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>>> {
>>> + unsigned int size = intr ? PERF_NUM_INTR_REGS * 64 : 64;
>>> uint64_t *mode = (uint64_t *)opt->value;
>>> const struct sample_reg *r = NULL;
>>> char *s, *os = NULL, *p;
>>> int ret = -1;
>>> - uint64_t mask;
>>> + DECLARE_BITMAP(mask, size);
>>> + DECLARE_BITMAP(mask_tmp, size);
>>>
>>> if (unset)
>>> return 0;
>>> @@ -24,13 +26,14 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>>> /*
>>> * cannot set it twice
>>> */
>>> - if (*mode)
>>> + if (bitmap_weight((unsigned long *)mode, size))
>>> return -1;
>>>
>>> + bitmap_zero(mask, size);
>>> if (intr)
>>> - mask = arch__intr_reg_mask();
>>> + arch__intr_reg_mask(mask);
>>> else
>>> - mask = arch__user_reg_mask();
>>> + *(uint64_t *)mask = arch__user_reg_mask();
>>>
>>> /* str may be NULL in case no arg is passed to -I */
>>> if (str) {
>>> @@ -47,7 +50,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>>> if (!strcmp(s, "?")) {
>>> fprintf(stderr, "available registers: ");
>>> for (r = arch__sample_reg_masks(); r->name; r++) {
>>> - if (r->mask & mask)
>>> + bitmap_and(mask_tmp, mask, r->mask_ext, size);
>>> + if (bitmap_weight(mask_tmp, size))
>>> fprintf(stderr, "%s ", r->name);
>>> }
>>> fputc('\n', stderr);
>>> @@ -55,7 +59,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>>> goto error;
>>> }
>>> for (r = arch__sample_reg_masks(); r->name; r++) {
>>> - if ((r->mask & mask) && !strcasecmp(s, r->name))
>>> + bitmap_and(mask_tmp, mask, r->mask_ext, size);
>>> + if (bitmap_weight(mask_tmp, size) && !strcasecmp(s, r->name))
>>> break;
>>> }
>>> if (!r || !r->name) {
>>> @@ -64,7 +69,7 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>>> goto error;
>>> }
>>>
>>> - *mode |= r->mask;
>>> + bitmap_or((unsigned long *)mode, (unsigned long *)mode, r->mask_ext, size);
>>>
>>> if (!p)
>>> break;
>>> @@ -75,8 +80,8 @@ __parse_regs(const struct option *opt, const char *str, int unset, bool intr)
>>> ret = 0;
>>>
>>> /* default to all possible regs */
>>> - if (*mode == 0)
>>> - *mode = mask;
>>> + if (!bitmap_weight((unsigned long *)mode, size))
>>> + bitmap_or((unsigned long *)mode, (unsigned long *)mode, mask, size);
>>> error:
>>> free(os);
>>> return ret;
>>> diff --git a/tools/perf/util/perf_regs.c b/tools/perf/util/perf_regs.c
>>> index 44b90bbf2d07..b36eafc10e84 100644
>>> --- a/tools/perf/util/perf_regs.c
>>> +++ b/tools/perf/util/perf_regs.c
>>> @@ -11,11 +11,6 @@ int __weak arch_sdt_arg_parse_op(char *old_op __maybe_unused,
>>> return SDT_ARG_SKIP;
>>> }
>>>
>>> -uint64_t __weak arch__intr_reg_mask(void)
>>> -{
>>> - return 0;
>>> -}
>>> -
>>> uint64_t __weak arch__user_reg_mask(void)
>>> {
>>> return 0;
>>> diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
>>> index f2d0736d65cc..5018b8d040ee 100644
>>> --- a/tools/perf/util/perf_regs.h
>>> +++ b/tools/perf/util/perf_regs.h
>>> @@ -4,18 +4,32 @@
>>>
>>> #include <linux/types.h>
>>> #include <linux/compiler.h>
>>> +#include <linux/bitmap.h>
>>> +#include <linux/perf_event.h>
>>> +#include "util/record.h"
>>>
>>> struct regs_dump;
>>>
>>> struct sample_reg {
>>> const char *name;
>>> - uint64_t mask;
>>> + union {
>>> + uint64_t mask;
>>> + DECLARE_BITMAP(mask_ext, PERF_NUM_INTR_REGS * 64);
>>> + };
>>> };
>>>
>>> #define SMPL_REG_MASK(b) (1ULL << (b))
>>> #define SMPL_REG(n, b) { .name = #n, .mask = SMPL_REG_MASK(b) }
>>> #define SMPL_REG2_MASK(b) (3ULL << (b))
>>> #define SMPL_REG2(n, b) { .name = #n, .mask = SMPL_REG2_MASK(b) }
>>> +#define SMPL_REG_EXT(n, b) \
>>> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0x1ULL << (b % __BITS_PER_LONG) }
>>> +#define SMPL_REG2_EXT(n, b) \
>>> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0x3ULL << (b % __BITS_PER_LONG) }
>>> +#define SMPL_REG4_EXT(n, b) \
>>> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0xfULL << (b % __BITS_PER_LONG) }
>>> +#define SMPL_REG8_EXT(n, b) \
>>> + { .name = #n, .mask_ext[b / __BITS_PER_LONG] = 0xffULL << (b % __BITS_PER_LONG) }
>>> #define SMPL_REG_END { .name = NULL }
>>>
>>> enum {
>>> @@ -24,7 +38,7 @@ enum {
>>> };
>>>
>>> int arch_sdt_arg_parse_op(char *old_op, char **new_op);
>>> -uint64_t arch__intr_reg_mask(void);
>>> +void arch__intr_reg_mask(unsigned long *mask);
>>> uint64_t arch__user_reg_mask(void);
>>> const struct sample_reg *arch__sample_reg_masks(void);
>>>
>>> diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
>>> index a6566134e09e..16e44a640e57 100644
>>> --- a/tools/perf/util/record.h
>>> +++ b/tools/perf/util/record.h
>>> @@ -57,7 +57,7 @@ struct record_opts {
>>> unsigned int auxtrace_mmap_pages;
>>> unsigned int user_freq;
>>> u64 branch_stack;
>>> - u64 sample_intr_regs;
>>> + u64 sample_intr_regs[PERF_NUM_INTR_REGS];
>>> u64 sample_user_regs;
>>> u64 default_interval;
>>> u64 user_interval;
>>> diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h
>>> index 70b2c3135555..98c9c4260de6 100644
>>> --- a/tools/perf/util/sample.h
>>> +++ b/tools/perf/util/sample.h
>>> @@ -4,13 +4,17 @@
>>>
>>> #include <linux/perf_event.h>
>>> #include <linux/types.h>
>>> +#include <linux/bitmap.h>
>>>
>>> /* number of register is bound by the number of bits in regs_dump::mask (64) */
>>> #define PERF_SAMPLE_REGS_CACHE_SIZE (8 * sizeof(u64))
>>>
>>> struct regs_dump {
>>> u64 abi;
>>> - u64 mask;
>>> + union {
>>> + u64 mask;
>>> + DECLARE_BITMAP(mask_ext, PERF_NUM_INTR_REGS * 64);
>>> + };
>>> u64 *regs;
>>>
>>> /* Cached values/mask filled by first register access. */
>>> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
>>> index 507e6cba9545..995f5c2963bc 100644
>>> --- a/tools/perf/util/session.c
>>> +++ b/tools/perf/util/session.c
>>> @@ -909,12 +909,13 @@ static void branch_stack__printf(struct perf_sample *sample,
>>> }
>>> }
>>>
>>> -static void regs_dump__printf(u64 mask, u64 *regs, const char *arch)
>>> +static void regs_dump__printf(bool intr, struct regs_dump *regs, const char *arch)
>>> {
>>> + unsigned int size = intr ? PERF_NUM_INTR_REGS * 64 : 64;
>>> unsigned rid, i = 0;
>>>
>>> - for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) {
>>> - u64 val = regs[i++];
>>> + for_each_set_bit(rid, regs->mask_ext, size) {
>>> + u64 val = regs->regs[i++];
>>>
>>> printf(".... %-5s 0x%016" PRIx64 "\n",
>>> perf_reg_name(rid, arch), val);
>>> @@ -935,16 +936,22 @@ static inline const char *regs_dump_abi(struct regs_dump *d)
>>> return regs_abi[d->abi];
>>> }
>>>
>>> -static void regs__printf(const char *type, struct regs_dump *regs, const char *arch)
>>> +static void regs__printf(bool intr, struct regs_dump *regs, const char *arch)
>>> {
>>> - u64 mask = regs->mask;
>>> + if (intr) {
>>> + u64 *mask = (u64 *)®s->mask_ext;
>>>
>>> - printf("... %s regs: mask 0x%" PRIx64 " ABI %s\n",
>>> - type,
>>> - mask,
>>> - regs_dump_abi(regs));
>>> + printf("... intr regs: mask 0x");
>>> + for (int i = 0; i < PERF_NUM_INTR_REGS; i++)
>>> + printf("%" PRIx64 "", mask[i]);
>>> + printf(" ABI %s\n", regs_dump_abi(regs));
>>> + } else {
>>> + printf("... user regs: mask 0x%" PRIx64 " ABI %s\n",
>>> + regs->mask,
>>> + regs_dump_abi(regs));
>>> + }
>>>
>>> - regs_dump__printf(mask, regs->regs, arch);
>>> + regs_dump__printf(intr, regs, arch);
>>> }
>>>
>>> static void regs_user__printf(struct perf_sample *sample, const char *arch)
>>> @@ -952,7 +959,7 @@ static void regs_user__printf(struct perf_sample *sample, const char *arch)
>>> struct regs_dump *user_regs = &sample->user_regs;
>>>
>>> if (user_regs->regs)
>>> - regs__printf("user", user_regs, arch);
>>> + regs__printf(false, user_regs, arch);
>>> }
>>>
>>> static void regs_intr__printf(struct perf_sample *sample, const char *arch)
>>> @@ -960,7 +967,7 @@ static void regs_intr__printf(struct perf_sample *sample, const char *arch)
>>> struct regs_dump *intr_regs = &sample->intr_regs;
>>>
>>> if (intr_regs->regs)
>>> - regs__printf("intr", intr_regs, arch);
>>> + regs__printf(true, intr_regs, arch);
>>> }
>>>
>>> static void stack_user__printf(struct stack_dump *dump)
>>> diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
>>> index a58444c4aed1..35c5d58aa45f 100644
>>> --- a/tools/perf/util/synthetic-events.c
>>> +++ b/tools/perf/util/synthetic-events.c
>>> @@ -1538,7 +1538,9 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
>>> if (type & PERF_SAMPLE_REGS_INTR) {
>>> if (sample->intr_regs.abi) {
>>> result += sizeof(u64);
>>> - sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
>>> + sz = bitmap_weight(sample->intr_regs.mask_ext,
>>> + PERF_NUM_INTR_REGS * 64) *
>>> + sizeof(u64);
>>> result += sz;
>>> } else {
>>> result += sizeof(u64);
>>> @@ -1741,7 +1743,8 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type, u64 read_fo
>>> if (type & PERF_SAMPLE_REGS_INTR) {
>>> if (sample->intr_regs.abi) {
>>> *array++ = sample->intr_regs.abi;
>>> - sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
>>> + sz = bitmap_weight(sample->intr_regs.mask_ext,
>>> + PERF_NUM_INTR_REGS * 64) * sizeof(u64);
>>> memcpy(array, sample->intr_regs.regs, sz);
>>> array = (void *)array + sz;
>>> } else {
>>> --
>>> 2.40.1
>>>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 01/20] perf/x86/intel: Add PMU support for Clearwater Forest
2025-02-06 1:31 ` Mi, Dapeng
@ 2025-02-06 7:53 ` Peter Zijlstra
2025-02-06 9:35 ` Mi, Dapeng
0 siblings, 1 reply; 47+ messages in thread
From: Peter Zijlstra @ 2025-02-06 7:53 UTC (permalink / raw)
To: Mi, Dapeng
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On Thu, Feb 06, 2025 at 09:31:46AM +0800, Mi, Dapeng wrote:
>
> On 1/28/2025 12:26 AM, Peter Zijlstra wrote:
> > On Thu, Jan 23, 2025 at 02:07:02PM +0000, Dapeng Mi wrote:
> >> From PMU's perspective, Clearwater Forest is similar to the previous
> >> generation Sierra Forest.
> >>
> >> The key differences are the ARCH PEBS feature and the new added 3 fixed
> >> counters for topdown L1 metrics events.
> >>
> >> The ARCH PEBS is supported in the following patches. This patch provides
> >> support for basic perfmon features and 3 new added fixed counters.
> >>
> >> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> >> ---
> >> arch/x86/events/intel/core.c | 24 ++++++++++++++++++++++++
> >> 1 file changed, 24 insertions(+)
> >>
> >> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> >> index b140c1473a9d..5e8521a54474 100644
> >> --- a/arch/x86/events/intel/core.c
> >> +++ b/arch/x86/events/intel/core.c
> >> @@ -2220,6 +2220,18 @@ static struct extra_reg intel_cmt_extra_regs[] __read_mostly = {
> >> EVENT_EXTRA_END
> >> };
> >>
> >> +EVENT_ATTR_STR(topdown-fe-bound, td_fe_bound_skt, "event=0x9c,umask=0x01");
> >> +EVENT_ATTR_STR(topdown-retiring, td_retiring_skt, "event=0xc2,umask=0x02");
> >> +EVENT_ATTR_STR(topdown-be-bound, td_be_bound_skt, "event=0xa4,umask=0x02");
> >> +
> >> +static struct attribute *skt_events_attrs[] = {
> >> + EVENT_PTR(td_fe_bound_skt),
> >> + EVENT_PTR(td_retiring_skt),
> >> + EVENT_PTR(td_bad_spec_cmt),
> >> + EVENT_PTR(td_be_bound_skt),
> >> + NULL,
> >> +};
> > The skt here is skymont, which is what Sierra Forest was based on, and
> > you just said that these counters are new with Darkmont, and as such the
> > lot should be called: dmt or whatever the proper trigraph is.
>
> Sorry for late response since the Chinese new year holiday.
>
> Sierra Forest is based on Crestmont instead of Skymont.
I hate all these names :-( But yeah, you're right.
> The 3 new fixed counters are introduced from Skymont and Darkmont
> inherits them. So these attributes are named with "skt" suffix.
Fair enough.
But how come this is new for darkmont and wasn't done for
arrowlake/lunarlake which have skymont based e-cores?
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 01/20] perf/x86/intel: Add PMU support for Clearwater Forest
2025-02-06 7:53 ` Peter Zijlstra
@ 2025-02-06 9:35 ` Mi, Dapeng
2025-02-06 9:39 ` Peter Zijlstra
0 siblings, 1 reply; 47+ messages in thread
From: Mi, Dapeng @ 2025-02-06 9:35 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On 2/6/2025 3:53 PM, Peter Zijlstra wrote:
> On Thu, Feb 06, 2025 at 09:31:46AM +0800, Mi, Dapeng wrote:
>> On 1/28/2025 12:26 AM, Peter Zijlstra wrote:
>>> On Thu, Jan 23, 2025 at 02:07:02PM +0000, Dapeng Mi wrote:
>>>> From PMU's perspective, Clearwater Forest is similar to the previous
>>>> generation Sierra Forest.
>>>>
>>>> The key differences are the ARCH PEBS feature and the new added 3 fixed
>>>> counters for topdown L1 metrics events.
>>>>
>>>> The ARCH PEBS is supported in the following patches. This patch provides
>>>> support for basic perfmon features and 3 new added fixed counters.
>>>>
>>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>>> ---
>>>> arch/x86/events/intel/core.c | 24 ++++++++++++++++++++++++
>>>> 1 file changed, 24 insertions(+)
>>>>
>>>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>>>> index b140c1473a9d..5e8521a54474 100644
>>>> --- a/arch/x86/events/intel/core.c
>>>> +++ b/arch/x86/events/intel/core.c
>>>> @@ -2220,6 +2220,18 @@ static struct extra_reg intel_cmt_extra_regs[] __read_mostly = {
>>>> EVENT_EXTRA_END
>>>> };
>>>>
>>>> +EVENT_ATTR_STR(topdown-fe-bound, td_fe_bound_skt, "event=0x9c,umask=0x01");
>>>> +EVENT_ATTR_STR(topdown-retiring, td_retiring_skt, "event=0xc2,umask=0x02");
>>>> +EVENT_ATTR_STR(topdown-be-bound, td_be_bound_skt, "event=0xa4,umask=0x02");
>>>> +
>>>> +static struct attribute *skt_events_attrs[] = {
>>>> + EVENT_PTR(td_fe_bound_skt),
>>>> + EVENT_PTR(td_retiring_skt),
>>>> + EVENT_PTR(td_bad_spec_cmt),
>>>> + EVENT_PTR(td_be_bound_skt),
>>>> + NULL,
>>>> +};
>>> The skt here is skymont, which is what Sierra Forest was based on, and
>>> you just said that these counters are new with Darkmont, and as such the
>>> lot should be called: dmt or whatever the proper trigraph is.
>> Sorry for late response since the Chinese new year holiday.
>>
>> Sierra Forest is based on Crestmont instead of Skymont.
> I hate all these names :-( But yeah, you're right.
>
>> The 3 new fixed counters are introduced from Skymont and Darkmont
>> inherits them. So these attributes are named with "skt" suffix.
> Fair enough.
>
> But how come this is new for darkmont and wasn't done for
> arrowlake/lunarlake which have skymont based e-cores?
ARL/LNL are all hybrid platforms, all the event attributes are defined with
a kind of P-core and E-core mixed format like below.
EVENT_ATTR_STR_HYBRID(topdown-retiring, td_retiring_lnl,
"event=0xc2,umask=0x02;event=0x00,umask=0x80", hybrid_big_small);
EVENT_ATTR_STR_HYBRID(topdown-fe-bound, td_fe_bound_lnl,
"event=0x9c,umask=0x01;event=0x00,umask=0x82", hybrid_big_small);
EVENT_ATTR_STR_HYBRID(topdown-be-bound, td_be_bound_lnl,
"event=0xa4,umask=0x02;event=0x00,umask=0x83", hybrid_big_small);
static struct attribute *lnl_hybrid_events_attrs[] = {
EVENT_PTR(slots_adl),
EVENT_PTR(td_retiring_lnl),
EVENT_PTR(td_bad_spec_adl),
EVENT_PTR(td_fe_bound_lnl),
EVENT_PTR(td_be_bound_lnl),
EVENT_PTR(td_heavy_ops_adl),
EVENT_PTR(td_br_mis_adl),
EVENT_PTR(td_fetch_lat_adl),
EVENT_PTR(td_mem_bound_adl),
NULL
};
CWF is pure E-cores and can't directly use this existed attributes, so
define these new attributes.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 01/20] perf/x86/intel: Add PMU support for Clearwater Forest
2025-02-06 9:35 ` Mi, Dapeng
@ 2025-02-06 9:39 ` Peter Zijlstra
0 siblings, 0 replies; 47+ messages in thread
From: Peter Zijlstra @ 2025-02-06 9:39 UTC (permalink / raw)
To: Mi, Dapeng
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On Thu, Feb 06, 2025 at 05:35:28PM +0800, Mi, Dapeng wrote:
> ARL/LNL are all hybrid platforms, all the event attributes are defined with
> a kind of P-core and E-core mixed format like below.
Ah, and there is no E-core only variant of them, like ADL-N was.
Thanks!
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 11/20] perf/x86/intel: Setup PEBS constraints base on counter & pdist map
2025-02-06 2:47 ` Mi, Dapeng
@ 2025-02-06 15:01 ` Liang, Kan
2025-02-07 1:27 ` Mi, Dapeng
0 siblings, 1 reply; 47+ messages in thread
From: Liang, Kan @ 2025-02-06 15:01 UTC (permalink / raw)
To: Mi, Dapeng, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi
On 2025-02-05 9:47 p.m., Mi, Dapeng wrote:
>
> On 1/28/2025 12:07 AM, Liang, Kan wrote:
>>
>> On 2025-01-23 9:07 a.m., Dapeng Mi wrote:
>>> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
>>> sampling and precise distribution PEBS sampling. Thus PEBS constraints
>>> can be dynamically configured base on these counter and precise
>>> distribution bitmap instead of defining them statically.
>>>
>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>> ---
>>> arch/x86/events/intel/core.c | 20 ++++++++++++++++++++
>>> arch/x86/events/intel/ds.c | 1 +
>>> 2 files changed, 21 insertions(+)
>>>
>>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>>> index 7775e1e1c1e9..0f1be36113fa 100644
>>> --- a/arch/x86/events/intel/core.c
>>> +++ b/arch/x86/events/intel/core.c
>>> @@ -3728,6 +3728,7 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>>> struct perf_event *event)
>>> {
>>> struct event_constraint *c1, *c2;
>>> + struct pmu *pmu = event->pmu;
>>>
>>> c1 = cpuc->event_constraint[idx];
>>>
>>> @@ -3754,6 +3755,25 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>>> c2->weight = hweight64(c2->idxmsk64);
>>> }
>>>
>>> + if (x86_pmu.arch_pebs && event->attr.precise_ip) {
>>> + u64 pebs_cntrs_mask;
>>> + u64 cntrs_mask;
>>> +
>>> + if (event->attr.precise_ip >= 3)
>>> + pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).pdists;
>>> + else
>>> + pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).counters;
>>> +
>>> + cntrs_mask = hybrid(pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED |
>>> + hybrid(pmu, cntr_mask64);
>>> +
>>> + if (pebs_cntrs_mask != cntrs_mask) {
>>> + c2 = dyn_constraint(cpuc, c2, idx);
>>> + c2->idxmsk64 &= pebs_cntrs_mask;
>>> + c2->weight = hweight64(c2->idxmsk64);
>>> + }
>>> + }
>> The pebs_cntrs_mask and cntrs_mask wouldn't be changed since the machine
>> boot. I don't think it's efficient to calculate them every time.
>>
>> Maybe adding a local pebs_event_constraints_pdist[] and update both
>> pebs_event_constraints[] and pebs_event_constraints_pdist[] with the
>> enumerated mask at initialization time.
>>
>> Update the intel_pebs_constraints() to utilize the corresponding array
>> according to the precise_ip.
>>
>> The above may be avoided.
>
> Even we have these two arrays, we still need the dynamic constraint, right?
> We can't predict what the event is, the event may be mapped to a quite
> specific event constraint and we can know it in advance.
The dynamic constraint is not necessary, but two arrays seems not
enough. Because a PEBS event may fall back to the event_constraints as
well. Sigh.
Four arrays should be required. pebs_event_constraints[],
pebs_event_constraints_pdist[], event_constraints_for_pebs[],
event_constraints_for_pdist_pebs[].
But it seems too complicated. It may not be implemented now.
But, at least the pebs_cntrs_mask and cntrs_mask can be calculated in
the hw_config(), or even intel_pmu_init() once. It should not be
calculated every time in the critical path.
Thanks,
Kan
>
>
>>
>> Thanks,
>> Kan
>>
>>> +
>>> return c2;
>>> }
>>>
>>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>>> index 2f2c6b7c801b..a573ce0e576a 100644
>>> --- a/arch/x86/events/intel/ds.c
>>> +++ b/arch/x86/events/intel/ds.c
>>> @@ -2941,6 +2941,7 @@ static void __init intel_arch_pebs_init(void)
>>> x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
>>> x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
>>> x86_pmu.pebs_capable = ~0ULL;
>>> + x86_pmu.flags |= PMU_FL_PEBS_ALL;
>>> }
>>>
>>> /*
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 11/20] perf/x86/intel: Setup PEBS constraints base on counter & pdist map
2025-02-06 15:01 ` Liang, Kan
@ 2025-02-07 1:27 ` Mi, Dapeng
0 siblings, 0 replies; 47+ messages in thread
From: Mi, Dapeng @ 2025-02-07 1:27 UTC (permalink / raw)
To: Liang, Kan, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi
On 2/6/2025 11:01 PM, Liang, Kan wrote:
>
> On 2025-02-05 9:47 p.m., Mi, Dapeng wrote:
>> On 1/28/2025 12:07 AM, Liang, Kan wrote:
>>> On 2025-01-23 9:07 a.m., Dapeng Mi wrote:
>>>> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
>>>> sampling and precise distribution PEBS sampling. Thus PEBS constraints
>>>> can be dynamically configured base on these counter and precise
>>>> distribution bitmap instead of defining them statically.
>>>>
>>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>>> ---
>>>> arch/x86/events/intel/core.c | 20 ++++++++++++++++++++
>>>> arch/x86/events/intel/ds.c | 1 +
>>>> 2 files changed, 21 insertions(+)
>>>>
>>>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>>>> index 7775e1e1c1e9..0f1be36113fa 100644
>>>> --- a/arch/x86/events/intel/core.c
>>>> +++ b/arch/x86/events/intel/core.c
>>>> @@ -3728,6 +3728,7 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>>>> struct perf_event *event)
>>>> {
>>>> struct event_constraint *c1, *c2;
>>>> + struct pmu *pmu = event->pmu;
>>>>
>>>> c1 = cpuc->event_constraint[idx];
>>>>
>>>> @@ -3754,6 +3755,25 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>>>> c2->weight = hweight64(c2->idxmsk64);
>>>> }
>>>>
>>>> + if (x86_pmu.arch_pebs && event->attr.precise_ip) {
>>>> + u64 pebs_cntrs_mask;
>>>> + u64 cntrs_mask;
>>>> +
>>>> + if (event->attr.precise_ip >= 3)
>>>> + pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).pdists;
>>>> + else
>>>> + pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).counters;
>>>> +
>>>> + cntrs_mask = hybrid(pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED |
>>>> + hybrid(pmu, cntr_mask64);
>>>> +
>>>> + if (pebs_cntrs_mask != cntrs_mask) {
>>>> + c2 = dyn_constraint(cpuc, c2, idx);
>>>> + c2->idxmsk64 &= pebs_cntrs_mask;
>>>> + c2->weight = hweight64(c2->idxmsk64);
>>>> + }
>>>> + }
>>> The pebs_cntrs_mask and cntrs_mask wouldn't be changed since the machine
>>> boot. I don't think it's efficient to calculate them every time.
>>>
>>> Maybe adding a local pebs_event_constraints_pdist[] and update both
>>> pebs_event_constraints[] and pebs_event_constraints_pdist[] with the
>>> enumerated mask at initialization time.
>>>
>>> Update the intel_pebs_constraints() to utilize the corresponding array
>>> according to the precise_ip.
>>>
>>> The above may be avoided.
>> Even we have these two arrays, we still need the dynamic constraint, right?
>> We can't predict what the event is, the event may be mapped to a quite
>> specific event constraint and we can know it in advance.
> The dynamic constraint is not necessary, but two arrays seems not
> enough. Because a PEBS event may fall back to the event_constraints as
> well. Sigh.
> Four arrays should be required. pebs_event_constraints[],
> pebs_event_constraints_pdist[], event_constraints_for_pebs[],
> event_constraints_for_pdist_pebs[].
> But it seems too complicated. It may not be implemented now.
>
> But, at least the pebs_cntrs_mask and cntrs_mask can be calculated in
> the hw_config(), or even intel_pmu_init() once. It should not be
> calculated every time in the critical path.
Yeah, these two counters mask are unnecessary to calculate at each call. It
looks we can further optimize this base on your dynamic constraints
optimization patch. Thanks.
>
> Thanks,
> Kan
>>
>>> Thanks,
>>> Kan
>>>
>>>> +
>>>> return c2;
>>>> }
>>>>
>>>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>>>> index 2f2c6b7c801b..a573ce0e576a 100644
>>>> --- a/arch/x86/events/intel/ds.c
>>>> +++ b/arch/x86/events/intel/ds.c
>>>> @@ -2941,6 +2941,7 @@ static void __init intel_arch_pebs_init(void)
>>>> x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
>>>> x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
>>>> x86_pmu.pebs_capable = ~0ULL;
>>>> + x86_pmu.flags |= PMU_FL_PEBS_ALL;
>>>> }
>>>>
>>>> /*
>
^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2025-02-07 1:27 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-23 14:07 [PATCH 00/20] Arch-PEBS and PMU supports for Clearwater Forest Dapeng Mi
2025-01-23 14:07 ` [PATCH 01/20] perf/x86/intel: Add PMU support " Dapeng Mi
2025-01-27 16:26 ` Peter Zijlstra
2025-02-06 1:31 ` Mi, Dapeng
2025-02-06 7:53 ` Peter Zijlstra
2025-02-06 9:35 ` Mi, Dapeng
2025-02-06 9:39 ` Peter Zijlstra
2025-01-23 14:07 ` [PATCH 02/20] perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF Dapeng Mi
2025-01-27 16:29 ` Peter Zijlstra
2025-01-27 16:43 ` Liang, Kan
2025-01-27 21:29 ` Peter Zijlstra
2025-01-28 0:28 ` Liang, Kan
2025-01-23 14:07 ` [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs Dapeng Mi
2025-01-23 18:58 ` Andi Kleen
2025-01-27 15:19 ` Liang, Kan
2025-01-27 16:44 ` Peter Zijlstra
2025-02-06 2:09 ` Mi, Dapeng
2025-01-23 14:07 ` [PATCH 04/20] perf/x86/intel: Decouple BTS initialization from PEBS initialization Dapeng Mi
2025-01-23 14:07 ` [PATCH 05/20] perf/x86/intel: Rename x86_pmu.pebs to x86_pmu.ds_pebs Dapeng Mi
2025-01-23 14:07 ` [PATCH 06/20] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
2025-01-28 11:22 ` Peter Zijlstra
2025-02-06 2:25 ` Mi, Dapeng
2025-01-23 14:07 ` [PATCH 07/20] perf/x86/intel/ds: Factor out common PEBS processing code to functions Dapeng Mi
2025-01-23 14:07 ` [PATCH 08/20] perf/x86/intel: Process arch-PEBS records or record fragments Dapeng Mi
2025-01-23 14:07 ` [PATCH 09/20] perf/x86/intel: Factor out common functions to process PEBS groups Dapeng Mi
2025-01-23 14:07 ` [PATCH 10/20] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
2025-01-23 14:07 ` [PATCH 11/20] perf/x86/intel: Setup PEBS constraints base on counter & pdist map Dapeng Mi
2025-01-27 16:07 ` Liang, Kan
2025-02-06 2:47 ` Mi, Dapeng
2025-02-06 15:01 ` Liang, Kan
2025-02-07 1:27 ` Mi, Dapeng
2025-01-23 14:07 ` [PATCH 12/20] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Dapeng Mi
2025-01-23 14:07 ` [PATCH 13/20] perf/x86/intel: Add SSP register support for arch-PEBS Dapeng Mi
2025-01-24 5:16 ` Andi Kleen
2025-01-27 15:38 ` Liang, Kan
2025-01-23 14:07 ` [PATCH 14/20] perf/x86/intel: Add counter group " Dapeng Mi
2025-01-23 14:07 ` [PATCH 15/20] perf/core: Support to capture higher width vector registers Dapeng Mi
2025-01-23 14:07 ` [PATCH 16/20] perf/x86/intel: Support arch-PEBS vector registers group capturing Dapeng Mi
2025-01-23 14:07 ` [PATCH 17/20] perf tools: Support to show SSP register Dapeng Mi
2025-01-23 16:15 ` Ian Rogers
2025-02-06 2:57 ` Mi, Dapeng
2025-01-23 14:07 ` [PATCH 18/20] perf tools: Support to capture more vector registers (common part) Dapeng Mi
2025-01-23 16:42 ` Ian Rogers
2025-01-27 15:50 ` Liang, Kan
2025-02-06 3:12 ` Mi, Dapeng
2025-01-23 14:07 ` [PATCH 19/20] perf tools: Support to capture more vector registers (x86/Intel part) Dapeng Mi
2025-01-23 14:07 ` [PATCH 20/20] perf tools/tests: Add vector registers PEBS sampling test Dapeng Mi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).