* [PATCH 1/3] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag
@ 2025-07-17 9:03 Dapeng Mi
2025-07-17 9:03 ` [PATCH 2/3] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) Dapeng Mi
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Dapeng Mi @ 2025-07-17 9:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Sean Christopherson, Paolo Bonzini, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane
Cc: kvm, linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi, Yi Lai
IA32_PERF_CAPABILITIES.PEBS_TIMING_INFO[bit 17] is introduced to
indicate whether timed PEBS is supported. Timed PEBS adds a new "retired
latency" field in basic info group to show the timing info. Please find
detailed information about timed PEBS in section 8.4.1 "Timed Processor
Event Based Sampling" of "Intel Architecture Instruction Set Extensions
and Future Features".
This patch adds PERF_CAP_PEBS_TIMING_INFO flag and KVM module leverages
this flag to expose timed PEBS feature to guest.
Moreover, opportunistically refine the indents and make the macros
share consistent indents.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
arch/x86/include/asm/msr-index.h | 14 ++++++++------
tools/arch/x86/include/asm/msr-index.h | 14 ++++++++------
2 files changed, 16 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b7dded3c8113..48b7ed28718c 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -315,12 +315,14 @@
#define PERF_CAP_PT_IDX 16
#define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6
-#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
-#define PERF_CAP_ARCH_REG BIT_ULL(7)
-#define PERF_CAP_PEBS_FORMAT 0xf00
-#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
-#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
- PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
+#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
+#define PERF_CAP_ARCH_REG BIT_ULL(7)
+#define PERF_CAP_PEBS_FORMAT 0xf00
+#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
+#define PERF_CAP_PEBS_TIMING_INFO BIT_ULL(17)
+#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
+ PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
+ PERF_CAP_PEBS_TIMING_INFO)
#define MSR_IA32_RTIT_CTL 0x00000570
#define RTIT_CTL_TRACEEN BIT(0)
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index b7dded3c8113..48b7ed28718c 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -315,12 +315,14 @@
#define PERF_CAP_PT_IDX 16
#define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6
-#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
-#define PERF_CAP_ARCH_REG BIT_ULL(7)
-#define PERF_CAP_PEBS_FORMAT 0xf00
-#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
-#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
- PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
+#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
+#define PERF_CAP_ARCH_REG BIT_ULL(7)
+#define PERF_CAP_PEBS_FORMAT 0xf00
+#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
+#define PERF_CAP_PEBS_TIMING_INFO BIT_ULL(17)
+#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
+ PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
+ PERF_CAP_PEBS_TIMING_INFO)
#define MSR_IA32_RTIT_CTL 0x00000570
#define RTIT_CTL_TRACEEN BIT(0)
base-commit: 829f5a6308ce11c3edaa31498a825f8c41b9e9aa
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/3] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48)
2025-07-17 9:03 [PATCH 1/3] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Dapeng Mi
@ 2025-07-17 9:03 ` Dapeng Mi
2025-07-17 9:03 ` [PATCH 3/3] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Dapeng Mi
2025-07-17 9:09 ` [PATCH 1/3] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Mi, Dapeng
2 siblings, 0 replies; 4+ messages in thread
From: Dapeng Mi @ 2025-07-17 9:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Sean Christopherson, Paolo Bonzini, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane
Cc: kvm, linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi, Yi Lai
Macro GLOBAL_CTRL_EN_PERF_METRICS is defined to 48 instead of
BIT_ULL(48), it's inconsistent with other similar macros. This leads to
this macro is quite easily used wrongly since users thinks it's a
bit-mask just like other similar macros.
Thus change GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) and eliminate
this potential misuse.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
arch/x86/events/intel/core.c | 8 ++++----
arch/x86/include/asm/perf_event.h | 2 +-
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index c2fb729c270e..1ee4480089aa 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5318,9 +5318,9 @@ static void intel_pmu_check_hybrid_pmus(struct x86_hybrid_pmu *pmu)
0, x86_pmu_num_counters(&pmu->pmu), 0, 0);
if (pmu->intel_cap.perf_metrics)
- pmu->intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;
+ pmu->intel_ctrl |= GLOBAL_CTRL_EN_PERF_METRICS;
else
- pmu->intel_ctrl &= ~(1ULL << GLOBAL_CTRL_EN_PERF_METRICS);
+ pmu->intel_ctrl &= ~GLOBAL_CTRL_EN_PERF_METRICS;
intel_pmu_check_event_constraints(pmu->event_constraints,
pmu->cntr_mask64,
@@ -5455,7 +5455,7 @@ static void intel_pmu_cpu_starting(int cpu)
rdmsrq(MSR_IA32_PERF_CAPABILITIES, perf_cap.capabilities);
if (!perf_cap.perf_metrics) {
x86_pmu.intel_cap.perf_metrics = 0;
- x86_pmu.intel_ctrl &= ~(1ULL << GLOBAL_CTRL_EN_PERF_METRICS);
+ x86_pmu.intel_ctrl &= ~GLOBAL_CTRL_EN_PERF_METRICS;
}
}
@@ -7789,7 +7789,7 @@ __init int intel_pmu_init(void)
}
if (!is_hybrid() && x86_pmu.intel_cap.perf_metrics)
- x86_pmu.intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;
+ x86_pmu.intel_ctrl |= GLOBAL_CTRL_EN_PERF_METRICS;
if (x86_pmu.intel_cap.pebs_timing_info)
x86_pmu.flags |= PMU_FL_RETIRE_LATENCY;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 70d1d94aca7e..f8247ac276c4 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -430,7 +430,7 @@ static inline bool is_topdown_idx(int idx)
#define GLOBAL_STATUS_TRACE_TOPAPMI BIT_ULL(GLOBAL_STATUS_TRACE_TOPAPMI_BIT)
#define GLOBAL_STATUS_PERF_METRICS_OVF_BIT 48
-#define GLOBAL_CTRL_EN_PERF_METRICS 48
+#define GLOBAL_CTRL_EN_PERF_METRICS BIT_ULL(48)
/*
* We model guest LBR event tracing as another fixed-mode PMC like BTS.
*
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/3] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK
2025-07-17 9:03 [PATCH 1/3] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Dapeng Mi
2025-07-17 9:03 ` [PATCH 2/3] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) Dapeng Mi
@ 2025-07-17 9:03 ` Dapeng Mi
2025-07-17 9:09 ` [PATCH 1/3] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Mi, Dapeng
2 siblings, 0 replies; 4+ messages in thread
From: Dapeng Mi @ 2025-07-17 9:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Sean Christopherson, Paolo Bonzini, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane
Cc: kvm, linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi, Yi Lai
ICL_FIXED_0_ADAPTIVE is missed to be added into INTEL_FIXED_BITS_MASK,
add it and opportunistically refine fixed counter enabling code.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
arch/x86/events/intel/core.c | 10 +++-------
arch/x86/include/asm/perf_event.h | 6 +++++-
arch/x86/kvm/pmu.h | 2 +-
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 1ee4480089aa..b79efae717f7 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2845,8 +2845,8 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
- u64 mask, bits = 0;
int idx = hwc->idx;
+ u64 bits = 0;
if (is_topdown_idx(idx)) {
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -2885,14 +2885,10 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
idx -= INTEL_PMC_IDX_FIXED;
bits = intel_fixed_bits_by_idx(idx, bits);
- mask = intel_fixed_bits_by_idx(idx, INTEL_FIXED_BITS_MASK);
-
- if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip) {
+ if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip)
bits |= intel_fixed_bits_by_idx(idx, ICL_FIXED_0_ADAPTIVE);
- mask |= intel_fixed_bits_by_idx(idx, ICL_FIXED_0_ADAPTIVE);
- }
- cpuc->fixed_ctrl_val &= ~mask;
+ cpuc->fixed_ctrl_val &= ~intel_fixed_bits_by_idx(idx, INTEL_FIXED_BITS_MASK);
cpuc->fixed_ctrl_val |= bits;
}
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index f8247ac276c4..49a4d442f3fc 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -35,7 +35,6 @@
#define ARCH_PERFMON_EVENTSEL_EQ (1ULL << 36)
#define ARCH_PERFMON_EVENTSEL_UMASK2 (0xFFULL << 40)
-#define INTEL_FIXED_BITS_MASK 0xFULL
#define INTEL_FIXED_BITS_STRIDE 4
#define INTEL_FIXED_0_KERNEL (1ULL << 0)
#define INTEL_FIXED_0_USER (1ULL << 1)
@@ -48,6 +47,11 @@
#define ICL_EVENTSEL_ADAPTIVE (1ULL << 34)
#define ICL_FIXED_0_ADAPTIVE (1ULL << 32)
+#define INTEL_FIXED_BITS_MASK \
+ (INTEL_FIXED_0_KERNEL | INTEL_FIXED_0_USER | \
+ INTEL_FIXED_0_ANYTHREAD | INTEL_FIXED_0_ENABLE_PMI | \
+ ICL_FIXED_0_ADAPTIVE)
+
#define intel_fixed_bits_by_idx(_idx, _bits) \
((_bits) << ((_idx) * INTEL_FIXED_BITS_STRIDE))
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index ad89d0bd6005..103604c4b33b 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -13,7 +13,7 @@
#define MSR_IA32_MISC_ENABLE_PMU_RO_MASK (MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | \
MSR_IA32_MISC_ENABLE_BTS_UNAVAIL)
-/* retrieve the 4 bits for EN and PMI out of IA32_FIXED_CTR_CTRL */
+/* retrieve a fixed counter bits out of IA32_FIXED_CTR_CTRL */
#define fixed_ctrl_field(ctrl_reg, idx) \
(((ctrl_reg) >> ((idx) * INTEL_FIXED_BITS_STRIDE)) & INTEL_FIXED_BITS_MASK)
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 1/3] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag
2025-07-17 9:03 [PATCH 1/3] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Dapeng Mi
2025-07-17 9:03 ` [PATCH 2/3] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) Dapeng Mi
2025-07-17 9:03 ` [PATCH 3/3] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Dapeng Mi
@ 2025-07-17 9:09 ` Mi, Dapeng
2 siblings, 0 replies; 4+ messages in thread
From: Mi, Dapeng @ 2025-07-17 9:09 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Sean Christopherson, Paolo Bonzini, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane
Cc: kvm, linux-kernel, linux-perf-users, Dapeng Mi, Yi Lai
Run basic perf counting, PMI based sampling and PEBS based sampling on
Intel Sapphire Rapids, Granite Rapids and Sierra Forest platforms, no issue
is found.
On 7/17/2025 5:03 PM, Dapeng Mi wrote:
> IA32_PERF_CAPABILITIES.PEBS_TIMING_INFO[bit 17] is introduced to
> indicate whether timed PEBS is supported. Timed PEBS adds a new "retired
> latency" field in basic info group to show the timing info. Please find
> detailed information about timed PEBS in section 8.4.1 "Timed Processor
> Event Based Sampling" of "Intel Architecture Instruction Set Extensions
> and Future Features".
>
> This patch adds PERF_CAP_PEBS_TIMING_INFO flag and KVM module leverages
> this flag to expose timed PEBS feature to guest.
>
> Moreover, opportunistically refine the indents and make the macros
> share consistent indents.
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Tested-by: Yi Lai <yi1.lai@intel.com>
> ---
> arch/x86/include/asm/msr-index.h | 14 ++++++++------
> tools/arch/x86/include/asm/msr-index.h | 14 ++++++++------
> 2 files changed, 16 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index b7dded3c8113..48b7ed28718c 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -315,12 +315,14 @@
> #define PERF_CAP_PT_IDX 16
>
> #define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6
> -#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
> -#define PERF_CAP_ARCH_REG BIT_ULL(7)
> -#define PERF_CAP_PEBS_FORMAT 0xf00
> -#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
> -#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
> - PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
> +#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
> +#define PERF_CAP_ARCH_REG BIT_ULL(7)
> +#define PERF_CAP_PEBS_FORMAT 0xf00
> +#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
> +#define PERF_CAP_PEBS_TIMING_INFO BIT_ULL(17)
> +#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
> + PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
> + PERF_CAP_PEBS_TIMING_INFO)
>
> #define MSR_IA32_RTIT_CTL 0x00000570
> #define RTIT_CTL_TRACEEN BIT(0)
> diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
> index b7dded3c8113..48b7ed28718c 100644
> --- a/tools/arch/x86/include/asm/msr-index.h
> +++ b/tools/arch/x86/include/asm/msr-index.h
> @@ -315,12 +315,14 @@
> #define PERF_CAP_PT_IDX 16
>
> #define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6
> -#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
> -#define PERF_CAP_ARCH_REG BIT_ULL(7)
> -#define PERF_CAP_PEBS_FORMAT 0xf00
> -#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
> -#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
> - PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
> +#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
> +#define PERF_CAP_ARCH_REG BIT_ULL(7)
> +#define PERF_CAP_PEBS_FORMAT 0xf00
> +#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
> +#define PERF_CAP_PEBS_TIMING_INFO BIT_ULL(17)
> +#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
> + PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
> + PERF_CAP_PEBS_TIMING_INFO)
>
> #define MSR_IA32_RTIT_CTL 0x00000570
> #define RTIT_CTL_TRACEEN BIT(0)
>
> base-commit: 829f5a6308ce11c3edaa31498a825f8c41b9e9aa
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-07-17 9:09 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-17 9:03 [PATCH 1/3] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Dapeng Mi
2025-07-17 9:03 ` [PATCH 2/3] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) Dapeng Mi
2025-07-17 9:03 ` [PATCH 3/3] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Dapeng Mi
2025-07-17 9:09 ` [PATCH 1/3] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Mi, Dapeng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).