public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH V5 1/2] perf/x86/intel: Support RDPMC metrics clear mode
@ 2024-12-11 16:03 kan.liang
  2024-12-11 16:03 ` [PATCH V5 2/2] perf doc: Update perf tools topdown documentation kan.liang
  0 siblings, 1 reply; 2+ messages in thread
From: kan.liang @ 2024-12-11 16:03 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, ak, james.clark, linux-kernel,
	linux-perf-users
  Cc: irogers, eranian, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The new RDPMC enhancement, metrics clear mode, is to clear the
PERF_METRICS-related resources as well as the fixed-function performance
monitoring counter 3 after the read is performed. It is available for
ring 3. The feature is enumerated by the
IA32_PERF_CAPABILITIES.RDPMC_CLEAR_METRICS[bit 19]. To enable the
feature, the IA32_FIXED_CTR_CTRL.METRICS_CLEAR_EN[bit 14] must be set.

Two ways were considered to enable the feature.
- Expose a knob in the sysfs globally. One user may affect the
  measurement of other users when changing the knob. The solution is
  dropped.
- Introduce a new event format, metrics_clear, for the slots event to
  disable/enable the feature only for the current process. Users can
  utilize the feature as needed.
The latter solution is implemented in the patch.

The current KVM doesn't support the perf metrics yet. For
virtualization, the feature can be enabled later separately.

Suggested-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---

The V4 can be found at
https://lore.kernel.org/lkml/20240926184558.3797290-1-kan.liang@linux.intel.com/

Change since V4:
- Split the doc update into a dedicated patch


 arch/x86/events/intel/core.c      | 20 +++++++++++++++++++-
 arch/x86/events/perf_event.h      |  1 +
 arch/x86/include/asm/perf_event.h |  4 ++++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 2e1e26846050..e76e892f44cd 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2816,6 +2816,9 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
 			return;
 
 		idx = INTEL_PMC_IDX_FIXED_SLOTS;
+
+		if (event->attr.config1 & INTEL_TD_CFG_METRIC_CLEAR)
+			bits |= INTEL_FIXED_3_METRICS_CLEAR;
 	}
 
 	intel_set_masks(event, idx);
@@ -4071,7 +4074,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	 * is used in a metrics group, it too cannot support sampling.
 	 */
 	if (intel_pmu_has_cap(event, PERF_CAP_METRICS_IDX) && is_topdown_event(event)) {
-		if (event->attr.config1 || event->attr.config2)
+		/* The metrics_clear can only be set for the slots event */
+		if (event->attr.config1 &&
+		    (!is_slots_event(event) || (event->attr.config1 & ~INTEL_TD_CFG_METRIC_CLEAR)))
+			return -EINVAL;
+
+		if (event->attr.config2)
 			return -EINVAL;
 
 		/*
@@ -4680,6 +4688,8 @@ PMU_FORMAT_ATTR(in_tx,  "config:32"	);
 PMU_FORMAT_ATTR(in_tx_cp, "config:33"	);
 PMU_FORMAT_ATTR(eq,	"config:36"	); /* v6 + */
 
+PMU_FORMAT_ATTR(metrics_clear,	"config1:0"); /* PERF_CAPABILITIES.RDPMC_METRICS_CLEAR */
+
 static ssize_t umask2_show(struct device *dev,
 			   struct device_attribute *attr,
 			   char *page)
@@ -4699,6 +4709,7 @@ static struct device_attribute format_attr_umask2  =
 static struct attribute *format_evtsel_ext_attrs[] = {
 	&format_attr_umask2.attr,
 	&format_attr_eq.attr,
+	&format_attr_metrics_clear.attr,
 	NULL
 };
 
@@ -4723,6 +4734,13 @@ evtsel_ext_is_visible(struct kobject *kobj, struct attribute *attr, int i)
 	if (i == 1)
 		return (mask & ARCH_PERFMON_EVENTSEL_EQ) ? attr->mode : 0;
 
+	/* PERF_CAPABILITIES.RDPMC_METRICS_CLEAR */
+	if (i == 2) {
+		union perf_capabilities intel_cap = hybrid(dev_get_drvdata(dev), intel_cap);
+
+		return intel_cap.rdpmc_metrics_clear ? attr->mode : 0;
+	}
+
 	return 0;
 }
 
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 82c6f45ce975..31c2771545a6 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -624,6 +624,7 @@ union perf_capabilities {
 		u64	pebs_output_pt_available:1;
 		u64	pebs_timing_info:1;
 		u64	anythread_deprecated:1;
+		u64	rdpmc_metrics_clear:1;
 	};
 	u64	capabilities;
 };
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index cb9c4679f45c..1ac79f361645 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -41,6 +41,7 @@
 #define INTEL_FIXED_0_USER				(1ULL << 1)
 #define INTEL_FIXED_0_ANYTHREAD			(1ULL << 2)
 #define INTEL_FIXED_0_ENABLE_PMI			(1ULL << 3)
+#define INTEL_FIXED_3_METRICS_CLEAR			(1ULL << 2)
 
 #define HSW_IN_TX					(1ULL << 32)
 #define HSW_IN_TX_CHECKPOINTED				(1ULL << 33)
@@ -372,6 +373,9 @@ static inline bool use_fixed_pseudo_encoding(u64 code)
 #define INTEL_TD_METRIC_MAX			INTEL_TD_METRIC_MEM_BOUND
 #define INTEL_TD_METRIC_NUM			8
 
+#define INTEL_TD_CFG_METRIC_CLEAR_BIT		0
+#define INTEL_TD_CFG_METRIC_CLEAR		BIT_ULL(INTEL_TD_CFG_METRIC_CLEAR_BIT)
+
 static inline bool is_metric_idx(int idx)
 {
 	return (unsigned)(idx - INTEL_PMC_IDX_METRIC_BASE) < INTEL_TD_METRIC_NUM;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [PATCH V5 2/2] perf doc: Update perf tools topdown documentation
  2024-12-11 16:03 [PATCH V5 1/2] perf/x86/intel: Support RDPMC metrics clear mode kan.liang
@ 2024-12-11 16:03 ` kan.liang
  0 siblings, 0 replies; 2+ messages in thread
From: kan.liang @ 2024-12-11 16:03 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, ak, james.clark, linux-kernel,
	linux-perf-users
  Cc: irogers, eranian, Kan Liang

From: Andi Kleen <ak@linux.intel.com>

- Document and give examples for the Lunar Lake perf metrics reset mode.
- Fix the error handling for mmap (it returns -1 on error, not 0)
- Clarify the slots placement documentation. It is not Icelake specific
and also applies for non sampling.

Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---

The doc update patch is originally from
https://lore.kernel.org/linux-perf-users/20241210193554.93013-1-ak@linux.intel.com/

 tools/perf/Documentation/topdown.txt | 33 +++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/topdown.txt b/tools/perf/Documentation/topdown.txt
index 5c17fff694ee..44644e917ffd 100644
--- a/tools/perf/Documentation/topdown.txt
+++ b/tools/perf/Documentation/topdown.txt
@@ -87,7 +87,7 @@ if (slots_fd < 0)
 
 /* Memory mapping the fd permits _rdpmc calls from userspace */
 void *slots_p = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, slots_fd, 0);
-if (!slot_p)
+if (slot_p == (void*)-1L)
 	.... error ...
 
 /*
@@ -107,7 +107,7 @@ if (metrics_fd < 0)
 
 /* Memory mapping the fd permits _rdpmc calls from userspace */
 void *metrics_p = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, metrics_fd, 0);
-if (!metrics_p)
+if (metrics_p == (void*)-1L)
 	... error ...
 
 Note: the file descriptors returned by the perf_event_open calls must be memory
@@ -290,15 +290,38 @@ This "opens" a new measurement period.
 A program using RDPMC for TopDown should schedule such a reset
 regularly, as in every few seconds.
 
-Limits on Intel Ice Lake
-========================
+Newer Intel CPUs (Lunar Lake, Arrow Lake+) support automatically
+resetting the perf metrics RDPMC, which can avoid a system call or
+needing to schedule special resets.
+
+This is available if /sys/devices/cpu*/format/metrics_clear
+exists, and requires setting an opt-in bit when opening the
+slots counter:
+
+if (access("/sys/devices/cpu/format/metrics_clear") &&
+    access("/sys/devices/cpu_core/format/metrics_clear"))
+     ... functionality not supported ...
+
+#define INTEL_TD_CFG_METRIC_CLEAR 1ULL
+
+struct perf_event_attr slots_event = {
+	... same as slots example above ...
+	.config1 = INTEL_TD_CFG_METRIC_CLEAR,
+};
+
+... open and map counter same as example above ...
+
+Then any metric read will reset the metrics and slots.
+
+Using perf metrics with perf stat
+=================================
 
 Four pseudo TopDown metric events are exposed for the end-users,
 topdown-retiring, topdown-bad-spec, topdown-fe-bound and topdown-be-bound.
 They can be used to collect the TopDown value under the following
 rules:
 - All the TopDown metric events must be in a group with the SLOTS event.
-- The SLOTS event must be the leader of the group.
+- The SLOTS event must be the first entry of the group.
 - The PERF_FORMAT_GROUP flag must be applied for each TopDown metric
   events
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-12-11 16:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-11 16:03 [PATCH V5 1/2] perf/x86/intel: Support RDPMC metrics clear mode kan.liang
2024-12-11 16:03 ` [PATCH V5 2/2] perf doc: Update perf tools topdown documentation kan.liang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox