From: Tony Luck <tony.luck@intel.com>
To: Fenghua Yu <fenghuay@nvidia.com>,
Reinette Chatre <reinette.chatre@intel.com>,
Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>,
Peter Newman <peternewman@google.com>,
James Morse <james.morse@arm.com>,
Babu Moger <babu.moger@amd.com>,
Drew Fustini <dfustini@baylibre.com>,
Dave Martin <Dave.Martin@arm.com>, Chen Yu <yu.c.chen@intel.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
patches@lists.linux.dev, Tony Luck <tony.luck@intel.com>
Subject: [PATCH v11 14/31] x86/resctrl: Discover hardware telemetry events
Date: Thu, 25 Sep 2025 13:03:08 -0700 [thread overview]
Message-ID: <20250925200328.64155-15-tony.luck@intel.com> (raw)
In-Reply-To: <20250925200328.64155-1-tony.luck@intel.com>
Each CPU collects data for telemetry events that it sends to the nearest
telemetry event aggregator either when the value of IA32_PQR_ASSOC.RMID
changes, or when a two millisecond timer expires.
The telemetry event aggregators maintain per-RMID per-event counts of the
total seen for all the CPUs. There may be more than one set of telemetry
event aggregators per package.
There are separate sets of aggregators for each type of event, but all
aggregators for a given type are symmetric keeping counts for the same
set of events for the CPUs that provide data to them.
Each telemetry event aggregator is responsible for a specific group of
events. E.g. on the Intel Clearwater Forest CPU there are two types of
aggregators. One type tracks a pair of energy related events. The other
type tracks a subset of "perf" type events.
The event counts are made available to Linux in a region of MMIO space
for each aggregator. All details about the layout of counters in each
aggregator MMIO region are described in XML files published by Intel and
made available in a GitHub repository [1].
The key to matching a specific telemetry aggregator to the XML file that
describes the MMIO layout is a 32-bit value. The Linux telemetry subsystem
refers to this as a "guid" while the XML files call it a "uniqueid".
Each XML file provides the following information:
1) Which telemetry events are included in the group.
2) The order in which the event counters appear for each RMID.
3) The value type of each event counter (integer or fixed-point).
4) The number of RMIDs supported.
5) Which additional aggregator status registers are included.
6) The total size of the MMIO region for an aggregator.
The INTEL_PMT_TELEMETRY driver enumerates support for telemetry events.
This driver provides intel_pmt_get_regions_by_feature() to list all
available telemetry event aggregators. The list includes the "guid",
the base address in MMIO space for the region where the event counters
are exposed, and the package id where the all the CPUs that report to this
aggregator are located.
Add a new Kconfig option CONFIG_X86_CPU_RESCTRL_INTEL_AET for the Intel
specific parts of telemetry code. This depends on the INTEL_PMT_TELEMETRY
and INTEL_TPMI drivers being built-in to the kernel for enumeration of
telemetry features.
Use INTEL_PMT_TELEMETRY's intel_pmt_get_regions_by_feature() with
each per-RMID telemetry feature id to obtain a private copy of
struct pmt_feature_group that contains all discovered/enumerated
telemetry aggregator data for all event groups (known and unknown
to resctrl) of that feature id. Further processing on this structure
will enable all supported events in resctrl. Return the structure to
INTEL_PMT_TELEMETRY at resctrl exit time.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://github.com/intel/Intel-PMT # [1]
---
Note that checkpatch complains about this:
DEFINE_FREE(intel_pmt_put_feature_group, struct pmt_feature_group *,
if (!IS_ERR_OR_NULL(_T))
intel_pmt_put_feature_group(_T))
with:
CHECK: Alignment should match open parenthesis
But if the alignment is fixed, it then complains:
WARNING: Statements should start on a tabstop
---
arch/x86/kernel/cpu/resctrl/internal.h | 8 ++
arch/x86/kernel/cpu/resctrl/core.c | 5 +
arch/x86/kernel/cpu/resctrl/intel_aet.c | 144 ++++++++++++++++++++++++
arch/x86/Kconfig | 13 +++
arch/x86/kernel/cpu/resctrl/Makefile | 1 +
5 files changed, 171 insertions(+)
create mode 100644 arch/x86/kernel/cpu/resctrl/intel_aet.c
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 14fadcff0d2b..886261a82b81 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -217,4 +217,12 @@ void __init intel_rdt_mbm_apply_quirk(void);
void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r);
+#ifdef CONFIG_X86_CPU_RESCTRL_INTEL_AET
+bool intel_aet_get_events(void);
+void __exit intel_aet_exit(void);
+#else
+static inline bool intel_aet_get_events(void) { return false; }
+static inline void __exit intel_aet_exit(void) { }
+#endif
+
#endif /* _ASM_X86_RESCTRL_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 64c6f507b7bc..9003a6344410 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -734,6 +734,9 @@ void resctrl_arch_pre_mount(void)
if (!atomic_try_cmpxchg(&only_once, &old, 1))
return;
+
+ if (!intel_aet_get_events())
+ return;
}
enum {
@@ -1091,6 +1094,8 @@ late_initcall(resctrl_arch_late_init);
static void __exit resctrl_arch_exit(void)
{
+ intel_aet_exit();
+
cpuhp_remove_state(rdt_online);
resctrl_exit();
diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
new file mode 100644
index 000000000000..966c840f0d6b
--- /dev/null
+++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
@@ -0,0 +1,144 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Resource Director Technology(RDT)
+ * - Intel Application Energy Telemetry
+ *
+ * Copyright (C) 2025 Intel Corporation
+ *
+ * Author:
+ * Tony Luck <tony.luck@intel.com>
+ */
+
+#define pr_fmt(fmt) "resctrl: " fmt
+
+#include <linux/array_size.h>
+#include <linux/cleanup.h>
+#include <linux/cpu.h>
+#include <linux/err.h>
+#include <linux/init.h>
+#include <linux/intel_pmt_features.h>
+#include <linux/intel_vsec.h>
+#include <linux/overflow.h>
+#include <linux/resctrl.h>
+#include <linux/stddef.h>
+#include <linux/types.h>
+
+#include "internal.h"
+
+/**
+ * struct event_group - All information about a group of telemetry events.
+ * @pfg: Points to the aggregated telemetry space information
+ * returned by the intel_pmt_get_regions_by_feature()
+ * call to the INTEL_PMT_TELEMETRY driver that contains
+ * data for all telemetry regions of a specific type.
+ * Valid if the system supports the event group.
+ * NULL otherwise.
+ * @guid: Unique number per XML description file.
+ */
+struct event_group {
+ /* Data fields for additional structures to manage this group. */
+ struct pmt_feature_group *pfg;
+
+ /* Remaining fields initialized from XML file. */
+ u32 guid;
+};
+
+/*
+ * Link: https://github.com/intel/Intel-PMT
+ * File: xml/CWF/OOBMSM/RMID-ENERGY/cwf_aggregator.xml
+ */
+static struct event_group energy_0x26696143 = {
+ .guid = 0x26696143,
+};
+
+/*
+ * Link: https://github.com/intel/Intel-PMT
+ * File: xml/CWF/OOBMSM/RMID-PERF/cwf_aggregator.xml
+ */
+static struct event_group perf_0x26557651 = {
+ .guid = 0x26557651,
+};
+
+static struct event_group *known_energy_event_groups[] = {
+ &energy_0x26696143,
+};
+
+static struct event_group *known_perf_event_groups[] = {
+ &perf_0x26557651,
+};
+
+#define for_each_enabled_event_group(_peg, _grp) \
+ for (_peg = (_grp); _peg < &_grp[ARRAY_SIZE(_grp)]; _peg++) \
+ if ((*_peg)->pfg)
+
+/* Stub for now */
+static bool enable_events(struct event_group *e, struct pmt_feature_group *p)
+{
+ return false;
+}
+
+DEFINE_FREE(intel_pmt_put_feature_group, struct pmt_feature_group *,
+ if (!IS_ERR_OR_NULL(_T))
+ intel_pmt_put_feature_group(_T))
+
+/*
+ * Make a request to the INTEL_PMT_TELEMETRY driver for a copy of the
+ * pmt_feature_group for a specific feature. If there is one, the returned
+ * structure has an array of telemetry_region structures. Each describes
+ * one telemetry aggregator.
+ * Try to use every telemetry aggregator with a known guid.
+ */
+static bool get_pmt_feature(enum pmt_feature_id feature, struct event_group **evgs,
+ unsigned int num_evg)
+{
+ struct pmt_feature_group *p __free(intel_pmt_put_feature_group) = NULL;
+ struct event_group **peg;
+ bool ret;
+
+ p = intel_pmt_get_regions_by_feature(feature);
+
+ if (IS_ERR_OR_NULL(p))
+ return false;
+
+ for (peg = evgs; peg < &evgs[num_evg]; peg++) {
+ ret = enable_events(*peg, p);
+ if (ret) {
+ (*peg)->pfg = no_free_ptr(p);
+ return true;
+ }
+ }
+
+ return false;
+}
+
+/*
+ * Ask INTEL_PMT_TELEMETRY driver for all the RMID based telemetry groups
+ * that it supports.
+ */
+bool intel_aet_get_events(void)
+{
+ bool ret1, ret2;
+
+ ret1 = get_pmt_feature(FEATURE_PER_RMID_ENERGY_TELEM,
+ known_energy_event_groups,
+ ARRAY_SIZE(known_energy_event_groups));
+ ret2 = get_pmt_feature(FEATURE_PER_RMID_PERF_TELEM,
+ known_perf_event_groups,
+ ARRAY_SIZE(known_perf_event_groups));
+
+ return ret1 || ret2;
+}
+
+void __exit intel_aet_exit(void)
+{
+ struct event_group **peg;
+
+ for_each_enabled_event_group(peg, known_energy_event_groups) {
+ intel_pmt_put_feature_group((*peg)->pfg);
+ (*peg)->pfg = NULL;
+ }
+ for_each_enabled_event_group(peg, known_perf_event_groups) {
+ intel_pmt_put_feature_group((*peg)->pfg);
+ (*peg)->pfg = NULL;
+ }
+}
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 52c8910ba2ef..ce9d086625c1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -525,6 +525,19 @@ config X86_CPU_RESCTRL
Say N if unsure.
+config X86_CPU_RESCTRL_INTEL_AET
+ bool "Intel Application Energy Telemetry"
+ depends on X86_CPU_RESCTRL && CPU_SUP_INTEL && INTEL_PMT_TELEMETRY=y && INTEL_TPMI=y
+ help
+ Enable per-RMID telemetry events in resctrl.
+
+ Intel feature that collects per-RMID execution data
+ about energy consumption, measure of frequency independent
+ activity and other performance metrics. Data is aggregated
+ per package.
+
+ Say N if unsure.
+
config X86_FRED
bool "Flexible Return and Event Delivery"
depends on X86_64
diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/resctrl/Makefile
index d8a04b195da2..273ddfa30836 100644
--- a/arch/x86/kernel/cpu/resctrl/Makefile
+++ b/arch/x86/kernel/cpu/resctrl/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_X86_CPU_RESCTRL) += core.o rdtgroup.o monitor.o
obj-$(CONFIG_X86_CPU_RESCTRL) += ctrlmondata.o
+obj-$(CONFIG_X86_CPU_RESCTRL_INTEL_AET) += intel_aet.o
obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK) += pseudo_lock.o
# To allow define_trace.h's recursive include:
--
2.51.0
next prev parent reply other threads:[~2025-09-25 20:04 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-25 20:02 [PATCH v11 00/31] x86,fs/resctrl telemetry monitoring Tony Luck
2025-09-25 20:02 ` [PATCH v11 01/31] x86,fs/resctrl: Improve domain type checking Tony Luck
2025-10-03 15:28 ` Reinette Chatre
2025-09-25 20:02 ` [PATCH v11 02/31] x86/resctrl: Move L3 initialization into new helper function Tony Luck
2025-10-03 15:28 ` Reinette Chatre
2025-09-25 20:02 ` [PATCH v11 03/31] x86,fs/resctrl: Refactor domain_remove_cpu_mon() ready for new domain types Tony Luck
2025-10-03 15:29 ` Reinette Chatre
2025-09-25 20:02 ` [PATCH v11 04/31] x86/resctrl: Clean up domain_remove_cpu_ctrl() Tony Luck
2025-10-03 15:30 ` Reinette Chatre
2025-09-25 20:02 ` [PATCH v11 05/31] x86,fs/resctrl: Refactor domain create/remove using struct rdt_domain_hdr Tony Luck
2025-10-03 15:33 ` Reinette Chatre
2025-10-03 22:55 ` Luck, Tony
2025-10-06 21:32 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 06/31] x86,fs/resctrl: Use struct rdt_domain_hdr when reading counters Tony Luck
2025-10-03 15:34 ` Reinette Chatre
2025-10-03 22:59 ` Luck, Tony
2025-09-25 20:03 ` [PATCH v11 07/31] x86,fs/resctrl: Rename struct rdt_mon_domain and rdt_hw_mon_domain Tony Luck
2025-10-03 23:24 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 08/31] x86,fs/resctrl: Rename some L3 specific functions Tony Luck
2025-10-03 23:24 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 09/31] fs/resctrl: Make event details accessible to functions when reading events Tony Luck
2025-10-03 23:27 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 10/31] x86,fs/resctrl: Handle events that can be read from any CPU Tony Luck
2025-10-03 23:32 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 11/31] x86,fs/resctrl: Support binary fixed point event counters Tony Luck
2025-09-25 20:03 ` [PATCH v11 12/31] x86,fs/resctrl: Add an architectural hook called for each mount Tony Luck
2025-09-25 20:03 ` [PATCH v11 13/31] x86,fs/resctrl: Add and initialize rdt_resource for package scope monitor Tony Luck
2025-09-25 20:03 ` Tony Luck [this message]
2025-10-03 23:35 ` [PATCH v11 14/31] x86/resctrl: Discover hardware telemetry events Reinette Chatre
2025-10-06 18:19 ` Luck, Tony
2025-10-06 21:33 ` Reinette Chatre
2025-10-06 21:47 ` Luck, Tony
2025-10-07 20:47 ` Luck, Tony
2025-10-08 17:12 ` Reinette Chatre
2025-10-08 17:20 ` Luck, Tony
2025-09-25 20:03 ` [PATCH v11 15/31] x86,fs/resctrl: Fill in details of events for guid 0x26696143 and 0x26557651 Tony Luck
2025-09-25 20:03 ` [PATCH v11 16/31] x86,fs/resctrl: Add architectural event pointer Tony Luck
2025-10-03 23:38 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 17/31] x86/resctrl: Find and enable usable telemetry events Tony Luck
2025-10-03 23:52 ` Reinette Chatre
2025-10-06 19:58 ` Luck, Tony
2025-10-06 21:33 ` Reinette Chatre
2025-10-06 21:54 ` Luck, Tony
2025-09-25 20:03 ` [PATCH v11 18/31] fs/resctrl: Refactor L3 specific parts of __mon_event_count() Tony Luck
2025-10-03 23:56 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 19/31] x86/resctrl: Read telemetry events Tony Luck
2025-09-25 20:03 ` [PATCH v11 20/31] fs/resctrl: Refactor Sub-NUMA Cluster (SNC) in mkdir/rmdir code flow Tony Luck
2025-10-03 23:58 ` Reinette Chatre
2025-10-06 23:10 ` Luck, Tony
2025-10-08 17:12 ` Reinette Chatre
2025-10-08 21:15 ` Luck, Tony
2025-10-08 22:12 ` Reinette Chatre
2025-10-08 22:29 ` Luck, Tony
2025-10-09 2:16 ` Reinette Chatre
2025-10-09 17:45 ` Luck, Tony
2025-10-09 20:29 ` Reinette Chatre
2025-10-09 21:31 ` Luck, Tony
2025-10-09 21:46 ` Reinette Chatre
2025-10-09 22:08 ` Luck, Tony
2025-10-10 0:16 ` Reinette Chatre
2025-10-10 1:14 ` Luck, Tony
2025-10-10 1:54 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 21/31] x86/resctrl: Handle domain creation/deletion for RDT_RESOURCE_PERF_PKG Tony Luck
2025-10-04 0:00 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 22/31] x86/resctrl: Add energy/perf choices to rdt boot option Tony Luck
2025-09-25 20:03 ` [PATCH v11 23/31] x86/resctrl: Handle number of RMIDs supported by telemetry resources Tony Luck
2025-10-04 0:06 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 24/31] fs/resctrl: Move allocation/free of closid_num_dirty_rmid[] Tony Luck
2025-10-04 0:09 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 25/31] fs,x86/resctrl: Compute number of RMIDs as minimum across resources Tony Luck
2025-10-04 0:10 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 26/31] fs/resctrl: Move RMID initialization to first mount Tony Luck
2025-10-04 0:12 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 27/31] x86/resctrl: Enable RDT_RESOURCE_PERF_PKG Tony Luck
2025-10-04 0:23 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 28/31] fs/resctrl: Provide interface to create architecture specific debugfs area Tony Luck
2025-09-25 20:03 ` [PATCH v11 29/31] x86/resctrl: Add debugfs files to show telemetry aggregator status Tony Luck
2025-10-04 0:23 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 30/31] x86,fs/resctrl: Update Documentation for package events Tony Luck
2025-10-04 0:25 ` Reinette Chatre
2025-09-25 20:03 ` [PATCH v11 31/31] fs/resctrl: Some kerneldoc updates Tony Luck
2025-10-04 0:26 ` Reinette Chatre
2025-10-06 16:54 ` Luck, Tony
2025-10-06 21:34 ` Reinette Chatre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250925200328.64155-15-tony.luck@intel.com \
--to=tony.luck@intel.com \
--cc=Dave.Martin@arm.com \
--cc=babu.moger@amd.com \
--cc=dfustini@baylibre.com \
--cc=fenghuay@nvidia.com \
--cc=james.morse@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maciej.wieczor-retman@intel.com \
--cc=patches@lists.linux.dev \
--cc=peternewman@google.com \
--cc=reinette.chatre@intel.com \
--cc=x86@kernel.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).