From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghuay@nvidia.com>,
Reinette Chatre <reinette.chatre@intel.com>,
Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>,
Peter Newman <peternewman@google.com>,
James Morse <james.morse@arm.com>,
Babu Moger <babu.moger@amd.com>,
Drew Fustini <dfustini@baylibre.com>,
Dave Martin <Dave.Martin@arm.com>, Chen Yu <yu.c.chen@intel.com>,
x86@kernel.org, LKML <linux-kernel@vger.kernel.org>,
patches@lists.linux.dev
Subject: Re: [PATCH v9 17/31] x86/resctrl: Discover hardware telemetry events
Date: Mon, 1 Sep 2025 11:39:56 +0300 (EEST) [thread overview]
Message-ID: <1cc035b1-7bd8-8fa9-e3d5-f530bcdec517@linux.intel.com> (raw)
In-Reply-To: <20250829193346.31565-18-tony.luck@intel.com>
On Fri, 29 Aug 2025, Tony Luck wrote:
> Each CPU collects data for telemetry events that it sends to a nearby
> telemetry event aggregator either when the value of IA32_PQR_ASSOC.RMID
> changed, or when a two millisecond timer expires.
>
> The telemetry event aggregators maintain per-RMID per-event counts of
> the total seen for all the CPUs. There may be more than one telemetry
> event aggregator per package.
>
> Each telemetry event aggregator is responsible for a specific group of
> events. E.g. on the Intel Clearwater Forest CPU there are two types of
> aggregators. One type tracks a pair of energy related events. The other
> type tracks a subset of "perf" type events.
>
> The event counts are made available to Linux in a region of MMIO space
> for each aggregator. All details about the layout of counters in each
> aggregator MMIO region are described in XML files published by Intel and
> made available in a GitHub repository: https://github.com/intel/Intel-PMT.
>
> The key to matching a specific telemetry aggregator to the XML file that
> describes the MMIO layout is a 32-bit value. The Linux telemetry subsystem
> refers to this as a "guid" while the XML files call it a "uniqueid".
>
> Each XML file provides the following information:
> 1) Which telemetry events are included in the group.
> 2) The order in which the event counters appear for each RMID.
> 3) The value type of each event counter (integer or fixed-point).
> 4) The number of RMIDs supported.
> 5) Which additional aggregator status registers are included.
> 6) The total size of the MMIO region for this aggregator.
>
> The INTEL_PMT_TELEMETRY driver enumerates support for telemetry events.
> This driver provides intel_pmt_get_regions_by_feature() to list all
> available telemetry event aggregators. The list includes the "guid",
> the base address in MMIO space for the region where the event counters
> are exposed, and the package id where the CPUs that report to this
> aggregator are located.
>
> Add a new Kconfig option CONFIG_X86_CPU_RESCTRL_INTEL_AET for the Intel
> specific parts of telemetry code. This depends on the INTEL_PMT_TELEMETRY
> and INTEL_TPMI drivers being built-in to the kernel for enumeration of
> telemetry features.
>
> Call intel_pmt_get_regions_by_feature() for each pmt_feature_id that
> indicates per-RMID telemetry.
>
> Save the returned pmt_feature_group pointers with guids that are known
> to resctrl for use at run time. Those pointers are returned to the
Extra space
> INTEL_PMT_TELEMETRY subsystem at resctrl_arch_exit() time.
>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> Note that checkpatch complains about this:
>
> DEFINE_FREE(intel_pmt_put_feature_group, struct pmt_feature_group *,
> if (!IS_ERR_OR_NULL(_T))
> intel_pmt_put_feature_group(_T))
> with:
> CHECK: Alignment should match open parenthesis
>
> But if the alignment is fixed, it then complains:
> WARNING: Statements should start on a tabstop
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 8 ++
> arch/x86/kernel/cpu/resctrl/core.c | 5 +
> arch/x86/kernel/cpu/resctrl/intel_aet.c | 133 ++++++++++++++++++++++++
> arch/x86/Kconfig | 13 +++
> arch/x86/kernel/cpu/resctrl/Makefile | 1 +
> 5 files changed, 160 insertions(+)
> create mode 100644 arch/x86/kernel/cpu/resctrl/intel_aet.c
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 6b3f3203edc4..9ddfbbe5c3cf 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -169,4 +169,12 @@ void __init intel_rdt_mbm_apply_quirk(void);
>
> void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>
> +#ifdef CONFIG_X86_CPU_RESCTRL_INTEL_AET
> +bool intel_aet_get_events(void);
> +void __exit intel_aet_exit(void);
> +#else
> +static inline bool intel_aet_get_events(void) { return false; }
> +static inline void __exit intel_aet_exit(void) { }
> +#endif
> +
> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 566530c6dbc3..57b34e1dc088 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -735,6 +735,9 @@ void resctrl_arch_pre_mount(void)
>
> if (!atomic_try_cmpxchg(&only_once, &old, 1))
> return;
> +
> + if (!intel_aet_get_events())
> + return;
> }
>
> enum {
> @@ -1087,6 +1090,8 @@ late_initcall(resctrl_arch_late_init);
>
> static void __exit resctrl_arch_exit(void)
> {
> + intel_aet_exit();
> +
> cpuhp_remove_state(rdt_online);
>
> resctrl_exit();
> diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> new file mode 100644
> index 000000000000..45cadbb87dc8
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> @@ -0,0 +1,133 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Resource Director Technology(RDT)
> + * - Intel Application Energy Telemetry
> + *
> + * Copyright (C) 2025 Intel Corporation
> + *
> + * Author:
> + * Tony Luck <tony.luck@intel.com>
> + */
> +
> +#define pr_fmt(fmt) "resctrl: " fmt
> +
> +#include <linux/cleanup.h>
> +#include <linux/cpu.h>
> +#include <linux/intel_vsec.h>
> +#include <linux/resctrl.h>
> +
> +#include "internal.h"
> +
> +/**
> + * struct event_group - All information about a group of telemetry events.
> + * @pfg: Points to the aggregated telemetry space information
> + * within the INTEL_PMT_TELEMETRY driver that contains data for all
> + * telemetry regions.
> + * @guid: Unique number per XML description file.
> + */
> +struct event_group {
> + /* Data fields for additional structures to manage this group. */
> + struct pmt_feature_group *pfg;
> +
> + /* Remaining fields initialized from XML file. */
> + u32 guid;
> +};
> +
> +/*
> + * Link: https://github.com/intel/Intel-PMT
> + * File: xml/CWF/OOBMSM/RMID-ENERGY/cwf_aggregator.xml
> + */
> +static struct event_group energy_0x26696143 = {
> + .guid = 0x26696143,
> +};
> +
> +/*
> + * Link: https://github.com/intel/Intel-PMT
> + * File: xml/CWF/OOBMSM/RMID-PERF/cwf_aggregator.xml
> + */
> +static struct event_group perf_0x26557651 = {
> + .guid = 0x26557651,
> +};
> +
> +static struct event_group *known_energy_event_groups[] = {
> + &energy_0x26696143,
> +};
> +
> +static struct event_group *known_perf_event_groups[] = {
> + &perf_0x26557651,
> +};
> +
> +#define for_each_enabled_event_group(_peg, _grp) \
> + for (_peg = _grp; _peg < &_grp[ARRAY_SIZE(_grp)]; _peg++) \
+ linux/array_size.h
Wouldn't it be wiser to protect the args even if the current users seem
fine?
> + if ((*_peg)->pfg)
> +
> +/* Stub for now */
> +static bool enable_events(struct event_group *e, struct pmt_feature_group *p)
> +{
> + return false;
> +}
> +
> +DEFINE_FREE(intel_pmt_put_feature_group, struct pmt_feature_group *,
> + if (!IS_ERR_OR_NULL(_T))
> + intel_pmt_put_feature_group(_T))
> +
> +/*
> + * Make a request to the INTEL_PMT_TELEMETRY driver for the pmt_feature_group
> + * for a specific feature. If there is one the returned structure has an array
Missing comma?
> + * of telemetry_region structures. Each describes one telemetry aggregator.
> + * Try to use every telemetry aggregator with a known guid.
> + */
> +static bool get_pmt_feature(enum pmt_feature_id feature, struct event_group **evgs,
> + unsigned int num_evg)
> +{
> + struct pmt_feature_group *p __free(intel_pmt_put_feature_group) = NULL;
> + struct event_group **peg;
> + int ret;
> +
> + p = intel_pmt_get_regions_by_feature(feature);
> +
> + if (IS_ERR_OR_NULL(p))
+ linux/err.h
> + return false;
> +
> + for (peg = evgs; peg < &evgs[num_evg]; peg++) {
> + ret = enable_events(*peg, p);
> + if (ret) {
This is super misleading. You place a bool returning function's return
value into "int ret" and then use the value like bool.
> + (*peg)->pfg = no_free_ptr(p);
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
> +/*
> + * Ask INTEL_PMT_TELEMETRY driver for all the RMID based telemetry groups
> + * that it supports.
> + */
> +bool intel_aet_get_events(void)
> +{
> + bool ret1, ret2;
> +
> + ret1 = get_pmt_feature(FEATURE_PER_RMID_ENERGY_TELEM,
> + known_energy_event_groups,
> + ARRAY_SIZE(known_energy_event_groups));
> + ret2 = get_pmt_feature(FEATURE_PER_RMID_PERF_TELEM,
> + known_perf_event_groups,
> + ARRAY_SIZE(known_perf_event_groups));
> +
> + return ret1 || ret2;
> +}
> +
> +void __exit intel_aet_exit(void)
> +{
> + struct event_group **peg;
> +
> + for_each_enabled_event_group(peg, known_energy_event_groups) {
> + intel_pmt_put_feature_group((*peg)->pfg);
> + (*peg)->pfg = NULL;
> + }
> + for_each_enabled_event_group(peg, known_perf_event_groups) {
> + intel_pmt_put_feature_group((*peg)->pfg);
> + (*peg)->pfg = NULL;
> + }
> +}
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 58d890fe2100..50051fdf4659 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -525,6 +525,19 @@ config X86_CPU_RESCTRL
>
> Say N if unsure.
>
> +config X86_CPU_RESCTRL_INTEL_AET
> + bool "Intel Application Energy Telemetry" if INTEL_PMT_TELEMETRY=y && INTEL_TPMI=y
> + depends on X86_CPU_RESCTRL && CPU_SUP_INTEL
> + help
> + Enable per-RMID telemetry events in resctrl.
> +
> + Intel feature that collects per-RMID execution data
> + about energy consumption, measure of frequency independent
> + activity and other performance metrics. Data is aggregated
> + per package.
> +
> + Say N if unsure.
> +
> config X86_FRED
> bool "Flexible Return and Event Delivery"
> depends on X86_64
> diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/resctrl/Makefile
> index d8a04b195da2..273ddfa30836 100644
> --- a/arch/x86/kernel/cpu/resctrl/Makefile
> +++ b/arch/x86/kernel/cpu/resctrl/Makefile
> @@ -1,6 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0
> obj-$(CONFIG_X86_CPU_RESCTRL) += core.o rdtgroup.o monitor.o
> obj-$(CONFIG_X86_CPU_RESCTRL) += ctrlmondata.o
> +obj-$(CONFIG_X86_CPU_RESCTRL_INTEL_AET) += intel_aet.o
> obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK) += pseudo_lock.o
>
> # To allow define_trace.h's recursive include:
>
--
i.
next prev parent reply other threads:[~2025-09-01 8:40 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-29 19:33 [PATCH v9 00/31] x86,fs/resctrl telemetry monitoring Tony Luck
2025-08-29 19:33 ` [PATCH v9 01/31] x86,fs/resctrl: Consolidate monitor event descriptions Tony Luck
2025-08-29 19:33 ` [PATCH v9 02/31] x86,fs/resctrl: Replace architecture event enabled checks Tony Luck
2025-08-29 19:33 ` [PATCH v9 03/31] x86/resctrl: Remove 'rdt_mon_features' global variable Tony Luck
2025-08-29 19:33 ` [PATCH v9 04/31] x86,fs/resctrl: Prepare for more monitor events Tony Luck
2025-08-29 19:33 ` [PATCH v9 05/31] x86,fs/resctrl: Improve domain type checking Tony Luck
2025-08-29 19:33 ` [PATCH v9 06/31] x86/resctrl: Move L3 initialization into new helper function Tony Luck
2025-08-29 19:33 ` [PATCH v9 07/31] x86,fs/resctrl: Refactor domain_remove_cpu_mon() ready for new domain types Tony Luck
2025-08-29 19:33 ` [PATCH v9 08/31] x86/resctrl: Clean up domain_remove_cpu_ctrl() Tony Luck
2025-08-29 19:33 ` [PATCH v9 09/31] x86,fs/resctrl: Use struct rdt_domain_hdr instead of struct rdt_mon_domain Tony Luck
2025-08-29 19:33 ` [PATCH v9 10/31] x86,fs/resctrl: Rename struct rdt_mon_domain and rdt_hw_mon_domain Tony Luck
2025-08-29 19:33 ` [PATCH v9 11/31] x86,fs/resctrl: Rename some L3 specific functions Tony Luck
2025-08-29 19:33 ` [PATCH v9 12/31] fs/resctrl: Make event details accessible to functions when reading events Tony Luck
2025-08-29 19:33 ` [PATCH v9 13/31] x86,fs/resctrl: Handle events that can be read from any CPU Tony Luck
2025-08-29 19:33 ` [PATCH v9 14/31] x86,fs/resctrl: Support binary fixed point event counters Tony Luck
2025-08-29 19:33 ` [PATCH v9 15/31] x86,fs/resctrl: Add an architectural hook called for each mount Tony Luck
2025-08-29 19:33 ` [PATCH v9 16/31] x86,fs/resctrl: Add and initialize rdt_resource for package scope monitor Tony Luck
2025-08-29 19:33 ` [PATCH v9 17/31] x86/resctrl: Discover hardware telemetry events Tony Luck
2025-09-01 8:39 ` Ilpo Järvinen [this message]
2025-09-03 18:12 ` Luck, Tony
2025-08-29 19:33 ` [PATCH v9 18/31] x86,fs/resctrl: Fill in details of events for guid 0x26696143 and 0x26557651 Tony Luck
2025-09-01 8:57 ` Ilpo Järvinen
2025-08-29 19:33 ` [PATCH v9 19/31] x86,fs/resctrl: Add architectural event pointer Tony Luck
2025-08-29 19:33 ` [PATCH v9 20/31] x86/resctrl: Find and enable usable telemetry events Tony Luck
2025-09-01 8:58 ` Ilpo Järvinen
2025-09-03 18:19 ` Luck, Tony
2025-08-29 19:33 ` [PATCH v9 21/31] x86/resctrl: Read " Tony Luck
2025-09-01 9:15 ` Ilpo Järvinen
2025-09-03 18:24 ` Luck, Tony
2025-08-29 19:33 ` [PATCH v9 22/31] x86/resctrl: Handle domain creation/deletion for RDT_RESOURCE_PERF_PKG Tony Luck
2025-08-29 19:33 ` [PATCH v9 23/31] x86/resctrl: Add energy/perf choices to rdt boot option Tony Luck
2025-08-29 19:33 ` [PATCH v9 24/31] x86/resctrl: Handle number of RMIDs supported by telemetry resources Tony Luck
2025-08-29 19:33 ` [PATCH v9 25/31] fs/resctrl: Move allocation/free of closid_num_dirty_rmid Tony Luck
2025-08-29 19:33 ` [PATCH v9 26/31] fs,x86/resctrl: Compute number of RMIDs as minimum across resources Tony Luck
2025-08-29 19:33 ` [PATCH v9 27/31] fs/resctrl: Move RMID initialization to first mount Tony Luck
2025-08-29 19:33 ` [PATCH v9 28/31] x86/resctrl: Enable RDT_RESOURCE_PERF_PKG Tony Luck
2025-08-29 19:33 ` [PATCH v9 29/31] fs/resctrl: Provide interface to create architecture specific debugfs area Tony Luck
2025-08-29 19:33 ` [PATCH v9 30/31] x86/resctrl: Add debugfs files to show telemetry aggregator status Tony Luck
2025-08-29 19:33 ` [PATCH v9 31/31] x86,fs/resctrl: Update Documentation for package events Tony Luck
2025-09-03 18:27 ` [PATCH v9 00/31] x86,fs/resctrl telemetry monitoring Luck, Tony
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1cc035b1-7bd8-8fa9-e3d5-f530bcdec517@linux.intel.com \
--to=ilpo.jarvinen@linux.intel.com \
--cc=Dave.Martin@arm.com \
--cc=babu.moger@amd.com \
--cc=dfustini@baylibre.com \
--cc=fenghuay@nvidia.com \
--cc=james.morse@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maciej.wieczor-retman@intel.com \
--cc=patches@lists.linux.dev \
--cc=peternewman@google.com \
--cc=reinette.chatre@intel.com \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).