Archive-only list for patches
 help / color / mirror / Atom feed
From: Tony Luck <tony.luck@intel.com>
To: Fenghua Yu <fenghuay@nvidia.com>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>,
	Peter Newman <peternewman@google.com>,
	James Morse <james.morse@arm.com>,
	Babu Moger <babu.moger@amd.com>,
	Drew Fustini <dfustini@baylibre.com>,
	Dave Martin <Dave.Martin@arm.com>,
	Anil Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: linux-kernel@vger.kernel.org, patches@lists.linux.dev,
	Tony Luck <tony.luck@intel.com>
Subject: [PATCH v3 19/26] x86/resctrl: Sanity check telemetry RMID values
Date: Mon,  7 Apr 2025 16:40:21 -0700	[thread overview]
Message-ID: <20250407234032.241215-20-tony.luck@intel.com> (raw)
In-Reply-To: <20250407234032.241215-1-tony.luck@intel.com>

There are three values of interest:
1) The number of RMIDs supported by the CPU core. This is enumerated by
   CPUID leaf 0xF. Linux saves the value in boot_cpu_data.x86_cache_max_rmid.
2) The number of counter registers in each telemetry region. This is
   described in the XML file for the region. Linux hard codes it into
   the struct telem_entry..num_rmids field.
3) The maximum number of RMIDs that can be tracked simultaneously for
   a telemetry region. This is provided in the structures received from
   the intel_pmt_get_regions_by_feature() calls.

Print appropriate warnings if these values do not match.

TODO: Need a better UI. The number of implemented counters can be
different per telemetry region.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/resctrl/intel_aet.c | 31 +++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
index 67a1245858dc..0bcbac326bee 100644
--- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
+++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
@@ -13,6 +13,7 @@
 
 #include <linux/cpu.h>
 #include <linux/cleanup.h>
+#include <linux/minmax.h>
 #include "fake_intel_aet_features.h"
 #include <linux/intel_vsec.h>
 #include <linux/resctrl.h>
@@ -51,6 +52,7 @@ struct pmt_event {
  * @last_overflow_tstamp_off:	Offset of overflow timestamp
  * @last_update_tstamp_off:	Offset of last update timestamp
  * @active:			Marks this group as active on this system
+ * @rmid_warned:		Set to stop multiple rmid sanity warnings
  * @num_events:			Size of @evts array
  * @evts:			Telemetry events in this group
  */
@@ -63,6 +65,7 @@ struct telem_entry {
 	int	last_overflow_tstamp_off;
 	int	last_update_tstamp_off;
 	bool	active;
+	bool	rmid_warned;
 	int	num_events;
 	struct pmt_event evts[];
 };
@@ -84,6 +87,33 @@ static struct telem_entry *telem_entry[] = {
 	NULL
 };
 
+static void rmid_sanity_check(struct telemetry_region *tr, struct telem_entry *tentry)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_PERF_PKG].r_resctrl;
+	int system_rmids = boot_cpu_data.x86_cache_max_rmid + 1;
+
+	if (tentry->rmid_warned)
+		return;
+
+	if (tentry->num_rmids != system_rmids) {
+		pr_info("Telemetry region %s has %d RMIDs system supports %d\n",
+			tentry->name, tentry->num_rmids, system_rmids);
+		tentry->rmid_warned = true;
+	}
+
+	if (tr->num_rmids < tentry->num_rmids) {
+		pr_info("Telemetry region %s only supports %d simultaneous RMIDS\n",
+			tentry->name, tr->num_rmids);
+		tentry->rmid_warned = true;
+	}
+
+	/* info/PKG_PERF_MON/num_rmids reports number of guaranteed counters */
+	if (!r->num_rmid)
+		r->num_rmid = tr->num_rmids;
+	else
+		r->num_rmid = min((u32)r->num_rmid, tr->num_rmids);
+}
+
 /*
  * Scan a feature group looking for guids recognized
  * and update the per-package counts of known groups.
@@ -109,6 +139,7 @@ static bool count_events(struct pkg_info *pkg, int max_pkgs, struct pmt_feature_
 					pr_warn_once("MMIO region for guid 0x%x too small\n", tr->guid);
 					continue;
 				}
+				rmid_sanity_check(tr, *tentry);
 				found = true;
 				(*tentry)->active = true;
 				pkg[tr->plat_info.package_id].count++;
-- 
2.48.1


  parent reply	other threads:[~2025-04-07 23:41 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-07 23:40 [PATCH v3 00/26] x86/resctrl telemetry monitoring Tony Luck
2025-04-07 23:40 ` [PATCH v3 01/26] fs/resctrl: Simplify allocation of mon_data structures Tony Luck
2025-04-18 21:13   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 02/26] fs-x86/resctrl: Prepare for more monitor events Tony Luck
2025-04-18 21:17   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 03/26] fs/resctrl: Change how events are initialized Tony Luck
2025-04-18 21:22   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 04/26] fs/resctrl: Set up Kconfig options for telemetry events Tony Luck
2025-04-18 21:23   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 05/26] x86/rectrl: Fake OOBMSM interface Tony Luck
2025-04-18 21:27   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 06/26] fs-x86/rectrl: Improve domain type checking Tony Luck
2025-04-18 21:40   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 07/26] x86/resctrl: Move L3 initialization out of domain_add_cpu_mon() Tony Luck
2025-04-18 21:51   ` Reinette Chatre
2025-04-21 20:01     ` Luck, Tony
2025-04-22 18:18       ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 08/26] x86/resctrl: Refactor domain_remove_cpu_mon() ready for new domain types Tony Luck
2025-04-18 21:53   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 09/26] x86/resctrl: Change generic monitor functions to use struct rdt_domain_hdr Tony Luck
2025-04-18 22:42   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 10/26] fs/resctrl: Improve handling for events that can be read from any CPU Tony Luck
2025-04-18 22:54   ` Reinette Chatre
2025-04-21 20:28     ` Luck, Tony
2025-04-22 18:19       ` Reinette Chatre
2025-04-23  0:51         ` Luck, Tony
2025-04-23  3:37           ` Reinette Chatre
2025-04-23 13:27         ` Peter Newman
2025-04-23 15:47           ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 11/26] fs/resctrl: Add support for additional monitor event display formats Tony Luck
2025-04-18 23:02   ` Reinette Chatre
2025-04-21 19:34     ` Luck, Tony
2025-04-22 18:20       ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 12/26] fs/resctrl: Add hook for architecture code to set monitor event attributes Tony Luck
2025-04-18 23:11   ` Reinette Chatre
2025-04-21 19:50     ` Luck, Tony
2025-04-22 18:20       ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 13/26] fs/resctrl: Add an architectural hook called for each mount Tony Luck
2025-04-18 23:47   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 14/26] x86/resctrl: Add first part of telemetry event enumeration Tony Luck
2025-04-19  0:08   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 15/26] x86/resctrl: Second stage " Tony Luck
2025-04-19  0:30   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 16/26] x86/resctrl: Third phase " Tony Luck
2025-04-19  0:45   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 17/26] x86/resctrl: Build a lookup table for each resctrl event id Tony Luck
2025-04-19  0:48   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 18/26] x86/resctrl: Add code to read core telemetry events Tony Luck
2025-04-19  1:53   ` Reinette Chatre
2025-04-07 23:40 ` Tony Luck [this message]
2025-04-19  5:14   ` [PATCH v3 19/26] x86/resctrl: Sanity check telemetry RMID values Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 20/26] x86/resctrl: Add and initialize rdt_resource for package scope core monitor Tony Luck
2025-04-07 23:40 ` [PATCH v3 21/26] fs-x86/resctrl: Handle RDT_RESOURCE_PERF_PKG in domain create/delete Tony Luck
2025-04-19  5:22   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 22/26] fs/resctrl: Add type define for PERF_PKG files Tony Luck
2025-04-07 23:40 ` [PATCH v3 23/26] fs/resctrl: Add new telemetry event id and structures Tony Luck
2025-04-07 23:40 ` [PATCH v3 24/26] x86/resctrl: Final steps to enable RDT_RESOURCE_PERF_PKG Tony Luck
2025-04-07 23:40 ` [PATCH v3 25/26] fs-x86/resctrl: Add detailed descriptions for Clearwater Forest events Tony Luck
2025-04-19  5:30   ` Reinette Chatre
2025-04-07 23:40 ` [PATCH v3 26/26] x86/resctrl: Update Documentation for package events Tony Luck
2025-04-19  5:40   ` Reinette Chatre
2025-04-18 21:13 ` [PATCH v3 00/26] x86/resctrl telemetry monitoring Reinette Chatre
2025-04-21 18:57   ` Luck, Tony
2025-04-21 22:59     ` Reinette Chatre
2025-04-22 16:20       ` Luck, Tony
2025-04-22 21:30         ` Reinette Chatre
2025-04-19  5:47 ` Reinette Chatre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250407234032.241215-20-tony.luck@intel.com \
    --to=tony.luck@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=babu.moger@amd.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghuay@nvidia.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=reinette.chatre@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox