public inbox for patches@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules
@ 2026-03-30 21:43 Tony Luck
  2026-03-30 21:43 ` [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL Tony Luck
                   ` (6 more replies)
  0 siblings, 7 replies; 31+ messages in thread
From: Tony Luck @ 2026-03-30 21:43 UTC (permalink / raw)
  To: Fenghua Yu, Reinette Chatre, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches, Tony Luck

Using symbol_request() eliminates the AET (Application Energy Telemetry)
requirement that INTEL_TELEMETY_PMT and INTEL_TPMI be configured as built-in
to the base kernel.

To make this work cleanly, refactor resctrl code so that architecture
code is made aware of every mount/unmount operation so that state can
be cleaned up on each unmount (instead of the notianal call from
resctrl_arch_exit()).

Signed-off-by: Tony Luck <tony.luck@intel.com>
AI-review-of-v3: https://sashiko.dev/#/patchset/20260327230208.18094-1-tony.luck%40intel.com

Changes since v3[1]
Link: [1] https://lore.kernel.org/all/20260327230208.18094-1-tony.luck@intel.com/

1) Switched from symbol_get() to symbol_request() so that required
modules will be auto-loaded when needed instead of mounting the resctrl
file system without AET events.

2) Add warnings to resctrl_{en,dis}able_mon_event() to ensure they are
only used when the resctrl file system is not mounted.

3) Move the newly adding locking around mount/unmount from architecture
to file system code. This makes it much simpler, avoiding obtaining a
mutex in one function and releasing in another.

4) Reset rdt_resource::mon_capable to false as part of cleanup on
unmount or for other mount failures. I have not updated rdt_mon_capable
which technically needs to be an integer, incremented on mount,
decremented on unmount to keep track of how many monitoring resources
are enabled in the current mount cycle.

Tony Luck (7):
  platform/x86/intel/pmt: Export PMT enumeration functions as GPL
  x86/resctrl: Drop setting of event_group::force_off when insufficient
    RMIDs
  fs/resctrl: Add interface to disable a monitor event
  fs,x86/resctrl: Add architecture hooks for every mount/unmount
  x86/resctrl: Resolve PMT and TPMI symbols at runtime
  x86/resctrl: Delete intel_aet_exit()
  x86/resctrl: Downgrade dependency of AET on INTEL_PMT

 include/linux/resctrl.h                    |  13 ++-
 arch/x86/kernel/cpu/resctrl/internal.h     |  10 +-
 arch/x86/kernel/cpu/resctrl/core.c         |  16 +++-
 arch/x86/kernel/cpu/resctrl/intel_aet.c    | 102 ++++++++++++++++++---
 drivers/platform/x86/intel/pmt/telemetry.c |   4 +-
 fs/resctrl/monitor.c                       |  16 ++++
 fs/resctrl/rdtgroup.c                      |  43 ++++++---
 arch/x86/Kconfig                           |   2 +-
 8 files changed, 172 insertions(+), 34 deletions(-)


base-commit: 7aaa8047eafd0bd628065b15757d9b48c5f9c07d
-- 
2.53.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL
  2026-03-30 21:43 [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules Tony Luck
@ 2026-03-30 21:43 ` Tony Luck
  2026-04-04  0:00   ` Reinette Chatre
  2026-04-08  5:07   ` Christoph Hellwig
  2026-03-30 21:43 ` [PATCH v4 2/7] x86/resctrl: Drop setting of event_group::force_off when insufficient RMIDs Tony Luck
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 31+ messages in thread
From: Tony Luck @ 2026-03-30 21:43 UTC (permalink / raw)
  To: Fenghua Yu, Reinette Chatre, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches, Tony Luck

The symbol_get() function requires that objects be GPL licensed.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 drivers/platform/x86/intel/pmt/telemetry.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/x86/intel/pmt/telemetry.c b/drivers/platform/x86/intel/pmt/telemetry.c
index a52803bfe124..4504fb9fd83c 100644
--- a/drivers/platform/x86/intel/pmt/telemetry.c
+++ b/drivers/platform/x86/intel/pmt/telemetry.c
@@ -287,13 +287,13 @@ struct pmt_feature_group *intel_pmt_get_regions_by_feature(enum pmt_feature_id i
 
 	return no_free_ptr(feature_group);
 }
-EXPORT_SYMBOL(intel_pmt_get_regions_by_feature);
+EXPORT_SYMBOL_GPL(intel_pmt_get_regions_by_feature);
 
 void intel_pmt_put_feature_group(struct pmt_feature_group *feature_group)
 {
 	kref_put(&feature_group->kref, pmt_feature_group_release);
 }
-EXPORT_SYMBOL(intel_pmt_put_feature_group);
+EXPORT_SYMBOL_GPL(intel_pmt_put_feature_group);
 
 int pmt_telem_read(struct telem_endpoint *ep, u32 id, u64 *data, u32 count)
 {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 2/7] x86/resctrl: Drop setting of event_group::force_off when insufficient RMIDs
  2026-03-30 21:43 [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules Tony Luck
  2026-03-30 21:43 ` [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL Tony Luck
@ 2026-03-30 21:43 ` Tony Luck
  2026-04-04  0:01   ` Reinette Chatre
  2026-03-30 21:43 ` [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event Tony Luck
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Tony Luck @ 2026-03-30 21:43 UTC (permalink / raw)
  To: Fenghua Yu, Reinette Chatre, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches, Tony Luck

Setting the force_off flag does not matter when AET features are only
enumerated once (on first mount).

In preparation for enumeration on every mount, drop this because it
overrides user request with rdt= boot option to force enable a feature.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/resctrl/intel_aet.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
index 89b8b619d5d5..015627401ed2 100644
--- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
+++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
@@ -214,10 +214,8 @@ static bool all_regions_have_sufficient_rmid(struct event_group *e, struct pmt_f
 		if (!p->regions[i].addr)
 			continue;
 		tr = &p->regions[i];
-		if (tr->num_rmids < e->num_rmid) {
-			e->force_off = true;
+		if (tr->num_rmids < e->num_rmid)
 			return false;
-		}
 	}
 
 	return true;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event
  2026-03-30 21:43 [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules Tony Luck
  2026-03-30 21:43 ` [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL Tony Luck
  2026-03-30 21:43 ` [PATCH v4 2/7] x86/resctrl: Drop setting of event_group::force_off when insufficient RMIDs Tony Luck
@ 2026-03-30 21:43 ` Tony Luck
  2026-04-04  0:03   ` Reinette Chatre
  2026-03-30 21:43 ` [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount Tony Luck
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Tony Luck @ 2026-03-30 21:43 UTC (permalink / raw)
  To: Fenghua Yu, Reinette Chatre, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches, Tony Luck

Architecture code can ask file system code to enable events. But there
is no way to clean up and disable events.

Add resctrl_disable_mon_event(). Adding/removing events is only
possible when the file system is not mounted. Add a WARN_ON() to
catch misuse.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 include/linux/resctrl.h |  1 +
 fs/resctrl/monitor.c    | 16 ++++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 006e57fd7ca5..b312aaf76974 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -416,6 +416,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
 
 bool resctrl_enable_mon_event(enum resctrl_event_id eventid, bool any_cpu,
 			      unsigned int binary_bits, void *arch_priv);
+void resctrl_disable_mon_event(enum resctrl_event_id eventid);
 
 bool resctrl_is_mon_event_enabled(enum resctrl_event_id eventid);
 
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 49f3f6b846b2..50eb5534767c 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -990,6 +990,8 @@ struct mon_evt mon_event_all[QOS_NUM_EVENTS] = {
 bool resctrl_enable_mon_event(enum resctrl_event_id eventid, bool any_cpu,
 			      unsigned int binary_bits, void *arch_priv)
 {
+	if (WARN_ON(resctrl_mounted))
+		return false;
 	if (WARN_ON_ONCE(eventid < QOS_FIRST_EVENT || eventid >= QOS_NUM_EVENTS ||
 			 binary_bits > MAX_BINARY_BITS))
 		return false;
@@ -1010,6 +1012,20 @@ bool resctrl_enable_mon_event(enum resctrl_event_id eventid, bool any_cpu,
 	return true;
 }
 
+void resctrl_disable_mon_event(enum resctrl_event_id eventid)
+{
+	if (WARN_ON(resctrl_mounted))
+		return;
+	if (WARN_ON_ONCE(eventid < QOS_FIRST_EVENT || eventid >= QOS_NUM_EVENTS))
+		return;
+	if (!mon_event_all[eventid].enabled) {
+		pr_warn("Repeat disable for event %d\n", eventid);
+		return;
+	}
+
+	mon_event_all[eventid].enabled = false;
+}
+
 bool resctrl_is_mon_event_enabled(enum resctrl_event_id eventid)
 {
 	return eventid >= QOS_FIRST_EVENT && eventid < QOS_NUM_EVENTS &&
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
  2026-03-30 21:43 [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules Tony Luck
                   ` (2 preceding siblings ...)
  2026-03-30 21:43 ` [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event Tony Luck
@ 2026-03-30 21:43 ` Tony Luck
  2026-04-04  0:52   ` Reinette Chatre
  2026-03-30 21:43 ` [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime Tony Luck
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Tony Luck @ 2026-03-30 21:43 UTC (permalink / raw)
  To: Fenghua Yu, Reinette Chatre, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches, Tony Luck

Add hooks for every mount/unmount of the resctrl file system so that
architecture code can allocate on mount and free on unmount.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---

Note this patch disables enumeration of AET monitor events because the
new mount/unmount hooks do not call intel_aet_get_events() (which is
not ready for the change from "just on first mount" to "called on
every mount"). That is resolved in the next patch.

 include/linux/resctrl.h                 | 12 +++++--
 arch/x86/kernel/cpu/resctrl/internal.h  |  8 +++--
 arch/x86/kernel/cpu/resctrl/core.c      | 14 +++++++-
 arch/x86/kernel/cpu/resctrl/intel_aet.c | 13 ++++++++
 fs/resctrl/rdtgroup.c                   | 43 ++++++++++++++++++-------
 5 files changed, 74 insertions(+), 16 deletions(-)

diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index b312aaf76974..489c7d4ae3e9 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -518,11 +518,19 @@ void resctrl_online_cpu(unsigned int cpu);
 void resctrl_offline_cpu(unsigned int cpu);
 
 /*
- * Architecture hook called at beginning of first file system mount attempt.
- * No locks are held.
+ * Architecture hooks for resctrl mount/unmount.
+ * Each is called with resctrl_mount_lock held.
  */
+
+/* Called at beginning of each file system mount attempt. */
 void resctrl_arch_pre_mount(void);
 
+/* Called to report success/failure of mount. */
+void resctrl_arch_mount_result(int ret);
+
+/* Called to report unmount. */
+void resctrl_arch_unmount(void);
+
 /**
  * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid
  *			      for this resource and domain.
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index e3cfa0c10e92..6f322818a9e6 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -234,14 +234,18 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
 void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r);
 
 #ifdef CONFIG_X86_CPU_RESCTRL_INTEL_AET
-bool intel_aet_get_events(void);
+bool intel_aet_pre_mount(void);
+void intel_aet_mount_result(int ret);
+void intel_aet_unmount(void);
 void __exit intel_aet_exit(void);
 int intel_aet_read_event(int domid, u32 rmid, void *arch_priv, u64 *val);
 void intel_aet_mon_domain_setup(int cpu, int id, struct rdt_resource *r,
 				struct list_head *add_pos);
 bool intel_handle_aet_option(bool force_off, char *tok);
 #else
-static inline bool intel_aet_get_events(void) { return false; }
+static inline bool intel_aet_pre_mount(void) { return false; }
+static inline void intel_aet_mount_result(int ret) { }
+static inline void intel_aet_unmount(void) { }
 static inline void __exit intel_aet_exit(void) { }
 static inline int intel_aet_read_event(int domid, u32 rmid, void *arch_priv, u64 *val)
 {
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 7667cf7c4e94..162eca2cfcdb 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -769,8 +769,10 @@ void resctrl_arch_pre_mount(void)
 	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_PERF_PKG].r_resctrl;
 	int cpu;
 
-	if (!intel_aet_get_events())
+	if (!intel_aet_pre_mount()) {
+		r->mon_capable = false;
 		return;
+	}
 
 	/*
 	 * Late discovery of telemetry events means the domains for the
@@ -786,6 +788,16 @@ void resctrl_arch_pre_mount(void)
 	cpus_read_unlock();
 }
 
+void resctrl_arch_mount_result(int ret)
+{
+	intel_aet_mount_result(ret);
+}
+
+void resctrl_arch_unmount(void)
+{
+	intel_aet_unmount();
+}
+
 enum {
 	RDT_FLAG_CMT,
 	RDT_FLAG_MBM_TOTAL,
diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
index 015627401ed2..743a1894fe9a 100644
--- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
+++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
@@ -335,6 +335,19 @@ void __exit intel_aet_exit(void)
 	}
 }
 
+bool intel_aet_pre_mount(void)
+{
+	return false; // Temporary stub
+}
+
+void intel_aet_mount_result(int ret)
+{
+}
+
+void intel_aet_unmount(void)
+{
+}
+
 #define DATA_VALID	BIT_ULL(63)
 #define DATA_BITS	GENMASK_ULL(62, 0)
 
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 5da305bd36c9..2ce11ae2d2cd 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -18,7 +18,6 @@
 #include <linux/fs_parser.h>
 #include <linux/sysfs.h>
 #include <linux/kernfs.h>
-#include <linux/once.h>
 #include <linux/resctrl.h>
 #include <linux/seq_buf.h>
 #include <linux/seq_file.h>
@@ -30,6 +29,9 @@
 
 #include "internal.h"
 
+/* Mutex protecting mount/unmount operations */
+static DEFINE_MUTEX(resctrl_mount_lock);
+
 /* Mutex to protect rdtgroup access. */
 DEFINE_MUTEX(rdtgroup_mutex);
 
@@ -2788,17 +2790,8 @@ static int rdt_get_tree(struct fs_context *fc)
 	struct rdt_resource *r;
 	int ret;
 
-	DO_ONCE_SLEEPABLE(resctrl_arch_pre_mount);
-
 	cpus_read_lock();
 	mutex_lock(&rdtgroup_mutex);
-	/*
-	 * resctrl file system can only be mounted once.
-	 */
-	if (resctrl_mounted) {
-		ret = -EBUSY;
-		goto out;
-	}
 
 	ret = setup_rmid_lru_list();
 	if (ret)
@@ -2900,6 +2893,30 @@ static int rdt_get_tree(struct fs_context *fc)
 	return ret;
 }
 
+static int rdt_get_tree_wrapper(struct fs_context *fc)
+{
+	int ret;
+
+	mutex_lock(&resctrl_mount_lock);
+
+	/*
+	 * resctrl file system can only be mounted once.
+	 */
+	if (resctrl_mounted) {
+		mutex_unlock(&resctrl_mount_lock);
+		return -EBUSY;
+	}
+
+	resctrl_arch_pre_mount();
+
+	ret = rdt_get_tree(fc);
+
+	resctrl_arch_mount_result(ret);
+	mutex_unlock(&resctrl_mount_lock);
+
+	return ret;
+}
+
 enum rdt_param {
 	Opt_cdp,
 	Opt_cdpl2,
@@ -2959,7 +2976,7 @@ static void rdt_fs_context_free(struct fs_context *fc)
 static const struct fs_context_operations rdt_fs_context_ops = {
 	.free		= rdt_fs_context_free,
 	.parse_param	= rdt_parse_param,
-	.get_tree	= rdt_get_tree,
+	.get_tree	= rdt_get_tree_wrapper,
 };
 
 static int rdt_init_fs_context(struct fs_context *fc)
@@ -3169,6 +3186,7 @@ static void rdt_kill_sb(struct super_block *sb)
 {
 	struct rdt_resource *r;
 
+	mutex_lock(&resctrl_mount_lock);
 	cpus_read_lock();
 	mutex_lock(&rdtgroup_mutex);
 
@@ -3187,6 +3205,9 @@ static void rdt_kill_sb(struct super_block *sb)
 	kernfs_kill_sb(sb);
 	mutex_unlock(&rdtgroup_mutex);
 	cpus_read_unlock();
+
+	resctrl_arch_unmount();
+	mutex_unlock(&resctrl_mount_lock);
 }
 
 static struct file_system_type rdt_fs_type = {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime
  2026-03-30 21:43 [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules Tony Luck
                   ` (3 preceding siblings ...)
  2026-03-30 21:43 ` [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount Tony Luck
@ 2026-03-30 21:43 ` Tony Luck
  2026-04-04  0:56   ` Reinette Chatre
  2026-03-30 21:43 ` [PATCH v4 6/7] x86/resctrl: Delete intel_aet_exit() Tony Luck
  2026-03-30 21:43 ` [PATCH v4 7/7] x86/resctrl: Downgrade dependency of AET on INTEL_PMT Tony Luck
  6 siblings, 1 reply; 31+ messages in thread
From: Tony Luck @ 2026-03-30 21:43 UTC (permalink / raw)
  To: Fenghua Yu, Reinette Chatre, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches, Tony Luck

The resctrl subsystem is always built-in. Currently, it has a direct
dependency on INTEL_PMT_TELEMETRY and INTEL_TPMI for Application Event
Telemetry (AET) features. This forces these drivers to be built-in as well,
even though they are logically separate.

Switch to using symbol_request() to resolve AET-related symbols at
runtime. This allows PMT and TPMI to be built as loadable modules.

Update the mount/unmount logic to manage these references:
- Acquire symbol references and pin the modules during resctrl mount.
- Ensure AET structures are cleaned up and events disabled on unmount.
- Release symbol references and unpin modules on unmount or if
  mounting fails.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/resctrl/intel_aet.c | 93 ++++++++++++++++++++++---
 1 file changed, 83 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
index 743a1894fe9a..14b472106f52 100644
--- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
+++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
@@ -23,6 +23,7 @@
 #include <linux/intel_vsec.h>
 #include <linux/io.h>
 #include <linux/minmax.h>
+#include <linux/module.h>
 #include <linux/printk.h>
 #include <linux/rculist.h>
 #include <linux/rcupdate.h>
@@ -289,6 +290,9 @@ static enum pmt_feature_id lookup_pfid(const char *pfname)
 	return FEATURE_INVALID;
 }
 
+static struct pmt_feature_group *(*get_feature)(enum pmt_feature_id id);
+static void (*put_feature)(struct pmt_feature_group *p);
+
 /*
  * Request a copy of struct pmt_feature_group for each event group. If there is
  * one, the returned structure has an array of telemetry_region structures,
@@ -300,7 +304,7 @@ static enum pmt_feature_id lookup_pfid(const char *pfname)
  * struct pmt_feature_group to indicate that its events are successfully
  * enabled.
  */
-bool intel_aet_get_events(void)
+static bool aet_get_events(void)
 {
 	struct pmt_feature_group *p;
 	enum pmt_feature_id pfid;
@@ -309,14 +313,14 @@ bool intel_aet_get_events(void)
 
 	for_each_event_group(peg) {
 		pfid = lookup_pfid((*peg)->pfname);
-		p = intel_pmt_get_regions_by_feature(pfid);
+		p = get_feature(pfid);
 		if (IS_ERR_OR_NULL(p))
 			continue;
 		if (enable_events(*peg, p)) {
 			(*peg)->pfg = p;
 			ret = true;
 		} else {
-			intel_pmt_put_feature_group(p);
+			put_feature(p);
 		}
 	}
 
@@ -325,27 +329,96 @@ bool intel_aet_get_events(void)
 
 void __exit intel_aet_exit(void)
 {
-	struct event_group **peg;
+}
 
-	for_each_event_group(peg) {
-		if ((*peg)->pfg) {
-			intel_pmt_put_feature_group((*peg)->pfg);
-			(*peg)->pfg = NULL;
-		}
+static bool get_pmt_references(void)
+{
+	get_feature = symbol_request(intel_pmt_get_regions_by_feature);
+	if (!get_feature)
+		return false;
+	put_feature = symbol_request(intel_pmt_put_feature_group);
+	if (!put_feature) {
+		symbol_put(intel_pmt_get_regions_by_feature);
+		get_feature = NULL;
+		return false;
+	}
+
+	return true;
+}
+
+static void put_pmt_references(void)
+{
+	if (get_feature) {
+		symbol_put(intel_pmt_get_regions_by_feature);
+		get_feature = NULL;
+	}
+	if (put_feature) {
+		symbol_put(intel_pmt_put_feature_group);
+		put_feature = NULL;
 	}
 }
 
+static enum {
+	AET_UNINITIALIZED,
+	AET_PRESENT,
+	AET_NOT_PRESENT
+} aet_state;
+
 bool intel_aet_pre_mount(void)
 {
-	return false; // Temporary stub
+	bool ret;
+
+	if (aet_state == AET_PRESENT)
+		return true;
+
+	if (aet_state == AET_NOT_PRESENT || !get_pmt_references())
+		return false;
+
+	ret = aet_get_events();
+
+	if (ret) {
+		aet_state = AET_PRESENT;
+	} else {
+		aet_state = AET_NOT_PRESENT;
+		put_pmt_references();
+	}
+
+	return ret;
+}
+
+static void aet_cleanup(void)
+{
+	struct event_group **peg;
+
+	if (aet_state == AET_PRESENT) {
+		for_each_event_group(peg) {
+			if ((*peg)->pfg) {
+				struct event_group *e = *peg;
+
+				for (int j = 0; j < e->num_events; j++)
+					resctrl_disable_mon_event(e->evts[j].id);
+				put_feature((*peg)->pfg);
+				(*peg)->pfg = NULL;
+			}
+		}
+		put_pmt_references();
+		aet_state = AET_UNINITIALIZED;
+	}
 }
 
 void intel_aet_mount_result(int ret)
 {
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_PERF_PKG].r_resctrl;
+
+	if (ret) {
+		r->mon_capable = false;
+		aet_cleanup();
+	}
 }
 
 void intel_aet_unmount(void)
 {
+	aet_cleanup();
 }
 
 #define DATA_VALID	BIT_ULL(63)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 6/7] x86/resctrl: Delete intel_aet_exit()
  2026-03-30 21:43 [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules Tony Luck
                   ` (4 preceding siblings ...)
  2026-03-30 21:43 ` [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime Tony Luck
@ 2026-03-30 21:43 ` Tony Luck
  2026-03-30 21:43 ` [PATCH v4 7/7] x86/resctrl: Downgrade dependency of AET on INTEL_PMT Tony Luck
  6 siblings, 0 replies; 31+ messages in thread
From: Tony Luck @ 2026-03-30 21:43 UTC (permalink / raw)
  To: Fenghua Yu, Reinette Chatre, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches, Tony Luck

All cleanup for AET is now handled during file system unmount.

Drop intel_aet_exit().

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/resctrl/internal.h  | 2 --
 arch/x86/kernel/cpu/resctrl/core.c      | 2 --
 arch/x86/kernel/cpu/resctrl/intel_aet.c | 4 ----
 3 files changed, 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 6f322818a9e6..2230b64fa2ba 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -237,7 +237,6 @@ void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r);
 bool intel_aet_pre_mount(void);
 void intel_aet_mount_result(int ret);
 void intel_aet_unmount(void);
-void __exit intel_aet_exit(void);
 int intel_aet_read_event(int domid, u32 rmid, void *arch_priv, u64 *val);
 void intel_aet_mon_domain_setup(int cpu, int id, struct rdt_resource *r,
 				struct list_head *add_pos);
@@ -246,7 +245,6 @@ bool intel_handle_aet_option(bool force_off, char *tok);
 static inline bool intel_aet_pre_mount(void) { return false; }
 static inline void intel_aet_mount_result(int ret) { }
 static inline void intel_aet_unmount(void) { }
-static inline void __exit intel_aet_exit(void) { }
 static inline int intel_aet_read_event(int domid, u32 rmid, void *arch_priv, u64 *val)
 {
 	return -EINVAL;
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 162eca2cfcdb..1951df519dea 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -1172,8 +1172,6 @@ late_initcall(resctrl_arch_late_init);
 
 static void __exit resctrl_arch_exit(void)
 {
-	intel_aet_exit();
-
 	cpuhp_remove_state(rdt_online);
 
 	resctrl_exit();
diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
index 14b472106f52..b708c94a4a83 100644
--- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
+++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
@@ -327,10 +327,6 @@ static bool aet_get_events(void)
 	return ret;
 }
 
-void __exit intel_aet_exit(void)
-{
-}
-
 static bool get_pmt_references(void)
 {
 	get_feature = symbol_request(intel_pmt_get_regions_by_feature);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 7/7] x86/resctrl: Downgrade dependency of AET on INTEL_PMT
  2026-03-30 21:43 [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules Tony Luck
                   ` (5 preceding siblings ...)
  2026-03-30 21:43 ` [PATCH v4 6/7] x86/resctrl: Delete intel_aet_exit() Tony Luck
@ 2026-03-30 21:43 ` Tony Luck
  6 siblings, 0 replies; 31+ messages in thread
From: Tony Luck @ 2026-03-30 21:43 UTC (permalink / raw)
  To: Fenghua Yu, Reinette Chatre, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches, Tony Luck

The INTEL_AET code now using symbol_get() to access the INTEL_PMT
enumeration functions so no requirement that they be built-in to
the base kernel.

Update the Kconfig dependencies.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e2df1b147184..1a60da62da33 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -542,7 +542,7 @@ config X86_CPU_RESCTRL
 
 config X86_CPU_RESCTRL_INTEL_AET
 	bool "Intel Application Energy Telemetry"
-	depends on X86_64 && X86_CPU_RESCTRL && CPU_SUP_INTEL && INTEL_PMT_TELEMETRY=y && INTEL_TPMI=y
+	depends on X86_64 && X86_CPU_RESCTRL && CPU_SUP_INTEL && INTEL_PMT_TELEMETRY != n && INTEL_TPMI != n
 	help
 	  Enable per-RMID telemetry events in resctrl.
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL
  2026-03-30 21:43 ` [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL Tony Luck
@ 2026-04-04  0:00   ` Reinette Chatre
  2026-04-06 18:07     ` David Box
  2026-04-08  5:07   ` Christoph Hellwig
  1 sibling, 1 reply; 31+ messages in thread
From: Reinette Chatre @ 2026-04-04  0:00 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches

Hi Tony,

On 3/30/26 2:43 PM, Tony Luck wrote:
> The symbol_get() function requires that objects be GPL licensed.

I do not know if this cryptic note is sufficient for the driver's maintainers.
Looking at the dependencies among symbols it does look possible to add more
context here that makes it clear that this is the right and obvious thing to do
for this driver. Doing so makes inclusion of this change just a formality. If
it is not the right and obvious thing to do then I similarly expect a better
motivation than some planned usage since any planned usage may cause a debate
about whether it should be supported.

Reinette


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/7] x86/resctrl: Drop setting of event_group::force_off when insufficient RMIDs
  2026-03-30 21:43 ` [PATCH v4 2/7] x86/resctrl: Drop setting of event_group::force_off when insufficient RMIDs Tony Luck
@ 2026-04-04  0:01   ` Reinette Chatre
  0 siblings, 0 replies; 31+ messages in thread
From: Reinette Chatre @ 2026-04-04  0:01 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches

Hi Tony,

On 3/30/26 2:43 PM, Tony Luck wrote:
> Setting the force_off flag does not matter when AET features are only
> enumerated once (on first mount).

Missing motivation here. Consider for example:
Enumerating AET features multiple times is needed because ... ?

> 
> In preparation for enumeration on every mount, drop this because it
> overrides user request with rdt= boot option to force enable a feature.

Missing answer to:
event_group::force_off is intended to reflect both the user and architecture's
opinion. This removes the architecture setting entirely. This is safe because ...

Missing:
Change definition of event_group::force_off.

Reinette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event
  2026-03-30 21:43 ` [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event Tony Luck
@ 2026-04-04  0:03   ` Reinette Chatre
  2026-04-06 18:35     ` Luck, Tony
  0 siblings, 1 reply; 31+ messages in thread
From: Reinette Chatre @ 2026-04-04  0:03 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches

Hi Tony,

On 3/30/26 2:43 PM, Tony Luck wrote:
> Architecture code can ask file system code to enable events. But there
> is no way to clean up and disable events.

Missing why it is required to disable events.

> 
> Add resctrl_disable_mon_event().

(can be seen from patch)

> Adding/removing events is only
> possible when the file system is not mounted. Add a WARN_ON() to

While this is accurate that it should not be possible to enable/disable
events while resctrl is mounted, this is *not* the *only* time when it
should not be possible. Here it is unfortunately not straight forward
since only some events require per-domain state which requires the event
to be enabled before any domain comes online, potentially very early in
initialization.

To me the addition of this warning adds false security.

Also, consider that resctrl_mounted is protected by rdtgroup_mutex and
this addition gives architecture code free access without any protection.

To do this right resctrl may need to add more state to an event but how
that may look is unclear since an architecture may require per-domain
architectural state for an event while resctrl fs needs none. At this
time these scenarios may just fall into the "architecture must do the
right thing" category since it has best information on how state is
managed for the events as they are enabled/disabled.

Reinette


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
  2026-03-30 21:43 ` [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount Tony Luck
@ 2026-04-04  0:52   ` Reinette Chatre
  2026-04-06 20:35     ` Luck, Tony
  0 siblings, 1 reply; 31+ messages in thread
From: Reinette Chatre @ 2026-04-04  0:52 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches

Hi Tony,

On 3/30/26 2:43 PM, Tony Luck wrote:
> Add hooks for every mount/unmount of the resctrl file system so that
> architecture code can allocate on mount and free on unmount.

Please use the changelog to describe and motivate all the other things
that this patch does.

> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> 
> Note this patch disables enumeration of AET monitor events because the
> new mount/unmount hooks do not call intel_aet_get_events() (which is
> not ready for the change from "just on first mount" to "called on
> every mount"). That is resolved in the next patch.

This could be part of the proper changelog.

Could patches be re-ordered to support incremental changes?

> 
>  include/linux/resctrl.h                 | 12 +++++--
>  arch/x86/kernel/cpu/resctrl/internal.h  |  8 +++--
>  arch/x86/kernel/cpu/resctrl/core.c      | 14 +++++++-
>  arch/x86/kernel/cpu/resctrl/intel_aet.c | 13 ++++++++
>  fs/resctrl/rdtgroup.c                   | 43 ++++++++++++++++++-------
>  5 files changed, 74 insertions(+), 16 deletions(-)
> 
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index b312aaf76974..489c7d4ae3e9 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -518,11 +518,19 @@ void resctrl_online_cpu(unsigned int cpu);
>  void resctrl_offline_cpu(unsigned int cpu);
>  
>  /*
> - * Architecture hook called at beginning of first file system mount attempt.
> - * No locks are held.
> + * Architecture hooks for resctrl mount/unmount.
> + * Each is called with resctrl_mount_lock held.
>   */
> +
> +/* Called at beginning of each file system mount attempt. */
>  void resctrl_arch_pre_mount(void);
>  
> +/* Called to report success/failure of mount. */
> +void resctrl_arch_mount_result(int ret);

Why is this needed? Why not just always call resctrl_arch_unmount()?

> +
> +/* Called to report unmount. */
> +void resctrl_arch_unmount(void);
> +
>  /**
>   * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid
>   *			      for this resource and domain.
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index e3cfa0c10e92..6f322818a9e6 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -234,14 +234,18 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>  void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r);
>  
>  #ifdef CONFIG_X86_CPU_RESCTRL_INTEL_AET
> -bool intel_aet_get_events(void);
> +bool intel_aet_pre_mount(void);
> +void intel_aet_mount_result(int ret);
> +void intel_aet_unmount(void);
>  void __exit intel_aet_exit(void);
>  int intel_aet_read_event(int domid, u32 rmid, void *arch_priv, u64 *val);
>  void intel_aet_mon_domain_setup(int cpu, int id, struct rdt_resource *r,
>  				struct list_head *add_pos);
>  bool intel_handle_aet_option(bool force_off, char *tok);
>  #else
> -static inline bool intel_aet_get_events(void) { return false; }
> +static inline bool intel_aet_pre_mount(void) { return false; }
> +static inline void intel_aet_mount_result(int ret) { }
> +static inline void intel_aet_unmount(void) { }
>  static inline void __exit intel_aet_exit(void) { }
>  static inline int intel_aet_read_event(int domid, u32 rmid, void *arch_priv, u64 *val)
>  {
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 7667cf7c4e94..162eca2cfcdb 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -769,8 +769,10 @@ void resctrl_arch_pre_mount(void)
>  	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_PERF_PKG].r_resctrl;
>  	int cpu;
>  
> -	if (!intel_aet_get_events())
> +	if (!intel_aet_pre_mount()) {
> +		r->mon_capable = false;

Why is this needed?

>  		return;
> +	}
>  
>  	/*
>  	 * Late discovery of telemetry events means the domains for the


...

>  #include "internal.h"
>  
> +/* Mutex protecting mount/unmount operations */
> +static DEFINE_MUTEX(resctrl_mount_lock);
> +
>  /* Mutex to protect rdtgroup access. */
>  DEFINE_MUTEX(rdtgroup_mutex);
>  
> @@ -2788,17 +2790,8 @@ static int rdt_get_tree(struct fs_context *fc)
>  	struct rdt_resource *r;
>  	int ret;
>  
> -	DO_ONCE_SLEEPABLE(resctrl_arch_pre_mount);
> -
>  	cpus_read_lock();
>  	mutex_lock(&rdtgroup_mutex);
> -	/*
> -	 * resctrl file system can only be mounted once.
> -	 */
> -	if (resctrl_mounted) {
> -		ret = -EBUSY;
> -		goto out;
> -	}
>  
>  	ret = setup_rmid_lru_list();
>  	if (ret)
> @@ -2900,6 +2893,30 @@ static int rdt_get_tree(struct fs_context *fc)
>  	return ret;
>  }
>  
> +static int rdt_get_tree_wrapper(struct fs_context *fc)
> +{
> +	int ret;
> +
> +	mutex_lock(&resctrl_mount_lock);
> +
> +	/*
> +	 * resctrl file system can only be mounted once.
> +	 */
> +	if (resctrl_mounted) {
> +		mutex_unlock(&resctrl_mount_lock);
> +		return -EBUSY;
> +	}
> +

This does not look right. Here too is resctrl_mounted accessed without rdtgroup_mutex
held. This change implies that resctrl_mounted is now protected by resctrl_mount_lock
but resctrl is not changed to respect this throughout resulting in unsafe access of
resctrl_mounted.

Does this new resctrl_mount_lock need to be in resctrl fs? It really seems as though the
needed synchronization belongs in the architecture. Could this instead be accomplished
with a private mutex within the AET code?

> +	resctrl_arch_pre_mount();
> +
> +	ret = rdt_get_tree(fc);
> +
> +	resctrl_arch_mount_result(ret);

Could this instead just call resctrl_arch_unmount() on failure?

Reinette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime
  2026-03-30 21:43 ` [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime Tony Luck
@ 2026-04-04  0:56   ` Reinette Chatre
  2026-04-07 18:13     ` Luck, Tony
  0 siblings, 1 reply; 31+ messages in thread
From: Reinette Chatre @ 2026-04-04  0:56 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86
  Cc: linux-kernel, patches

Hi Tony,

On 3/30/26 2:43 PM, Tony Luck wrote:
> The resctrl subsystem is always built-in. Currently, it has a direct
> dependency on INTEL_PMT_TELEMETRY and INTEL_TPMI for Application Event
> Telemetry (AET) features. This forces these drivers to be built-in as well,
> even though they are logically separate.
> 
> Switch to using symbol_request() to resolve AET-related symbols at
> runtime. This allows PMT and TPMI to be built as loadable modules.
> 
> Update the mount/unmount logic to manage these references:
> - Acquire symbol references and pin the modules during resctrl mount.
> - Ensure AET structures are cleaned up and events disabled on unmount.
> - Release symbol references and unpin modules on unmount or if
>   mounting fails.

If I remember correctly resctrl depends on the mount being done much later
than driver initialization to ensure all PMT load and enumeration is
complete by the time resctrl fs is mounted.
Now it looks like these modules can be loaded dynamically on resctrl mount
with the functions learning about PMT feature groups called right after.
Considering the change in timing, is it guaranteed that if
intel_pmt_get_regions_by_feature() is called right after module probe that
the telemetry driver would be done with its enumeration?

> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/kernel/cpu/resctrl/intel_aet.c | 93 ++++++++++++++++++++++---
>  1 file changed, 83 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> index 743a1894fe9a..14b472106f52 100644
> --- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
> +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> @@ -23,6 +23,7 @@
>  #include <linux/intel_vsec.h>
>  #include <linux/io.h>
>  #include <linux/minmax.h>
> +#include <linux/module.h>
>  #include <linux/printk.h>
>  #include <linux/rculist.h>
>  #include <linux/rcupdate.h>
> @@ -289,6 +290,9 @@ static enum pmt_feature_id lookup_pfid(const char *pfname)
>  	return FEATURE_INVALID;
>  }
>  
> +static struct pmt_feature_group *(*get_feature)(enum pmt_feature_id id);
> +static void (*put_feature)(struct pmt_feature_group *p);
> +
>  /*
>   * Request a copy of struct pmt_feature_group for each event group. If there is
>   * one, the returned structure has an array of telemetry_region structures,
> @@ -300,7 +304,7 @@ static enum pmt_feature_id lookup_pfid(const char *pfname)
>   * struct pmt_feature_group to indicate that its events are successfully
>   * enabled.
>   */
> -bool intel_aet_get_events(void)
> +static bool aet_get_events(void)
>  {
>  	struct pmt_feature_group *p;
>  	enum pmt_feature_id pfid;
> @@ -309,14 +313,14 @@ bool intel_aet_get_events(void)
>  
>  	for_each_event_group(peg) {
>  		pfid = lookup_pfid((*peg)->pfname);
> -		p = intel_pmt_get_regions_by_feature(pfid);
> +		p = get_feature(pfid);
>  		if (IS_ERR_OR_NULL(p))
>  			continue;
>  		if (enable_events(*peg, p)) {
>  			(*peg)->pfg = p;
>  			ret = true;
>  		} else {
> -			intel_pmt_put_feature_group(p);
> +			put_feature(p);
>  		}
>  	}
>  
> @@ -325,27 +329,96 @@ bool intel_aet_get_events(void)
>  
>  void __exit intel_aet_exit(void)
>  {
> -	struct event_group **peg;
> +}
>  
> -	for_each_event_group(peg) {
> -		if ((*peg)->pfg) {
> -			intel_pmt_put_feature_group((*peg)->pfg);
> -			(*peg)->pfg = NULL;
> -		}
> +static bool get_pmt_references(void)
> +{
> +	get_feature = symbol_request(intel_pmt_get_regions_by_feature);
> +	if (!get_feature)
> +		return false;
> +	put_feature = symbol_request(intel_pmt_put_feature_group);
> +	if (!put_feature) {
> +		symbol_put(intel_pmt_get_regions_by_feature);
> +		get_feature = NULL;
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +static void put_pmt_references(void)
> +{
> +	if (get_feature) {
> +		symbol_put(intel_pmt_get_regions_by_feature);
> +		get_feature = NULL;
> +	}
> +	if (put_feature) {
> +		symbol_put(intel_pmt_put_feature_group);
> +		put_feature = NULL;
>  	}
>  }
>  
> +static enum {
> +	AET_UNINITIALIZED,
> +	AET_PRESENT,
> +	AET_NOT_PRESENT
> +} aet_state;
> +
>  bool intel_aet_pre_mount(void)
>  {
> -	return false; // Temporary stub
> +	bool ret;
> +
> +	if (aet_state == AET_PRESENT)
> +		return true;
> +
> +	if (aet_state == AET_NOT_PRESENT || !get_pmt_references())
> +		return false;
> +
> +	ret = aet_get_events();
> +
> +	if (ret) {
> +		aet_state = AET_PRESENT;
> +	} else {
> +		aet_state = AET_NOT_PRESENT;
> +		put_pmt_references();
> +	}
> +
> +	return ret;
> +}

I think I am missing something here. This seems to build a new state
machine on top of existing state tracking. 
What if, instead, just do:
	
	aet_get_events() 
	{

		for_each_event_group(peg) {
			if ((*peg)->pfg) /* This means "AET_PRESENT" ? */
				continue;
			...
		}

	}

Similarly, get_feature and put_feature maintains state about
the symbols.

> +
> +static void aet_cleanup(void)
> +{
> +	struct event_group **peg;
> +
> +	if (aet_state == AET_PRESENT) {

I do not see why this check/state machine is needed, if enumeration was not
done the loop should just do nothing because (*peg)->pfg will always be NULL. 

> +		for_each_event_group(peg) {
> +			if ((*peg)->pfg) {
> +				struct event_group *e = *peg;
> +
> +				for (int j = 0; j < e->num_events; j++)
> +					resctrl_disable_mon_event(e->evts[j].id);

Can this be expected to clean up/reset all changes from enable_events()? For example,
how about event_group::num_rmid and rdt_resource::resctrl_mon::num_rmid?

> +				put_feature((*peg)->pfg);
> +				(*peg)->pfg = NULL;
> +			}
> +		}
> +		put_pmt_references();

Similarly here, put_pmt_references() could just always be called and it will not
do anything if there aren't references?

> +		aet_state = AET_UNINITIALIZED;
> +	}
>  }
>  
>  void intel_aet_mount_result(int ret)
>  {
> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_PERF_PKG].r_resctrl;
> +
> +	if (ret) {
> +		r->mon_capable = false;
> +		aet_cleanup();
> +	}
>  }
>  
>  void intel_aet_unmount(void)
>  {
> +	aet_cleanup();
>  }

I am expecting intel_aet_mount_result() and intel_aet_unmount() to look the same and that
intel_aet_mount_result() thus becomes unnecessary.
Just setting r->mon_capable to false does not seem enough though ... how about all those
monitoring domains that were created? This flow looks unexpected to me, what am I missing?

Reinette



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL
  2026-04-04  0:00   ` Reinette Chatre
@ 2026-04-06 18:07     ` David Box
  0 siblings, 0 replies; 31+ messages in thread
From: David Box @ 2026-04-06 18:07 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Tony Luck, Fenghua Yu, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86, linux-kernel, patches

On Fri, Apr 03, 2026 at 05:00:05PM -0700, Reinette Chatre wrote:
> Hi Tony,
> 
> On 3/30/26 2:43 PM, Tony Luck wrote:
> > The symbol_get() function requires that objects be GPL licensed.
> 
> I do not know if this cryptic note is sufficient for the driver's maintainers.
> Looking at the dependencies among symbols it does look possible to add more
> context here that makes it clear that this is the right and obvious thing to do
> for this driver. Doing so makes inclusion of this change just a formality. If
> it is not the right and obvious thing to do then I similarly expect a better
> motivation than some planned usage since any planned usage may cause a debate
> about whether it should be supported.
> 
> Reinette

Tony,

Something along the lines of this would make it clearer:

"Prepare PMT enumeration interfaces for GPL-only users by converting
intel_pmt_get_regions_by_feature() and intel_pmt_put_feature_group() to
EXPORT_SYMBOL_GPL(). This enables future users of these interfaces to
access them via symbol_get(), which requires GPL-exported symbols."

With that clarification, this looks good to me.

Acked-by: David E. Box <david.e.box@linux.intel.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event
  2026-04-04  0:03   ` Reinette Chatre
@ 2026-04-06 18:35     ` Luck, Tony
  2026-04-06 21:13       ` Reinette Chatre
  0 siblings, 1 reply; 31+ messages in thread
From: Luck, Tony @ 2026-04-06 18:35 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

Hi Reinette,

On Fri, Apr 03, 2026 at 05:03:28PM -0700, Reinette Chatre wrote:
> Hi Tony,
> 
> On 3/30/26 2:43 PM, Tony Luck wrote:
> > Architecture code can ask file system code to enable events. But there
> > is no way to clean up and disable events.
> 
> Missing why it is required to disable events.

Will add rationale.

> > 
> > Add resctrl_disable_mon_event().
> 
> (can be seen from patch)

Will drop this part of comment.

> > Adding/removing events is only
> > possible when the file system is not mounted. Add a WARN_ON() to
> 
> While this is accurate that it should not be possible to enable/disable
> events while resctrl is mounted, this is *not* the *only* time when it
> should not be possible. Here it is unfortunately not straight forward
> since only some events require per-domain state which requires the event
> to be enabled before any domain comes online, potentially very early in
> initialization.
> 
> To me the addition of this warning adds false security.
> 
> Also, consider that resctrl_mounted is protected by rdtgroup_mutex and
> this addition gives architecture code free access without any protection.

This is a problem.

> To do this right resctrl may need to add more state to an event but how
> that may look is unclear since an architecture may require per-domain
> architectural state for an event while resctrl fs needs none.

The extra state for an event could be a pair of pointers to file system functions
to be used by resctrl_{en,disable}_mon_event() to allocate/clean up any state needed
by file system code for each event.

But this might lead to a rabbit hole of adding complexity. May someday
be useful if we ever make resctrl a loadable module.

>                                                               At this
> time these scenarios may just fall into the "architecture must do the
> right thing" category since it has best information on how state is
> managed for the events as they are enabled/disabled.

Are you suggesting to just drop the check for resctrl_mounted (as both
a locking issue, and an incomplete solution)?

> Reinette

-Tony

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
  2026-04-04  0:52   ` Reinette Chatre
@ 2026-04-06 20:35     ` Luck, Tony
  2026-04-06 21:16       ` Reinette Chatre
  0 siblings, 1 reply; 31+ messages in thread
From: Luck, Tony @ 2026-04-06 20:35 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

Hi Reinette,

On Fri, Apr 03, 2026 at 05:52:30PM -0700, Reinette Chatre wrote:
> Hi Tony,
> 
> On 3/30/26 2:43 PM, Tony Luck wrote:
> > Add hooks for every mount/unmount of the resctrl file system so that
> > architecture code can allocate on mount and free on unmount.
> 
> Please use the changelog to describe and motivate all the other things
> that this patch does.

OK. I will expand.

> > 
> > Signed-off-by: Tony Luck <tony.luck@intel.com>
> > ---
> > 
> > Note this patch disables enumeration of AET monitor events because the
> > new mount/unmount hooks do not call intel_aet_get_events() (which is
> > not ready for the change from "just on first mount" to "called on
> > every mount"). That is resolved in the next patch.
> 
> This could be part of the proper changelog.
> 
> Could patches be re-ordered to support incremental changes?

I'll look again because several things have changed since I ordered
the series this way. But some bits got overly complicated trying to
make AET ready to be called multiple times. If I can't solve elegantly
I'll move this into the proper changelog.

> > 
> >  include/linux/resctrl.h                 | 12 +++++--
> >  arch/x86/kernel/cpu/resctrl/internal.h  |  8 +++--
> >  arch/x86/kernel/cpu/resctrl/core.c      | 14 +++++++-
> >  arch/x86/kernel/cpu/resctrl/intel_aet.c | 13 ++++++++
> >  fs/resctrl/rdtgroup.c                   | 43 ++++++++++++++++++-------
> >  5 files changed, 74 insertions(+), 16 deletions(-)
> > 
> > diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> > index b312aaf76974..489c7d4ae3e9 100644
> > --- a/include/linux/resctrl.h
> > +++ b/include/linux/resctrl.h
> > @@ -518,11 +518,19 @@ void resctrl_online_cpu(unsigned int cpu);
> >  void resctrl_offline_cpu(unsigned int cpu);
> >  
> >  /*
> > - * Architecture hook called at beginning of first file system mount attempt.
> > - * No locks are held.
> > + * Architecture hooks for resctrl mount/unmount.
> > + * Each is called with resctrl_mount_lock held.
> >   */
> > +
> > +/* Called at beginning of each file system mount attempt. */
> >  void resctrl_arch_pre_mount(void);
> >  
> > +/* Called to report success/failure of mount. */
> > +void resctrl_arch_mount_result(int ret);
> 
> Why is this needed? Why not just always call resctrl_arch_unmount()?

You are right. If the mount fails in the resctrl code, resctrl_arch_unmount()
can clean things up.

> > +
> > +/* Called to report unmount. */
> > +void resctrl_arch_unmount(void);
> > +
> >  /**
> >   * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid
> >   *			      for this resource and domain.
> > diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> > index e3cfa0c10e92..6f322818a9e6 100644
> > --- a/arch/x86/kernel/cpu/resctrl/internal.h
> > +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> > @@ -234,14 +234,18 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
> >  void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r);
> >  
> >  #ifdef CONFIG_X86_CPU_RESCTRL_INTEL_AET
> > -bool intel_aet_get_events(void);
> > +bool intel_aet_pre_mount(void);
> > +void intel_aet_mount_result(int ret);
> > +void intel_aet_unmount(void);
> >  void __exit intel_aet_exit(void);
> >  int intel_aet_read_event(int domid, u32 rmid, void *arch_priv, u64 *val);
> >  void intel_aet_mon_domain_setup(int cpu, int id, struct rdt_resource *r,
> >  				struct list_head *add_pos);
> >  bool intel_handle_aet_option(bool force_off, char *tok);
> >  #else
> > -static inline bool intel_aet_get_events(void) { return false; }
> > +static inline bool intel_aet_pre_mount(void) { return false; }
> > +static inline void intel_aet_mount_result(int ret) { }
> > +static inline void intel_aet_unmount(void) { }
> >  static inline void __exit intel_aet_exit(void) { }
> >  static inline int intel_aet_read_event(int domid, u32 rmid, void *arch_priv, u64 *val)
> >  {
> > diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> > index 7667cf7c4e94..162eca2cfcdb 100644
> > --- a/arch/x86/kernel/cpu/resctrl/core.c
> > +++ b/arch/x86/kernel/cpu/resctrl/core.c
> > @@ -769,8 +769,10 @@ void resctrl_arch_pre_mount(void)
> >  	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_PERF_PKG].r_resctrl;
> >  	int cpu;
> >  
> > -	if (!intel_aet_get_events())
> > +	if (!intel_aet_pre_mount()) {
> > +		r->mon_capable = false;
> 
> Why is this needed?

I'll move to the next patch. It's needed for the case when AET works on some
mount and then fails on a subsequent one. But that can't happen at this point
in this patch series.

> >  		return;
> > +	}
> >  
> >  	/*
> >  	 * Late discovery of telemetry events means the domains for the
> 
> 
> ...
> 
> >  #include "internal.h"
> >  
> > +/* Mutex protecting mount/unmount operations */
> > +static DEFINE_MUTEX(resctrl_mount_lock);
> > +
> >  /* Mutex to protect rdtgroup access. */
> >  DEFINE_MUTEX(rdtgroup_mutex);
> >  
> > @@ -2788,17 +2790,8 @@ static int rdt_get_tree(struct fs_context *fc)
> >  	struct rdt_resource *r;
> >  	int ret;
> >  
> > -	DO_ONCE_SLEEPABLE(resctrl_arch_pre_mount);
> > -
> >  	cpus_read_lock();
> >  	mutex_lock(&rdtgroup_mutex);
> > -	/*
> > -	 * resctrl file system can only be mounted once.
> > -	 */
> > -	if (resctrl_mounted) {
> > -		ret = -EBUSY;
> > -		goto out;
> > -	}
> >  
> >  	ret = setup_rmid_lru_list();
> >  	if (ret)
> > @@ -2900,6 +2893,30 @@ static int rdt_get_tree(struct fs_context *fc)
> >  	return ret;
> >  }
> >  
> > +static int rdt_get_tree_wrapper(struct fs_context *fc)
> > +{
> > +	int ret;
> > +
> > +	mutex_lock(&resctrl_mount_lock);
> > +
> > +	/*
> > +	 * resctrl file system can only be mounted once.
> > +	 */
> > +	if (resctrl_mounted) {
> > +		mutex_unlock(&resctrl_mount_lock);
> > +		return -EBUSY;
> > +	}
> > +
> 
> This does not look right. Here too is resctrl_mounted accessed without rdtgroup_mutex
> held. This change implies that resctrl_mounted is now protected by resctrl_mount_lock
> but resctrl is not changed to respect this throughout resulting in unsafe access of
> resctrl_mounted.
> 
> Does this new resctrl_mount_lock need to be in resctrl fs? It really seems as though the
> needed synchronization belongs in the architecture. Could this instead be accomplished
> with a private mutex within the AET code?

If you dig in lore for the v3 of this patch, you'll see I had the mutex in the
AET code. But there were some complications.

1) Need to acquire in intel_aet_pre_mount() and release in intel_aet_mount_result()
which is legal, but makes code more complex when call chains need to be compared to
check that the mutex is being released correctly.

2) The "only mounted once" case meant extra state (AET_PRESENT, which you note
in next patch may be redundant) because intel_aet_pre_mount() is called, but
needs to do nothing.

Adding resctrl_mount_lock to the file system code made things simpler. The
pre-mount code can't be called with rdtgroup_mutex held because it needs to
build the domains. That needs cpus_read_lock() + mutex_lock(&domain_list_lock);

I need to add more comments on locking. resctrl_mounted is only modified when both
resctrl_mount_lock AND rdtgroup_mutex are held. I believe that makes it safe to
read the value of resctrl_mounted with just rdtgroup_mutex held.

> > +	resctrl_arch_pre_mount();
> > +
> > +	ret = rdt_get_tree(fc);
> > +
> > +	resctrl_arch_mount_result(ret);
> 
> Could this instead just call resctrl_arch_unmount() on failure?

Yes. Thanks for the suggestion.

> Reinette

-Tony

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event
  2026-04-06 18:35     ` Luck, Tony
@ 2026-04-06 21:13       ` Reinette Chatre
  2026-04-07 18:40         ` Luck, Tony
  0 siblings, 1 reply; 31+ messages in thread
From: Reinette Chatre @ 2026-04-06 21:13 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

Hi Tony,

On 4/6/26 11:35 AM, Luck, Tony wrote:
> On Fri, Apr 03, 2026 at 05:03:28PM -0700, Reinette Chatre wrote:

...

> 
>>                                                               At this
>> time these scenarios may just fall into the "architecture must do the
>> right thing" category since it has best information on how state is
>> managed for the events as they are enabled/disabled.
> 
> Are you suggesting to just drop the check for resctrl_mounted (as both
> a locking issue, and an incomplete solution)?

I am indeed suggesting to "drop the check for resctrl_mounted" but instead
of "just" doing that I think it worthwhile to add function comments to these
two arch helpers in include/linux/resctrl.h that describes what needs to be
considered when calling them. That is, describe "architecture must do the
right thing" with some documentation about what needs to be considered.
Such documentation may help us to start putting some boundaries on how
these helpers can/should be used to help guide any future enhancements to
make this more robust.

Reinette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
  2026-04-06 20:35     ` Luck, Tony
@ 2026-04-06 21:16       ` Reinette Chatre
  2026-04-09 20:35         ` Luck, Tony
  0 siblings, 1 reply; 31+ messages in thread
From: Reinette Chatre @ 2026-04-06 21:16 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

Hi Tony,

On 4/6/26 1:35 PM, Luck, Tony wrote:
> On Fri, Apr 03, 2026 at 05:52:30PM -0700, Reinette Chatre wrote:
>> On 3/30/26 2:43 PM, Tony Luck wrote:
>>> Add hooks for every mount/unmount of the resctrl file system so that
>>> architecture code can allocate on mount and free on unmount.
>>
>> Please use the changelog to describe and motivate all the other things
>> that this patch does.
> 
> OK. I will expand.
> 
>>>
>>> Signed-off-by: Tony Luck <tony.luck@intel.com>
>>> ---
>>>
>>> Note this patch disables enumeration of AET monitor events because the
>>> new mount/unmount hooks do not call intel_aet_get_events() (which is
>>> not ready for the change from "just on first mount" to "called on
>>> every mount"). That is resolved in the next patch.
>>
>> This could be part of the proper changelog.
>>
>> Could patches be re-ordered to support incremental changes?
> 
> I'll look again because several things have changed since I ordered
> the series this way. But some bits got overly complicated trying to
> make AET ready to be called multiple times. If I can't solve elegantly
> I'll move this into the proper changelog.

Please mark patches as RFC when still working out details. When reviewing it
helps to know whether something is being submitted for inclusion or not.

...

>>> @@ -2900,6 +2893,30 @@ static int rdt_get_tree(struct fs_context *fc)
>>>  	return ret;
>>>  }
>>>  
>>> +static int rdt_get_tree_wrapper(struct fs_context *fc)
>>> +{
>>> +	int ret;
>>> +
>>> +	mutex_lock(&resctrl_mount_lock);
>>> +
>>> +	/*
>>> +	 * resctrl file system can only be mounted once.
>>> +	 */
>>> +	if (resctrl_mounted) {
>>> +		mutex_unlock(&resctrl_mount_lock);
>>> +		return -EBUSY;
>>> +	}
>>> +
>>
>> This does not look right. Here too is resctrl_mounted accessed without rdtgroup_mutex
>> held. This change implies that resctrl_mounted is now protected by resctrl_mount_lock
>> but resctrl is not changed to respect this throughout resulting in unsafe access of
>> resctrl_mounted.
>>
>> Does this new resctrl_mount_lock need to be in resctrl fs? It really seems as though the
>> needed synchronization belongs in the architecture. Could this instead be accomplished
>> with a private mutex within the AET code?
> 
> If you dig in lore for the v3 of this patch, you'll see I had the mutex in the
> AET code. But there were some complications.
> 
> 1) Need to acquire in intel_aet_pre_mount() and release in intel_aet_mount_result()
> which is legal, but makes code more complex when call chains need to be compared to
> check that the mutex is being released correctly.

Why was it needed to hold mutex for so long? I cannot find explanation here or in changelog
of v3. I did not remember correctly and considered the AET code to be doing the domain
addition. Even so, I do think a mutex internal to the arch code can be used to manage
the synchronization. Could you please elaborate why this cannot be done?


> 2) The "only mounted once" case meant extra state (AET_PRESENT, which you note
> in next patch may be redundant) because intel_aet_pre_mount() is called, but
> needs to do nothing.

Right, I do not see need for extra state. In fact, since it is not clear to me that
PMT enumeration will be complete when intel_pmt_get_regions_by_feature() is called it
seemed worthwhile to only rely on event_group::pfg - if PMT enumeration was not complete
during mount N it may be complete on mount N+1? This creates a poor user interface
though since user would need an alternate way to know if AET is supported and then
a "remount until it works" approach.

> 
> Adding resctrl_mount_lock to the file system code made things simpler. The

Adding complications to resctrl fs to make things simpler for x86?

> pre-mount code can't be called with rdtgroup_mutex held because it needs to
> build the domains. That needs cpus_read_lock() + mutex_lock(&domain_list_lock);

ack. Can an arch-specific mutex be used instead?

> I need to add more comments on locking. resctrl_mounted is only modified when both
> resctrl_mount_lock AND rdtgroup_mutex are held. I believe that makes it safe to
> read the value of resctrl_mounted with just rdtgroup_mutex held.

...but not to read it with only resctrl_mount_lock held as in snippet above.

Reinette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime
  2026-04-04  0:56   ` Reinette Chatre
@ 2026-04-07 18:13     ` Luck, Tony
  2026-04-07 18:40       ` Reinette Chatre
  0 siblings, 1 reply; 31+ messages in thread
From: Luck, Tony @ 2026-04-07 18:13 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

Hi Reinette,

On Fri, Apr 03, 2026 at 05:56:35PM -0700, Reinette Chatre wrote:
> Hi Tony,
> 
> On 3/30/26 2:43 PM, Tony Luck wrote:
> > The resctrl subsystem is always built-in. Currently, it has a direct
> > dependency on INTEL_PMT_TELEMETRY and INTEL_TPMI for Application Event
> > Telemetry (AET) features. This forces these drivers to be built-in as well,
> > even though they are logically separate.
> > 
> > Switch to using symbol_request() to resolve AET-related symbols at
> > runtime. This allows PMT and TPMI to be built as loadable modules.
> > 
> > Update the mount/unmount logic to manage these references:
> > - Acquire symbol references and pin the modules during resctrl mount.
> > - Ensure AET structures are cleaned up and events disabled on unmount.
> > - Release symbol references and unpin modules on unmount or if
> >   mounting fails.
> 
> If I remember correctly resctrl depends on the mount being done much later
> than driver initialization to ensure all PMT load and enumeration is
> complete by the time resctrl fs is mounted.
> Now it looks like these modules can be loaded dynamically on resctrl mount
> with the functions learning about PMT feature groups called right after.
> Considering the change in timing, is it guaranteed that if
> intel_pmt_get_regions_by_feature() is called right after module probe that
> the telemetry driver would be done with its enumeration?

You do remmeber correctly. Your concerns are valid. I'd seen that the
pmt_telemetry module was auto-loaded, but hadn't tracked how/when that
happens.

The sequence is the intel_vsec driver is auto-loaded based on the OOBMSM PCIe
device ID. That sets up an auxillary bus which pulls in the pmt_telemetry
module. Adding a debug pr_info() I saw that the probe routine for the
pmt discover module was called at kernel timestamp 42.720156 for socket 0
and 42.725646 for socket 1.

Adding a "resctrl" line to /etc/fstab attempts the mount at 39.667. Three
seconds too early. No PMT events are found, and code in this V4 version
of the patch series marks the system as AET_NOT_PRESENT and will never
look again :-(

I can drop the AET_NOT_PRESENT state so that a retry will succeed. I don't
see another fix other than to document this limitation.

Workarounds are:
1) Change the CONFIG to build pmt_telemetry into the kernel (where we
   are today, but haven't heard from Linux distros like Red Hat, SUSE etc.
   on whether this is acceptable.)
2) Delay mounting the resctrl file system.

> 
> > 
> > Signed-off-by: Tony Luck <tony.luck@intel.com>
> > ---
> >  arch/x86/kernel/cpu/resctrl/intel_aet.c | 93 ++++++++++++++++++++++---
> >  1 file changed, 83 insertions(+), 10 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> > index 743a1894fe9a..14b472106f52 100644
> > --- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
> > +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> > @@ -23,6 +23,7 @@
> >  #include <linux/intel_vsec.h>
> >  #include <linux/io.h>
> >  #include <linux/minmax.h>
> > +#include <linux/module.h>
> >  #include <linux/printk.h>
> >  #include <linux/rculist.h>
> >  #include <linux/rcupdate.h>
> > @@ -289,6 +290,9 @@ static enum pmt_feature_id lookup_pfid(const char *pfname)
> >  	return FEATURE_INVALID;
> >  }
> >  
> > +static struct pmt_feature_group *(*get_feature)(enum pmt_feature_id id);
> > +static void (*put_feature)(struct pmt_feature_group *p);
> > +
> >  /*
> >   * Request a copy of struct pmt_feature_group for each event group. If there is
> >   * one, the returned structure has an array of telemetry_region structures,
> > @@ -300,7 +304,7 @@ static enum pmt_feature_id lookup_pfid(const char *pfname)
> >   * struct pmt_feature_group to indicate that its events are successfully
> >   * enabled.
> >   */
> > -bool intel_aet_get_events(void)
> > +static bool aet_get_events(void)
> >  {
> >  	struct pmt_feature_group *p;
> >  	enum pmt_feature_id pfid;
> > @@ -309,14 +313,14 @@ bool intel_aet_get_events(void)
> >  
> >  	for_each_event_group(peg) {
> >  		pfid = lookup_pfid((*peg)->pfname);
> > -		p = intel_pmt_get_regions_by_feature(pfid);
> > +		p = get_feature(pfid);
> >  		if (IS_ERR_OR_NULL(p))
> >  			continue;
> >  		if (enable_events(*peg, p)) {
> >  			(*peg)->pfg = p;
> >  			ret = true;
> >  		} else {
> > -			intel_pmt_put_feature_group(p);
> > +			put_feature(p);
> >  		}
> >  	}
> >  
> > @@ -325,27 +329,96 @@ bool intel_aet_get_events(void)
> >  
> >  void __exit intel_aet_exit(void)
> >  {
> > -	struct event_group **peg;
> > +}
> >  
> > -	for_each_event_group(peg) {
> > -		if ((*peg)->pfg) {
> > -			intel_pmt_put_feature_group((*peg)->pfg);
> > -			(*peg)->pfg = NULL;
> > -		}
> > +static bool get_pmt_references(void)
> > +{
> > +	get_feature = symbol_request(intel_pmt_get_regions_by_feature);
> > +	if (!get_feature)
> > +		return false;
> > +	put_feature = symbol_request(intel_pmt_put_feature_group);
> > +	if (!put_feature) {
> > +		symbol_put(intel_pmt_get_regions_by_feature);
> > +		get_feature = NULL;
> > +		return false;
> > +	}
> > +
> > +	return true;
> > +}
> > +
> > +static void put_pmt_references(void)
> > +{
> > +	if (get_feature) {
> > +		symbol_put(intel_pmt_get_regions_by_feature);
> > +		get_feature = NULL;
> > +	}
> > +	if (put_feature) {
> > +		symbol_put(intel_pmt_put_feature_group);
> > +		put_feature = NULL;
> >  	}
> >  }
> >  
> > +static enum {
> > +	AET_UNINITIALIZED,
> > +	AET_PRESENT,
> > +	AET_NOT_PRESENT
> > +} aet_state;
> > +
> >  bool intel_aet_pre_mount(void)
> >  {
> > -	return false; // Temporary stub
> > +	bool ret;
> > +
> > +	if (aet_state == AET_PRESENT)
> > +		return true;
> > +
> > +	if (aet_state == AET_NOT_PRESENT || !get_pmt_references())
> > +		return false;
> > +
> > +	ret = aet_get_events();
> > +
> > +	if (ret) {
> > +		aet_state = AET_PRESENT;
> > +	} else {
> > +		aet_state = AET_NOT_PRESENT;
> > +		put_pmt_references();
> > +	}
> > +
> > +	return ret;
> > +}
> 
> I think I am missing something here. This seems to build a new state
> machine on top of existing state tracking. 
> What if, instead, just do:

The state machine was built to cope with the spurious call to intel_aet_pre_mount()
in the case that user attempts to mount resctrl while it is already mounted.
I removed that issue with the refactor in the file system layer, but didn't
clean up here.

The only other part of this state machine was to avoid repeated calls to
enumerate telemetry events on platforms that don't support them. But given
the race to complete enumeration is lost on early mount, that seems to be
a bad idea too and can be dropped.

> 	aet_get_events() 
> 	{
> 
> 		for_each_event_group(peg) {
> 			if ((*peg)->pfg) /* This means "AET_PRESENT" ? */
> 				continue;
> 			...
> 		}
> 
> 	}
> 
> Similarly, get_feature and put_feature maintains state about
> the symbols.
Yes.

> > +
> > +static void aet_cleanup(void)
> > +{
> > +	struct event_group **peg;
> > +
> > +	if (aet_state == AET_PRESENT) {
> 
> I do not see why this check/state machine is needed, if enumeration was not
> done the loop should just do nothing because (*peg)->pfg will always be NULL. 
Yes.

> > +		for_each_event_group(peg) {
> > +			if ((*peg)->pfg) {
> > +				struct event_group *e = *peg;
> > +
> > +				for (int j = 0; j < e->num_events; j++)
> > +					resctrl_disable_mon_event(e->evts[j].id);
> 
> Can this be expected to clean up/reset all changes from enable_events()? For example,
> how about event_group::num_rmid and rdt_resource::resctrl_mon::num_rmid?

I don't clean these up. But I don't see that they would change from one mount to
the next.

> > +				put_feature((*peg)->pfg);
> > +				(*peg)->pfg = NULL;
> > +			}
> > +		}
> > +		put_pmt_references();
> 
> Similarly here, put_pmt_references() could just always be called and it will not
> do anything if there aren't references?
Yes.

> > +		aet_state = AET_UNINITIALIZED;
> > +	}
> >  }
> >  
> >  void intel_aet_mount_result(int ret)
> >  {
> > +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_PERF_PKG].r_resctrl;
> > +
> > +	if (ret) {
> > +		r->mon_capable = false;
> > +		aet_cleanup();
> > +	}
> >  }
> >  
> >  void intel_aet_unmount(void)
> >  {
> > +	aet_cleanup();
> >  }
> 
> I am expecting intel_aet_mount_result() and intel_aet_unmount() to look the same and that
> intel_aet_mount_result() thus becomes unnecessary.

Agreed. Per your comments on the previous patch in the series. File system can just call
the architecture unmount function in the case that the mount failed. Action for
architecture code is same for an aborted mount and for an unmount that follows a
successful mount.

> Just setting r->mon_capable to false does not seem enough though ... how about all those
> monitoring domains that were created? This flow looks unexpected to me, what am I missing?

I'm missing the code to clean up the domains. Will add to next version.

> Reinette
> 

-Tony

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event
  2026-04-06 21:13       ` Reinette Chatre
@ 2026-04-07 18:40         ` Luck, Tony
  2026-04-07 23:10           ` Reinette Chatre
  0 siblings, 1 reply; 31+ messages in thread
From: Luck, Tony @ 2026-04-07 18:40 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

On Mon, Apr 06, 2026 at 02:13:19PM -0700, Reinette Chatre wrote:
> Hi Tony,
> 
> On 4/6/26 11:35 AM, Luck, Tony wrote:
> > On Fri, Apr 03, 2026 at 05:03:28PM -0700, Reinette Chatre wrote:
> 
> ...
> 
> > 
> >>                                                               At this
> >> time these scenarios may just fall into the "architecture must do the
> >> right thing" category since it has best information on how state is
> >> managed for the events as they are enabled/disabled.
> > 
> > Are you suggesting to just drop the check for resctrl_mounted (as both
> > a locking issue, and an incomplete solution)?
> 
> I am indeed suggesting to "drop the check for resctrl_mounted" but instead
> of "just" doing that I think it worthwhile to add function comments to these
> two arch helpers in include/linux/resctrl.h that describes what needs to be
> considered when calling them. That is, describe "architecture must do the
> right thing" with some documentation about what needs to be considered.
> Such documentation may help us to start putting some boundaries on how
> these helpers can/should be used to help guide any future enhancements to
> make this more robust.

Something like this:

/*
 * For events that require per-domain allocation, this routine must be called
 * before CPU hot plug state begins allocating domain structures.
 * For other events the requirement is that the file system must not be
 * mounted when enabling events.
 */
bool resctrl_enable_mon_event(enum resctrl_event_id eventid, bool any_cpu,
                              unsigned int binary_bits, void *arch_priv)

...

/*
 * This routine must not be called for events that require per-domain allcoation.
 * For other events the requirement is that the file system must not be
 * mounted when disabling events.
 */
void resctrl_disable_mon_event(enum resctrl_event_id eventid)

> 
> Reinette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime
  2026-04-07 18:13     ` Luck, Tony
@ 2026-04-07 18:40       ` Reinette Chatre
  2026-04-07 20:33         ` Luck, Tony
  0 siblings, 1 reply; 31+ messages in thread
From: Reinette Chatre @ 2026-04-07 18:40 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

Hi Tony,

On 4/7/26 11:13 AM, Luck, Tony wrote:
...
> Adding a "resctrl" line to /etc/fstab attempts the mount at 39.667. Three
> seconds too early. No PMT events are found, and code in this V4 version
> of the patch series marks the system as AET_NOT_PRESENT and will never
> look again :-(
> 
> I can drop the AET_NOT_PRESENT state so that a retry will succeed. I don't
> see another fix other than to document this limitation.
> 
> Workarounds are:
> 1) Change the CONFIG to build pmt_telemetry into the kernel (where we
>    are today, but haven't heard from Linux distros like Red Hat, SUSE etc.
>    on whether this is acceptable.)
> 2) Delay mounting the resctrl file system.

As I mentioned in https://lore.kernel.org/lkml/e85cd466-2202-4b40-82ed-91e421d8e073@intel.com/
I find (2) to be a poor user interface since it (a) requires user space to
somehow know that the system supports AET and then (b) either delay for some
indeterminate time or repetitively (for some indeterminate count) remount resctrl
to obtain needed features. 

Considering all the complications, could you please provide the motivation
for this series? I should have checked for this first. The cover letter does not
contain this information.

Reinette




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime
  2026-04-07 18:40       ` Reinette Chatre
@ 2026-04-07 20:33         ` Luck, Tony
  0 siblings, 0 replies; 31+ messages in thread
From: Luck, Tony @ 2026-04-07 20:33 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

On Tue, Apr 07, 2026 at 11:40:49AM -0700, Reinette Chatre wrote:
> Hi Tony,
> 
> On 4/7/26 11:13 AM, Luck, Tony wrote:
> ...
> > Adding a "resctrl" line to /etc/fstab attempts the mount at 39.667. Three
> > seconds too early. No PMT events are found, and code in this V4 version
> > of the patch series marks the system as AET_NOT_PRESENT and will never
> > look again :-(
> > 
> > I can drop the AET_NOT_PRESENT state so that a retry will succeed. I don't
> > see another fix other than to document this limitation.
> > 
> > Workarounds are:
> > 1) Change the CONFIG to build pmt_telemetry into the kernel (where we
> >    are today, but haven't heard from Linux distros like Red Hat, SUSE etc.
> >    on whether this is acceptable.)
> > 2) Delay mounting the resctrl file system.
> 
> As I mentioned in https://lore.kernel.org/lkml/e85cd466-2202-4b40-82ed-91e421d8e073@intel.com/
> I find (2) to be a poor user interface since it (a) requires user space to
> somehow know that the system supports AET and then (b) either delay for some
> indeterminate time or repetitively (for some indeterminate count) remount resctrl
> to obtain needed features. 
> 
> Considering all the complications, could you please provide the motivation
> for this series? I should have checked for this first. The cover letter does not
> contain this information.

Reinette,

I'm concerned that AET will not be available in Linux distros. The beta for
Ubuntu 26.04 is available. I downloaded the iso and booted. It has the v7.0
kernel, so AET is in the source. But checking the config I see:

   ubuntu@ubuntu:/boot$ grep VSEC config-7.0.0-10-generic
   CONFIG_INTEL_VSEC=m
   ubuntu@ubuntu:/boot$ grep INTEL_PMT config-7.0.0-10-generic
   CONFIG_INTEL_PMT_CLASS=m
   CONFIG_INTEL_PMT_TELEMETRY=m
   CONFIG_INTEL_PMT_CRASHLOG=m
   CONFIG_INTEL_PMT_DISCOVERY=m
   ubuntu@ubuntu:/boot$ grep AET config-7.0.0-10-generic
   ubuntu@ubuntu:/boot$

The default for most distributions is to build as much as possible as
modules. So I expect to see the same when Fedora, Red Hat, SUSE.

Even our internal intel-next daily builds are configuring modules (I did
get them to switch the relevant modules to built-in for one release cycle.
But they feel they should match what is done in production systems, so
they have switched back.

-Tony

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event
  2026-04-07 18:40         ` Luck, Tony
@ 2026-04-07 23:10           ` Reinette Chatre
  0 siblings, 0 replies; 31+ messages in thread
From: Reinette Chatre @ 2026-04-07 23:10 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

Hi Tony,

On 4/7/26 11:40 AM, Luck, Tony wrote:
> On Mon, Apr 06, 2026 at 02:13:19PM -0700, Reinette Chatre wrote:
>> Hi Tony,
>>
>> On 4/6/26 11:35 AM, Luck, Tony wrote:
>>> On Fri, Apr 03, 2026 at 05:03:28PM -0700, Reinette Chatre wrote:
>>
>> ...
>>
>>>
>>>>                                                               At this
>>>> time these scenarios may just fall into the "architecture must do the
>>>> right thing" category since it has best information on how state is
>>>> managed for the events as they are enabled/disabled.
>>>
>>> Are you suggesting to just drop the check for resctrl_mounted (as both
>>> a locking issue, and an incomplete solution)?
>>
>> I am indeed suggesting to "drop the check for resctrl_mounted" but instead
>> of "just" doing that I think it worthwhile to add function comments to these
>> two arch helpers in include/linux/resctrl.h that describes what needs to be
>> considered when calling them. That is, describe "architecture must do the
>> right thing" with some documentation about what needs to be considered.
>> Such documentation may help us to start putting some boundaries on how
>> these helpers can/should be used to help guide any future enhancements to
>> make this more robust.
> 
> Something like this:
> 
> /*
>  * For events that require per-domain allocation, this routine must be called

Since all events are per-domain and all domains need allocation this is not clear.
(nit: it is not necessary to say "this routine" when the context is clearly
associated with the function).

I think referring to it as "per-domain state" is more accurate.

>  * before CPU hot plug state begins allocating domain structures.

Architecture has some flexibility here if a resource is discovered after
initialization, like what AET does.

>  * For other events the requirement is that the file system must not be

"For other events" -> "For all events"?

>  * mounted when enabling events.
>  */
> bool resctrl_enable_mon_event(enum resctrl_event_id eventid, bool any_cpu,
>                               unsigned int binary_bits, void *arch_priv)

Since resctrl has AET to thank for all the additional parameters and this is
another AET enhancement I think it would be appreciated if this be done properly.
Consider, for example:

/**
 * resctrl_enable_mon_event() - Enable monitoring event
 * @eventid:	ID of the event
 * @any_cpu:	True if event data can be read from any CPU.
 * @binary_bits:Number of binary places of the fixed-point value expected to
 *		back a floating point event. Can only be set for floating point
 *		events.
 * @arch_priv:	Architecture private data associated with event. Passed back to
 *		architecture when reading the event via resctrl_arch_rmid_read().
 *
 * The file system must not be mounted when enabling an event.
 *
 * Events that require per-domain (architectural and/or filesystem) state must
 * be enabled before the domain structures are allocated. For example before
 * CPU hotplug callbacks that allocate domain structures are registered. If the
 * architecture discovers a resource after initialization it should enable
 * events needing per-domain state before any domain structure allocation which
 * should be coordinated with the CPU hotplug callbacks.
 *
 * Return:
 * true if event was successfully enabled, false otherwise.
 */
bool resctrl_enable_mon_event(enum resctrl_event_id eventid, bool any_cpu,
                              unsigned int binary_bits, void *arch_priv)


> 
> ...
> 
> /*
>  * This routine must not be called for events that require per-domain allcoation.
>  * For other events the requirement is that the file system must not be
>  * mounted when disabling events.
>  */
> void resctrl_disable_mon_event(enum resctrl_event_id eventid)
> 

Similar here. Please note the "allcoation"

Reinette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL
  2026-03-30 21:43 ` [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL Tony Luck
  2026-04-04  0:00   ` Reinette Chatre
@ 2026-04-08  5:07   ` Christoph Hellwig
  2026-04-08 17:01     ` Luck, Tony
  1 sibling, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2026-04-08  5:07 UTC (permalink / raw)
  To: Tony Luck
  Cc: Fenghua Yu, Reinette Chatre, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	David E Box, x86, linux-kernel, patches

On Mon, Mar 30, 2026 at 02:43:16PM -0700, Tony Luck wrote:
> The symbol_get() function requires that objects be GPL licensed.

Please don't add new uses of symbol_get, as we've been slowly trying
to kill it off.  If you have bidirectional dependencies just add an
registration module in the middle to resolve it.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL
  2026-04-08  5:07   ` Christoph Hellwig
@ 2026-04-08 17:01     ` Luck, Tony
  2026-04-09  5:41       ` Christoph Hellwig
  0 siblings, 1 reply; 31+ messages in thread
From: Luck, Tony @ 2026-04-08 17:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Fenghua Yu, Chatre, Reinette, Wieczor-Retman, Maciej,
	Peter Newman, James Morse, Babu Moger, Drew Fustini, Dave Martin,
	Chen, Yu C, Box, David E, x86@kernel.org,
	linux-kernel@vger.kernel.org, patches@lists.linux.dev

>> The symbol_get() function requires that objects be GPL licensed.
>
> Please don't add new uses of symbol_get, as we've been slowly trying
> to kill it off.  If you have bidirectional dependencies just add an
> registration module in the middle to resolve it.

In my case I have "resctrl" (which is built-in, and likely to stay that way) that
needs functions from the "pmt_discovery" module to enumerate which
event counters are present.

Should I add  EXPORT_SYMBOL_GPL(some registration function) to resctrl

and have the pmt_discovery code call that to provide pointers to resctrl?

Thanks

-Tony

P.S.

When killing of some function, it would be nice if you added it to:

    Documentation/process/deprecated.rst

to make this obvious to unwary developers.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL
  2026-04-08 17:01     ` Luck, Tony
@ 2026-04-09  5:41       ` Christoph Hellwig
  0 siblings, 0 replies; 31+ messages in thread
From: Christoph Hellwig @ 2026-04-09  5:41 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Christoph Hellwig, Fenghua Yu, Chatre, Reinette,
	Wieczor-Retman, Maciej, Peter Newman, James Morse, Babu Moger,
	Drew Fustini, Dave Martin, Chen, Yu C, Box, David E,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	patches@lists.linux.dev

On Wed, Apr 08, 2026 at 05:01:54PM +0000, Luck, Tony wrote:
> >> The symbol_get() function requires that objects be GPL licensed.
> >
> > Please don't add new uses of symbol_get, as we've been slowly trying
> > to kill it off.  If you have bidirectional dependencies just add an
> > registration module in the middle to resolve it.
> 
> In my case I have "resctrl" (which is built-in, and likely to stay that way) that
> needs functions from the "pmt_discovery" module to enumerate which
> event counters are present.
> 
> Should I add  EXPORT_SYMBOL_GPL(some registration function) to resctrl
> 
> and have the pmt_discovery code call that to provide pointers to resctrl?

Yes, if your dependency only goes one-way against the normal flow,
that's by far the easiest.  Bonus point for bounding that registration
on an object fitting the natural scope through OF/fwnode/acpi_dev or
whatever fits here.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
  2026-04-06 21:16       ` Reinette Chatre
@ 2026-04-09 20:35         ` Luck, Tony
  2026-04-10 15:16           ` Reinette Chatre
  0 siblings, 1 reply; 31+ messages in thread
From: Luck, Tony @ 2026-04-09 20:35 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

On Mon, Apr 06, 2026 at 02:16:46PM -0700, Reinette Chatre wrote:
> Hi Tony,
> 
> On 4/6/26 1:35 PM, Luck, Tony wrote:
> > On Fri, Apr 03, 2026 at 05:52:30PM -0700, Reinette Chatre wrote:
> >> On 3/30/26 2:43 PM, Tony Luck wrote:
> >>> Add hooks for every mount/unmount of the resctrl file system so that
> >>> architecture code can allocate on mount and free on unmount.
> >>
> >> Please use the changelog to describe and motivate all the other things
> >> that this patch does.
> > 
> > OK. I will expand.
> > 
> >>>
> >>> Signed-off-by: Tony Luck <tony.luck@intel.com>
> >>> ---
> >>>
> >>> Note this patch disables enumeration of AET monitor events because the
> >>> new mount/unmount hooks do not call intel_aet_get_events() (which is
> >>> not ready for the change from "just on first mount" to "called on
> >>> every mount"). That is resolved in the next patch.
> >>
> >> This could be part of the proper changelog.
> >>
> >> Could patches be re-ordered to support incremental changes?
> > 
> > I'll look again because several things have changed since I ordered
> > the series this way. But some bits got overly complicated trying to
> > make AET ready to be called multiple times. If I can't solve elegantly
> > I'll move this into the proper changelog.
> 
> Please mark patches as RFC when still working out details. When reviewing it
> helps to know whether something is being submitted for inclusion or not.

I thought I was close enough for minor tweaks. I was wrong. Especially
so after learning from Christoph that symbol_get() and symbol_request()
are quietly being deprecated. I'm going to use Christoph's suggestion
of a registration function in the next version.

I have succesfully reordered these two patches to avoid breaking AET
enumeration for the next series. The minor caveat this time is that
architecture code is first updated to handle multiple mount/unmount
cycles, but file system code still does the DO_ONCE_SLEEPABLE() to only
call on first mount attempt. So the resctrl_arch_unmount() is defined
but not called until the next patch.

It seems nicer to have a "just architecture" patch, followed by a "just
file system" patch. But I could move the definition of resctrl_arch_unmount()
into the file system patch that first uses it.

> ...
> 
> >>> @@ -2900,6 +2893,30 @@ static int rdt_get_tree(struct fs_context *fc)
> >>>  	return ret;
> >>>  }
> >>>  
> >>> +static int rdt_get_tree_wrapper(struct fs_context *fc)
> >>> +{
> >>> +	int ret;
> >>> +
> >>> +	mutex_lock(&resctrl_mount_lock);
> >>> +
> >>> +	/*
> >>> +	 * resctrl file system can only be mounted once.
> >>> +	 */
> >>> +	if (resctrl_mounted) {
> >>> +		mutex_unlock(&resctrl_mount_lock);
> >>> +		return -EBUSY;
> >>> +	}
> >>> +
> >>
> >> This does not look right. Here too is resctrl_mounted accessed without rdtgroup_mutex
> >> held. This change implies that resctrl_mounted is now protected by resctrl_mount_lock
> >> but resctrl is not changed to respect this throughout resulting in unsafe access of
> >> resctrl_mounted.
> >>
> >> Does this new resctrl_mount_lock need to be in resctrl fs? It really seems as though the
> >> needed synchronization belongs in the architecture. Could this instead be accomplished
> >> with a private mutex within the AET code?
> > 
> > If you dig in lore for the v3 of this patch, you'll see I had the mutex in the
> > AET code. But there were some complications.
> > 
> > 1) Need to acquire in intel_aet_pre_mount() and release in intel_aet_mount_result()
> > which is legal, but makes code more complex when call chains need to be compared to
> > check that the mutex is being released correctly.
> 
> Why was it needed to hold mutex for so long? I cannot find explanation here or in changelog
> of v3. I did not remember correctly and considered the AET code to be doing the domain
> addition. Even so, I do think a mutex internal to the arch code can be used to manage
> the synchronization. Could you please elaborate why this cannot be done?

I tried to move the locks into architecture code. But main problem is still
handling when a user tries to mount an already mounted resctrl file system
and gets -EBUSY.

In that case file system calls resctrl_arch_pre_mount() with the file system
mounted. You suggested that the AET code could detect and ignore a repeat
enumeration by noting that the event_group "(*peg)->pfg" is non-NULL, set by
the original enumeration. But that fails in this scenario:

	# rmmod pmt_telemetry
	# mount -t resctrl resctrl /sys/fs/resctrl
		... succeeds, but without AET present
	# modprobe pmt_telemetry
	# mount -t resctrl resctrl /sys/fs/resctrl
		... enumeration success, but now calls resctrl_enable_mon_event()
		... with the file system mounted

I think the bast solution for this is to change definition of resctrl_arch_pre_mount()
from "called on every mount attempt" to "called only when resctrl is NOT mounted".
This is because architecture code cannot tell whether the file system is mounted.

> 
> > 2) The "only mounted once" case meant extra state (AET_PRESENT, which you note
> > in next patch may be redundant) because intel_aet_pre_mount() is called, but
> > needs to do nothing.
> 
> Right, I do not see need for extra state. In fact, since it is not clear to me that
> PMT enumeration will be complete when intel_pmt_get_regions_by_feature() is called it
> seemed worthwhile to only rely on event_group::pfg - if PMT enumeration was not complete
> during mount N it may be complete on mount N+1? This creates a poor user interface
> though since user would need an alternate way to know if AET is supported and then
> a "remount until it works" approach.

The race remains, and is lost if resctrl is auto-mounted at boot from /etc/fstab.

The user can tell if AET is supported with:

	$ grep ^ /sys/class/intel_pmt/*/guid

and checking if any of the RMID based event guids are present on the system.

Delta T for the race is small enough that delaying the mount to some other
startup script should be sufficient. Users are likely to have such a script
to create the CTRL_MON directories and configure schemata for their workload.
So annoying, but easily solved.

> > 
> > Adding resctrl_mount_lock to the file system code made things simpler. The
> 
> Adding complications to resctrl fs to make things simpler for x86?

I believe it is necessary, since architecture cannot tell if the file system
is mounted.

> > pre-mount code can't be called with rdtgroup_mutex held because it needs to
> > build the domains. That needs cpus_read_lock() + mutex_lock(&domain_list_lock);
> 
> ack. Can an arch-specific mutex be used instead?

See above.

> > I need to add more comments on locking. resctrl_mounted is only modified when both
> > resctrl_mount_lock AND rdtgroup_mutex are held. I believe that makes it safe to
> > read the value of resctrl_mounted with just rdtgroup_mutex held.
> 
> ...but not to read it with only resctrl_mount_lock held as in snippet above.

Holding either of resctrl_mount_lock or rdtgroup_mutex makes it safe to
read the value of resctrl_mounted as it can only be modified when both
mutexes are held.


> Reinette

-Tony

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
  2026-04-09 20:35         ` Luck, Tony
@ 2026-04-10 15:16           ` Reinette Chatre
  2026-04-10 18:59             ` Luck, Tony
  0 siblings, 1 reply; 31+ messages in thread
From: Reinette Chatre @ 2026-04-10 15:16 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

Hi Tony,

On 4/9/26 1:35 PM, Luck, Tony wrote:
> On Mon, Apr 06, 2026 at 02:16:46PM -0700, Reinette Chatre wrote:
>> Hi Tony,
>>
>> On 4/6/26 1:35 PM, Luck, Tony wrote:
>>> On Fri, Apr 03, 2026 at 05:52:30PM -0700, Reinette Chatre wrote:
>>>> On 3/30/26 2:43 PM, Tony Luck wrote:

>>>>> @@ -2900,6 +2893,30 @@ static int rdt_get_tree(struct fs_context *fc)
>>>>>  	return ret;
>>>>>  }
>>>>>  
>>>>> +static int rdt_get_tree_wrapper(struct fs_context *fc)
>>>>> +{
>>>>> +	int ret;
>>>>> +
>>>>> +	mutex_lock(&resctrl_mount_lock);
>>>>> +
>>>>> +	/*
>>>>> +	 * resctrl file system can only be mounted once.
>>>>> +	 */
>>>>> +	if (resctrl_mounted) {
>>>>> +		mutex_unlock(&resctrl_mount_lock);
>>>>> +		return -EBUSY;
>>>>> +	}
>>>>> +
>>>>
>>>> This does not look right. Here too is resctrl_mounted accessed without rdtgroup_mutex
>>>> held. This change implies that resctrl_mounted is now protected by resctrl_mount_lock
>>>> but resctrl is not changed to respect this throughout resulting in unsafe access of
>>>> resctrl_mounted.
>>>>
>>>> Does this new resctrl_mount_lock need to be in resctrl fs? It really seems as though the
>>>> needed synchronization belongs in the architecture. Could this instead be accomplished
>>>> with a private mutex within the AET code?
>>>
>>> If you dig in lore for the v3 of this patch, you'll see I had the mutex in the
>>> AET code. But there were some complications.
>>>
>>> 1) Need to acquire in intel_aet_pre_mount() and release in intel_aet_mount_result()
>>> which is legal, but makes code more complex when call chains need to be compared to
>>> check that the mutex is being released correctly.
>>
>> Why was it needed to hold mutex for so long? I cannot find explanation here or in changelog
>> of v3. I did not remember correctly and considered the AET code to be doing the domain
>> addition. Even so, I do think a mutex internal to the arch code can be used to manage
>> the synchronization. Could you please elaborate why this cannot be done?
> 
> I tried to move the locks into architecture code. But main problem is still
> handling when a user tries to mount an already mounted resctrl file system
> and gets -EBUSY.
> 
> In that case file system calls resctrl_arch_pre_mount() with the file system
> mounted. You suggested that the AET code could detect and ignore a repeat
> enumeration by noting that the event_group "(*peg)->pfg" is non-NULL, set by

It would be great if resctrl only needs to enumerate once but I do not see how that
is possible since there is no clear indication from PMT driver whether (all) its 
data is ready or not.

> the original enumeration. But that fails in this scenario:
> 
> 	# rmmod pmt_telemetry
> 	# mount -t resctrl resctrl /sys/fs/resctrl
> 		... succeeds, but without AET present
> 	# modprobe pmt_telemetry
> 	# mount -t resctrl resctrl /sys/fs/resctrl
> 		... enumeration success, but now calls resctrl_enable_mon_event()
> 		... with the file system mounted

Thank you for catching this.

> 
> I think the bast solution for this is to change definition of resctrl_arch_pre_mount()
> from "called on every mount attempt" to "called only when resctrl is NOT mounted".
> This is because architecture code cannot tell whether the file system is mounted.

Does this mean the addition of the extra locking in resctrl fs? I am not comfortable with
that asymmetrical locking.

I think a problem here is that resctrl_enable_mon_event() gives the architecture code the
capability of manipulating internal resctrl fs state directly. This is a consequence
of the original design before the arch/fs split. I see the additional resctrl fs locking
as an attempt to make it easier for the arch to manipulate fs state but I do not think we
should go in that direction, instead I think it is better to improve the arch/fs boundaries
here.

To that end, what if resctrl uses its familiar "capable" vs "enable" separation here?
That is, the architecture informs resctrl fs whether it is *capable* of a feature and
resctrl fs, based on interactions with user space, determines whether the feature is
*enabled*?

For a simpler addition resctrl could obtain a new helper, for example,
resctrl_capable_mon_event() that only marks an event as "capable". resctrl_enable_mon_event()
could remain as the "early" call that marks events as capable
and enabled but may be deprecated. Both of these *have* to be called before any domains
are created since per-domain state would now depend on whether the system is "capable"
of an event or not. resctrl fs can assume that all "capable" events needs to be enabled
and it can mark them so at the beginning of each mount, resctrl will only expose
"enabled" events.

There are likely some simplifications on top of current implementation to not make
this too invasive.

What do you think?

>>> 2) The "only mounted once" case meant extra state (AET_PRESENT, which you note
>>> in next patch may be redundant) because intel_aet_pre_mount() is called, but
>>> needs to do nothing.
>>
>> Right, I do not see need for extra state. In fact, since it is not clear to me that
>> PMT enumeration will be complete when intel_pmt_get_regions_by_feature() is called it
>> seemed worthwhile to only rely on event_group::pfg - if PMT enumeration was not complete
>> during mount N it may be complete on mount N+1? This creates a poor user interface
>> though since user would need an alternate way to know if AET is supported and then
>> a "remount until it works" approach.
> 
> The race remains, and is lost if resctrl is auto-mounted at boot from /etc/fstab.
> 
> The user can tell if AET is supported with:
> 
> 	$ grep ^ /sys/class/intel_pmt/*/guid
> 
> and checking if any of the RMID based event guids are present on the system.
> 
> Delta T for the race is small enough that delaying the mount to some other
> startup script should be sufficient. Users are likely to have such a script
> to create the CTRL_MON directories and configure schemata for their workload.
> So annoying, but easily solved.

I do not have a clear understanding of how the new implementation with registration
function will look to understand the races involved.

>>> Adding resctrl_mount_lock to the file system code made things simpler. The
>>
>> Adding complications to resctrl fs to make things simpler for x86?
> 
> I believe it is necessary, since architecture cannot tell if the file system
> is mounted.
> 
>>> pre-mount code can't be called with rdtgroup_mutex held because it needs to
>>> build the domains. That needs cpus_read_lock() + mutex_lock(&domain_list_lock);
>>
>> ack. Can an arch-specific mutex be used instead?
> 
> See above.
> 
>>> I need to add more comments on locking. resctrl_mounted is only modified when both
>>> resctrl_mount_lock AND rdtgroup_mutex are held. I believe that makes it safe to
>>> read the value of resctrl_mounted with just rdtgroup_mutex held.
>>
>> ...but not to read it with only resctrl_mount_lock held as in snippet above.
> 
> Holding either of resctrl_mount_lock or rdtgroup_mutex makes it safe to
> read the value of resctrl_mounted as it can only be modified when both
> mutexes are held.

I am concerned about this approach due to it not being symmetrical and how resctrl fs
now adds additional locking to accommodate drivers.

Reinette



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
  2026-04-10 15:16           ` Reinette Chatre
@ 2026-04-10 18:59             ` Luck, Tony
  2026-04-10 21:21               ` Reinette Chatre
  0 siblings, 1 reply; 31+ messages in thread
From: Luck, Tony @ 2026-04-10 18:59 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

Hi Reinette,


On Fri, Apr 10, 2026 at 08:16:50AM -0700, Reinette Chatre wrote:
> Hi Tony,
> 
> On 4/9/26 1:35 PM, Luck, Tony wrote:
> > On Mon, Apr 06, 2026 at 02:16:46PM -0700, Reinette Chatre wrote:
> >> Hi Tony,
> >>
> >> On 4/6/26 1:35 PM, Luck, Tony wrote:
> >>> On Fri, Apr 03, 2026 at 05:52:30PM -0700, Reinette Chatre wrote:
> >>>> On 3/30/26 2:43 PM, Tony Luck wrote:
> 
> >>>>> @@ -2900,6 +2893,30 @@ static int rdt_get_tree(struct fs_context *fc)
> >>>>>  	return ret;
> >>>>>  }
> >>>>>  
> >>>>> +static int rdt_get_tree_wrapper(struct fs_context *fc)
> >>>>> +{
> >>>>> +	int ret;
> >>>>> +
> >>>>> +	mutex_lock(&resctrl_mount_lock);
> >>>>> +
> >>>>> +	/*
> >>>>> +	 * resctrl file system can only be mounted once.
> >>>>> +	 */
> >>>>> +	if (resctrl_mounted) {
> >>>>> +		mutex_unlock(&resctrl_mount_lock);
> >>>>> +		return -EBUSY;
> >>>>> +	}
> >>>>> +
> >>>>
> >>>> This does not look right. Here too is resctrl_mounted accessed without rdtgroup_mutex
> >>>> held. This change implies that resctrl_mounted is now protected by resctrl_mount_lock
> >>>> but resctrl is not changed to respect this throughout resulting in unsafe access of
> >>>> resctrl_mounted.
> >>>>
> >>>> Does this new resctrl_mount_lock need to be in resctrl fs? It really seems as though the
> >>>> needed synchronization belongs in the architecture. Could this instead be accomplished
> >>>> with a private mutex within the AET code?
> >>>
> >>> If you dig in lore for the v3 of this patch, you'll see I had the mutex in the
> >>> AET code. But there were some complications.
> >>>
> >>> 1) Need to acquire in intel_aet_pre_mount() and release in intel_aet_mount_result()
> >>> which is legal, but makes code more complex when call chains need to be compared to
> >>> check that the mutex is being released correctly.
> >>
> >> Why was it needed to hold mutex for so long? I cannot find explanation here or in changelog
> >> of v3. I did not remember correctly and considered the AET code to be doing the domain
> >> addition. Even so, I do think a mutex internal to the arch code can be used to manage
> >> the synchronization. Could you please elaborate why this cannot be done?
> > 
> > I tried to move the locks into architecture code. But main problem is still
> > handling when a user tries to mount an already mounted resctrl file system
> > and gets -EBUSY.
> > 
> > In that case file system calls resctrl_arch_pre_mount() with the file system
> > mounted. You suggested that the AET code could detect and ignore a repeat
> > enumeration by noting that the event_group "(*peg)->pfg" is non-NULL, set by
> 
> It would be great if resctrl only needs to enumerate once but I do not see how that
> is possible since there is no clear indication from PMT driver whether (all) its 
> data is ready or not.

That was the root problem. But allowing INTEL_PMT_TELEMETRY to be a
module adds the additional constraint that the module cannot be unloaded
while resctrl is mounted. If that happens, the mappings to the MMIO
space disappear. The existing intel_pmt_get_regions_by_feature() and
intel_pmt_put_feature_group() do not do module_{get,put}() to prevent
that. New series under development addresses this.

> > the original enumeration. But that fails in this scenario:
> > 
> > 	# rmmod pmt_telemetry
> > 	# mount -t resctrl resctrl /sys/fs/resctrl
> > 		... succeeds, but without AET present
> > 	# modprobe pmt_telemetry
> > 	# mount -t resctrl resctrl /sys/fs/resctrl
> > 		... enumeration success, but now calls resctrl_enable_mon_event()
> > 		... with the file system mounted
> 
> Thank you for catching this.
> 
> > 
> > I think the bast solution for this is to change definition of resctrl_arch_pre_mount()
> > from "called on every mount attempt" to "called only when resctrl is NOT mounted".
> > This is because architecture code cannot tell whether the file system is mounted.
> 
> Does this mean the addition of the extra locking in resctrl fs? I am not comfortable with
> that asymmetrical locking.

Conceptually the change here is from "architecture must do all
enumeration before file system code starts" to "architecture is allowed
to make changes when the file system is not mounted".

There are currently several limits to what is safe to do because file
system only does a partial cleanup on unmount. But general development
direction has been to move initialization code into mount path. For AET
things need minimal tweaks today. For some future feature more
refactoring from "init" to "mount" might be needed.

> I think a problem here is that resctrl_enable_mon_event() gives the architecture code the
> capability of manipulating internal resctrl fs state directly. This is a consequence
> of the original design before the arch/fs split. I see the additional resctrl fs locking
> as an attempt to make it easier for the arch to manipulate fs state but I do not think we
> should go in that direction, instead I think it is better to improve the arch/fs boundaries
> here.
> 
> To that end, what if resctrl uses its familiar "capable" vs "enable" separation here?
> That is, the architecture informs resctrl fs whether it is *capable* of a feature and
> resctrl fs, based on interactions with user space, determines whether the feature is
> *enabled*?

For the AET features user space could look at /sys/class/intel_pmt/*/guid to see
which events could be upgraded from "capable" to "enabled". But how can that
be conveyed to "resctrl fs"? Adding more mount options would seem to be
the only choice. Config files under "info/" are only available after
mount. But by then domains have been created and the top level mon_data
directory populated.

This also runs into problems if INTEL_PMT_TELEMETRY is unloaded between
telling the file system that some set of events are "capable" and the
file system asking to enable them.

> For a simpler addition resctrl could obtain a new helper, for example,
> resctrl_capable_mon_event() that only marks an event as "capable". resctrl_enable_mon_event()
> could remain as the "early" call that marks events as capable
> and enabled but may be deprecated. Both of these *have* to be called before any domains
> are created since per-domain state would now depend on whether the system is "capable"
> of an event or not. resctrl fs can assume that all "capable" events needs to be enabled
> and it can mark them so at the beginning of each mount, resctrl will only expose
> "enabled" events.
> 
> There are likely some simplifications on top of current implementation to not make
> this too invasive.
> 
> What do you think?

I think it opens a new can of worms. Maybe the most challenging is for
the file system to add a "hold" to the INTEL_PMT_TELEMETRY module when
enabling an event.

> 
> >>> 2) The "only mounted once" case meant extra state (AET_PRESENT, which you note
> >>> in next patch may be redundant) because intel_aet_pre_mount() is called, but
> >>> needs to do nothing.
> >>
> >> Right, I do not see need for extra state. In fact, since it is not clear to me that
> >> PMT enumeration will be complete when intel_pmt_get_regions_by_feature() is called it
> >> seemed worthwhile to only rely on event_group::pfg - if PMT enumeration was not complete
> >> during mount N it may be complete on mount N+1? This creates a poor user interface
> >> though since user would need an alternate way to know if AET is supported and then
> >> a "remount until it works" approach.
> > 
> > The race remains, and is lost if resctrl is auto-mounted at boot from /etc/fstab.
> > 
> > The user can tell if AET is supported with:
> > 
> > 	$ grep ^ /sys/class/intel_pmt/*/guid
> > 
> > and checking if any of the RMID based event guids are present on the system.
> > 
> > Delta T for the race is small enough that delaying the mount to some other
> > startup script should be sufficient. Users are likely to have such a script
> > to create the CTRL_MON directories and configure schemata for their workload.
> > So annoying, but easily solved.
> 
> I do not have a clear understanding of how the new implementation with registration
> function will look to understand the races involved.

I think I'm ready with v5 of the series. I can post (as "RFC") so you
can see, and comment, on the details.

> >>> Adding resctrl_mount_lock to the file system code made things simpler. The
> >>
> >> Adding complications to resctrl fs to make things simpler for x86?
> > 
> > I believe it is necessary, since architecture cannot tell if the file system
> > is mounted.
> > 
> >>> pre-mount code can't be called with rdtgroup_mutex held because it needs to
> >>> build the domains. That needs cpus_read_lock() + mutex_lock(&domain_list_lock);
> >>
> >> ack. Can an arch-specific mutex be used instead?
> > 
> > See above.
> > 
> >>> I need to add more comments on locking. resctrl_mounted is only modified when both
> >>> resctrl_mount_lock AND rdtgroup_mutex are held. I believe that makes it safe to
> >>> read the value of resctrl_mounted with just rdtgroup_mutex held.
> >>
> >> ...but not to read it with only resctrl_mount_lock held as in snippet above.
> > 
> > Holding either of resctrl_mount_lock or rdtgroup_mutex makes it safe to
> > read the value of resctrl_mounted as it can only be modified when both
> > mutexes are held.
> 
> I am concerned about this approach due to it not being symmetrical and how resctrl fs
> now adds additional locking to accommodate drivers.

I'm not sure I see the asymmetry. File system code already calls architecture
code with some set of locks held. This just adds a new lock (at the top of
the locking hierarchy) that is held across calls to resctrl_arch_pre_mount()
and resctrl_arch_unmount().

> 
> Reinette
> 

-Tony

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
  2026-04-10 18:59             ` Luck, Tony
@ 2026-04-10 21:21               ` Reinette Chatre
  2026-04-10 23:03                 ` Luck, Tony
  0 siblings, 1 reply; 31+ messages in thread
From: Reinette Chatre @ 2026-04-10 21:21 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, David E Box, x86,
	linux-kernel, patches

Hi Tony,

On 4/10/26 11:59 AM, Luck, Tony wrote:
> On Fri, Apr 10, 2026 at 08:16:50AM -0700, Reinette Chatre wrote:
>> On 4/9/26 1:35 PM, Luck, Tony wrote:
>>> On Mon, Apr 06, 2026 at 02:16:46PM -0700, Reinette Chatre wrote:
>>>> On 4/6/26 1:35 PM, Luck, Tony wrote:
>>>>> On Fri, Apr 03, 2026 at 05:52:30PM -0700, Reinette Chatre wrote:
>>>>>> On 3/30/26 2:43 PM, Tony Luck wrote:
>>
>>>>>>> @@ -2900,6 +2893,30 @@ static int rdt_get_tree(struct fs_context *fc)
>>>>>>>  	return ret;
>>>>>>>  }
>>>>>>>  
>>>>>>> +static int rdt_get_tree_wrapper(struct fs_context *fc)
>>>>>>> +{
>>>>>>> +	int ret;
>>>>>>> +
>>>>>>> +	mutex_lock(&resctrl_mount_lock);
>>>>>>> +
>>>>>>> +	/*
>>>>>>> +	 * resctrl file system can only be mounted once.
>>>>>>> +	 */
>>>>>>> +	if (resctrl_mounted) {
>>>>>>> +		mutex_unlock(&resctrl_mount_lock);
>>>>>>> +		return -EBUSY;
>>>>>>> +	}
>>>>>>> +
>>>>>>
>>>>>> This does not look right. Here too is resctrl_mounted accessed without rdtgroup_mutex
>>>>>> held. This change implies that resctrl_mounted is now protected by resctrl_mount_lock
>>>>>> but resctrl is not changed to respect this throughout resulting in unsafe access of
>>>>>> resctrl_mounted.
>>>>>>
>>>>>> Does this new resctrl_mount_lock need to be in resctrl fs? It really seems as though the
>>>>>> needed synchronization belongs in the architecture. Could this instead be accomplished
>>>>>> with a private mutex within the AET code?
>>>>>
>>>>> If you dig in lore for the v3 of this patch, you'll see I had the mutex in the
>>>>> AET code. But there were some complications.
>>>>>
>>>>> 1) Need to acquire in intel_aet_pre_mount() and release in intel_aet_mount_result()
>>>>> which is legal, but makes code more complex when call chains need to be compared to
>>>>> check that the mutex is being released correctly.
>>>>
>>>> Why was it needed to hold mutex for so long? I cannot find explanation here or in changelog
>>>> of v3. I did not remember correctly and considered the AET code to be doing the domain
>>>> addition. Even so, I do think a mutex internal to the arch code can be used to manage
>>>> the synchronization. Could you please elaborate why this cannot be done?
>>>
>>> I tried to move the locks into architecture code. But main problem is still
>>> handling when a user tries to mount an already mounted resctrl file system
>>> and gets -EBUSY.
>>>
>>> In that case file system calls resctrl_arch_pre_mount() with the file system
>>> mounted. You suggested that the AET code could detect and ignore a repeat
>>> enumeration by noting that the event_group "(*peg)->pfg" is non-NULL, set by
>>
>> It would be great if resctrl only needs to enumerate once but I do not see how that
>> is possible since there is no clear indication from PMT driver whether (all) its 
>> data is ready or not.
> 
> That was the root problem. But allowing INTEL_PMT_TELEMETRY to be a
> module adds the additional constraint that the module cannot be unloaded
> while resctrl is mounted. If that happens, the mappings to the MMIO
> space disappear. The existing intel_pmt_get_regions_by_feature() and
> intel_pmt_put_feature_group() do not do module_{get,put}() to prevent
> that. New series under development addresses this.

ack.

> 
>>> the original enumeration. But that fails in this scenario:
>>>
>>> 	# rmmod pmt_telemetry
>>> 	# mount -t resctrl resctrl /sys/fs/resctrl
>>> 		... succeeds, but without AET present
>>> 	# modprobe pmt_telemetry
>>> 	# mount -t resctrl resctrl /sys/fs/resctrl
>>> 		... enumeration success, but now calls resctrl_enable_mon_event()
>>> 		... with the file system mounted
>>
>> Thank you for catching this.
>>
>>>
>>> I think the bast solution for this is to change definition of resctrl_arch_pre_mount()
>>> from "called on every mount attempt" to "called only when resctrl is NOT mounted".
>>> This is because architecture code cannot tell whether the file system is mounted.
>>
>> Does this mean the addition of the extra locking in resctrl fs? I am not comfortable with
>> that asymmetrical locking.
> 
> Conceptually the change here is from "architecture must do all
> enumeration before file system code starts" to "architecture is allowed
> to make changes when the file system is not mounted".

This does not seem to be something that resctrl should enforce in general. Architecture
could theoretically make changes any time it wants and stage them for resctrl fs to
consume when it is able to do so.

> There are currently several limits to what is safe to do because file
> system only does a partial cleanup on unmount. But general development
> direction has been to move initialization code into mount path. For AET
> things need minimal tweaks today. For some future feature more
> refactoring from "init" to "mount" might be needed.

Right ... and we need to consider how the architecture and filesystem boundaries are consistent
so that this is managed well. Just adding a lock for convenience does not seem appropriate to me.

> 
>> I think a problem here is that resctrl_enable_mon_event() gives the architecture code the
>> capability of manipulating internal resctrl fs state directly. This is a consequence
>> of the original design before the arch/fs split. I see the additional resctrl fs locking
>> as an attempt to make it easier for the arch to manipulate fs state but I do not think we
>> should go in that direction, instead I think it is better to improve the arch/fs boundaries
>> here.
>>
>> To that end, what if resctrl uses its familiar "capable" vs "enable" separation here?
>> That is, the architecture informs resctrl fs whether it is *capable* of a feature and
>> resctrl fs, based on interactions with user space, determines whether the feature is
>> *enabled*?
> 
> For the AET features user space could look at /sys/class/intel_pmt/*/guid to see
> which events could be upgraded from "capable" to "enabled". But how can that
> be conveyed to "resctrl fs"? Adding more mount options would seem to be

User space should not be involved here and I do not see any need to add new mount
options. It is the architecture that marks events as capable if it supports the events.

> the only choice. Config files under "info/" are only available after
> mount. But by then domains have been created and the top level mon_data
> directory populated.

Why would this be needed? 

Here is how I understand the relationship:

- arch enumerates a new feature at any time and calls resctrl_capable_mon_event()
  to mark any event discovered during enumeration as "capable".
- arch is responsible for domain creation and is trusted to do so *after*
  marking any events as "capable", any needed per-domain state is created for
  the "capable" events.

- when resctrl fs is mounted, before creating any files, resctrl fs automatically
  sets all "capable" events to "enabled". resctrl fs will create needed files only for
  "enabled" events. 

- if an arch discovers new events after resctrl is mounted then it can still
  enumerate the events and mark them as "capable" - resctrl fs will pick that up
  on remount.

- arch can only disable an event as part of the unmount handler, this will clear
  "capable" as well as "enabled". This can be enforced with a check in the
  callback where only rdtgroup_mutex should be needed to access resctrl_mounted.

> 
> This also runs into problems if INTEL_PMT_TELEMETRY is unloaded between
> telling the file system that some set of events are "capable" and the
> file system asking to enable them.

This is existing problem and sounds like you are solving this with the module_get()
and module_put(). 

> 
>> For a simpler addition resctrl could obtain a new helper, for example,
>> resctrl_capable_mon_event() that only marks an event as "capable". resctrl_enable_mon_event()
>> could remain as the "early" call that marks events as capable
>> and enabled but may be deprecated. Both of these *have* to be called before any domains
>> are created since per-domain state would now depend on whether the system is "capable"
>> of an event or not. resctrl fs can assume that all "capable" events needs to be enabled
>> and it can mark them so at the beginning of each mount, resctrl will only expose
>> "enabled" events.
>>
>> There are likely some simplifications on top of current implementation to not make
>> this too invasive.
>>
>> What do you think?
> 
> I think it opens a new can of worms. Maybe the most challenging is for
> the file system to add a "hold" to the INTEL_PMT_TELEMETRY module when
> enabling an event.

This is a different problem that needs to be solved also and it sounds as though you
have a solution for that.

>>>>> 2) The "only mounted once" case meant extra state (AET_PRESENT, which you note
>>>>> in next patch may be redundant) because intel_aet_pre_mount() is called, but
>>>>> needs to do nothing.
>>>>
>>>> Right, I do not see need for extra state. In fact, since it is not clear to me that
>>>> PMT enumeration will be complete when intel_pmt_get_regions_by_feature() is called it
>>>> seemed worthwhile to only rely on event_group::pfg - if PMT enumeration was not complete
>>>> during mount N it may be complete on mount N+1? This creates a poor user interface
>>>> though since user would need an alternate way to know if AET is supported and then
>>>> a "remount until it works" approach.
>>>
>>> The race remains, and is lost if resctrl is auto-mounted at boot from /etc/fstab.
>>>
>>> The user can tell if AET is supported with:
>>>
>>> 	$ grep ^ /sys/class/intel_pmt/*/guid
>>>
>>> and checking if any of the RMID based event guids are present on the system.
>>>
>>> Delta T for the race is small enough that delaying the mount to some other
>>> startup script should be sufficient. Users are likely to have such a script
>>> to create the CTRL_MON directories and configure schemata for their workload.
>>> So annoying, but easily solved.
>>
>> I do not have a clear understanding of how the new implementation with registration
>> function will look to understand the races involved.
> 
> I think I'm ready with v5 of the series. I can post (as "RFC") so you

You consider it ready without completing this discussion? I interpret this to mean that
I am wasting my time trying to discuss. 

> can see, and comment, on the details.

... and it is already posted ... before this discussion is complete ... unless you feel
that the hour between your response here and the posting of the new version is sufficient
grace for me to respond.
The first four versions of this work were all sent within one week. I tried to
collaborate with you on v4 but now you sent v5 without completing the discussion.
I cannot collaborate with such rapid attempts at throwing stuff to see what sticks.

>>>>> Adding resctrl_mount_lock to the file system code made things simpler. The
>>>>
>>>> Adding complications to resctrl fs to make things simpler for x86?
>>>
>>> I believe it is necessary, since architecture cannot tell if the file system
>>> is mounted.
>>>
>>>>> pre-mount code can't be called with rdtgroup_mutex held because it needs to
>>>>> build the domains. That needs cpus_read_lock() + mutex_lock(&domain_list_lock);
>>>>
>>>> ack. Can an arch-specific mutex be used instead?
>>>
>>> See above.
>>>
>>>>> I need to add more comments on locking. resctrl_mounted is only modified when both
>>>>> resctrl_mount_lock AND rdtgroup_mutex are held. I believe that makes it safe to
>>>>> read the value of resctrl_mounted with just rdtgroup_mutex held.
>>>>
>>>> ...but not to read it with only resctrl_mount_lock held as in snippet above.
>>>
>>> Holding either of resctrl_mount_lock or rdtgroup_mutex makes it safe to
>>> read the value of resctrl_mounted as it can only be modified when both
>>> mutexes are held.
>>
>> I am concerned about this approach due to it not being symmetrical and how resctrl fs
>> now adds additional locking to accommodate drivers.
> 
> I'm not sure I see the asymmetry. File system code already calls architecture
> code with some set of locks held. This just adds a new lock (at the top of
> the locking hierarchy) that is held across calls to resctrl_arch_pre_mount()
> and resctrl_arch_unmount().

The asymmetry is in the inconsistency which locks are used to interact with
resctrl_mounted ... in this version at least.

Reinette


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
  2026-04-10 21:21               ` Reinette Chatre
@ 2026-04-10 23:03                 ` Luck, Tony
  0 siblings, 0 replies; 31+ messages in thread
From: Luck, Tony @ 2026-04-10 23:03 UTC (permalink / raw)
  To: Chatre, Reinette
  Cc: Fenghua Yu, Wieczor-Retman, Maciej, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen, Yu C, Box, David E,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	patches@lists.linux.dev

> >> I do not have a clear understanding of how the new implementation with registration
> >> function will look to understand the races involved.
> >
> > I think I'm ready with v5 of the series. I can post (as "RFC") so you
>
> You consider it ready without completing this discussion? I interpret this to mean that
> I am wasting my time trying to discuss.
>
> > can see, and comment, on the details.
>
> ... and it is already posted ... before this discussion is complete ... unless you feel
> that the hour between your response here and the posting of the new version is sufficient
> grace for me to respond.
> The first four versions of this work were all sent within one week. I tried to
> collaborate with you on v4 but now you sent v5 without completing the discussion.
> I cannot collaborate with such rapid attempts at throwing stuff to see what sticks.

My intent in posting (as RFC as suggested by you for code that isn't ready for upstream)
was simply to give you visibility to the registration mechanism that I'd implemented in
response to Christoph's comment that I should not use symbol_get() or symbol_request().

I'm looking at your suggestion for asynchronous update by architecture for use on next
mount.

-Tony

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2026-04-10 23:03 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-30 21:43 [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules Tony Luck
2026-03-30 21:43 ` [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL Tony Luck
2026-04-04  0:00   ` Reinette Chatre
2026-04-06 18:07     ` David Box
2026-04-08  5:07   ` Christoph Hellwig
2026-04-08 17:01     ` Luck, Tony
2026-04-09  5:41       ` Christoph Hellwig
2026-03-30 21:43 ` [PATCH v4 2/7] x86/resctrl: Drop setting of event_group::force_off when insufficient RMIDs Tony Luck
2026-04-04  0:01   ` Reinette Chatre
2026-03-30 21:43 ` [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event Tony Luck
2026-04-04  0:03   ` Reinette Chatre
2026-04-06 18:35     ` Luck, Tony
2026-04-06 21:13       ` Reinette Chatre
2026-04-07 18:40         ` Luck, Tony
2026-04-07 23:10           ` Reinette Chatre
2026-03-30 21:43 ` [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount Tony Luck
2026-04-04  0:52   ` Reinette Chatre
2026-04-06 20:35     ` Luck, Tony
2026-04-06 21:16       ` Reinette Chatre
2026-04-09 20:35         ` Luck, Tony
2026-04-10 15:16           ` Reinette Chatre
2026-04-10 18:59             ` Luck, Tony
2026-04-10 21:21               ` Reinette Chatre
2026-04-10 23:03                 ` Luck, Tony
2026-03-30 21:43 ` [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime Tony Luck
2026-04-04  0:56   ` Reinette Chatre
2026-04-07 18:13     ` Luck, Tony
2026-04-07 18:40       ` Reinette Chatre
2026-04-07 20:33         ` Luck, Tony
2026-03-30 21:43 ` [PATCH v4 6/7] x86/resctrl: Delete intel_aet_exit() Tony Luck
2026-03-30 21:43 ` [PATCH v4 7/7] x86/resctrl: Downgrade dependency of AET on INTEL_PMT Tony Luck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox