public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
@ 2024-06-14 14:59 James Morse
  2024-06-14 14:59 ` [PATCH v3 01/38] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
                   ` (38 more replies)
  0 siblings, 39 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 14:59 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

Changes since v2?
Patches 3+13 have been merged, then split into six patches that also bring
the format string and a few other parameters into the struct resctrl_schema.
This makes for a cleaner arch/fs split.

Dave's checkpatch fixes to existing code have been broken out into a separate
series to be posted shortly. The asm->linux changes got merged into the patch that
makes this possible.

~

This is the final series that allows other architectures to implement resctrl.
The final patch to move the code has been ommited, but can be generated using
the python script at the end of the series.
The final move is a bit of a monster. I don't expect that to get merged as part
of this series - we should wait for it to make less impact on other series.

Otherwise this series renames functions and moves code around. With the
exception of invalid configurations for the configurable-events, there should
be no changes in behaviour caused by this series.

The driving pattern is to make things like struct rdtgroup private to resctrl.
Features like pseudo-lock aren't going to work on arm64, the ability to disable
it at compile time is added.

After this, I can start posting the MPAM driver to make use of resctrl on arm64.
(What's MPAM? See the cover letter of the first series. [1])

This series is based on v6.10-rc1 and can be retrieved from:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/move_to_fs/v3

As ever - bugs welcome,
Thanks,

James

[v2] https://lore.kernel.org/r/20240426150537.8094-1-Dave.Martin@arm.com
[v1] https://lore.kernel.org/r/20240321165106.31602-1-james.morse@arm.com
[1] https://lore.kernel.org/lkml/20201030161120.227225-1-james.morse@arm.com/

James Morse (38):
  x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no
    monitors
  x86/resctrl: Add a helper to avoid reaching into the arch code
    resource list
  x86/resctrl: Add a schema format enum and use this for fflags
  x86/resctrl: Use schema type to determine how to parse schema values
  x86/resctrl: Use schema type to determine the schema format string
  x86/resctrl: Move data_width to be a schema property
  x86/resctrl: Add max_bw to struct resctrl_membw
  x86/resctrl: Generate default_ctrl instead of sharing it
  x86/resctrl: Add helper for setting CPU default properties
  x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid()
  x86/resctrl: Export resctrl fs's init function
  x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain()
  x86/resctrl: Move resctrl types to a separate header
  x86/resctrl: Add a resctrl helper to reset all the resources
  x86/resctrl: Move monitor exit work to a restrl exit call
  x86/resctrl: Move monitor init work to a resctrl init call
  x86/resctrl: Stop using the for_each_*_rdt_resource() walkers
  x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h
  x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC
  x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
  x86/resctrl: Move mbm_cfg_mask to struct rdt_resource
  x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions
  x86/resctrl: Allow an architecture to disable pseudo lock
  x86/resctrl: Make prefetch_disable_bits belong to the arch code
  x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr
  x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl
  x86/resctrl: Move get_config_index() to a header
  x86/resctrl: Claim get_domain_from_cpu() for resctrl
  x86/resctrl: Describe resctrl's bitmap size assumptions
  x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_"
  x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
  x86/resctrl: Drop __init/__exit on assorted symbols
  x86/resctrl: Move is_mba_sc() out of core.c
  x86/resctrl: Add end-marker to the resctrl_event_id enum
  x86/resctrl: Remove a newline to avoid confusing the code move script
  fs/resctrl: Add boiler plate for external resctrl code
  x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
  x86/resctrl: Add python script to move resctrl code to /fs/resctrl

 MAINTAINERS                               |   2 +
 arch/Kconfig                              |   8 +
 arch/x86/Kconfig                          |  12 +-
 arch/x86/include/asm/resctrl.h            |  45 +-
 arch/x86/kernel/cpu/resctrl/Makefile      |   5 +-
 arch/x86/kernel/cpu/resctrl/core.c        | 158 ++---
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  67 +-
 arch/x86/kernel/cpu/resctrl/internal.h    | 185 ++----
 arch/x86/kernel/cpu/resctrl/monitor.c     |  88 +--
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  65 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 288 +++++---
 arch/x86/kernel/process_32.c              |   2 +-
 arch/x86/kernel/process_64.c              |   2 +-
 fs/Kconfig                                |   1 +
 fs/Makefile                               |   1 +
 fs/resctrl/Kconfig                        |  36 +
 fs/resctrl/Makefile                       |   3 +
 fs/resctrl/ctrlmondata.c                  |   0
 fs/resctrl/internal.h                     |   0
 fs/resctrl/monitor.c                      |   0
 fs/resctrl/pseudo_lock.c                  |   0
 fs/resctrl/rdtgroup.c                     |   0
 fs/resctrl/trace.h                        |   0
 include/linux/resctrl.h                   | 193 +++++-
 include/linux/resctrl_types.h             |  59 ++
 resctrl_copy_pasta.py                     | 766 ++++++++++++++++++++++
 26 files changed, 1546 insertions(+), 440 deletions(-)
 create mode 100644 fs/resctrl/Kconfig
 create mode 100644 fs/resctrl/Makefile
 create mode 100644 fs/resctrl/ctrlmondata.c
 create mode 100644 fs/resctrl/internal.h
 create mode 100644 fs/resctrl/monitor.c
 create mode 100644 fs/resctrl/pseudo_lock.c
 create mode 100644 fs/resctrl/rdtgroup.c
 create mode 100644 fs/resctrl/trace.h
 create mode 100644 include/linux/resctrl_types.h
 create mode 100644 resctrl_copy_pasta.py

-- 
2.39.2


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH v3 01/38] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
@ 2024-06-14 14:59 ` James Morse
  2024-06-28 16:41   ` Reinette Chatre
  2024-06-14 14:59 ` [PATCH v3 02/38] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
                   ` (37 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 14:59 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

commit 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by
searching closid_num_dirty_rmid") added logic that causes resctrl to
search for the CLOSID with the fewest dirty cache lines when creating a
new control group, if requested by the arch code. This depends on the
values read from the llc_occupancy counters. The logic is applicable to
architectures where the CLOSID effectively forms part of the monitoring
identifier and so do not allow complete freedom to choose an unused
monitoring identifier for a given CLOSID.

This support missed that some platforms may not have these counters.
This causes a NULL pointer dereference when creating a new control
group as the array was not allocated by dom_data_init().

As this feature isn't necessary on platforms that don't have cache
occupancy monitors, add this to the check that occurs when a new
control group is allocated.

The existing code is not selected by any upstream platform, it makes
no sense to backport this patch to stable.

Fixes: 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid")
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: David Hildenbrand <david@redhat.com>

---
Changes since v1:
 * [Commit message only] Reword the first paragraph to make it clear
   that the issue being fixed wasn't directly associated with addition
   of a Kconfig option.  (Actually, the option is not in Kconfig yet,
   and gets added later in this series.)
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 02f213f1c51c..d02f4c97e40f 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -149,7 +149,8 @@ static int closid_alloc(void)
 
 	lockdep_assert_held(&rdtgroup_mutex);
 
-	if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) {
+	if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID) &&
+	    is_llc_occupancy_enabled()) {
 		cleanest_closid = resctrl_find_cleanest_closid();
 		if (cleanest_closid < 0)
 			return cleanest_closid;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 02/38] x86/resctrl: Add a helper to avoid reaching into the arch code resource list
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
  2024-06-14 14:59 ` [PATCH v3 01/38] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
@ 2024-06-14 14:59 ` James Morse
  2024-06-28 16:42   ` Reinette Chatre
  2024-06-14 14:59 ` [PATCH v3 03/38] x86/resctrl: Add a schema format enum and use this for fflags James Morse
                   ` (36 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 14:59 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Dave Martin, Shaopeng Tan

Resctrl occasionally wants to know something about a specific resource,
in these cases it reaches into the arch code's rdt_resources_all[]
array.

Once the filesystem parts of resctrl are moved to /fs/, this means it
will need visibility of the architecture specific struct
rdt_hw_resource definition, and the array of all resources.  All
architectures would also need a r_resctrl member in this struct.

Instead, abstract this via a helper to allow architectures to do
different things here. Move the level enum to the resctrl header and
add a helper to retrieve the struct rdt_resource by 'rid'.

resctrl_arch_get_resource() should not return NULL for any value in
the enum, it may instead return a dummy resource that is
!alloc_enabled && !mon_enabled.

Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
 * Backed out non-functional renaming of "r" to "l3" in rdt_get_tree(),
   and unhoisted the assignment of r (as now is) back into the if ()
   where it started out.  There seem to be no uses of this variable
   outside this if().
 * [Commit message only] Typo fix:
   s/resctrl_hw_resource/rdt_hw_resource/g
---
 arch/x86/kernel/cpu/resctrl/core.c        | 10 +++++++++-
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  2 +-
 arch/x86/kernel/cpu/resctrl/internal.h    | 10 ----------
 arch/x86/kernel/cpu/resctrl/monitor.c     |  8 ++++----
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 12 ++++++------
 include/linux/resctrl.h                   | 17 +++++++++++++++++
 6 files changed, 37 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index a113d9aba553..d04e157dd7f7 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -117,6 +117,14 @@ struct rdt_hw_resource rdt_resources_all[] = {
 	},
 };
 
+struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
+{
+	if (l >= RDT_NUM_RESOURCES)
+		return NULL;
+
+	return &rdt_resources_all[l].r_resctrl;
+}
+
 /*
  * cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs
  * as they do not have CPUID enumeration support for Cache allocation.
@@ -164,7 +172,7 @@ static inline void cache_alloc_hsw_probe(void)
 bool is_mba_sc(struct rdt_resource *r)
 {
 	if (!r)
-		return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.mba_sc;
+		r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
 
 	/*
 	 * The software controller support is only applicable to MBA resource.
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index b7291f60399c..9f1ed26b9d83 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -575,7 +575,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
 	domid = md.u.domid;
 	evtid = md.u.evtid;
 
-	r = &rdt_resources_all[resid].r_resctrl;
+	r = resctrl_arch_get_resource(resid);
 	d = rdt_find_domain(r, domid, NULL);
 	if (IS_ERR_OR_NULL(d)) {
 		ret = -ENOENT;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index f1d926832ec8..e11abefcfd31 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -468,16 +468,6 @@ extern struct rdt_hw_resource rdt_resources_all[];
 extern struct rdtgroup rdtgroup_default;
 extern struct dentry *debugfs_resctrl;
 
-enum resctrl_res_level {
-	RDT_RESOURCE_L3,
-	RDT_RESOURCE_L2,
-	RDT_RESOURCE_MBA,
-	RDT_RESOURCE_SMBA,
-
-	/* Must be the last */
-	RDT_NUM_RESOURCES,
-};
-
 static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
 {
 	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(res);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 2345e6836593..96aaaa87c82c 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -322,7 +322,7 @@ static void limbo_release_entry(struct rmid_entry *entry)
  */
 void __check_limbo(struct rdt_domain *d, bool force_free)
 {
-	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
 	u32 idx_limit = resctrl_arch_system_num_rmid_idx();
 	struct rmid_entry *entry;
 	u32 idx, cur_idx = 1;
@@ -478,7 +478,7 @@ int alloc_rmid(u32 closid)
 
 static void add_rmid_to_limbo(struct rmid_entry *entry)
 {
-	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
 	struct rdt_domain *d;
 	u32 idx;
 
@@ -680,7 +680,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
 	if (!is_mbm_local_enabled())
 		return;
 
-	r_mba = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
+	r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
 
 	closid = rgrp->closid;
 	rmid = rgrp->mon.rmid;
@@ -850,7 +850,7 @@ void mbm_handle_overflow(struct work_struct *work)
 	if (!resctrl_mounted || !resctrl_arch_mon_capable())
 		goto out_unlock;
 
-	r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
 	d = container_of(work, struct rdt_domain, mbm_over.work);
 
 	list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index d02f4c97e40f..e3edc41882dc 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2253,7 +2253,7 @@ static void l2_qos_cfg_update(void *arg)
 
 static inline bool is_mba_linear(void)
 {
-	return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.delay_linear;
+	return resctrl_arch_get_resource(RDT_RESOURCE_MBA)->membw.delay_linear;
 }
 
 static int set_cache_qos_cfg(int level, bool enable)
@@ -2341,7 +2341,7 @@ static void mba_sc_domain_destroy(struct rdt_resource *r,
  */
 static bool supports_mba_mbps(void)
 {
-	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
 
 	return (is_mbm_local_enabled() &&
 		r->alloc_capable && is_mba_linear());
@@ -2353,7 +2353,7 @@ static bool supports_mba_mbps(void)
  */
 static int set_mba_sc(bool mba_sc)
 {
-	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
 	u32 num_closid = resctrl_arch_get_num_closid(r);
 	struct rdt_domain *d;
 	int i;
@@ -2701,7 +2701,7 @@ static int rdt_get_tree(struct fs_context *fc)
 		resctrl_mounted = true;
 
 	if (is_mbm_enabled()) {
-		r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+		r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
 		list_for_each_entry(dom, &r->domains, list)
 			mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL,
 						   RESCTRL_PICK_ANY_CPU);
@@ -3870,7 +3870,7 @@ static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
 	if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2))
 		seq_puts(seq, ",cdpl2");
 
-	if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl))
+	if (is_mba_sc(resctrl_arch_get_resource(RDT_RESOURCE_MBA)))
 		seq_puts(seq, ",mba_MBps");
 
 	if (resctrl_debug)
@@ -4060,7 +4060,7 @@ static void clear_childcpus(struct rdtgroup *r, unsigned int cpu)
 
 void resctrl_offline_cpu(unsigned int cpu)
 {
-	struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	struct rdt_resource *l3 = resctrl_arch_get_resource(RDT_RESOURCE_L3);
 	struct rdtgroup *rdtgrp;
 	struct rdt_domain *d;
 
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index a365f67131ec..168cc9510069 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -36,6 +36,16 @@ enum resctrl_conf_type {
 	CDP_DATA,
 };
 
+enum resctrl_res_level {
+	RDT_RESOURCE_L3,
+	RDT_RESOURCE_L2,
+	RDT_RESOURCE_MBA,
+	RDT_RESOURCE_SMBA,
+
+	/* Must be the last */
+	RDT_NUM_RESOURCES,
+};
+
 #define CDP_NUM_TYPES	(CDP_DATA + 1)
 
 /*
@@ -190,6 +200,13 @@ struct rdt_resource {
 	bool			cdp_capable;
 };
 
+/*
+ * Get the resource that exists at this level. If the level is not supported
+ * a dummy/not-capable resource can be returned. Levels >= RDT_NUM_RESOURCES
+ * will return NULL.
+ */
+struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
+
 /**
  * struct resctrl_schema - configuration abilities of a resource presented to
  *			   user-space
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 03/38] x86/resctrl: Add a schema format enum and use this for fflags
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
  2024-06-14 14:59 ` [PATCH v3 01/38] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
  2024-06-14 14:59 ` [PATCH v3 02/38] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
@ 2024-06-14 14:59 ` James Morse
  2024-06-28 16:43   ` Reinette Chatre
  2024-06-14 14:59 ` [PATCH v3 04/38] x86/resctrl: Use schema type to determine how to parse schema values James Morse
                   ` (35 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 14:59 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

resctrl has three types of control, these emerge from the way the
architecture initialises a number of properties in struct rdt_resource.

A group of these properties need to be set the same on all architectures,
it would be better to specify the format the schema entry should use, and
allow resctrl to generate all the other properties it needs. This avoids
architectures having divergant behaviour here.

Add a schema format enum, and as a first use, replace the fflags member
of struct rdt_resource.

The MBA schema has a different format between AMD and Intel systems.
The schema_fmt property is changed by __rdt_get_mem_config_amd() to
enable the MBPS format.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v2:
 * This patch is new
---
 arch/x86/kernel/cpu/resctrl/core.c     | 13 +++++++++----
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 17 +++++++++++++++--
 include/linux/resctrl.h                | 16 ++++++++++++++--
 3 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index d04e157dd7f7..a72fd53e0ffe 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -68,11 +68,11 @@ struct rdt_hw_resource rdt_resources_all[] = {
 		.r_resctrl = {
 			.rid			= RDT_RESOURCE_L3,
 			.name			= "L3",
+			.schema_fmt		= RESCTRL_SCHEMA_BITMAP,
 			.cache_level		= 3,
 			.domains		= domain_init(RDT_RESOURCE_L3),
 			.parse_ctrlval		= parse_cbm,
 			.format_str		= "%d=%0*x",
-			.fflags			= RFTYPE_RES_CACHE,
 		},
 		.msr_base		= MSR_IA32_L3_CBM_BASE,
 		.msr_update		= cat_wrmsr,
@@ -82,11 +82,11 @@ struct rdt_hw_resource rdt_resources_all[] = {
 		.r_resctrl = {
 			.rid			= RDT_RESOURCE_L2,
 			.name			= "L2",
+			.schema_fmt		= RESCTRL_SCHEMA_BITMAP,
 			.cache_level		= 2,
 			.domains		= domain_init(RDT_RESOURCE_L2),
 			.parse_ctrlval		= parse_cbm,
 			.format_str		= "%d=%0*x",
-			.fflags			= RFTYPE_RES_CACHE,
 		},
 		.msr_base		= MSR_IA32_L2_CBM_BASE,
 		.msr_update		= cat_wrmsr,
@@ -96,11 +96,15 @@ struct rdt_hw_resource rdt_resources_all[] = {
 		.r_resctrl = {
 			.rid			= RDT_RESOURCE_MBA,
 			.name			= "MB",
+			/*
+			 * MBA schema_fmt is modified by
+			 * __rdt_get_mem_config_amd()
+			 */
+			.schema_fmt		= RESCTRL_SCHEMA_PERCENTAGE,
 			.cache_level		= 3,
 			.domains		= domain_init(RDT_RESOURCE_MBA),
 			.parse_ctrlval		= parse_bw,
 			.format_str		= "%d=%*u",
-			.fflags			= RFTYPE_RES_MB,
 		},
 	},
 	[RDT_RESOURCE_SMBA] =
@@ -108,11 +112,11 @@ struct rdt_hw_resource rdt_resources_all[] = {
 		.r_resctrl = {
 			.rid			= RDT_RESOURCE_SMBA,
 			.name			= "SMBA",
+			.schema_fmt		= RESCTRL_SCHEMA_MBPS,
 			.cache_level		= 3,
 			.domains		= domain_init(RDT_RESOURCE_SMBA),
 			.parse_ctrlval		= parse_bw,
 			.format_str		= "%d=%*u",
-			.fflags			= RFTYPE_RES_MB,
 		},
 	},
 };
@@ -253,6 +257,7 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
 	cpuid_count(0x80000020, subleaf, &eax, &ebx, &ecx, &edx);
 	hw_res->num_closid = edx + 1;
 	r->default_ctrl = 1 << eax;
+	r->schema_fmt = RESCTRL_SCHEMA_MBPS;
 
 	/* AMD does not use delay */
 	r->membw.delay_linear = false;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e3edc41882dc..b12307d465bc 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2162,6 +2162,19 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
 	return ret;
 }
 
+static u32 fflags_from_resource(struct rdt_resource *r)
+{
+	switch (r->schema_fmt) {
+	case RESCTRL_SCHEMA_BITMAP:
+		return RFTYPE_RES_CACHE;
+	case RESCTRL_SCHEMA_PERCENTAGE:
+	case RESCTRL_SCHEMA_MBPS:
+		return RFTYPE_RES_MB;
+	}
+
+	return WARN_ON_ONCE(1);
+}
+
 static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
 {
 	struct resctrl_schema *s;
@@ -2182,14 +2195,14 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
 	/* loop over enabled controls, these are all alloc_capable */
 	list_for_each_entry(s, &resctrl_schema_all, list) {
 		r = s->res;
-		fflags = r->fflags | RFTYPE_CTRL_INFO;
+		fflags =  fflags_from_resource(r) | RFTYPE_CTRL_INFO;
 		ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
 		if (ret)
 			goto out_destroy;
 	}
 
 	for_each_mon_capable_rdt_resource(r) {
-		fflags = r->fflags | RFTYPE_MON_INFO;
+		fflags =  fflags_from_resource(r) | RFTYPE_MON_INFO;
 		sprintf(name, "%s_MON", r->name);
 		ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
 		if (ret)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 168cc9510069..4822abbc08c8 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -160,6 +160,18 @@ struct resctrl_membw {
 struct rdt_parse_data;
 struct resctrl_schema;
 
+/**
+ * enum resctrl_schema_fmt - The format user-space provides for a schema.
+ * @RESCTRL_SCHEMA_BITMAP:	The schema is a bitmap in hex.
+ * @RESCTRL_SCHEMA_PERCENTAGE:	The schema is a decimal percentage value.
+ * @RESCTRL_SCHEMA_MBPS:	The schema is a decimal MBps value.
+ */
+enum resctrl_schema_fmt {
+	RESCTRL_SCHEMA_BITMAP,
+	RESCTRL_SCHEMA_PERCENTAGE,
+	RESCTRL_SCHEMA_MBPS,
+};
+
 /**
  * struct rdt_resource - attributes of a resctrl resource
  * @rid:		The index of the resource
@@ -175,8 +187,8 @@ struct resctrl_schema;
  * @default_ctrl:	Specifies default cache cbm or memory B/W percent.
  * @format_str:		Per resource format string to show domain value
  * @parse_ctrlval:	Per resource function pointer to parse control values
+ * @schema_fmt:	Which format string and parser is used for this schema.
  * @evt_list:		List of monitoring events
- * @fflags:		flags to choose base and info files
  * @cdp_capable:	Is the CDP feature available on this resource
  */
 struct rdt_resource {
@@ -195,8 +207,8 @@ struct rdt_resource {
 	int			(*parse_ctrlval)(struct rdt_parse_data *data,
 						 struct resctrl_schema *s,
 						 struct rdt_domain *d);
+	enum resctrl_schema_fmt	schema_fmt;
 	struct list_head	evt_list;
-	unsigned long		fflags;
 	bool			cdp_capable;
 };
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 04/38] x86/resctrl: Use schema type to determine how to parse schema values
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (2 preceding siblings ...)
  2024-06-14 14:59 ` [PATCH v3 03/38] x86/resctrl: Add a schema format enum and use this for fflags James Morse
@ 2024-06-14 14:59 ` James Morse
  2024-06-28 16:43   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 05/38] x86/resctrl: Use schema type to determine the schema format string James Morse
                   ` (34 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 14:59 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

Resctrl's architecture code gets to specify a function pointer that is
used when parsing schema entries. This is expected to be one of two
helpers from the filesystem code.

Setting this function pointer allows the architecture code to change
the ABI resctrl presents to user-space, and forces resctrl to expose
these helpers.

Instead, use the schema format enum to choose which schema parser to
use. This allows the helpers to be made static and the structs used
for passing arguments moved out of shared headers.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v2:
 * This patch is new
---
 arch/x86/kernel/cpu/resctrl/core.c        |  4 ---
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 33 +++++++++++++++++++----
 arch/x86/kernel/cpu/resctrl/internal.h    | 10 -------
 include/linux/resctrl.h                   |  5 ----
 4 files changed, 28 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index a72fd53e0ffe..02a51cce380f 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -71,7 +71,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
 			.schema_fmt		= RESCTRL_SCHEMA_BITMAP,
 			.cache_level		= 3,
 			.domains		= domain_init(RDT_RESOURCE_L3),
-			.parse_ctrlval		= parse_cbm,
 			.format_str		= "%d=%0*x",
 		},
 		.msr_base		= MSR_IA32_L3_CBM_BASE,
@@ -85,7 +84,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
 			.schema_fmt		= RESCTRL_SCHEMA_BITMAP,
 			.cache_level		= 2,
 			.domains		= domain_init(RDT_RESOURCE_L2),
-			.parse_ctrlval		= parse_cbm,
 			.format_str		= "%d=%0*x",
 		},
 		.msr_base		= MSR_IA32_L2_CBM_BASE,
@@ -103,7 +101,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
 			.schema_fmt		= RESCTRL_SCHEMA_PERCENTAGE,
 			.cache_level		= 3,
 			.domains		= domain_init(RDT_RESOURCE_MBA),
-			.parse_ctrlval		= parse_bw,
 			.format_str		= "%d=%*u",
 		},
 	},
@@ -115,7 +112,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
 			.schema_fmt		= RESCTRL_SCHEMA_MBPS,
 			.cache_level		= 3,
 			.domains		= domain_init(RDT_RESOURCE_SMBA),
-			.parse_ctrlval		= parse_bw,
 			.format_str		= "%d=%*u",
 		},
 	},
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 9f1ed26b9d83..30a4ff2b6392 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -23,6 +23,15 @@
 
 #include "internal.h"
 
+struct rdt_parse_data {
+	struct rdtgroup		*rdtgrp;
+	char			*buf;
+};
+
+typedef int (ctrlval_parser_t)(struct rdt_parse_data *data,
+			       struct resctrl_schema *s,
+			       struct rdt_domain *d);
+
 /*
  * Check whether MBA bandwidth percentage value is correct. The value is
  * checked against the minimum and max bandwidth values specified by the
@@ -59,8 +68,8 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
 	return true;
 }
 
-int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
-	     struct rdt_domain *d)
+static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
+		    struct rdt_domain *d)
 {
 	struct resctrl_staged_config *cfg;
 	u32 closid = data->rdtgrp->closid;
@@ -138,8 +147,8 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
  * Read one cache bit mask (hex). Check that it is valid for the current
  * resource type.
  */
-int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
-	      struct rdt_domain *d)
+static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
+		     struct rdt_domain *d)
 {
 	struct rdtgroup *rdtgrp = data->rdtgrp;
 	struct resctrl_staged_config *cfg;
@@ -195,6 +204,19 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
 	return 0;
 }
 
+static ctrlval_parser_t *get_parser(struct rdt_resource *r)
+{
+	switch (r->schema_fmt) {
+	case RESCTRL_SCHEMA_BITMAP:
+		return &parse_cbm;
+	case RESCTRL_SCHEMA_PERCENTAGE:
+	case RESCTRL_SCHEMA_MBPS:
+		return &parse_bw;
+	};
+
+	return NULL;
+}
+
 /*
  * For each domain in this resource we expect to find a series of:
  *	id=mask
@@ -204,6 +226,7 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
 static int parse_line(char *line, struct resctrl_schema *s,
 		      struct rdtgroup *rdtgrp)
 {
+	ctrlval_parser_t *parse_ctrlval = get_parser(s->res);
 	enum resctrl_conf_type t = s->conf_type;
 	struct resctrl_staged_config *cfg;
 	struct rdt_resource *r = s->res;
@@ -235,7 +258,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
 		if (d->id == dom_id) {
 			data.buf = dom;
 			data.rdtgrp = rdtgrp;
-			if (r->parse_ctrlval(&data, s, d))
+			if (parse_ctrlval(&data, s, d))
 				return -EINVAL;
 			if (rdtgrp->mode ==  RDT_MODE_PSEUDO_LOCKSETUP) {
 				cfg = &d->staged_config[t];
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index e11abefcfd31..7e0b0b5f3530 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -416,11 +416,6 @@ static inline bool is_mbm_event(int e)
 		e <= QOS_L3_MBM_LOCAL_EVENT_ID);
 }
 
-struct rdt_parse_data {
-	struct rdtgroup		*rdtgrp;
-	char			*buf;
-};
-
 /**
  * struct rdt_hw_resource - arch private attributes of a resctrl resource
  * @r_resctrl:		Attributes of the resource used directly by resctrl.
@@ -457,11 +452,6 @@ static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r
 	return container_of(r, struct rdt_hw_resource, r_resctrl);
 }
 
-int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
-	      struct rdt_domain *d);
-int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
-	     struct rdt_domain *d);
-
 extern struct mutex rdtgroup_mutex;
 
 extern struct rdt_hw_resource rdt_resources_all[];
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 4822abbc08c8..975345f3cd0a 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -157,7 +157,6 @@ struct resctrl_membw {
 	u32				*mb_map;
 };
 
-struct rdt_parse_data;
 struct resctrl_schema;
 
 /**
@@ -186,7 +185,6 @@ enum resctrl_schema_fmt {
  * @data_width:		Character width of data when displaying
  * @default_ctrl:	Specifies default cache cbm or memory B/W percent.
  * @format_str:		Per resource format string to show domain value
- * @parse_ctrlval:	Per resource function pointer to parse control values
  * @schema_fmt:	Which format string and parser is used for this schema.
  * @evt_list:		List of monitoring events
  * @cdp_capable:	Is the CDP feature available on this resource
@@ -204,9 +202,6 @@ struct rdt_resource {
 	int			data_width;
 	u32			default_ctrl;
 	const char		*format_str;
-	int			(*parse_ctrlval)(struct rdt_parse_data *data,
-						 struct resctrl_schema *s,
-						 struct rdt_domain *d);
 	enum resctrl_schema_fmt	schema_fmt;
 	struct list_head	evt_list;
 	bool			cdp_capable;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 05/38] x86/resctrl: Use schema type to determine the schema format string
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (3 preceding siblings ...)
  2024-06-14 14:59 ` [PATCH v3 04/38] x86/resctrl: Use schema type to determine how to parse schema values James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 16:43   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 06/38] x86/resctrl: Move data_width to be a schema property James Morse
                   ` (33 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

Resctrl's architecture code gets to specify a format string that is
used when printing schema entries. This is expected to be one of two
values that the filesystem code supports.

Setting this format string allows the architecture code to change
the ABI resctrl presents to user-space.

Instead, use the schema format enum to choose which format string to
use.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v2:
 * This patch is new.
---
 arch/x86/kernel/cpu/resctrl/core.c        |  4 ----
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  2 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 12 ++++++++++++
 include/linux/resctrl.h                   |  4 ++--
 4 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 02a51cce380f..4a5216a13b46 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -71,7 +71,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
 			.schema_fmt		= RESCTRL_SCHEMA_BITMAP,
 			.cache_level		= 3,
 			.domains		= domain_init(RDT_RESOURCE_L3),
-			.format_str		= "%d=%0*x",
 		},
 		.msr_base		= MSR_IA32_L3_CBM_BASE,
 		.msr_update		= cat_wrmsr,
@@ -84,7 +83,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
 			.schema_fmt		= RESCTRL_SCHEMA_BITMAP,
 			.cache_level		= 2,
 			.domains		= domain_init(RDT_RESOURCE_L2),
-			.format_str		= "%d=%0*x",
 		},
 		.msr_base		= MSR_IA32_L2_CBM_BASE,
 		.msr_update		= cat_wrmsr,
@@ -101,7 +99,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
 			.schema_fmt		= RESCTRL_SCHEMA_PERCENTAGE,
 			.cache_level		= 3,
 			.domains		= domain_init(RDT_RESOURCE_MBA),
-			.format_str		= "%d=%*u",
 		},
 	},
 	[RDT_RESOURCE_SMBA] =
@@ -112,7 +109,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
 			.schema_fmt		= RESCTRL_SCHEMA_MBPS,
 			.cache_level		= 3,
 			.domains		= domain_init(RDT_RESOURCE_SMBA),
-			.format_str		= "%d=%*u",
 		},
 	},
 };
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 30a4ff2b6392..380b88b69c6e 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -483,7 +483,7 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
 			ctrl_val = resctrl_arch_get_config(r, dom, closid,
 							   schema->conf_type);
 
-		seq_printf(s, r->format_str, dom->id, max_data_width,
+		seq_printf(s, schema->fmt_str, dom->id, max_data_width,
 			   ctrl_val);
 		sep = true;
 	}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index b12307d465bc..af9968328771 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2597,6 +2597,18 @@ static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type
 	if (cl > max_name_width)
 		max_name_width = cl;
 
+	switch (r->schema_fmt) {
+	case RESCTRL_SCHEMA_BITMAP:
+		s->fmt_str = "%d=%0*x";
+		break;
+	case RESCTRL_SCHEMA_PERCENTAGE:
+		s->fmt_str = "%d=%0*u";
+		break;
+	case RESCTRL_SCHEMA_MBPS:
+		s->fmt_str = "%d=%0*u";
+		break;
+	}
+
 	INIT_LIST_HEAD(&s->list);
 	list_add(&s->list, &resctrl_schema_all);
 
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 975345f3cd0a..abecbd92ac93 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -184,7 +184,6 @@ enum resctrl_schema_fmt {
  * @name:		Name to use in "schemata" file.
  * @data_width:		Character width of data when displaying
  * @default_ctrl:	Specifies default cache cbm or memory B/W percent.
- * @format_str:		Per resource format string to show domain value
  * @schema_fmt:	Which format string and parser is used for this schema.
  * @evt_list:		List of monitoring events
  * @cdp_capable:	Is the CDP feature available on this resource
@@ -201,7 +200,6 @@ struct rdt_resource {
 	char			*name;
 	int			data_width;
 	u32			default_ctrl;
-	const char		*format_str;
 	enum resctrl_schema_fmt	schema_fmt;
 	struct list_head	evt_list;
 	bool			cdp_capable;
@@ -219,6 +217,7 @@ struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
  *			   user-space
  * @list:	Member of resctrl_schema_all.
  * @name:	The name to use in the "schemata" file.
+ * @fmt_str:	Format string to show domain value
  * @conf_type:	Whether this schema is specific to code/data.
  * @res:	The resource structure exported by the architecture to describe
  *		the hardware that is configured by this schema.
@@ -229,6 +228,7 @@ struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
 struct resctrl_schema {
 	struct list_head		list;
 	char				name[8];
+	const char			*fmt_str;
 	enum resctrl_conf_type		conf_type;
 	struct rdt_resource		*res;
 	u32				num_closid;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 06/38] x86/resctrl: Move data_width to be a schema property
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (4 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 05/38] x86/resctrl: Use schema type to determine the schema format string James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 16:45   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 07/38] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
                   ` (32 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

The resctrl architecture code gets to specify the width of the schema
entries that are used by resctrl. These are determined by the schema
format, e.g. percentage or bitmap.

Move this property into struct resctrl_schema and get the filesystem
parts of resctrl to set it based on the schema format.

This allows rdt_init_padding() to be removed, its work can be done
by schemata_list_add(), allowing max_name_width and max_data_width
to be moved out of core.c which has no counterpart after the
move to fs.

The logic for calculating max_name_width was moved in earlier patches,
but the definition was not moved.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v2:
 * This patch is new.
---
 arch/x86/kernel/cpu/resctrl/core.c     | 26 --------------------------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 11 +++++++++++
 include/linux/resctrl.h                |  4 ++--
 3 files changed, 13 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 4a5216a13b46..4de7d20aa5aa 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -44,12 +44,6 @@ static DEFINE_MUTEX(domain_list_lock);
  */
 DEFINE_PER_CPU(struct resctrl_pqr_state, pqr_state);
 
-/*
- * Used to store the max resource name width and max resource data width
- * to display the schemata in a tabular format
- */
-int max_name_width, max_data_width;
-
 /*
  * Global boolean for rdt_alloc which is true if any
  * resource allocation is enabled.
@@ -222,7 +216,6 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
 			return false;
 		r->membw.arch_needs_linear = false;
 	}
-	r->data_width = 3;
 
 	if (boot_cpu_has(X86_FEATURE_PER_THREAD_MBA))
 		r->membw.throttle_mode = THREAD_THROTTLE_PER_THREAD;
@@ -262,8 +255,6 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
 	r->membw.throttle_mode = THREAD_THROTTLE_UNDEFINED;
 	r->membw.min_bw = 0;
 	r->membw.bw_gran = 1;
-	/* Max value is 2048, Data width should be 4 in decimal */
-	r->data_width = 4;
 
 	r->alloc_capable = true;
 
@@ -283,7 +274,6 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
 	r->cache.cbm_len = eax.split.cbm_len + 1;
 	r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
 	r->cache.shareable_bits = ebx & r->default_ctrl;
-	r->data_width = (r->cache.cbm_len + 3) / 4;
 	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
 		r->cache.arch_has_sparse_bitmasks = ecx.split.noncont;
 	r->alloc_capable = true;
@@ -631,20 +621,6 @@ static int resctrl_arch_offline_cpu(unsigned int cpu)
 	return 0;
 }
 
-/*
- * Choose a width for the resource name and resource data based on the
- * resource that has widest name and cbm.
- */
-static __init void rdt_init_padding(void)
-{
-	struct rdt_resource *r;
-
-	for_each_alloc_capable_rdt_resource(r) {
-		if (r->data_width > max_data_width)
-			max_data_width = r->data_width;
-	}
-}
-
 enum {
 	RDT_FLAG_CMT,
 	RDT_FLAG_MBM_TOTAL,
@@ -942,8 +918,6 @@ static int __init resctrl_late_init(void)
 	if (!get_rdt_resources())
 		return -ENODEV;
 
-	rdt_init_padding();
-
 	state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
 				  "x86/resctrl/cat:online:",
 				  resctrl_arch_online_cpu,
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index af9968328771..4f8e20cc06eb 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -58,6 +58,12 @@ static struct kernfs_node *kn_mongrp;
 /* Kernel fs node for "mon_data" directory under root */
 static struct kernfs_node *kn_mondata;
 
+/*
+ * Used to store the max resource name width and max resource data width
+ * to display the schemata in a tabular format
+ */
+int max_name_width, max_data_width;
+
 static struct seq_buf last_cmd_status;
 static char last_cmd_status_buf[512];
 
@@ -2600,15 +2606,20 @@ static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type
 	switch (r->schema_fmt) {
 	case RESCTRL_SCHEMA_BITMAP:
 		s->fmt_str = "%d=%0*x";
+		s->data_width = (r->cache.cbm_len + 3) / 4;
 		break;
 	case RESCTRL_SCHEMA_PERCENTAGE:
 		s->fmt_str = "%d=%0*u";
+		s->data_width = 3;
 		break;
 	case RESCTRL_SCHEMA_MBPS:
 		s->fmt_str = "%d=%0*u";
+		s->data_width = 4;
 		break;
 	}
 
+	max_data_width = max(max_data_width, s->data_width);
+
 	INIT_LIST_HEAD(&s->list);
 	list_add(&s->list, &resctrl_schema_all);
 
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index abecbd92ac93..ddcd938972d2 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -182,7 +182,6 @@ enum resctrl_schema_fmt {
  * @membw:		If the component has bandwidth controls, their properties.
  * @domains:		RCU list of all domains for this resource
  * @name:		Name to use in "schemata" file.
- * @data_width:		Character width of data when displaying
  * @default_ctrl:	Specifies default cache cbm or memory B/W percent.
  * @schema_fmt:	Which format string and parser is used for this schema.
  * @evt_list:		List of monitoring events
@@ -198,7 +197,6 @@ struct rdt_resource {
 	struct resctrl_membw	membw;
 	struct list_head	domains;
 	char			*name;
-	int			data_width;
 	u32			default_ctrl;
 	enum resctrl_schema_fmt	schema_fmt;
 	struct list_head	evt_list;
@@ -218,6 +216,7 @@ struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
  * @list:	Member of resctrl_schema_all.
  * @name:	The name to use in the "schemata" file.
  * @fmt_str:	Format string to show domain value
+ * @data_width:	Character width of data when displaying
  * @conf_type:	Whether this schema is specific to code/data.
  * @res:	The resource structure exported by the architecture to describe
  *		the hardware that is configured by this schema.
@@ -229,6 +228,7 @@ struct resctrl_schema {
 	struct list_head		list;
 	char				name[8];
 	const char			*fmt_str;
+	int				data_width;
 	enum resctrl_conf_type		conf_type;
 	struct rdt_resource		*res;
 	u32				num_closid;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 07/38] x86/resctrl: Add max_bw to struct resctrl_membw
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (5 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 06/38] x86/resctrl: Move data_width to be a schema property James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 08/38] x86/resctrl: Generate default_ctrl instead of sharing it James Morse
                   ` (31 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

__rdt_get_mem_config_amd() and __get_mem_config_intel() both use
the default_ctrl property as a maximum value. This is because the
MBA schema works differently between these platforms. Doing this
complicates determining whether the default_ctrl property belongs
to the arch code, or can be derived from the schema format.

Add a max_bw property for AMD platforms to specify their maximum
MBA bandwidth. This isn't needed for other schema formats.

This will allow the default_ctrl to be generated from the schema
properties when it is needed.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v2:
 * This patch is new.
---
 arch/x86/kernel/cpu/resctrl/core.c        | 2 ++
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 9 +++++----
 include/linux/resctrl.h                   | 2 ++
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 4de7d20aa5aa..c1dfc1466e53 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -206,6 +206,7 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
 	hw_res->num_closid = edx.split.cos_max + 1;
 	max_delay = eax.split.max_delay + 1;
 	r->default_ctrl = MAX_MBA_BW;
+	r->membw.max_bw = MAX_MBA_BW;
 	r->membw.arch_needs_linear = true;
 	if (ecx & MBA_IS_LINEAR) {
 		r->membw.delay_linear = true;
@@ -243,6 +244,7 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
 	hw_res->num_closid = edx + 1;
 	r->default_ctrl = 1 << eax;
 	r->schema_fmt = RESCTRL_SCHEMA_MBPS;
+	r->membw.max_bw = 1 << eax;
 
 	/* AMD does not use delay */
 	r->membw.delay_linear = false;
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 380b88b69c6e..2ef91e748325 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -57,10 +57,10 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
 		return false;
 	}
 
-	if ((bw < r->membw.min_bw || bw > r->default_ctrl) &&
+	if ((bw < r->membw.min_bw || bw > r->membw.max_bw) &&
 	    !is_mba_sc(r)) {
 		rdt_last_cmd_printf("MB value %ld out of range [%d,%d]\n", bw,
-				    r->membw.min_bw, r->default_ctrl);
+				    r->membw.min_bw, r->membw.max_bw);
 		return false;
 	}
 
@@ -108,8 +108,9 @@ static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
  */
 static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
 {
-	unsigned long first_bit, zero_bit, val;
+	u32 supported_bits = BIT_MASK(r->cache.cbm_len + 1) - 1;
 	unsigned int cbm_len = r->cache.cbm_len;
+	unsigned long first_bit, zero_bit, val;
 	int ret;
 
 	ret = kstrtoul(buf, 16, &val);
@@ -118,7 +119,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
 		return false;
 	}
 
-	if ((r->cache.min_cbm_bits > 0 && val == 0) || val > r->default_ctrl) {
+	if ((r->cache.min_cbm_bits > 0 && val == 0) || val > supported_bits) {
 		rdt_last_cmd_puts("Mask out of range\n");
 		return false;
 	}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index ddcd938972d2..0dee50530847 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -139,6 +139,7 @@ enum membw_throttle_mode {
 /**
  * struct resctrl_membw - Memory bandwidth allocation related data
  * @min_bw:		Minimum memory bandwidth percentage user can request
+ * @max_bw:		Maximum memory bandwidth value, used as the reset value
  * @bw_gran:		Granularity at which the memory bandwidth is allocated
  * @delay_linear:	True if memory B/W delay is in linear scale
  * @arch_needs_linear:	True if we can't configure non-linear resources
@@ -149,6 +150,7 @@ enum membw_throttle_mode {
  */
 struct resctrl_membw {
 	u32				min_bw;
+	u32				max_bw;
 	u32				bw_gran;
 	u32				delay_linear;
 	bool				arch_needs_linear;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 08/38] x86/resctrl: Generate default_ctrl instead of sharing it
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (6 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 07/38] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 09/38] x86/resctrl: Add helper for setting CPU default properties James Morse
                   ` (30 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

The struct rdt_resource default_ctrl is used by both the architecture
code for resetting the hardware controls, and by the filesystem parts
of resctrl to report to user-space.

This means the value has to be shared, but might not match the
properties of the control. e.g. a percentage greater than 100.

Instead, determine the default control value from a shared helper
resctrl_get_default_ctrl() that uses the schema properties to
determine the correct value.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v2:
 * This patch is new.
---
 arch/x86/kernel/cpu/resctrl/core.c     | 16 +++++++---------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |  6 +++---
 include/linux/resctrl.h                | 21 +++++++++++++++++++--
 3 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index c1dfc1466e53..9241f3ff3870 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -137,7 +137,10 @@ static inline void cache_alloc_hsw_probe(void)
 {
 	struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_L3];
 	struct rdt_resource *r  = &hw_res->r_resctrl;
-	u64 max_cbm = BIT_ULL_MASK(20) - 1, l3_cbm_0;
+	u64 max_cbm, l3_cbm_0;
+
+	r->cache.cbm_len = 20;
+	max_cbm = resctrl_get_default_ctrl(r);
 
 	if (wrmsrl_safe(MSR_IA32_L3_CBM_BASE, max_cbm))
 		return;
@@ -149,8 +152,6 @@ static inline void cache_alloc_hsw_probe(void)
 		return;
 
 	hw_res->num_closid = 4;
-	r->default_ctrl = max_cbm;
-	r->cache.cbm_len = 20;
 	r->cache.shareable_bits = 0xc0000;
 	r->cache.min_cbm_bits = 2;
 	r->cache.arch_has_sparse_bitmasks = false;
@@ -205,7 +206,6 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
 	cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
 	hw_res->num_closid = edx.split.cos_max + 1;
 	max_delay = eax.split.max_delay + 1;
-	r->default_ctrl = MAX_MBA_BW;
 	r->membw.max_bw = MAX_MBA_BW;
 	r->membw.arch_needs_linear = true;
 	if (ecx & MBA_IS_LINEAR) {
@@ -242,7 +242,6 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
 
 	cpuid_count(0x80000020, subleaf, &eax, &ebx, &ecx, &edx);
 	hw_res->num_closid = edx + 1;
-	r->default_ctrl = 1 << eax;
 	r->schema_fmt = RESCTRL_SCHEMA_MBPS;
 	r->membw.max_bw = 1 << eax;
 
@@ -274,8 +273,7 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
 	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
 	hw_res->num_closid = edx.split.cos_max + 1;
 	r->cache.cbm_len = eax.split.cbm_len + 1;
-	r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
-	r->cache.shareable_bits = ebx & r->default_ctrl;
+	r->cache.shareable_bits = ebx & resctrl_get_default_ctrl(r);
 	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
 		r->cache.arch_has_sparse_bitmasks = ecx.split.noncont;
 	r->alloc_capable = true;
@@ -322,7 +320,7 @@ static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
 		return MAX_MBA_BW - bw;
 
 	pr_warn_once("Non Linear delay-bw map not supported but queried\n");
-	return r->default_ctrl;
+	return resctrl_get_default_ctrl(r);
 }
 
 static void mba_wrmsr_intel(struct msr_param *m)
@@ -419,7 +417,7 @@ static void setup_default_ctrlval(struct rdt_resource *r, u32 *dc)
 	 * For Memory Allocation: Set b/w requested to 100%
 	 */
 	for (i = 0; i < hw_res->num_closid; i++, dc++)
-		*dc = r->default_ctrl;
+		*dc = resctrl_get_default_ctrl(r);
 }
 
 static void domain_free(struct rdt_hw_domain *hw_dom)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 4f8e20cc06eb..ba43173d5b66 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -970,7 +970,7 @@ static int rdt_default_ctrl_show(struct kernfs_open_file *of,
 	struct resctrl_schema *s = of->kn->parent->priv;
 	struct rdt_resource *r = s->res;
 
-	seq_printf(seq, "%x\n", r->default_ctrl);
+	seq_printf(seq, "%x\n", resctrl_get_default_ctrl(r));
 	return 0;
 }
 
@@ -2869,7 +2869,7 @@ static int reset_all_ctrls(struct rdt_resource *r)
 		hw_dom = resctrl_to_arch_dom(d);
 
 		for (i = 0; i < hw_res->num_closid; i++)
-			hw_dom->ctrl_val[i] = r->default_ctrl;
+			hw_dom->ctrl_val[i] = resctrl_get_default_ctrl(r);
 		msr_param.dom = d;
 		smp_call_function_any(&d->cpu_mask, rdt_ctrl_update, &msr_param, 1);
 	}
@@ -3340,7 +3340,7 @@ static void rdtgroup_init_mba(struct rdt_resource *r, u32 closid)
 		}
 
 		cfg = &d->staged_config[CDP_NONE];
-		cfg->new_ctrl = r->default_ctrl;
+		cfg->new_ctrl = resctrl_get_default_ctrl(r);
 		cfg->have_new_ctrl = true;
 	}
 }
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 0dee50530847..cc491a03def8 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -184,7 +184,6 @@ enum resctrl_schema_fmt {
  * @membw:		If the component has bandwidth controls, their properties.
  * @domains:		RCU list of all domains for this resource
  * @name:		Name to use in "schemata" file.
- * @default_ctrl:	Specifies default cache cbm or memory B/W percent.
  * @schema_fmt:	Which format string and parser is used for this schema.
  * @evt_list:		List of monitoring events
  * @cdp_capable:	Is the CDP feature available on this resource
@@ -199,7 +198,6 @@ struct rdt_resource {
 	struct resctrl_membw	membw;
 	struct list_head	domains;
 	char			*name;
-	u32			default_ctrl;
 	enum resctrl_schema_fmt	schema_fmt;
 	struct list_head	evt_list;
 	bool			cdp_capable;
@@ -236,6 +234,25 @@ struct resctrl_schema {
 	u32				num_closid;
 };
 
+/**
+ * resctrl_get_default_ctrl() - Return the default control value for this
+ *                              resource.
+ * @r:		The resource whose default control type is queried.
+ */
+static inline u32 resctrl_get_default_ctrl(struct rdt_resource *r)
+{
+	switch (r->schema_fmt) {
+	case RESCTRL_SCHEMA_BITMAP:
+		return BIT_MASK(r->cache.cbm_len) - 1;
+	case RESCTRL_SCHEMA_PERCENTAGE:
+		return 100u;
+	case RESCTRL_SCHEMA_MBPS:
+		return r->membw.max_bw;
+	}
+
+	return WARN_ON_ONCE(1);
+}
+
 /* The number of closid supported by this resource regardless of CDP */
 u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
 int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 09/38] x86/resctrl: Add helper for setting CPU default properties
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (7 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 08/38] x86/resctrl: Generate default_ctrl instead of sharing it James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 10/38] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid() James Morse
                   ` (29 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Dave Martin, Shaopeng Tan

rdtgroup_rmdir_ctrl() and rdtgroup_rmdir_mon() set the per-CPU
pqr_state for CPUs that were part of the rmdir()'d group.

Another architecture might not have a 'pqr_state', its hardware may
need the values in a different format. MPAM's equivalent of RMID values
are not unique, and always need the CLOSID to be provided too.

There is only one caller that modifies a single value,
(rdtgroup_rmdir_mon()). MPAM always needs both CLOSID and RMID
for the hardware value as these are written to the same system
register.

As rdtgroup_rmdir_mon() has the CLOSID on hand, only provide a
helper to set both values. These values are read by
__resctrl_sched_in(), but may be written by a different CPU without
any locking, add READ/WRTE_ONCE() to avoid torn values.

Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
 * In rdtgroup_rmdir_mon(), (re)set CPU default closid based on the
   parent control group, to avoid the appearance of referencing
   something that we're in the process of destroying (even if it
   doesn't make a difference because the victim mon group necessarily
   has the same closid as the parent control group).

   Update comment to match.

   No (intentional) functional change.
---
 arch/x86/include/asm/resctrl.h         | 14 +++++++++++---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 20 ++++++++++++++------
 2 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 12dbd2588ca7..f61382258743 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -4,8 +4,9 @@
 
 #ifdef CONFIG_X86_CPU_RESCTRL
 
-#include <linux/sched.h>
 #include <linux/jump_label.h>
+#include <linux/percpu.h>
+#include <linux/sched.h>
 
 /*
  * This value can never be a valid CLOSID, and is used when mapping a
@@ -96,8 +97,8 @@ static inline void resctrl_arch_disable_mon(void)
 static inline void __resctrl_sched_in(struct task_struct *tsk)
 {
 	struct resctrl_pqr_state *state = this_cpu_ptr(&pqr_state);
-	u32 closid = state->default_closid;
-	u32 rmid = state->default_rmid;
+	u32 closid = READ_ONCE(state->default_closid);
+	u32 rmid = READ_ONCE(state->default_rmid);
 	u32 tmp;
 
 	/*
@@ -132,6 +133,13 @@ static inline unsigned int resctrl_arch_round_mon_val(unsigned int val)
 	return val * scale;
 }
 
+static inline void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid,
+							    u32 rmid)
+{
+	WRITE_ONCE(per_cpu(pqr_state.default_closid, cpu), closid);
+	WRITE_ONCE(per_cpu(pqr_state.default_rmid, cpu), rmid);
+}
+
 static inline void resctrl_arch_set_closid_rmid(struct task_struct *tsk,
 						u32 closid, u32 rmid)
 {
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index ba43173d5b66..af83b833c523 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -3652,14 +3652,21 @@ static int rdtgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
 static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 {
 	struct rdtgroup *prdtgrp = rdtgrp->mon.parent;
+	u32 closid, rmid;
 	int cpu;
 
 	/* Give any tasks back to the parent group */
 	rdt_move_group_tasks(rdtgrp, prdtgrp, tmpmask);
 
-	/* Update per cpu rmid of the moved CPUs first */
+	/*
+	 * Update per cpu closid/rmid of the moved CPUs first.
+	 * Note: the closid will not change, but the arch code still needs it.
+	 */
+	closid = prdtgrp->closid;
+	rmid = prdtgrp->mon.rmid;
 	for_each_cpu(cpu, &rdtgrp->cpu_mask)
-		per_cpu(pqr_state.default_rmid, cpu) = prdtgrp->mon.rmid;
+		resctrl_arch_set_cpu_default_closid_rmid(cpu, closid, rmid);
+
 	/*
 	 * Update the MSR on moved CPUs and CPUs which have moved
 	 * task running on them.
@@ -3692,6 +3699,7 @@ static int rdtgroup_ctrl_remove(struct rdtgroup *rdtgrp)
 
 static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 {
+	u32 closid, rmid;
 	int cpu;
 
 	/* Give any tasks back to the default group */
@@ -3702,10 +3710,10 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 		   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
 
 	/* Update per cpu closid and rmid of the moved CPUs first */
-	for_each_cpu(cpu, &rdtgrp->cpu_mask) {
-		per_cpu(pqr_state.default_closid, cpu) = rdtgroup_default.closid;
-		per_cpu(pqr_state.default_rmid, cpu) = rdtgroup_default.mon.rmid;
-	}
+	closid = rdtgroup_default.closid;
+	rmid = rdtgroup_default.mon.rmid;
+	for_each_cpu(cpu, &rdtgrp->cpu_mask)
+		resctrl_arch_set_cpu_default_closid_rmid(cpu, closid, rmid);
 
 	/*
 	 * Update the MSR on moved CPUs and CPUs which have moved
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 10/38] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid()
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (8 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 09/38] x86/resctrl: Add helper for setting CPU default properties James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 11/38] x86/resctrl: Export resctrl fs's init function James Morse
                   ` (28 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Dave Martin, Shaopeng Tan

update_cpu_closid_rmid() takes a struct rdtgroup as an argument, which
it uses to update the local CPUs default pqr values. This is a problem
once the resctrl parts move out to /fs/, as the arch code cannot
poke around inside struct rdtgroup.

Rename update_cpu_closid_rmid() as resctrl_arch_sync_cpu_closid_rmid()
to be used as the target of an IPI, and pass the effective CLOSID
and RMID in a new struct.

Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v2:
 * Change the function name in the commit message to match.

Changes since v1:
 * To clarify the meanings of the new helper and struct:

   Rename resctrl_arch_sync_cpu_default() to
   resctrl_arch_sync_cpu_closid_rmid();

   Rename struct resctrl_cpu_sync to struct resctrl_cpu_defaults;

   Flesh out the comment block in <linux/resctrl.h>.

   No functional change.
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 17 +++++++++++++----
 include/linux/resctrl.h                | 22 ++++++++++++++++++++++
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index af83b833c523..9143cc0d384e 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -347,13 +347,13 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
  * from update_closid_rmid() is protected against __switch_to() because
  * preemption is disabled.
  */
-static void update_cpu_closid_rmid(void *info)
+void resctrl_arch_sync_cpu_closid_rmid(void *info)
 {
-	struct rdtgroup *r = info;
+	struct resctrl_cpu_defaults *r = info;
 
 	if (r) {
 		this_cpu_write(pqr_state.default_closid, r->closid);
-		this_cpu_write(pqr_state.default_rmid, r->mon.rmid);
+		this_cpu_write(pqr_state.default_rmid, r->rmid);
 	}
 
 	/*
@@ -368,11 +368,20 @@ static void update_cpu_closid_rmid(void *info)
  * Update the PGR_ASSOC MSR on all cpus in @cpu_mask,
  *
  * Per task closids/rmids must have been set up before calling this function.
+ * @r may be NULL.
  */
 static void
 update_closid_rmid(const struct cpumask *cpu_mask, struct rdtgroup *r)
 {
-	on_each_cpu_mask(cpu_mask, update_cpu_closid_rmid, r, 1);
+	struct resctrl_cpu_defaults defaults, *p = NULL;
+
+	if (r) {
+		defaults.closid = r->closid;
+		defaults.rmid = r->mon.rmid;
+		p = &defaults;
+	}
+
+	on_each_cpu_mask(cpu_mask, resctrl_arch_sync_cpu_closid_rmid, p, 1);
 }
 
 static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index cc491a03def8..03024681920b 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -234,6 +234,28 @@ struct resctrl_schema {
 	u32				num_closid;
 };
 
+struct resctrl_cpu_defaults {
+	u32 closid;
+	u32 rmid;
+};
+
+/**
+ * resctrl_arch_sync_cpu_closid_rmid() - Refresh this CPU's CLOSID and RMID.
+ *					 Call via IPI.
+ * @info:	If non-NULL, a pointer to a struct resctrl_cpu_defaults
+ *		specifying the new CLOSID and RMID for tasks in the default
+ *		resctrl ctrl and mon group when running on this CPU.  If NULL,
+ *		this CPU is not re-assigned to a different default group.
+ *
+ * Propagates reassignment of CPUs and/or tasks to different resctrl groups
+ * when requested by the resctrl core code.
+ *
+ * This function records the per-cpu defaults specified by @info (if any),
+ * and then reconfigures the CPU's hardware CLOSID and RMID for subsequent
+ * execution based on @current, in the same way as during a task switch.
+ */
+void resctrl_arch_sync_cpu_closid_rmid(void *info);
+
 /**
  * resctrl_get_default_ctrl() - Return the default control value for this
  *                              resource.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 11/38] x86/resctrl: Export resctrl fs's init function
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (9 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 10/38] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid() James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 12/38] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain() James Morse
                   ` (27 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Dave Martin, Shaopeng Tan

rdtgroup_init() needs exporting so that arch code can call it once
it lives in core code. As this is one of the few functions exported,
rename it to have "resctrl" in the name. The same goes for the exit
call.

x86's arch code init functions for RDT are renamed to have an arch
prefix to make it clear these are part of the architecture code.

Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
 * Rename stale rdtgroup_init() to resctrl_init() in
   arch/x86/kernel/cpu/resctrl/monitor.c comments.

   No functional change.

 * [Commit message only] Minor rewording to avoid "impersonating code".

 * [Commit message only] Typo fix:
   s/to have the resctrl/to have resctrl/ in commit message.
---
 arch/x86/kernel/cpu/resctrl/core.c     | 12 ++++++------
 arch/x86/kernel/cpu/resctrl/internal.h |  3 ---
 arch/x86/kernel/cpu/resctrl/monitor.c  |  2 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |  8 ++++----
 include/linux/resctrl.h                |  3 +++
 5 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 9241f3ff3870..fe7b99e7f07e 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -902,7 +902,7 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c)
 	}
 }
 
-static int __init resctrl_late_init(void)
+static int __init resctrl_arch_late_init(void)
 {
 	struct rdt_resource *r;
 	int state, ret;
@@ -925,7 +925,7 @@ static int __init resctrl_late_init(void)
 	if (state < 0)
 		return state;
 
-	ret = rdtgroup_init();
+	ret = resctrl_init();
 	if (ret) {
 		cpuhp_remove_state(state);
 		return ret;
@@ -941,18 +941,18 @@ static int __init resctrl_late_init(void)
 	return 0;
 }
 
-late_initcall(resctrl_late_init);
+late_initcall(resctrl_arch_late_init);
 
-static void __exit resctrl_exit(void)
+static void __exit resctrl_arch_exit(void)
 {
 	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
 
 	cpuhp_remove_state(rdt_online);
 
-	rdtgroup_exit();
+	resctrl_exit();
 
 	if (r->mon_capable)
 		rdt_put_mon_l3_config();
 }
 
-__exitcall(resctrl_exit);
+__exitcall(resctrl_arch_exit);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 7e0b0b5f3530..73b44b684c52 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -301,9 +301,6 @@ extern struct list_head rdt_all_groups;
 
 extern int max_name_width, max_data_width;
 
-int __init rdtgroup_init(void);
-void __exit rdtgroup_exit(void);
-
 /**
  * struct rftype - describe each file in the resctrl file system
  * @name:	File name
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 96aaaa87c82c..3e5375c365e6 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -952,7 +952,7 @@ static int dom_data_init(struct rdt_resource *r)
 	/*
 	 * RESCTRL_RESERVED_CLOSID and RESCTRL_RESERVED_RMID are special and
 	 * are always allocated. These are used for the rdtgroup_default
-	 * control group, which will be setup later in rdtgroup_init().
+	 * control group, which will be setup later in resctrl_init().
 	 */
 	idx = resctrl_arch_rmid_idx_encode(RESCTRL_RESERVED_CLOSID,
 					   RESCTRL_RESERVED_RMID);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 9143cc0d384e..1574f5afd0e8 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4146,14 +4146,14 @@ void resctrl_offline_cpu(unsigned int cpu)
 }
 
 /*
- * rdtgroup_init - rdtgroup initialization
+ * resctrl_init - resctrl filesystem initialization
  *
  * Setup resctrl file system including set up root, create mount point,
- * register rdtgroup filesystem, and initialize files under root directory.
+ * register resctrl filesystem, and initialize files under root directory.
  *
  * Return: 0 on success or -errno
  */
-int __init rdtgroup_init(void)
+int __init resctrl_init(void)
 {
 	int ret = 0;
 
@@ -4201,7 +4201,7 @@ int __init rdtgroup_init(void)
 	return ret;
 }
 
-void __exit rdtgroup_exit(void)
+void __exit resctrl_exit(void)
 {
 	debugfs_remove_recursive(debugfs_resctrl);
 	unregister_filesystem(&rdt_fs_type);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 03024681920b..476d92ab0884 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -369,4 +369,7 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_domain *d);
 extern unsigned int resctrl_rmid_realloc_threshold;
 extern unsigned int resctrl_rmid_realloc_limit;
 
+int __init resctrl_init(void);
+void __exit resctrl_exit(void);
+
 #endif /* _RESCTRL_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 12/38] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain()
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (10 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 11/38] x86/resctrl: Export resctrl fs's init function James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 13/38] x86/resctrl: Move resctrl types to a separate header James Morse
                   ` (26 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

rdt_find_domain() finds a domain given a resource and a cache-id.
It's not quite right for the resctrl arch API as it also returns the
position to insert a new domain, which is needed when bringing a
domain online in the arch code.

Wrap rdt_find_domain() in another function resctrl_arch_find_domain()
in order to avoid the unnecessary argument outside the arch code.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
 * [Commit message only] Minor rewording to avoid "impersonating code".

 * [Commit message only] Typo fix:
   s/in a another/in another/ in commit message.
---
 arch/x86/kernel/cpu/resctrl/core.c        | 9 +++++++--
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
 arch/x86/kernel/cpu/resctrl/internal.h    | 2 --
 include/linux/resctrl.h                   | 2 ++
 4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index fe7b99e7f07e..9ad660b2b097 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -381,8 +381,8 @@ void rdt_ctrl_update(void *arg)
  * caller, return the first domain whose id is bigger than the input id.
  * The domain list is sorted by id in ascending order.
  */
-struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
-				   struct list_head **pos)
+static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
+					  struct list_head **pos)
 {
 	struct rdt_domain *d;
 	struct list_head *l;
@@ -406,6 +406,11 @@ struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
 	return NULL;
 }
 
+struct rdt_domain *resctrl_arch_find_domain(struct rdt_resource *r, int id)
+{
+	return rdt_find_domain(r, id, NULL);
+}
+
 static void setup_default_ctrlval(struct rdt_resource *r, u32 *dc)
 {
 	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 2ef91e748325..2100560dda6a 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -600,7 +600,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
 	evtid = md.u.evtid;
 
 	r = resctrl_arch_get_resource(resid);
-	d = rdt_find_domain(r, domid, NULL);
+	d = resctrl_arch_find_domain(r, domid);
 	if (IS_ERR_OR_NULL(d)) {
 		ret = -ENOENT;
 		goto out;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 73b44b684c52..54aba0b6b7d2 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -535,8 +535,6 @@ void rdtgroup_kn_unlock(struct kernfs_node *kn);
 int rdtgroup_kn_mode_restrict(struct rdtgroup *r, const char *name);
 int rdtgroup_kn_mode_restore(struct rdtgroup *r, const char *name,
 			     umode_t mask);
-struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
-				   struct list_head **pos);
 ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
 				char *buf, size_t nbytes, loff_t off);
 int rdtgroup_schemata_show(struct kernfs_open_file *of,
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 476d92ab0884..5f1d578371ab 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -277,6 +277,8 @@ static inline u32 resctrl_get_default_ctrl(struct rdt_resource *r)
 
 /* The number of closid supported by this resource regardless of CDP */
 u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
+
+struct rdt_domain *resctrl_arch_find_domain(struct rdt_resource *r, int id);
 int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
 
 /*
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 13/38] x86/resctrl: Move resctrl types to a separate header
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (11 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 12/38] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain() James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 16:45   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 14/38] x86/resctrl: Add a resctrl helper to reset all the resources James Morse
                   ` (25 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

When resctrl is fully factored into core and per-arch code, each arch
will need to use some resctrl common definitions in order to define its
own specializations and helpers.  Following conventional practice, it
would be desirable to put the dependent arch definitions in an
<asm/resctrl.h> header that is included by the common <linux/resctrl.h>
header.  However, this can make it awkward to avoid a circular
dependency between <linux/resctrl.h> and the arch header.

To avoid such dependencies, move the affected common types and
constants into a new header that does not need to depend on
<linux/resctrl.h> or on the arch headers.

The same logic applies to the monitor-configuration defines, move these
too.

Some kind of enumeration for events is needed between the filesystem
and architecture code. Take the x86 definition as its convenient for
x86.

The definition of enum resctrl_event_id is need to allow the architecture
code to define resctrl_arch_event_is_free_running(),
resctrl_arch_set_cdp_enabled(), resctrl_arch_mon_ctx_alloc() and
resctrl_arch_mon_ctx_free().

The definition of enum resctrl_res_level is needed to allow the
architecture code to define resctrl_arch_set_cdp_enabled() and
resctrl_arch_get_cdp_enabled().

The bits for mbm_local_bytes_config et al are ABI, and must be the same
on all architectures. These are documented in
Documentation/arch/x86/resctrl.rst

The maintainers entry for these headers was missed when resctrl.h was
created. Add a wildcard entry to match both resctrl.h and
resctrl_types.h.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v2:
 * Added to the commit message why each of these things is necessary.
 * Moved the enum resctrl_conf_type back to resctrl.h - this week arm's
   CDP emulation code gets away without this...

Changes since v1:
 * [Commit message only] Rewrite commit message to clarify the the
   rationale for refactoring the headers in this way.
---
 MAINTAINERS                            |  1 +
 arch/x86/kernel/cpu/resctrl/internal.h | 24 ------------
 include/linux/resctrl.h                | 21 +---------
 include/linux/resctrl_types.h          | 54 ++++++++++++++++++++++++++
 4 files changed, 56 insertions(+), 44 deletions(-)
 create mode 100644 include/linux/resctrl_types.h

diff --git a/MAINTAINERS b/MAINTAINERS
index d6c90161c7bf..441b039068d8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18859,6 +18859,7 @@ S:	Supported
 F:	Documentation/arch/x86/resctrl*
 F:	arch/x86/include/asm/resctrl.h
 F:	arch/x86/kernel/cpu/resctrl/
+F:	include/linux/resctrl*.h
 F:	tools/testing/selftests/resctrl/
 
 READ-COPY UPDATE (RCU)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 54aba0b6b7d2..7ede340b1301 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -32,30 +32,6 @@
  */
 #define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
 
-/* Reads to Local DRAM Memory */
-#define READS_TO_LOCAL_MEM		BIT(0)
-
-/* Reads to Remote DRAM Memory */
-#define READS_TO_REMOTE_MEM		BIT(1)
-
-/* Non-Temporal Writes to Local Memory */
-#define NON_TEMP_WRITE_TO_LOCAL_MEM	BIT(2)
-
-/* Non-Temporal Writes to Remote Memory */
-#define NON_TEMP_WRITE_TO_REMOTE_MEM	BIT(3)
-
-/* Reads to Local Memory the system identifies as "Slow Memory" */
-#define READS_TO_LOCAL_S_MEM		BIT(4)
-
-/* Reads to Remote Memory the system identifies as "Slow Memory" */
-#define READS_TO_REMOTE_S_MEM		BIT(5)
-
-/* Dirty Victims to All Types of Memory */
-#define DIRTY_VICTIMS_TO_ALL_MEM	BIT(6)
-
-/* Max event bits supported */
-#define MAX_EVT_CONFIG_BITS		GENMASK(6, 0)
-
 /**
  * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
  *			        aren't marked nohz_full
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 5f1d578371ab..02b745f9c4c4 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -5,6 +5,7 @@
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/pid.h>
+#include <linux/resctrl_types.h>
 
 /* CLOSID, RMID value used by the default control group */
 #define RESCTRL_RESERVED_CLOSID		0
@@ -36,28 +37,8 @@ enum resctrl_conf_type {
 	CDP_DATA,
 };
 
-enum resctrl_res_level {
-	RDT_RESOURCE_L3,
-	RDT_RESOURCE_L2,
-	RDT_RESOURCE_MBA,
-	RDT_RESOURCE_SMBA,
-
-	/* Must be the last */
-	RDT_NUM_RESOURCES,
-};
-
 #define CDP_NUM_TYPES	(CDP_DATA + 1)
 
-/*
- * Event IDs, the values match those used to program IA32_QM_EVTSEL before
- * reading IA32_QM_CTR on RDT systems.
- */
-enum resctrl_event_id {
-	QOS_L3_OCCUP_EVENT_ID		= 0x01,
-	QOS_L3_MBM_TOTAL_EVENT_ID	= 0x02,
-	QOS_L3_MBM_LOCAL_EVENT_ID	= 0x03,
-};
-
 /**
  * struct resctrl_staged_config - parsed configuration to be applied
  * @new_ctrl:		new ctrl value to be loaded
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
new file mode 100644
index 000000000000..51c51a1aabfb
--- /dev/null
+++ b/include/linux/resctrl_types.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2024 Arm Ltd.
+ * Based on arch/x86/kernel/cpu/resctrl/internal.h
+ */
+
+#ifndef __LINUX_RESCTRL_TYPES_H
+#define __LINUX_RESCTRL_TYPES_H
+
+/* Reads to Local DRAM Memory */
+#define READS_TO_LOCAL_MEM		BIT(0)
+
+/* Reads to Remote DRAM Memory */
+#define READS_TO_REMOTE_MEM		BIT(1)
+
+/* Non-Temporal Writes to Local Memory */
+#define NON_TEMP_WRITE_TO_LOCAL_MEM	BIT(2)
+
+/* Non-Temporal Writes to Remote Memory */
+#define NON_TEMP_WRITE_TO_REMOTE_MEM	BIT(3)
+
+/* Reads to Local Memory the system identifies as "Slow Memory" */
+#define READS_TO_LOCAL_S_MEM		BIT(4)
+
+/* Reads to Remote Memory the system identifies as "Slow Memory" */
+#define READS_TO_REMOTE_S_MEM		BIT(5)
+
+/* Dirty Victims to All Types of Memory */
+#define DIRTY_VICTIMS_TO_ALL_MEM	BIT(6)
+
+/* Max event bits supported */
+#define MAX_EVT_CONFIG_BITS		GENMASK(6, 0)
+
+enum resctrl_res_level {
+	RDT_RESOURCE_L3,
+	RDT_RESOURCE_L2,
+	RDT_RESOURCE_MBA,
+	RDT_RESOURCE_SMBA,
+
+	/* Must be the last */
+	RDT_NUM_RESOURCES,
+};
+
+/*
+ * Event IDs, the values match those used to program IA32_QM_EVTSEL before
+ * reading IA32_QM_CTR on RDT systems.
+ */
+enum resctrl_event_id {
+	QOS_L3_OCCUP_EVENT_ID		= 0x01,
+	QOS_L3_MBM_TOTAL_EVENT_ID	= 0x02,
+	QOS_L3_MBM_LOCAL_EVENT_ID	= 0x03,
+};
+
+#endif /* __LINUX_RESCTRL_TYPES_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 14/38] x86/resctrl: Add a resctrl helper to reset all the resources
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (12 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 13/38] x86/resctrl: Move resctrl types to a separate header James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 15/38] x86/resctrl: Move monitor exit work to a restrl exit call James Morse
                   ` (24 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Dave Martin, Shaopeng Tan

On umount(), resctrl resets each resource back to its default
configuration. It only ever does this for all resources in one go.

reset_all_ctrls() is architecture specific as it works with struct
rdt_hw_resource.

Add an architecture helper to reset all resources.

Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
 * Rename the for_each_capable_rdt_resource() introduced in the new
   function resctrl_arch_reset_resources(), back to
   for_each_alloc_capable_rdt_resource() as it was in the original code.

   The change looked unintentional; and presumably a resource that does
   not support resource allocation doesn't have any properties to
   reset...
---
 arch/x86/include/asm/resctrl.h         |  2 ++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 16 +++++++++++-----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index f61382258743..5f6a5375bb4a 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -15,6 +15,8 @@
  */
 #define X86_RESCTRL_EMPTY_CLOSID         ((u32)~0)
 
+void resctrl_arch_reset_resources(void);
+
 /**
  * struct resctrl_pqr_state - State cache for the PQR MSR
  * @cur_rmid:		The cached Resource Monitoring ID
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 1574f5afd0e8..82d64885c6c0 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2886,6 +2886,14 @@ static int reset_all_ctrls(struct rdt_resource *r)
 	return 0;
 }
 
+void resctrl_arch_reset_resources(void)
+{
+	struct rdt_resource *r;
+
+	for_each_alloc_capable_rdt_resource(r)
+		reset_all_ctrls(r);
+}
+
 /*
  * Move tasks from one to the other group. If @from is NULL, then all tasks
  * in the systems are moved unconditionally (used for teardown).
@@ -2995,16 +3003,14 @@ static void rmdir_all_sub(void)
 
 static void rdt_kill_sb(struct super_block *sb)
 {
-	struct rdt_resource *r;
-
 	cpus_read_lock();
 	mutex_lock(&rdtgroup_mutex);
 
 	rdt_disable_ctx();
 
-	/*Put everything back to default values. */
-	for_each_alloc_capable_rdt_resource(r)
-		reset_all_ctrls(r);
+	/* Put everything back to default values. */
+	resctrl_arch_reset_resources();
+
 	rmdir_all_sub();
 	rdt_pseudo_lock_release();
 	rdtgroup_default.mode = RDT_MODE_SHAREABLE;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 15/38] x86/resctrl: Move monitor exit work to a restrl exit call
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (13 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 14/38] x86/resctrl: Add a resctrl helper to reset all the resources James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 16:46   ` Reinette Chatre
  2024-07-11 21:12   ` Carl Worth
  2024-06-14 15:00 ` [PATCH v3 16/38] x86/resctrl: Move monitor init work to a resctrl init call James Morse
                   ` (23 subsequent siblings)
  38 siblings, 2 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

rdt_put_mon_l3_config() is called via the architecture's
resctrl_arch_exit() call, and appears to free the rmid_ptrs[]
and closid_num_dirty_rmid[] arrays. In reality this code is marked
__exit, and is removed by the linker as resctl can't be built
as a module.

To separate the filesystem and architecture parts of resctrl,
this free()ing work needs to be triggered by the filesystem,
as these structures belong to the filesystem code.

Rename rdt_put_mon_l3_config() resctrl_mon_resource_exit()
and call it from resctrl_exit(). The kfree() is currently
dependent on r->mon_capable. resctrl_mon_resource_init()
takes no arguments, so resctrl_mon_resource_exit() shouldn't
take any either. Add the check to dom_data_exit(), making it
take the resource as an argument. This makes it more symmetrical
with dom_data_init().

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v2:
 * Dropped __exit as needed in the next patch.

Change since v1:
 * [Commit message only] Typo fixes:
   s/restrl/resctrl/g
   s/resctl/resctrl/g

 * [Commit message only] Reword second paragraph to remove reference to
   the MPAM error interrupt, which provides background rationale for a
   later patch rather than for this patch, and so it is not really
   relevant here.
---
 arch/x86/kernel/cpu/resctrl/core.c     |  5 -----
 arch/x86/kernel/cpu/resctrl/internal.h |  2 +-
 arch/x86/kernel/cpu/resctrl/monitor.c  | 12 ++++++++----
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |  2 ++
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 9ad660b2b097..2540a7cb11b0 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -950,14 +950,9 @@ late_initcall(resctrl_arch_late_init);
 
 static void __exit resctrl_arch_exit(void)
 {
-	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
-
 	cpuhp_remove_state(rdt_online);
 
 	resctrl_exit();
-
-	if (r->mon_capable)
-		rdt_put_mon_l3_config();
 }
 
 __exitcall(resctrl_arch_exit);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 7ede340b1301..9aa7f587484c 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -535,7 +535,7 @@ void closid_free(int closid);
 int alloc_rmid(u32 closid);
 void free_rmid(u32 closid, u32 rmid);
 int rdt_get_mon_l3_config(struct rdt_resource *r);
-void __exit rdt_put_mon_l3_config(void);
+void resctrl_mon_resource_exit(void);
 bool __init rdt_cpu_has(int flag);
 void mon_event_count(void *info);
 int rdtgroup_mondata_show(struct seq_file *m, void *arg);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 3e5375c365e6..7d6aebce75c1 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -965,10 +965,12 @@ static int dom_data_init(struct rdt_resource *r)
 	return err;
 }
 
-static void __exit dom_data_exit(void)
+static void dom_data_exit(struct rdt_resource *r)
 {
-	mutex_lock(&rdtgroup_mutex);
+	if (!r->mon_capable)
+		return;
 
+	mutex_lock(&rdtgroup_mutex);
 	if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) {
 		kfree(closid_num_dirty_rmid);
 		closid_num_dirty_rmid = NULL;
@@ -1075,9 +1077,11 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	return 0;
 }
 
-void __exit rdt_put_mon_l3_config(void)
+void resctrl_mon_resource_exit(void)
 {
-	dom_data_exit();
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+
+	dom_data_exit(r);
 }
 
 void __init intel_rdt_mbm_apply_quirk(void)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 82d64885c6c0..8c380f389b93 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4212,4 +4212,6 @@ void __exit resctrl_exit(void)
 	debugfs_remove_recursive(debugfs_resctrl);
 	unregister_filesystem(&rdt_fs_type);
 	sysfs_remove_mount_point(fs_kobj, "resctrl");
+
+	resctrl_mon_resource_exit();
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 16/38] x86/resctrl: Move monitor init work to a resctrl init call
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (14 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 15/38] x86/resctrl: Move monitor exit work to a restrl exit call James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 16:47   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 17/38] x86/resctrl: Stop using the for_each_*_rdt_resource() walkers James Morse
                   ` (22 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

rdt_get_mon_l3_config() is called from the architecture's
resctrl_arch_late_init(), and initialises both architecture specific
fields, such as hw_res->mon_scale and resctrl filesystem fields
by calling dom_data_init().

To separate the filesystem and architecture parts of resctrl, this
function needs splitting up.

Add resctrl_mon_resource_init() to do the filesystem specific work,
and call it from resctrl_init(). This runs later, but is still before
the filesystem is mounted and the rmid_ptrs[] array can be used.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v2:
 * Added error handling for the case sysfs files can't be created.
---
 arch/x86/kernel/cpu/resctrl/internal.h |  1 +
 arch/x86/kernel/cpu/resctrl/monitor.c  | 24 +++++++++++++++++-------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |  9 ++++++++-
 3 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 9aa7f587484c..eaf458967fa1 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -542,6 +542,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg);
 void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
 		    struct rdt_domain *d, struct rdtgroup *rdtgrp,
 		    int evtid, int first);
+int resctrl_mon_resource_init(void);
 void mbm_setup_overflow_handler(struct rdt_domain *dom,
 				unsigned long delay_ms,
 				int exclude_cpu);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 7d6aebce75c1..527c0e9d7b2e 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1016,12 +1016,28 @@ static void l3_mon_evt_init(struct rdt_resource *r)
 		list_add_tail(&mbm_local_event.list, &r->evt_list);
 }
 
+int resctrl_mon_resource_init(void)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+	int ret;
+
+	if (!r->mon_capable)
+		return 0;
+
+	ret = dom_data_init(r);
+	if (ret)
+		return ret;
+
+	l3_mon_evt_init(r);
+
+	return 0;
+}
+
 int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 {
 	unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
 	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
 	unsigned int threshold;
-	int ret;
 
 	resctrl_rmid_realloc_limit = boot_cpu_data.x86_cache_size * 1024;
 	hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale;
@@ -1049,10 +1065,6 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	 */
 	resctrl_rmid_realloc_threshold = resctrl_arch_round_mon_val(threshold);
 
-	ret = dom_data_init(r);
-	if (ret)
-		return ret;
-
 	if (rdt_cpu_has(X86_FEATURE_BMEC)) {
 		u32 eax, ebx, ecx, edx;
 
@@ -1070,8 +1082,6 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 		}
 	}
 
-	l3_mon_evt_init(r);
-
 	r->mon_capable = true;
 
 	return 0;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 8c380f389b93..9c03e973a5f6 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4168,9 +4168,15 @@ int __init resctrl_init(void)
 
 	rdtgroup_setup_default();
 
+	ret = resctrl_mon_resource_init();
+	if (ret)
+		return ret;
+
 	ret = sysfs_create_mount_point(fs_kobj, "resctrl");
-	if (ret)
+	if (ret) {
+		resctrl_mon_resource_exit();
 		return ret;
+	}
 
 	ret = register_filesystem(&rdt_fs_type);
 	if (ret)
@@ -4203,6 +4209,7 @@ int __init resctrl_init(void)
 
 cleanup_mountpoint:
 	sysfs_remove_mount_point(fs_kobj, "resctrl");
+	resctrl_mon_resource_exit();
 
 	return ret;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 17/38] x86/resctrl: Stop using the for_each_*_rdt_resource() walkers
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (15 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 16/38] x86/resctrl: Move monitor init work to a resctrl init call James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 16:48   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 18/38] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h James Morse
                   ` (21 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Dave Martin, Shaopeng Tan

The for_each_*_rdt_resource() helpers walk the architecture's array
of structures, using the resctrl visible part as an iterator. These
became over-complex when the structures were split into a
filesystem and architecture-specific struct. This approach avoided
the need to touch every call site.

Once the filesystem parts of resctrl are moved to /fs/, both the
architecture's resource array, and the definition of those structures
is no longer accessible. To support resctrl, each architecture would
have to provide equally complex macros.

Change the resctrl code that uses these to walk through the resource_level
enum and check the mon/alloc capable flags instead. Instances in core.c,
and resctrl_arch_reset_resources() remain part of x86's architecture
specific code.

Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
 * [Whitespace only] Fix bogus whitespace introduced in
   rdtgroup_create_info_dir().

 * [Commit message only] Typo fix:
   s/architectures/architecture's/g
---
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  7 +++++-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 28 +++++++++++++++++++----
 2 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index aacf236dfe3b..ad20822bb64e 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -840,6 +840,7 @@ bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm
 bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
 {
 	cpumask_var_t cpu_with_psl;
+	enum resctrl_res_level i;
 	struct rdt_resource *r;
 	struct rdt_domain *d_i;
 	bool ret = false;
@@ -854,7 +855,11 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
 	 * First determine which cpus have pseudo-locked regions
 	 * associated with them.
 	 */
-	for_each_alloc_capable_rdt_resource(r) {
+	for (i = 0; i < RDT_NUM_RESOURCES; i++) {
+		r = resctrl_arch_get_resource(i);
+		if (!r->alloc_capable)
+			continue;
+
 		list_for_each_entry(d_i, &r->domains, list) {
 			if (d_i->plr)
 				cpumask_or(cpu_with_psl, cpu_with_psl,
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 9c03e973a5f6..d9513d7b5157 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -98,12 +98,17 @@ void rdt_last_cmd_printf(const char *fmt, ...)
 
 void rdt_staged_configs_clear(void)
 {
+	enum resctrl_res_level i;
 	struct rdt_resource *r;
 	struct rdt_domain *dom;
 
 	lockdep_assert_held(&rdtgroup_mutex);
 
-	for_each_alloc_capable_rdt_resource(r) {
+	for (i = 0; i < RDT_NUM_RESOURCES; i++) {
+		r = resctrl_arch_get_resource(i);
+		if (!r->alloc_capable)
+			continue;
+
 		list_for_each_entry(dom, &r->domains, list)
 			memset(dom->staged_config, 0, sizeof(dom->staged_config));
 	}
@@ -2192,6 +2197,7 @@ static u32 fflags_from_resource(struct rdt_resource *r)
 
 static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
 {
+	enum resctrl_res_level i;
 	struct resctrl_schema *s;
 	struct rdt_resource *r;
 	unsigned long fflags;
@@ -2216,7 +2222,11 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
 			goto out_destroy;
 	}
 
-	for_each_mon_capable_rdt_resource(r) {
+	for (i = 0; i < RDT_NUM_RESOURCES; i++) {
+		r = resctrl_arch_get_resource(i);
+		if (!r->mon_capable)
+			continue;
+
 		fflags =  fflags_from_resource(r) | RFTYPE_MON_INFO;
 		sprintf(name, "%s_MON", r->name);
 		ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
@@ -2637,10 +2647,15 @@ static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type
 
 static int schemata_list_create(void)
 {
+	enum resctrl_res_level i;
 	struct rdt_resource *r;
 	int ret = 0;
 
-	for_each_alloc_capable_rdt_resource(r) {
+	for (i = 0; i < RDT_NUM_RESOURCES; i++) {
+		r = resctrl_arch_get_resource(i);
+		if (!r->alloc_capable)
+			continue;
+
 		if (resctrl_arch_get_cdp_enabled(r->rid)) {
 			ret = schemata_list_add(r, CDP_CODE);
 			if (ret)
@@ -3181,6 +3196,7 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn,
 			     struct rdtgroup *prgrp,
 			     struct kernfs_node **dest_kn)
 {
+	enum resctrl_res_level i;
 	struct rdt_resource *r;
 	struct kernfs_node *kn;
 	int ret;
@@ -3199,7 +3215,11 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn,
 	 * Create the subdirectories for each domain. Note that all events
 	 * in a domain like L3 are grouped into a resource whose domain is L3
 	 */
-	for_each_mon_capable_rdt_resource(r) {
+	for (i = 0; i < RDT_NUM_RESOURCES; i++) {
+		r = resctrl_arch_get_resource(i);
+		if (!r->mon_capable)
+			continue;
+
 		ret = mkdir_mondata_subdir_alldom(kn, r, prgrp);
 		if (ret)
 			goto out_destroy;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 18/38] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (16 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 17/38] x86/resctrl: Stop using the for_each_*_rdt_resource() walkers James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 19/38] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC James Morse
                   ` (20 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

The architecture specific parts of resctrl have helpers to hide accesses
to the rdt_mon_features bitmap.

Once the filesystem parts of resctrl are moved, these can no longer live
in internal.h. Once these are exposed to the wider kernel, they should
have a 'resctrl_arch_' prefix, to fit the rest of the arch<->fs interface.

Move and rename the helpers that touch rdt_mon_features directly.
is_mbm_event() and is_mbm_enabled() are only called from rdtgroup.c,
so can be moved into that file.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 arch/x86/include/asm/resctrl.h         | 17 +++++++++++
 arch/x86/kernel/cpu/resctrl/core.c     |  4 +--
 arch/x86/kernel/cpu/resctrl/internal.h | 27 -----------------
 arch/x86/kernel/cpu/resctrl/monitor.c  | 18 ++++++------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 40 +++++++++++++++++---------
 5 files changed, 54 insertions(+), 52 deletions(-)

diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 5f6a5375bb4a..50407e83d0ca 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -7,6 +7,7 @@
 #include <linux/jump_label.h>
 #include <linux/percpu.h>
 #include <linux/sched.h>
+#include <linux/resctrl_types.h>
 
 /*
  * This value can never be a valid CLOSID, and is used when mapping a
@@ -43,6 +44,7 @@ DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state);
 
 extern bool rdt_alloc_capable;
 extern bool rdt_mon_capable;
+extern unsigned int rdt_mon_features;
 
 DECLARE_STATIC_KEY_FALSE(rdt_enable_key);
 DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key);
@@ -82,6 +84,21 @@ static inline void resctrl_arch_disable_mon(void)
 	static_branch_dec_cpuslocked(&rdt_enable_key);
 }
 
+static inline bool resctrl_arch_is_llc_occupancy_enabled(void)
+{
+	return (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID));
+}
+
+static inline bool resctrl_arch_is_mbm_total_enabled(void)
+{
+	return (rdt_mon_features & (1 << QOS_L3_MBM_TOTAL_EVENT_ID));
+}
+
+static inline bool resctrl_arch_is_mbm_local_enabled(void)
+{
+	return (rdt_mon_features & (1 << QOS_L3_MBM_LOCAL_EVENT_ID));
+}
+
 /*
  * __resctrl_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR
  *
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 2540a7cb11b0..06521c124795 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -465,13 +465,13 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_domain *hw_dom)
 {
 	size_t tsize;
 
-	if (is_mbm_total_enabled()) {
+	if (resctrl_arch_is_mbm_total_enabled()) {
 		tsize = sizeof(*hw_dom->arch_mbm_total);
 		hw_dom->arch_mbm_total = kcalloc(num_rmid, tsize, GFP_KERNEL);
 		if (!hw_dom->arch_mbm_total)
 			return -ENOMEM;
 	}
-	if (is_mbm_local_enabled()) {
+	if (resctrl_arch_is_mbm_local_enabled()) {
 		tsize = sizeof(*hw_dom->arch_mbm_local);
 		hw_dom->arch_mbm_local = kcalloc(num_rmid, tsize, GFP_KERNEL);
 		if (!hw_dom->arch_mbm_local) {
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index eaf458967fa1..f4f48542447f 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -131,7 +131,6 @@ struct rmid_read {
 	void			*arch_mon_ctx;
 };
 
-extern unsigned int rdt_mon_features;
 extern struct list_head resctrl_schema_all;
 extern bool resctrl_mounted;
 
@@ -363,32 +362,6 @@ struct msr_param {
 	u32			high;
 };
 
-static inline bool is_llc_occupancy_enabled(void)
-{
-	return (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID));
-}
-
-static inline bool is_mbm_total_enabled(void)
-{
-	return (rdt_mon_features & (1 << QOS_L3_MBM_TOTAL_EVENT_ID));
-}
-
-static inline bool is_mbm_local_enabled(void)
-{
-	return (rdt_mon_features & (1 << QOS_L3_MBM_LOCAL_EVENT_ID));
-}
-
-static inline bool is_mbm_enabled(void)
-{
-	return (is_mbm_total_enabled() || is_mbm_local_enabled());
-}
-
-static inline bool is_mbm_event(int e)
-{
-	return (e >= QOS_L3_MBM_TOTAL_EVENT_ID &&
-		e <= QOS_L3_MBM_LOCAL_EVENT_ID);
-}
-
 /**
  * struct rdt_hw_resource - arch private attributes of a resctrl resource
  * @r_resctrl:		Attributes of the resource used directly by resctrl.
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 527c0e9d7b2e..66bd869f02a9 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -252,11 +252,11 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_domain *d)
 {
 	struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
 
-	if (is_mbm_total_enabled())
+	if (resctrl_arch_is_mbm_total_enabled())
 		memset(hw_dom->arch_mbm_total, 0,
 		       sizeof(*hw_dom->arch_mbm_total) * r->num_rmid);
 
-	if (is_mbm_local_enabled())
+	if (resctrl_arch_is_mbm_local_enabled())
 		memset(hw_dom->arch_mbm_local, 0,
 		       sizeof(*hw_dom->arch_mbm_local) * r->num_rmid);
 }
@@ -525,7 +525,7 @@ void free_rmid(u32 closid, u32 rmid)
 
 	entry = __rmid_entry(idx);
 
-	if (is_llc_occupancy_enabled())
+	if (resctrl_arch_is_llc_occupancy_enabled())
 		add_rmid_to_limbo(entry);
 	else
 		list_add_tail(&entry->list, &rmid_free_lru);
@@ -677,7 +677,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
 	struct list_head *head;
 	struct rdtgroup *entry;
 
-	if (!is_mbm_local_enabled())
+	if (!resctrl_arch_is_mbm_local_enabled())
 		return;
 
 	r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
@@ -746,7 +746,7 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d,
 	 * This is protected from concurrent reads from user
 	 * as both the user and we hold the global mutex.
 	 */
-	if (is_mbm_total_enabled()) {
+	if (resctrl_arch_is_mbm_total_enabled()) {
 		rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID;
 		rr.val = 0;
 		rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
@@ -760,7 +760,7 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d,
 
 		resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
 	}
-	if (is_mbm_local_enabled()) {
+	if (resctrl_arch_is_mbm_local_enabled()) {
 		rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID;
 		rr.val = 0;
 		rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
@@ -1008,11 +1008,11 @@ static void l3_mon_evt_init(struct rdt_resource *r)
 {
 	INIT_LIST_HEAD(&r->evt_list);
 
-	if (is_llc_occupancy_enabled())
+	if (resctrl_arch_is_llc_occupancy_enabled())
 		list_add_tail(&llc_occupancy_event.list, &r->evt_list);
-	if (is_mbm_total_enabled())
+	if (resctrl_arch_is_mbm_total_enabled())
 		list_add_tail(&mbm_total_event.list, &r->evt_list);
-	if (is_mbm_local_enabled())
+	if (resctrl_arch_is_mbm_local_enabled())
 		list_add_tail(&mbm_local_event.list, &r->evt_list);
 }
 
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index d9513d7b5157..8c4ec7df12d3 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -114,6 +114,18 @@ void rdt_staged_configs_clear(void)
 	}
 }
 
+static bool resctrl_is_mbm_enabled(void)
+{
+	return (resctrl_arch_is_mbm_total_enabled() ||
+		resctrl_arch_is_mbm_local_enabled());
+}
+
+static bool resctrl_is_mbm_event(int e)
+{
+	return (e >= QOS_L3_MBM_TOTAL_EVENT_ID &&
+		e <= QOS_L3_MBM_LOCAL_EVENT_ID);
+}
+
 /*
  * Trivial allocator for CLOSIDs. Since h/w only supports a small number,
  * we can keep a bitmap of free CLOSIDs in a single integer.
@@ -161,7 +173,7 @@ static int closid_alloc(void)
 	lockdep_assert_held(&rdtgroup_mutex);
 
 	if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID) &&
-	    is_llc_occupancy_enabled()) {
+	    resctrl_arch_is_llc_occupancy_enabled()) {
 		cleanest_closid = resctrl_find_cleanest_closid();
 		if (cleanest_closid < 0)
 			return cleanest_closid;
@@ -2381,7 +2393,7 @@ static bool supports_mba_mbps(void)
 {
 	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
 
-	return (is_mbm_local_enabled() &&
+	return (resctrl_arch_is_mbm_local_enabled() &&
 		r->alloc_capable && is_mba_linear());
 }
 
@@ -2760,7 +2772,7 @@ static int rdt_get_tree(struct fs_context *fc)
 	if (resctrl_arch_alloc_capable() || resctrl_arch_mon_capable())
 		resctrl_mounted = true;
 
-	if (is_mbm_enabled()) {
+	if (resctrl_is_mbm_enabled()) {
 		r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
 		list_for_each_entry(dom, &r->domains, list)
 			mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL,
@@ -3122,7 +3134,7 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn,
 		if (ret)
 			goto out_destroy;
 
-		if (is_mbm_event(mevt->evtid))
+		if (resctrl_is_mbm_event(mevt->evtid))
 			mon_event_read(&rr, r, d, prgrp, mevt->evtid, true);
 	}
 	kernfs_activate(kn);
@@ -4024,9 +4036,9 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d)
 	if (resctrl_mounted && resctrl_arch_mon_capable())
 		rmdir_mondata_subdir_allrdtgrp(r, d->id);
 
-	if (is_mbm_enabled())
+	if (resctrl_is_mbm_enabled())
 		cancel_delayed_work(&d->mbm_over);
-	if (is_llc_occupancy_enabled() && has_busy_rmid(d)) {
+	if (resctrl_arch_is_llc_occupancy_enabled() && has_busy_rmid(d)) {
 		/*
 		 * When a package is going down, forcefully
 		 * decrement rmid->ebusy. There is no way to know
@@ -4050,12 +4062,12 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d)
 	u32 idx_limit = resctrl_arch_system_num_rmid_idx();
 	size_t tsize;
 
-	if (is_llc_occupancy_enabled()) {
+	if (resctrl_arch_is_llc_occupancy_enabled()) {
 		d->rmid_busy_llc = bitmap_zalloc(idx_limit, GFP_KERNEL);
 		if (!d->rmid_busy_llc)
 			return -ENOMEM;
 	}
-	if (is_mbm_total_enabled()) {
+	if (resctrl_arch_is_mbm_total_enabled()) {
 		tsize = sizeof(*d->mbm_total);
 		d->mbm_total = kcalloc(idx_limit, tsize, GFP_KERNEL);
 		if (!d->mbm_total) {
@@ -4063,7 +4075,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d)
 			return -ENOMEM;
 		}
 	}
-	if (is_mbm_local_enabled()) {
+	if (resctrl_arch_is_mbm_local_enabled()) {
 		tsize = sizeof(*d->mbm_local);
 		d->mbm_local = kcalloc(idx_limit, tsize, GFP_KERNEL);
 		if (!d->mbm_local) {
@@ -4095,13 +4107,13 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d)
 	if (err)
 		goto out_unlock;
 
-	if (is_mbm_enabled()) {
+	if (resctrl_is_mbm_enabled()) {
 		INIT_DELAYED_WORK(&d->mbm_over, mbm_handle_overflow);
 		mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL,
 					   RESCTRL_PICK_ANY_CPU);
 	}
 
-	if (is_llc_occupancy_enabled())
+	if (resctrl_arch_is_llc_occupancy_enabled())
 		INIT_DELAYED_WORK(&d->cqm_limbo, cqm_handle_limbo);
 
 	/*
@@ -4156,12 +4168,12 @@ void resctrl_offline_cpu(unsigned int cpu)
 
 	d = get_domain_from_cpu(cpu, l3);
 	if (d) {
-		if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
+		if (resctrl_is_mbm_enabled() && cpu == d->mbm_work_cpu) {
 			cancel_delayed_work(&d->mbm_over);
 			mbm_setup_overflow_handler(d, 0, cpu);
 		}
-		if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu &&
-		    has_busy_rmid(d)) {
+		if (resctrl_arch_is_llc_occupancy_enabled() &&
+		    cpu == d->cqm_work_cpu && has_busy_rmid(d)) {
 			cancel_delayed_work(&d->cqm_limbo);
 			cqm_setup_limbo_handler(d, 0, cpu);
 		}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 19/38] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (17 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 18/38] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 20/38] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
                   ` (19 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

When BMEC is supported the resctrl event can be configured in a number
of ways. This depends on architecture support. rdt_get_mon_l3_config()
modifies the struct mon_evt and calls mbm_config_rftype_init() to create
the files that allow the configuration.

Splitting this into separate architecture and filesystem parts would
require the struct mon_evt and mbm_config_rftype_init() to be exposed.

Instead, add resctrl_arch_is_evt_configurable(), and use this from
resctrl_mon_resource_init() to initialise struct mon_evt and call
mbm_config_rftype_init().
resctrl_arch_is_evt_configurable() calls rdt_cpu_has() so it doesn't
obviously benefit from being inlined. Putting it in core.c will allow
rdt_cpu_has() to eventually become static.

resctrl_arch_is_evt_configurable() uses rdt_cpu_has() from
resctrl_mon_resource_init(), which isn't marked __init. In addition,
MPAM needs to initialise resctrl late. Drop the __init on the relevant
functions.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 arch/x86/kernel/cpu/resctrl/core.c     | 19 +++++++++++++++++--
 arch/x86/kernel/cpu/resctrl/internal.h |  4 ++--
 arch/x86/kernel/cpu/resctrl/monitor.c  | 18 +++++++++---------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |  2 +-
 include/linux/resctrl.h                |  2 ++
 5 files changed, 31 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 06521c124795..252201eefdd0 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -651,7 +651,7 @@ struct rdt_options {
 	bool	force_off, force_on;
 };
 
-static struct rdt_options rdt_options[]  __initdata = {
+static struct rdt_options rdt_options[]  __ro_after_init = {
 	RDT_OPT(RDT_FLAG_CMT,	    "cmt",	X86_FEATURE_CQM_OCCUP_LLC),
 	RDT_OPT(RDT_FLAG_MBM_TOTAL, "mbmtotal", X86_FEATURE_CQM_MBM_TOTAL),
 	RDT_OPT(RDT_FLAG_MBM_LOCAL, "mbmlocal", X86_FEATURE_CQM_MBM_LOCAL),
@@ -691,7 +691,7 @@ static int __init set_rdt_options(char *str)
 }
 __setup("rdt", set_rdt_options);
 
-bool __init rdt_cpu_has(int flag)
+bool rdt_cpu_has(int flag)
 {
 	bool ret = boot_cpu_has(flag);
 	struct rdt_options *o;
@@ -711,6 +711,21 @@ bool __init rdt_cpu_has(int flag)
 	return ret;
 }
 
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
+{
+	if (!rdt_cpu_has(X86_FEATURE_BMEC))
+		return false;
+
+	switch (evt) {
+	case QOS_L3_MBM_TOTAL_EVENT_ID:
+		return rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL);
+	case QOS_L3_MBM_LOCAL_EVENT_ID:
+		return rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL);
+	default:
+		return false;
+	}
+}
+
 static __init bool get_mem_config(void)
 {
 	struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_MBA];
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index f4f48542447f..06e80356cdbb 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -509,7 +509,7 @@ int alloc_rmid(u32 closid);
 void free_rmid(u32 closid, u32 rmid);
 int rdt_get_mon_l3_config(struct rdt_resource *r);
 void resctrl_mon_resource_exit(void);
-bool __init rdt_cpu_has(int flag);
+bool rdt_cpu_has(int flag);
 void mon_event_count(void *info);
 int rdtgroup_mondata_show(struct seq_file *m, void *arg);
 void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
@@ -529,7 +529,7 @@ bool has_busy_rmid(struct rdt_domain *d);
 void __check_limbo(struct rdt_domain *d, bool force_free);
 void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
 void __init thread_throttle_mode_init(void);
-void __init mbm_config_rftype_init(const char *config);
+void mbm_config_rftype_init(const char *config);
 void rdt_staged_configs_clear(void);
 bool closid_allocated(unsigned int closid);
 int resctrl_find_cleanest_closid(void);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 66bd869f02a9..5906dccfb247 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1030,6 +1030,15 @@ int resctrl_mon_resource_init(void)
 
 	l3_mon_evt_init(r);
 
+	if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_TOTAL_EVENT_ID)) {
+		mbm_total_event.configurable = true;
+		mbm_config_rftype_init("mbm_total_bytes_config");
+	}
+	if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_LOCAL_EVENT_ID)) {
+		mbm_local_event.configurable = true;
+		mbm_config_rftype_init("mbm_local_bytes_config");
+	}
+
 	return 0;
 }
 
@@ -1071,15 +1080,6 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 		/* Detect list of bandwidth sources that can be tracked */
 		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
 		hw_res->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
-
-		if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
-			mbm_total_event.configurable = true;
-			mbm_config_rftype_init("mbm_total_bytes_config");
-		}
-		if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL)) {
-			mbm_local_event.configurable = true;
-			mbm_config_rftype_init("mbm_local_bytes_config");
-		}
 	}
 
 	r->mon_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 8c4ec7df12d3..da71f1d80be4 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2066,7 +2066,7 @@ void __init thread_throttle_mode_init(void)
 	rft->fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB;
 }
 
-void __init mbm_config_rftype_init(const char *config)
+void mbm_config_rftype_init(const char *config)
 {
 	struct rftype *rft;
 
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 02b745f9c4c4..fbae9a907544 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -262,6 +262,8 @@ u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
 struct rdt_domain *resctrl_arch_find_domain(struct rdt_resource *r, int id);
 int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
 
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
+
 /*
  * Update the ctrl_val and apply this config right now.
  * Must be called on one of the domain's CPUs.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 20/38] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (18 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 19/38] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 16:49   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 21/38] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
                   ` (18 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Dave Martin, Shaopeng Tan

mon_event_config_{read,write}() are called via IPI and access model
specific registers to do their work.

To support another architecture, this needs abstracting.

Rename mon_event_config_{read,write}() to have a "resctrl_arch_"
prefix, and move their struct mon_config_info parameter into
<linux/resctrl.h>.  This allows another architecture to supply an
implementation of these.

As struct mon_config_info is now exposed globally, give it a 'resctrl_'
prefix. MPAM systems need access to the domain to do this work, add
the resource and domain to struct resctrl_mon_config_info.

Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
 * [Whitespace only] Re-tabbed struct resctrl_mon_config_info in
   <linux/resctrl.h> to fit the prevailing style.

   Non-functional change.

 * [Commit message only] Reword to align with the actual naming of the
   definitions and destination header file.
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 34 +++++++++++++-------------
 include/linux/resctrl.h                |  9 +++++++
 2 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index da71f1d80be4..45c9b8b76cca 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1578,11 +1578,6 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
 	return ret;
 }
 
-struct mon_config_info {
-	u32 evtid;
-	u32 mon_config;
-};
-
 #define INVALID_CONFIG_INDEX   UINT_MAX
 
 /**
@@ -1607,9 +1602,9 @@ static inline unsigned int mon_event_config_index_get(u32 evtid)
 	}
 }
 
-static void mon_event_config_read(void *info)
+void resctrl_arch_mon_event_config_read(void *info)
 {
-	struct mon_config_info *mon_info = info;
+	struct resctrl_mon_config_info *mon_info = info;
 	unsigned int index;
 	u64 msrval;
 
@@ -1624,14 +1619,15 @@ static void mon_event_config_read(void *info)
 	mon_info->mon_config = msrval & MAX_EVT_CONFIG_BITS;
 }
 
-static void mondata_config_read(struct rdt_domain *d, struct mon_config_info *mon_info)
+static void mondata_config_read(struct resctrl_mon_config_info *mon_info)
 {
-	smp_call_function_any(&d->cpu_mask, mon_event_config_read, mon_info, 1);
+	smp_call_function_any(&mon_info->d->cpu_mask,
+			      resctrl_arch_mon_event_config_read, mon_info, 1);
 }
 
 static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
 {
-	struct mon_config_info mon_info = {0};
+	struct resctrl_mon_config_info mon_info = {0};
 	struct rdt_domain *dom;
 	bool sep = false;
 
@@ -1642,9 +1638,11 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid
 		if (sep)
 			seq_puts(s, ";");
 
-		memset(&mon_info, 0, sizeof(struct mon_config_info));
+		memset(&mon_info, 0, sizeof(struct resctrl_mon_config_info));
+		mon_info.r = r;
+		mon_info.d = dom;
 		mon_info.evtid = evtid;
-		mondata_config_read(dom, &mon_info);
+		mondata_config_read(&mon_info);
 
 		seq_printf(s, "%d=0x%02x", dom->id, mon_info.mon_config);
 		sep = true;
@@ -1677,9 +1675,9 @@ static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
 	return 0;
 }
 
-static void mon_event_config_write(void *info)
+void resctrl_arch_mon_event_config_write(void *info)
 {
-	struct mon_config_info *mon_info = info;
+	struct resctrl_mon_config_info *mon_info = info;
 	unsigned int index;
 
 	index = mon_event_config_index_get(mon_info->evtid);
@@ -1693,14 +1691,16 @@ static void mon_event_config_write(void *info)
 static void mbm_config_write_domain(struct rdt_resource *r,
 				    struct rdt_domain *d, u32 evtid, u32 val)
 {
-	struct mon_config_info mon_info = {0};
+	struct resctrl_mon_config_info mon_info = {0};
 
 	/*
 	 * Read the current config value first. If both are the same then
 	 * no need to write it again.
 	 */
+	mon_info.r = r;
+	mon_info.d = d;
 	mon_info.evtid = evtid;
-	mondata_config_read(d, &mon_info);
+	mondata_config_read(&mon_info);
 	if (mon_info.mon_config == val)
 		return;
 
@@ -1712,7 +1712,7 @@ static void mbm_config_write_domain(struct rdt_resource *r,
 	 * are scoped at the domain level. Writing any of these MSRs
 	 * on one CPU is observed by all the CPUs in the domain.
 	 */
-	smp_call_function_any(&d->cpu_mask, mon_event_config_write,
+	smp_call_function_any(&d->cpu_mask, resctrl_arch_mon_event_config_write,
 			      &mon_info, 1);
 
 	/*
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index fbae9a907544..65f0a2d17e4b 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -220,6 +220,13 @@ struct resctrl_cpu_defaults {
 	u32 rmid;
 };
 
+struct resctrl_mon_config_info {
+	struct rdt_resource	*r;
+	struct rdt_domain	*d;
+	u32			evtid;
+	u32			mon_config;
+};
+
 /**
  * resctrl_arch_sync_cpu_closid_rmid() - Refresh this CPU's CLOSID and RMID.
  *					 Call via IPI.
@@ -263,6 +270,8 @@ struct rdt_domain *resctrl_arch_find_domain(struct rdt_resource *r, int id);
 int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
 
 bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
+void resctrl_arch_mon_event_config_write(void *info);
+void resctrl_arch_mon_event_config_read(void *info);
 
 /*
  * Update the ctrl_val and apply this config right now.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 21/38] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (19 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 20/38] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 16:53   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 22/38] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions James Morse
                   ` (17 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

The mbm_cfg_mask field lists the bits that user-space can set when
configuring an event. This value is output via the last_cmd_status
file.

Once the filesystem parts of resctrl are moved to live in /fs/, the
struct rdt_hw_resource is inaccessible to the filesystem code. Because
this value is output to user-space, it has to be accessible to the
filesystem code.

Move it to struct rdt_resource.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Change since v1:
 * Reword comments to avoid being overly arch-specific.
---
 arch/x86/kernel/cpu/resctrl/internal.h | 3 ---
 arch/x86/kernel/cpu/resctrl/monitor.c  | 2 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 ++---
 include/linux/resctrl.h                | 3 +++
 4 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 06e80356cdbb..85a5c6e83fad 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -374,8 +374,6 @@ struct msr_param {
  * @msr_update:		Function pointer to update QOS MSRs
  * @mon_scale:		cqm counter * mon_scale = occupancy in bytes
  * @mbm_width:		Monitor width, to detect and correct for overflow.
- * @mbm_cfg_mask:	Bandwidth sources that can be tracked when Bandwidth
- *			Monitoring Event Configuration (BMEC) is supported.
  * @cdp_enabled:	CDP state of this resource
  *
  * Members of this structure are either private to the architecture
@@ -389,7 +387,6 @@ struct rdt_hw_resource {
 	void			(*msr_update)(struct msr_param *m);
 	unsigned int		mon_scale;
 	unsigned int		mbm_width;
-	unsigned int		mbm_cfg_mask;
 	bool			cdp_enabled;
 };
 
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 5906dccfb247..a09f5ed929d3 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1079,7 +1079,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 
 		/* Detect list of bandwidth sources that can be tracked */
 		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
-		hw_res->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
+		r->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
 	}
 
 	r->mon_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 45c9b8b76cca..0446d30db4da 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1729,7 +1729,6 @@ static void mbm_config_write_domain(struct rdt_resource *r,
 
 static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid)
 {
-	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
 	char *dom_str = NULL, *id_str;
 	unsigned long dom_id, val;
 	struct rdt_domain *d;
@@ -1756,9 +1755,9 @@ static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid)
 	}
 
 	/* Value from user cannot be more than the supported set of events */
-	if ((val & hw_res->mbm_cfg_mask) != val) {
+	if ((val & r->mbm_cfg_mask) != val) {
 		rdt_last_cmd_printf("Invalid event configuration: max valid mask is 0x%02x\n",
-				    hw_res->mbm_cfg_mask);
+				    r->mbm_cfg_mask);
 		return -EINVAL;
 	}
 
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 65f0a2d17e4b..b04744b00f6f 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -167,6 +167,8 @@ enum resctrl_schema_fmt {
  * @name:		Name to use in "schemata" file.
  * @schema_fmt:	Which format string and parser is used for this schema.
  * @evt_list:		List of monitoring events
+ * @mbm_cfg_mask:	Bandwidth sources that can be tracked when Bandwidth
+ *			Monitoring Event Configuration (BMEC) is supported.
  * @cdp_capable:	Is the CDP feature available on this resource
  */
 struct rdt_resource {
@@ -181,6 +183,7 @@ struct rdt_resource {
 	char			*name;
 	enum resctrl_schema_fmt	schema_fmt;
 	struct list_head	evt_list;
+	unsigned int		mbm_cfg_mask;
 	bool			cdp_capable;
 };
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 22/38] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (20 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 21/38] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 23/38] x86/resctrl: Allow an architecture to disable pseudo lock James Morse
                   ` (16 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

resctrl's pseudo lock has some copy-to-cache and measurement
functions that are micro-architecture specific. pseudo_lock_fn()
is not at all portable. Label these 'resctrl_arch_' so they stay
under /arch/x86.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 arch/x86/include/asm/resctrl.h            |  5 ++++
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 36 ++++++++++++-----------
 2 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 50407e83d0ca..a88af68f9fe2 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -211,6 +211,11 @@ static inline void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, int evtid
 static inline void resctrl_arch_mon_ctx_free(struct rdt_resource *r, int evtid,
 					     void *ctx) { };
 
+u64 resctrl_arch_get_prefetch_disable_bits(void);
+int resctrl_arch_pseudo_lock_fn(void *_rdtgrp);
+int resctrl_arch_measure_cycles_lat_fn(void *_plr);
+int resctrl_arch_measure_l2_residency(void *_plr);
+int resctrl_arch_measure_l3_residency(void *_plr);
 void resctrl_cpu_detect(struct cpuinfo_x86 *c);
 
 #else
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index ad20822bb64e..f5d20a040a3d 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -62,7 +62,8 @@ static const struct class pseudo_lock_class = {
 };
 
 /**
- * get_prefetch_disable_bits - prefetch disable bits of supported platforms
+ * resctrl_arch_get_prefetch_disable_bits - prefetch disable bits of supported
+ *                                          platforms
  * @void: It takes no parameters.
  *
  * Capture the list of platforms that have been validated to support
@@ -76,13 +77,13 @@ static const struct class pseudo_lock_class = {
  * in the SDM.
  *
  * When adding a platform here also add support for its cache events to
- * measure_cycles_perf_fn()
+ * resctrl_arch_measure_l*_residency()
  *
  * Return:
  * If platform is supported, the bits to disable hardware prefetchers, 0
  * if platform is not supported.
  */
-static u64 get_prefetch_disable_bits(void)
+u64 resctrl_arch_get_prefetch_disable_bits(void)
 {
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
 	    boot_cpu_data.x86 != 6)
@@ -410,7 +411,7 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
 }
 
 /**
- * pseudo_lock_fn - Load kernel memory into cache
+ * resctrl_arch_pseudo_lock_fn - Load kernel memory into cache
  * @_rdtgrp: resource group to which pseudo-lock region belongs
  *
  * This is the core pseudo-locking flow.
@@ -428,7 +429,7 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
  *
  * Return: 0. Waiter on waitqueue will be woken on completion.
  */
-static int pseudo_lock_fn(void *_rdtgrp)
+int resctrl_arch_pseudo_lock_fn(void *_rdtgrp)
 {
 	struct rdtgroup *rdtgrp = _rdtgrp;
 	struct pseudo_lock_region *plr = rdtgrp->plr;
@@ -714,7 +715,7 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
 	 * Not knowing the bits to disable prefetching implies that this
 	 * platform does not support Cache Pseudo-Locking.
 	 */
-	prefetch_disable_bits = get_prefetch_disable_bits();
+	prefetch_disable_bits = resctrl_arch_get_prefetch_disable_bits();
 	if (prefetch_disable_bits == 0) {
 		rdt_last_cmd_puts("Pseudo-locking not supported\n");
 		return -EINVAL;
@@ -879,7 +880,8 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
 }
 
 /**
- * measure_cycles_lat_fn - Measure cycle latency to read pseudo-locked memory
+ * resctrl_arch_measure_cycles_lat_fn - Measure cycle latency to read
+ *                                      pseudo-locked memory
  * @_plr: pseudo-lock region to measure
  *
  * There is no deterministic way to test if a memory region is cached. One
@@ -892,7 +894,7 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
  *
  * Return: 0. Waiter on waitqueue will be woken on completion.
  */
-static int measure_cycles_lat_fn(void *_plr)
+int resctrl_arch_measure_cycles_lat_fn(void *_plr)
 {
 	struct pseudo_lock_region *plr = _plr;
 	u32 saved_low, saved_high;
@@ -1076,7 +1078,7 @@ static int measure_residency_fn(struct perf_event_attr *miss_attr,
 	return 0;
 }
 
-static int measure_l2_residency(void *_plr)
+int resctrl_arch_measure_l2_residency(void *_plr)
 {
 	struct pseudo_lock_region *plr = _plr;
 	struct residency_counts counts = {0};
@@ -1114,7 +1116,7 @@ static int measure_l2_residency(void *_plr)
 	return 0;
 }
 
-static int measure_l3_residency(void *_plr)
+int resctrl_arch_measure_l3_residency(void *_plr)
 {
 	struct pseudo_lock_region *plr = _plr;
 	struct residency_counts counts = {0};
@@ -1212,18 +1214,18 @@ static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
 	plr->cpu = cpu;
 
 	if (sel == 1)
-		thread = kthread_create_on_node(measure_cycles_lat_fn, plr,
-						cpu_to_node(cpu),
+		thread = kthread_create_on_node(resctrl_arch_measure_cycles_lat_fn,
+						plr, cpu_to_node(cpu),
 						"pseudo_lock_measure/%u",
 						cpu);
 	else if (sel == 2)
-		thread = kthread_create_on_node(measure_l2_residency, plr,
-						cpu_to_node(cpu),
+		thread = kthread_create_on_node(resctrl_arch_measure_l2_residency,
+						plr, cpu_to_node(cpu),
 						"pseudo_lock_measure/%u",
 						cpu);
 	else if (sel == 3)
-		thread = kthread_create_on_node(measure_l3_residency, plr,
-						cpu_to_node(cpu),
+		thread = kthread_create_on_node(resctrl_arch_measure_l3_residency,
+						plr, cpu_to_node(cpu),
 						"pseudo_lock_measure/%u",
 						cpu);
 	else
@@ -1322,7 +1324,7 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
 
 	plr->thread_done = 0;
 
-	thread = kthread_create_on_node(pseudo_lock_fn, rdtgrp,
+	thread = kthread_create_on_node(resctrl_arch_pseudo_lock_fn, rdtgrp,
 					cpu_to_node(plr->cpu),
 					"pseudo_lock/%u", plr->cpu);
 	if (IS_ERR(thread)) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 23/38] x86/resctrl: Allow an architecture to disable pseudo lock
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (21 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 22/38] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-07-11 21:33   ` Carl Worth
  2024-06-14 15:00 ` [PATCH v3 24/38] x86/resctrl: Make prefetch_disable_bits belong to the arch code James Morse
                   ` (15 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

Pseudo-lock relies on knowledge of the micro-architecture to disable
prefetchers etc.

On arm64 these controls are typically secure only, meaning linux can't
access them. Arm's cache-lockdown feature works in a very different
way. Resctrl's pseudo-lock isn't going to be used on arm64 platforms.

Add a Kconfig symbol that can be selected by the architecture. This
enables or disables building of the pseudo_lock.c file, and replaces
the functions with stubs. An additional IS_ENABLED() check is needed
in rdtgroup_mode_write() so that attempting to enable pseudo-lock
reports an "Unknown or unsupported mode" to user-space via the
last_cmd file.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v2:
 * Clarified the commit message as to where the error string is printed.

Changes since v1:
 * [Commit message only] Typo fix:
   s/psuedo/pseudo/g
---
 arch/x86/Kconfig                       |  7 ++++
 arch/x86/kernel/cpu/resctrl/Makefile   |  5 +--
 arch/x86/kernel/cpu/resctrl/internal.h | 48 +++++++++++++++++++++-----
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |  3 +-
 4 files changed, 52 insertions(+), 11 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 1d7122a1883e..446984277b45 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -492,6 +492,7 @@ config X86_CPU_RESCTRL
 	depends on X86 && (CPU_SUP_INTEL || CPU_SUP_AMD)
 	select KERNFS
 	select PROC_CPU_RESCTRL		if PROC_FS
+	select RESCTRL_FS_PSEUDO_LOCK
 	help
 	  Enable x86 CPU resource control support.
 
@@ -508,6 +509,12 @@ config X86_CPU_RESCTRL
 
 	  Say N if unsure.
 
+config RESCTRL_FS_PSEUDO_LOCK
+	bool
+	help
+	  Software mechanism to pin data in a cache portion using
+	  micro-architecture specific knowledge.
+
 config X86_FRED
 	bool "Flexible Return and Event Delivery"
 	depends on X86_64
diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/resctrl/Makefile
index 4a06c37b9cf1..0c13b0befd8a 100644
--- a/arch/x86/kernel/cpu/resctrl/Makefile
+++ b/arch/x86/kernel/cpu/resctrl/Makefile
@@ -1,4 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-obj-$(CONFIG_X86_CPU_RESCTRL)	+= core.o rdtgroup.o monitor.o
-obj-$(CONFIG_X86_CPU_RESCTRL)	+= ctrlmondata.o pseudo_lock.o
+obj-$(CONFIG_X86_CPU_RESCTRL)		+= core.o rdtgroup.o monitor.o
+obj-$(CONFIG_X86_CPU_RESCTRL)		+= ctrlmondata.o
+obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK)	+= pseudo_lock.o
 CFLAGS_pseudo_lock.o = -I$(src)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 85a5c6e83fad..1c2132e67df3 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -491,14 +491,6 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_domain *d,
 				  unsigned long cbm);
 enum rdtgrp_mode rdtgroup_mode_by_closid(int closid);
 int rdtgroup_tasks_assigned(struct rdtgroup *r);
-int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
-int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp);
-bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm);
-bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d);
-int rdt_pseudo_lock_init(void);
-void rdt_pseudo_lock_release(void);
-int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp);
-void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp);
 struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r);
 int closids_supported(void);
 void closid_free(int closid);
@@ -531,4 +523,44 @@ void rdt_staged_configs_clear(void);
 bool closid_allocated(unsigned int closid);
 int resctrl_find_cleanest_closid(void);
 
+#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
+int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
+int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp);
+bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm);
+bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d);
+int rdt_pseudo_lock_init(void);
+void rdt_pseudo_lock_release(void);
+int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp);
+void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp);
+#else
+static inline int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm)
+{
+	return false;
+}
+
+static inline bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
+{
+	return false;
+}
+
+static inline int rdt_pseudo_lock_init(void) { return 0; }
+static inline void rdt_pseudo_lock_release(void) { }
+static inline int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp) { }
+#endif /* CONFIG_RESCTRL_FS_PSEUDO_LOCK */
+
 #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 0446d30db4da..7957edcfc97d 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1450,7 +1450,8 @@ static ssize_t rdtgroup_mode_write(struct kernfs_open_file *of,
 				goto out;
 		}
 		rdtgrp->mode = RDT_MODE_EXCLUSIVE;
-	} else if (!strcmp(buf, "pseudo-locksetup")) {
+	} else if (IS_ENABLED(CONFIG_RESCTRL_FS_PSEUDO_LOCK) &&
+		   !strcmp(buf, "pseudo-locksetup")) {
 		ret = rdtgroup_locksetup_enter(rdtgrp);
 		if (ret)
 			goto out;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 24/38] x86/resctrl: Make prefetch_disable_bits belong to the arch code
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (22 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 23/38] x86/resctrl: Allow an architecture to disable pseudo lock James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 25/38] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr James Morse
                   ` (14 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

prefetch_disable_bits is set by rdtgroup_locksetup_enter() from a
value provided by the architecture, but is largely read by other
architecture helpers.

Instead of exporting this value, make
resctrl_arch_get_prefetch_disable_bits() set it so that the other
arch-code helpers can use the cached-value.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index f5d20a040a3d..cfd40ffe9b72 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -85,6 +85,8 @@ static const struct class pseudo_lock_class = {
  */
 u64 resctrl_arch_get_prefetch_disable_bits(void)
 {
+	prefetch_disable_bits = 0;
+
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
 	    boot_cpu_data.x86 != 6)
 		return 0;
@@ -100,7 +102,8 @@ u64 resctrl_arch_get_prefetch_disable_bits(void)
 		 * 3    DCU IP Prefetcher Disable (R/W)
 		 * 63:4 Reserved
 		 */
-		return 0xF;
+		prefetch_disable_bits = 0xF;
+		break;
 	case INTEL_ATOM_GOLDMONT:
 	case INTEL_ATOM_GOLDMONT_PLUS:
 		/*
@@ -111,10 +114,11 @@ u64 resctrl_arch_get_prefetch_disable_bits(void)
 		 * 2     DCU Hardware Prefetcher Disable (R/W)
 		 * 63:3  Reserved
 		 */
-		return 0x5;
+		prefetch_disable_bits = 0x5;
+		break;
 	}
 
-	return 0;
+	return prefetch_disable_bits;
 }
 
 /**
@@ -715,8 +719,7 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
 	 * Not knowing the bits to disable prefetching implies that this
 	 * platform does not support Cache Pseudo-Locking.
 	 */
-	prefetch_disable_bits = resctrl_arch_get_prefetch_disable_bits();
-	if (prefetch_disable_bits == 0) {
+	if (resctrl_arch_get_prefetch_disable_bits() == 0) {
 		rdt_last_cmd_puts("Pseudo-locking not supported\n");
 		return -EINVAL;
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 25/38] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (23 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 24/38] x86/resctrl: Make prefetch_disable_bits belong to the arch code James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 26/38] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl James Morse
                   ` (13 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

resctrl_arch_pseudo_lock_fn() has architecture specific behaviour,
and takes a struct rdtgroup as an argument.

After the filesystem code moves to /fs/, the definition of struct
rdtgroup will not be available to the architecture code.

The only reason resctrl_arch_pseudo_lock_fn() wants the rdtgroup is
for the CLOSID. Embed that in the pseudo_lock_region as a closid,
and move the definition of struct pseudo_lock_region to resctrl.h.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Change since v1:
 * [Commit message only] Typo fix:
   s/hw_closid/closid/g
---
 arch/x86/include/asm/resctrl.h            |  2 +-
 arch/x86/kernel/cpu/resctrl/internal.h    | 37 ---------------------
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 13 ++++----
 include/linux/resctrl.h                   | 39 +++++++++++++++++++++++
 4 files changed, 47 insertions(+), 44 deletions(-)

diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index a88af68f9fe2..9940398e367e 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -212,7 +212,7 @@ static inline void resctrl_arch_mon_ctx_free(struct rdt_resource *r, int evtid,
 					     void *ctx) { };
 
 u64 resctrl_arch_get_prefetch_disable_bits(void);
-int resctrl_arch_pseudo_lock_fn(void *_rdtgrp);
+int resctrl_arch_pseudo_lock_fn(void *_plr);
 int resctrl_arch_measure_cycles_lat_fn(void *_plr);
 int resctrl_arch_measure_l2_residency(void *_plr);
 int resctrl_arch_measure_l3_residency(void *_plr);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 1c2132e67df3..38d3aab9b684 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -183,43 +183,6 @@ struct mongroup {
 	u32			rmid;
 };
 
-/**
- * struct pseudo_lock_region - pseudo-lock region information
- * @s:			Resctrl schema for the resource to which this
- *			pseudo-locked region belongs
- * @d:			RDT domain to which this pseudo-locked region
- *			belongs
- * @cbm:		bitmask of the pseudo-locked region
- * @lock_thread_wq:	waitqueue used to wait on the pseudo-locking thread
- *			completion
- * @thread_done:	variable used by waitqueue to test if pseudo-locking
- *			thread completed
- * @cpu:		core associated with the cache on which the setup code
- *			will be run
- * @line_size:		size of the cache lines
- * @size:		size of pseudo-locked region in bytes
- * @kmem:		the kernel memory associated with pseudo-locked region
- * @minor:		minor number of character device associated with this
- *			region
- * @debugfs_dir:	pointer to this region's directory in the debugfs
- *			filesystem
- * @pm_reqs:		Power management QoS requests related to this region
- */
-struct pseudo_lock_region {
-	struct resctrl_schema	*s;
-	struct rdt_domain	*d;
-	u32			cbm;
-	wait_queue_head_t	lock_thread_wq;
-	int			thread_done;
-	int			cpu;
-	unsigned int		line_size;
-	unsigned int		size;
-	void			*kmem;
-	unsigned int		minor;
-	struct dentry		*debugfs_dir;
-	struct list_head	pm_reqs;
-};
-
 /**
  * struct rdtgroup - store rdtgroup's data in resctrl file system.
  * @kn:				kernfs node
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index cfd40ffe9b72..c096fa106b80 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -416,7 +416,7 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
 
 /**
  * resctrl_arch_pseudo_lock_fn - Load kernel memory into cache
- * @_rdtgrp: resource group to which pseudo-lock region belongs
+ * @_plr: the pseudo-lock region descriptor
  *
  * This is the core pseudo-locking flow.
  *
@@ -433,10 +433,9 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
  *
  * Return: 0. Waiter on waitqueue will be woken on completion.
  */
-int resctrl_arch_pseudo_lock_fn(void *_rdtgrp)
+int resctrl_arch_pseudo_lock_fn(void *_plr)
 {
-	struct rdtgroup *rdtgrp = _rdtgrp;
-	struct pseudo_lock_region *plr = rdtgrp->plr;
+	struct pseudo_lock_region *plr = _plr;
 	u32 rmid_p, closid_p;
 	unsigned long i;
 	u64 saved_msr;
@@ -496,7 +495,8 @@ int resctrl_arch_pseudo_lock_fn(void *_rdtgrp)
 	 * pseudo-locked followed by reading of kernel memory to load it
 	 * into the cache.
 	 */
-	__wrmsr(MSR_IA32_PQR_ASSOC, rmid_p, rdtgrp->closid);
+	__wrmsr(MSR_IA32_PQR_ASSOC, rmid_p, plr->closid);
+
 	/*
 	 * Cache was flushed earlier. Now access kernel memory to read it
 	 * into cache region associated with just activated plr->closid.
@@ -1327,7 +1327,8 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
 
 	plr->thread_done = 0;
 
-	thread = kthread_create_on_node(resctrl_arch_pseudo_lock_fn, rdtgrp,
+	plr->closid = rdtgrp->closid;
+	thread = kthread_create_on_node(resctrl_arch_pseudo_lock_fn, plr,
 					cpu_to_node(plr->cpu),
 					"pseudo_lock/%u", plr->cpu);
 	if (IS_ERR(thread)) {
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index b04744b00f6f..0359746d45f5 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -39,6 +39,45 @@ enum resctrl_conf_type {
 
 #define CDP_NUM_TYPES	(CDP_DATA + 1)
 
+/*
+ * struct pseudo_lock_region - pseudo-lock region information
+ * @s:			Resctrl schema for the resource to which this
+ *			pseudo-locked region belongs
+ * @closid:		The closid that this pseudo-locked region uses
+ * @d:			RDT domain to which this pseudo-locked region
+ *			belongs
+ * @cbm:		bitmask of the pseudo-locked region
+ * @lock_thread_wq:	waitqueue used to wait on the pseudo-locking thread
+ *			completion
+ * @thread_done:	variable used by waitqueue to test if pseudo-locking
+ *			thread completed
+ * @cpu:		core associated with the cache on which the setup code
+ *			will be run
+ * @line_size:		size of the cache lines
+ * @size:		size of pseudo-locked region in bytes
+ * @kmem:		the kernel memory associated with pseudo-locked region
+ * @minor:		minor number of character device associated with this
+ *			region
+ * @debugfs_dir:	pointer to this region's directory in the debugfs
+ *			filesystem
+ * @pm_reqs:		Power management QoS requests related to this region
+ */
+struct pseudo_lock_region {
+	struct resctrl_schema	*s;
+	u32			closid;
+	struct rdt_domain	*d;
+	u32			cbm;
+	wait_queue_head_t	lock_thread_wq;
+	int			thread_done;
+	int			cpu;
+	unsigned int		line_size;
+	unsigned int		size;
+	void			*kmem;
+	unsigned int		minor;
+	struct dentry		*debugfs_dir;
+	struct list_head	pm_reqs;
+};
+
 /**
  * struct resctrl_staged_config - parsed configuration to be applied
  * @new_ctrl:		new ctrl value to be loaded
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 26/38] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (24 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 25/38] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 27/38] x86/resctrl: Move get_config_index() to a header James Morse
                   ` (12 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

thread_throttle_mode_init() is called from the architecture specific code
to make the 'thread_throttle_mode' file visible. The architecture specific
code has already set the membw.throttle_mode in the rdt_resource.

This doesn't need to be specific to the architecture, the throttle_mode
can be used by resctrl to determine if the 'thread_throttle_mode' file
should be visible.

Call thread_throttle_mode_init() from resctrl_setup(), check the
membw.throttle_mode on the MBA resource. This avoids publishing an
extra function between the architecture and filesystem code.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 arch/x86/kernel/cpu/resctrl/core.c     | 1 -
 arch/x86/kernel/cpu/resctrl/internal.h | 1 -
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 9 ++++++++-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 252201eefdd0..d932e03f129f 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -222,7 +222,6 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
 		r->membw.throttle_mode = THREAD_THROTTLE_PER_THREAD;
 	else
 		r->membw.throttle_mode = THREAD_THROTTLE_MAX;
-	thread_throttle_mode_init();
 
 	r->alloc_capable = true;
 
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 38d3aab9b684..fc837d144894 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -480,7 +480,6 @@ void cqm_handle_limbo(struct work_struct *work);
 bool has_busy_rmid(struct rdt_domain *d);
 void __check_limbo(struct rdt_domain *d, bool force_free);
 void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
-void __init thread_throttle_mode_init(void);
 void mbm_config_rftype_init(const char *config);
 void rdt_staged_configs_clear(void);
 bool closid_allocated(unsigned int closid);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 7957edcfc97d..8066d0e51a73 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2055,10 +2055,15 @@ static struct rftype *rdtgroup_get_rftype_by_name(const char *name)
 	return NULL;
 }
 
-void __init thread_throttle_mode_init(void)
+static void __init thread_throttle_mode_init(void)
 {
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
 	struct rftype *rft;
 
+	if (!r->alloc_capable ||
+	    r->membw.throttle_mode == THREAD_THROTTLE_UNDEFINED)
+		return;
+
 	rft = rdtgroup_get_rftype_by_name("thread_throttle_mode");
 	if (!rft)
 		return;
@@ -4200,6 +4205,8 @@ int __init resctrl_init(void)
 
 	rdtgroup_setup_default();
 
+	thread_throttle_mode_init();
+
 	ret = resctrl_mon_resource_init();
 	if (ret)
 		return ret;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 27/38] x86/resctrl: Move get_config_index() to a header
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (25 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 26/38] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 28/38] x86/resctrl: Claim get_domain_from_cpu() for resctrl James Morse
                   ` (11 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Dave Martin, Shaopeng Tan

get_config_index() is used by the architecture specific code to map a
CLOSID+type pair to an index in the configuration arrays.

MPAM needs to do this too to preserve the ABI to user-space, there is
no reason to do it differently.

Move the helper to a header file.

Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
 * Reindent resctrl_get_config_index() as per coding-style.rst rules.

 * Remove redundant parentheses from arithmetic in
   resctrl_get_config_index(), so as to match the original source
   version of this moved code.

   No functional change.
---
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 19 +++----------------
 include/linux/resctrl.h                   | 15 +++++++++++++++
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 2100560dda6a..130583035d27 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -283,25 +283,12 @@ static int parse_line(char *line, struct resctrl_schema *s,
 	return -EINVAL;
 }
 
-static u32 get_config_index(u32 closid, enum resctrl_conf_type type)
-{
-	switch (type) {
-	default:
-	case CDP_NONE:
-		return closid;
-	case CDP_CODE:
-		return closid * 2 + 1;
-	case CDP_DATA:
-		return closid * 2;
-	}
-}
-
 int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_domain *d,
 			    u32 closid, enum resctrl_conf_type t, u32 cfg_val)
 {
 	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
 	struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
-	u32 idx = get_config_index(closid, t);
+	u32 idx = resctrl_get_config_index(closid, t);
 	struct msr_param msr_param;
 
 	if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask))
@@ -338,7 +325,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
 			if (!cfg->have_new_ctrl)
 				continue;
 
-			idx = get_config_index(closid, t);
+			idx = resctrl_get_config_index(closid, t);
 			if (cfg->new_ctrl == hw_dom->ctrl_val[idx])
 				continue;
 			hw_dom->ctrl_val[idx] = cfg->new_ctrl;
@@ -458,7 +445,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
 			    u32 closid, enum resctrl_conf_type type)
 {
 	struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
-	u32 idx = get_config_index(closid, type);
+	u32 idx = resctrl_get_config_index(closid, type);
 
 	return hw_dom->ctrl_val[idx];
 }
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 0359746d45f5..e07d719ace33 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -315,6 +315,21 @@ bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
 void resctrl_arch_mon_event_config_write(void *info);
 void resctrl_arch_mon_event_config_read(void *info);
 
+/* For use by arch code to remap resctrl's smaller CDP CLOSID range */
+static inline u32 resctrl_get_config_index(u32 closid,
+					   enum resctrl_conf_type type)
+{
+	switch (type) {
+	default:
+	case CDP_NONE:
+		return closid;
+	case CDP_CODE:
+		return closid * 2 + 1;
+	case CDP_DATA:
+		return closid * 2;
+	}
+}
+
 /*
  * Update the ctrl_val and apply this config right now.
  * Must be called on one of the domain's CPUs.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 28/38] x86/resctrl: Claim get_domain_from_cpu() for resctrl
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (26 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 27/38] x86/resctrl: Move get_config_index() to a header James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 29/38] x86/resctrl: Describe resctrl's bitmap size assumptions James Morse
                   ` (10 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

get_domain_from_cpu() is a handy helper that both the arch code and
resctrl need to use. Rename it resctrl_get_domain_from_cpu() so it
gets moved out to /fs, and exported back to the arch code.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 arch/x86/kernel/cpu/resctrl/core.c     | 15 ---------------
 arch/x86/kernel/cpu/resctrl/internal.h |  1 -
 arch/x86/kernel/cpu/resctrl/monitor.c  |  2 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |  2 +-
 include/linux/resctrl.h                | 21 +++++++++++++++++++++
 5 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index d932e03f129f..258e0a945f87 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -343,21 +343,6 @@ static void cat_wrmsr(struct msr_param *m)
 		wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
 }
 
-struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
-{
-	struct rdt_domain *d;
-
-	lockdep_assert_cpus_held();
-
-	list_for_each_entry(d, &r->domains, list) {
-		/* Find the domain that contains this CPU */
-		if (cpumask_test_cpu(cpu, &d->cpu_mask))
-			return d;
-	}
-
-	return NULL;
-}
-
 u32 resctrl_arch_get_num_closid(struct rdt_resource *r)
 {
 	return resctrl_to_arch_res(r)->num_closid;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index fc837d144894..bad103f20663 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -454,7 +454,6 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_domain *d,
 				  unsigned long cbm);
 enum rdtgrp_mode rdtgroup_mode_by_closid(int closid);
 int rdtgroup_tasks_assigned(struct rdtgroup *r);
-struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r);
 int closids_supported(void);
 void closid_free(int closid);
 int alloc_rmid(u32 closid);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index a09f5ed929d3..145bd05eafa5 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -687,7 +687,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
 	idx = resctrl_arch_rmid_idx_encode(closid, rmid);
 	pmbm_data = &dom_mbm->mbm_local[idx];
 
-	dom_mba = get_domain_from_cpu(smp_processor_id(), r_mba);
+	dom_mba = resctrl_get_domain_from_cpu(smp_processor_id(), r_mba);
 	if (!dom_mba) {
 		pr_warn_once("Failure to get domain for MBA update\n");
 		return;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 8066d0e51a73..74edf83a3eec 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4171,7 +4171,7 @@ void resctrl_offline_cpu(unsigned int cpu)
 	if (!l3->mon_capable)
 		goto out_unlock;
 
-	d = get_domain_from_cpu(cpu, l3);
+	d = resctrl_get_domain_from_cpu(cpu, l3);
 	if (d) {
 		if (resctrl_is_mbm_enabled() && cpu == d->mbm_work_cpu) {
 			cancel_delayed_work(&d->mbm_over);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index e07d719ace33..d67225f95ee1 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -2,6 +2,7 @@
 #ifndef _RESCTRL_H
 #define _RESCTRL_H
 
+#include <linux/cpu.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/pid.h>
@@ -330,6 +331,26 @@ static inline u32 resctrl_get_config_index(u32 closid,
 	}
 }
 
+/*
+ * Caller must hold the cpuhp read lock to prevent the struct rdt_domain being
+ * freed.
+ */
+static inline struct rdt_domain *
+resctrl_get_domain_from_cpu(int cpu, struct rdt_resource *r)
+{
+	struct rdt_domain *d;
+
+	lockdep_assert_cpus_held();
+
+	list_for_each_entry(d, &r->domains, list) {
+		/* Find the domain that contains this CPU */
+		if (cpumask_test_cpu(cpu, &d->cpu_mask))
+			return d;
+	}
+
+	return NULL;
+}
+
 /*
  * Update the ctrl_val and apply this config right now.
  * Must be called on one of the domain's CPUs.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 29/38] x86/resctrl: Describe resctrl's bitmap size assumptions
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (27 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 28/38] x86/resctrl: Claim get_domain_from_cpu() for resctrl James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 30/38] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" James Morse
                   ` (9 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

resctrl operates on configuration bitmaps and a bitmap of allocated
CLOSID, both are stored in a u32.

MPAM supports configuration/portion bitmaps and PARTIDs larger
than will fit in a u32.

Add some preprocessor values that make it clear why MPAM clamps
some of these values. This will make it easier to find code related
to these values if this resctrl behaviour ever changes.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 include/linux/resctrl.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index d67225f95ee1..3fc5f760e041 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -26,6 +26,17 @@ int proc_resctrl_show(struct seq_file *m,
 /* max value for struct rdt_domain's mbps_val */
 #define MBA_MAX_MBPS   U32_MAX
 
+/*
+ * Resctrl uses a u32 as a closid bitmap. The maximum closid is 32.
+ */
+#define RESCTRL_MAX_CLOSID		32
+
+/*
+ * Resctrl uses u32 to hold the user-space config. The maximum bitmap size is
+ * 32.
+ */
+#define RESCTRL_MAX_CBM			32
+
 /**
  * enum resctrl_conf_type - The type of configuration.
  * @CDP_NONE:	No prioritisation, both code and data are controlled or monitored.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 30/38] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_"
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (28 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 29/38] x86/resctrl: Describe resctrl's bitmap size assumptions James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
                   ` (8 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

resctrl_sched_in() loads the architecture specific CPU MSRs with the
CLOSID and RMID values. This function was named before resctrl was
split to have architecture specific code, and generic filesystem code.

This function is obviously architecture specific, but does not begin
with 'resctrl_arch_', making it the odd one out in the functions an
architecture needs to support to enable resctrl.

Rename it for consistency. This is purely cosmetic.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 arch/x86/include/asm/resctrl.h         |  4 ++--
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 12 ++++++------
 arch/x86/kernel/process_32.c           |  2 +-
 arch/x86/kernel/process_64.c           |  2 +-
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 9940398e367e..491342f56811 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -177,7 +177,7 @@ static inline bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 ignored,
 	return READ_ONCE(tsk->rmid) == rmid;
 }
 
-static inline void resctrl_sched_in(struct task_struct *tsk)
+static inline void resctrl_arch_sched_in(struct task_struct *tsk)
 {
 	if (static_branch_likely(&rdt_enable_key))
 		__resctrl_sched_in(tsk);
@@ -220,7 +220,7 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c);
 
 #else
 
-static inline void resctrl_sched_in(struct task_struct *tsk) {}
+static inline void resctrl_arch_sched_in(struct task_struct *tsk) {}
 static inline void resctrl_cpu_detect(struct cpuinfo_x86 *c) {}
 
 #endif /* CONFIG_X86_CPU_RESCTRL */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 74edf83a3eec..403118fdabd4 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -359,7 +359,7 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
 }
 
 /*
- * This is safe against resctrl_sched_in() called from __switch_to()
+ * This is safe against resctrl_arch_sched_in() called from __switch_to()
  * because __switch_to() is executed with interrupts disabled. A local call
  * from update_closid_rmid() is protected against __switch_to() because
  * preemption is disabled.
@@ -378,7 +378,7 @@ void resctrl_arch_sync_cpu_closid_rmid(void *info)
 	 * executing task might have its own closid selected. Just reuse
 	 * the context switch code.
 	 */
-	resctrl_sched_in(current);
+	resctrl_arch_sched_in(current);
 }
 
 /*
@@ -603,7 +603,7 @@ static void _update_task_closid_rmid(void *task)
 	 * Otherwise, the MSR is updated when the task is scheduled in.
 	 */
 	if (task == current)
-		resctrl_sched_in(task);
+		resctrl_arch_sched_in(task);
 }
 
 static void update_task_closid_rmid(struct task_struct *t)
@@ -661,7 +661,7 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
 	 * Ensure the task's closid and rmid are written before determining if
 	 * the task is current that will decide if it will be interrupted.
 	 * This pairs with the full barrier between the rq->curr update and
-	 * resctrl_sched_in() during context switch.
+	 * resctrl_arch_sched_in() during context switch.
 	 */
 	smp_mb();
 
@@ -2949,8 +2949,8 @@ static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to,
 			/*
 			 * Order the closid/rmid stores above before the loads
 			 * in task_curr(). This pairs with the full barrier
-			 * between the rq->curr update and resctrl_sched_in()
-			 * during context switch.
+			 * between the rq->curr update and
+			 * resctrl_arch_sched_in() during context switch.
 			 */
 			smp_mb();
 
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 0917c7f25720..8697b02dabf1 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -211,7 +211,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	switch_fpu_finish(next_p);
 
 	/* Load the Intel cache allocation PQR MSR. */
-	resctrl_sched_in(next_p);
+	resctrl_arch_sched_in(next_p);
 
 	return prev_p;
 }
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 6d3d20e3e43a..162b11b824ee 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -707,7 +707,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	}
 
 	/* Load the Intel cache allocation PQR MSR. */
-	resctrl_sched_in(next_p);
+	resctrl_arch_sched_in(next_p);
 
 	return prev_p;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (29 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 30/38] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 16:53   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 32/38] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
                   ` (7 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
resctrl can't be built as a module, and the kernfs helpers are not exported
so this is unlikely to change. MPAM has an error interrupt which indicates
the MPAM driver has gone haywire. Should this occur tasks could run with
the wrong control values, leading to bad performance for impoartant tasks.
The MPAM driver needs a way to tell resctrl that no further configuration
should be attempted.

Using resctrl_exit() for this leaves the system in a funny state as
resctrl is still mounted, but cannot be un-mounted because the sysfs
directory that is typically used has been removed. Dave Martin suggests
this may cause systemd trouble in the future as not all filesystems
can be unmounted.

Add calls to remove all the files and directories in resctrl, and
remove the sysfs_remove_mount_point() call that leaves the system
in a funny state. When triggered, this causes all the resctrl files
to disappear. resctrl can be unmounted, but not mounted again.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 403118fdabd4..ddf3d0a26517 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4255,9 +4255,9 @@ int __init resctrl_init(void)
 
 void __exit resctrl_exit(void)
 {
+	rdtgroup_destroy_root();
 	debugfs_remove_recursive(debugfs_resctrl);
 	unregister_filesystem(&rdt_fs_type);
-	sysfs_remove_mount_point(fs_kobj, "resctrl");
 
 	resctrl_mon_resource_exit();
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 32/38] x86/resctrl: Drop __init/__exit on assorted symbols
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (30 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 33/38] x86/resctrl: Move is_mba_sc() out of core.c James Morse
                   ` (6 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

Because ARM's MPAM controls are probed using MMIO, resctrl can't be
initialised until enough CPUs are online to have determined the
system-wide supported num_closid. Arm64 also supports 'late onlined
secondaries', where only a subset of CPUs are online during boot.

These two combine to mean the MPAM driver may not be able to initialise
resctrl until user-space has brought 'enough' CPUs online.

To allow MPAM to initialise resctrl after __init text has been free'd,
remove all the __init markings from resctrl.

The existing __exit markings cause these functions to be removed by the
linker as it has never been possible to build resctrl as a module. MPAM
has an error interrupt which causes the driver to reset and disable
itself. Remove the __exit markings to allow the MPAM driver to tear down
resctrl when an error occurs.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 8 ++++----
 include/linux/resctrl.h                | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index ddf3d0a26517..0936a4cddc9e 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2055,7 +2055,7 @@ static struct rftype *rdtgroup_get_rftype_by_name(const char *name)
 	return NULL;
 }
 
-static void __init thread_throttle_mode_init(void)
+static void thread_throttle_mode_init(void)
 {
 	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
 	struct rftype *rft;
@@ -4003,7 +4003,7 @@ static void rdtgroup_destroy_root(void)
 	rdtgroup_default.kn = NULL;
 }
 
-static void __init rdtgroup_setup_default(void)
+static void rdtgroup_setup_default(void)
 {
 	mutex_lock(&rdtgroup_mutex);
 
@@ -4196,7 +4196,7 @@ void resctrl_offline_cpu(unsigned int cpu)
  *
  * Return: 0 on success or -errno
  */
-int __init resctrl_init(void)
+int resctrl_init(void)
 {
 	int ret = 0;
 
@@ -4253,7 +4253,7 @@ int __init resctrl_init(void)
 	return ret;
 }
 
-void __exit resctrl_exit(void)
+void resctrl_exit(void)
 {
 	rdtgroup_destroy_root();
 	debugfs_remove_recursive(debugfs_resctrl);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 3fc5f760e041..31ae6d9a224e 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -452,7 +452,7 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_domain *d);
 extern unsigned int resctrl_rmid_realloc_threshold;
 extern unsigned int resctrl_rmid_realloc_limit;
 
-int __init resctrl_init(void);
-void __exit resctrl_exit(void);
+int resctrl_init(void);
+void resctrl_exit(void);
 
 #endif /* _RESCTRL_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 33/38] x86/resctrl: Move is_mba_sc() out of core.c
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (31 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 32/38] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 34/38] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
                   ` (5 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

is_mba_sc() is defined in core.c, but has no callers there. It does
not access any architecture private structures.

Move this to rdtgroup.c where the majority of callers are. This makes
the move of the filesystem code to /fs/ cleaner.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v2:
 * This patch is new.
---
 arch/x86/kernel/cpu/resctrl/core.c     | 15 ---------------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 258e0a945f87..cb8119d58c30 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -160,21 +160,6 @@ static inline void cache_alloc_hsw_probe(void)
 	rdt_alloc_capable = true;
 }
 
-bool is_mba_sc(struct rdt_resource *r)
-{
-	if (!r)
-		r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
-
-	/*
-	 * The software controller support is only applicable to MBA resource.
-	 * Make sure to check for resource type.
-	 */
-	if (r->rid != RDT_RESOURCE_MBA)
-		return false;
-
-	return r->membw.mba_sc;
-}
-
 /*
  * rdt_get_mb_table() - get a mapping of bandwidth(b/w) percentage values
  * exposed to user interface and the h/w understandable delay values.
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 0936a4cddc9e..f13b5b0404e4 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1500,6 +1500,21 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r,
 	return size;
 }
 
+bool is_mba_sc(struct rdt_resource *r)
+{
+	if (!r)
+		r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
+
+	/*
+	 * The software controller support is only applicable to MBA resource.
+	 * Make sure to check for resource type.
+	 */
+	if (r->rid != RDT_RESOURCE_MBA)
+		return false;
+
+	return r->membw.mba_sc;
+}
+
 /*
  * rdtgroup_size_show - Display size in bytes of allocated regions
  *
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 34/38] x86/resctrl: Add end-marker to the resctrl_event_id enum
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (32 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 33/38] x86/resctrl: Move is_mba_sc() out of core.c James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 35/38] x86/resctrl: Remove a newline to avoid confusing the code move script James Morse
                   ` (4 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

The resctrl_event_id enum gives names to the counter event numbers on x86.
These are used directly by resctrl.

To allow the MPAM driver to keep an array of these the size of the enum
needs to be known.

Add a 'num_events' define which can be used to size an array. This isn't
a member of the enum to avoid updating switch statements that would
otherwise be missing a case.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v2:
 * This patch is new.
---
 include/linux/resctrl_types.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
index 51c51a1aabfb..70226f5ab3e3 100644
--- a/include/linux/resctrl_types.h
+++ b/include/linux/resctrl_types.h
@@ -51,4 +51,6 @@ enum resctrl_event_id {
 	QOS_L3_MBM_LOCAL_EVENT_ID	= 0x03,
 };
 
+#define QOS_NUM_EVENTS		(QOS_L3_MBM_LOCAL_EVENT_ID + 1)
+
 #endif /* __LINUX_RESCTRL_TYPES_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 35/38] x86/resctrl: Remove a newline to avoid confusing the code move script
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (33 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 34/38] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-14 15:00 ` [PATCH v3 36/38] fs/resctrl: Add boiler plate for external resctrl code James Morse
                   ` (3 subsequent siblings)
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

The resctrl filesystem code will shortly be moved to /fs/. This involves
splitting all the existing files, with some functions remaining under
arch/x86, and others moving to fs/resctrl.

To make this reproducible, a python script does the heavy lif^W
copy-and-paste. This involves some clunky parsing of C code.

The parser gets confused by the newline after this #ifdef.
Just remove it.

Signed-off-by: James Morse <james.morse@arm.com>
---
This patch, and the post-move cleanup could all be merged together.
It's included like this to make the move script easier to work with.

Changes since v2:
 * This patch is new.
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index f13b5b0404e4..969c454b67f1 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -880,7 +880,6 @@ static int rdtgroup_rmid_show(struct kernfs_open_file *of,
 }
 
 #ifdef CONFIG_PROC_CPU_RESCTRL
-
 /*
  * A task can only be part of one resctrl control group and of one monitor
  * group which is associated to that control group.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 36/38] fs/resctrl: Add boiler plate for external resctrl code
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (34 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 35/38] x86/resctrl: Remove a newline to avoid confusing the code move script James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 16:54   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 37/38] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
                   ` (2 subsequent siblings)
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Dave Martin, Shaopeng Tan

Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL
for the common parts of the resctrl interface and make X86_CPU_RESCTRL
depend on this.

Adding an include of asm/resctrl.h to linux/resctrl.h allows some
of the files to switch over to using this header instead.

Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v2:
 * Dropped KERNFS dependency from arch side Kconfig.
 * Added empty trace.h file.
 * Merged asm->linux includes from Dave's patch to decouple those
   patches from this series.

Changes since v1:
 * Rename new file psuedo_lock.c to pseudo_lock.c, to match the name
   of the original file (and to be less surprising).

 * [Whitespace only] Under RESCTRL_FS in fs/resctrl/Kconfig, delete
   alignment space in orphaned select ... if (which has nothing to line
   up with any more).

 * [Whitespace only] Reflow and re-tab Kconfig additions.
---
 MAINTAINERS                               |  1 +
 arch/Kconfig                              |  8 +++++
 arch/x86/Kconfig                          |  5 ++--
 arch/x86/kernel/cpu/resctrl/internal.h    |  3 +-
 arch/x86/kernel/cpu/resctrl/monitor.c     |  2 +-
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  2 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    |  2 +-
 fs/Kconfig                                |  1 +
 fs/Makefile                               |  1 +
 fs/resctrl/Kconfig                        | 36 +++++++++++++++++++++++
 fs/resctrl/Makefile                       |  3 ++
 fs/resctrl/ctrlmondata.c                  |  0
 fs/resctrl/internal.h                     |  0
 fs/resctrl/monitor.c                      |  0
 fs/resctrl/pseudo_lock.c                  |  0
 fs/resctrl/rdtgroup.c                     |  0
 fs/resctrl/trace.h                        |  0
 include/linux/resctrl.h                   |  4 +++
 18 files changed, 61 insertions(+), 7 deletions(-)
 create mode 100644 fs/resctrl/Kconfig
 create mode 100644 fs/resctrl/Makefile
 create mode 100644 fs/resctrl/ctrlmondata.c
 create mode 100644 fs/resctrl/internal.h
 create mode 100644 fs/resctrl/monitor.c
 create mode 100644 fs/resctrl/pseudo_lock.c
 create mode 100644 fs/resctrl/rdtgroup.c
 create mode 100644 fs/resctrl/trace.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 441b039068d8..64195c298baf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18859,6 +18859,7 @@ S:	Supported
 F:	Documentation/arch/x86/resctrl*
 F:	arch/x86/include/asm/resctrl.h
 F:	arch/x86/kernel/cpu/resctrl/
+F:	fs/resctrl/
 F:	include/linux/resctrl*.h
 F:	tools/testing/selftests/resctrl/
 
diff --git a/arch/Kconfig b/arch/Kconfig
index 975dd22a2dbd..4156604dd926 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1431,6 +1431,14 @@ config STRICT_MODULE_RWX
 config ARCH_HAS_PHYS_TO_DMA
 	bool
 
+config ARCH_HAS_CPU_RESCTRL
+	bool
+	help
+	  An architecture selects this option to indicate that the necessary
+	  hooks are provided to support the common memory system usage
+	  monitoring and control interfaces provided by the 'resctrl'
+	  filesystem (see RESCTRL_FS).
+
 config HAVE_ARCH_COMPILER_H
 	bool
 	help
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 446984277b45..e4dd4097e10f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -490,8 +490,9 @@ config X86_MPPARSE
 config X86_CPU_RESCTRL
 	bool "x86 CPU resource control support"
 	depends on X86 && (CPU_SUP_INTEL || CPU_SUP_AMD)
-	select KERNFS
-	select PROC_CPU_RESCTRL		if PROC_FS
+	depends on MISC_FILESYSTEMS
+	select ARCH_HAS_CPU_RESCTRL
+	select RESCTRL_FS
 	select RESCTRL_FS_PSEUDO_LOCK
 	help
 	  Enable x86 CPU resource control support.
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index bad103f20663..6f6785a31efe 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -7,10 +7,9 @@
 #include <linux/kernfs.h>
 #include <linux/fs_context.h>
 #include <linux/jump_label.h>
+#include <linux/resctrl.h>
 #include <linux/tick.h>
 
-#include <asm/resctrl.h>
-
 #define L3_QOS_CDP_ENABLE		0x01ULL
 
 #define L2_QOS_CDP_ENABLE		0x01ULL
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 145bd05eafa5..a1539edb25fa 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -17,11 +17,11 @@
 
 #include <linux/cpu.h>
 #include <linux/module.h>
+#include <linux/resctrl.h>
 #include <linux/sizes.h>
 #include <linux/slab.h>
 
 #include <asm/cpu_device_id.h>
-#include <asm/resctrl.h>
 
 #include "internal.h"
 #include "trace.h"
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index c096fa106b80..97e901009c91 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -19,12 +19,12 @@
 #include <linux/mman.h>
 #include <linux/perf_event.h>
 #include <linux/pm_qos.h>
+#include <linux/resctrl.h>
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 
 #include <asm/cacheflush.h>
 #include <asm/cpu_device_id.h>
-#include <asm/resctrl.h>
 #include <asm/perf_event.h>
 
 #include "../../events/perf_event.h" /* For X86_CONFIG() */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 969c454b67f1..c7cbd30ac0f2 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -19,6 +19,7 @@
 #include <linux/fs_parser.h>
 #include <linux/sysfs.h>
 #include <linux/kernfs.h>
+#include <linux/resctrl.h>
 #include <linux/seq_buf.h>
 #include <linux/seq_file.h>
 #include <linux/sched/signal.h>
@@ -29,7 +30,6 @@
 
 #include <uapi/linux/magic.h>
 
-#include <asm/resctrl.h>
 #include "internal.h"
 
 DEFINE_STATIC_KEY_FALSE(rdt_enable_key);
diff --git a/fs/Kconfig b/fs/Kconfig
index a46b0cbc4d8f..d8a36383b6dc 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -331,6 +331,7 @@ source "fs/omfs/Kconfig"
 source "fs/hpfs/Kconfig"
 source "fs/qnx4/Kconfig"
 source "fs/qnx6/Kconfig"
+source "fs/resctrl/Kconfig"
 source "fs/romfs/Kconfig"
 source "fs/pstore/Kconfig"
 source "fs/sysv/Kconfig"
diff --git a/fs/Makefile b/fs/Makefile
index 6ecc9b0a53f2..da6e2d028722 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -129,3 +129,4 @@ obj-$(CONFIG_EFIVAR_FS)		+= efivarfs/
 obj-$(CONFIG_EROFS_FS)		+= erofs/
 obj-$(CONFIG_VBOXSF_FS)		+= vboxsf/
 obj-$(CONFIG_ZONEFS_FS)		+= zonefs/
+obj-$(CONFIG_RESCTRL_FS)	+= resctrl/
diff --git a/fs/resctrl/Kconfig b/fs/resctrl/Kconfig
new file mode 100644
index 000000000000..a5fbda54d32f
--- /dev/null
+++ b/fs/resctrl/Kconfig
@@ -0,0 +1,36 @@
+config RESCTRL_FS
+	bool "CPU Resource Control Filesystem (resctrl)"
+	depends on ARCH_HAS_CPU_RESCTRL
+	select KERNFS
+	select PROC_CPU_RESCTRL if PROC_FS
+	help
+	  Some architectures provide hardware facilities to group tasks and
+	  monitor and control their usage of memory system resources such as
+	  caches and memory bandwidth.  Examples of such facilities include
+	  Intel's Resource Director Technology (Intel(R) RDT) and AMD's
+	  Platform Quality of Service (AMD QoS).
+
+	  If your system has the necessary support and you want to be able to
+	  assign tasks to groups and manipulate the associated resource
+	  monitors and controls from userspace, say Y here to get a mountable
+	  'resctrl' filesystem that lets you do just that.
+
+	  If nothing mounts or prods the 'resctrl' filesystem, resource
+	  controls and monitors are left in a quiescent, permissive state.
+
+	  If unsure, it is safe to say N.
+
+	  See <file:Documentation/arch/x86/resctrl.rst> for more information.
+
+config RESCTRL_FS_PSEUDO_LOCK
+	bool
+	help
+	  Software mechanism to pin data in a cache portion using
+	  micro-architecture specific knowledge.
+
+config RESCTRL_RMID_DEPENDS_ON_CLOSID
+	bool
+	help
+	  Enable by the architecture when the RMID values depend on the CLOSID.
+	  This causes the closid allocator to search for CLOSID with clean
+	  RMID.
diff --git a/fs/resctrl/Makefile b/fs/resctrl/Makefile
new file mode 100644
index 000000000000..ee8c4463317a
--- /dev/null
+++ b/fs/resctrl/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_RESCTRL_FS)		+= rdtgroup.o ctrlmondata.o monitor.o
+obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK)	+= pseudo_lock.o
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/pseudo_lock.c b/fs/resctrl/pseudo_lock.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/trace.h b/fs/resctrl/trace.h
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 31ae6d9a224e..268e17276412 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -8,6 +8,10 @@
 #include <linux/pid.h>
 #include <linux/resctrl_types.h>
 
+#ifdef CONFIG_ARCH_HAS_CPU_RESCTRL
+#include <asm/resctrl.h>
+#endif
+
 /* CLOSID, RMID value used by the default control group */
 #define RESCTRL_RESERVED_CLOSID		0
 #define RESCTRL_RESERVED_RMID		0
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 37/38] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (35 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 36/38] fs/resctrl: Add boiler plate for external resctrl code James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-06-28 17:04   ` Reinette Chatre
  2024-06-14 15:00 ` [PATCH v3 38/38] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
  2024-07-11 22:00 ` [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem " Carl Worth
  38 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Dave Martin, Shaopeng Tan

Once the filesystem parts of resctrl move to fs/resctrl, it cannot rely
on definitions in x86's internal.h.

Move definitions in internal.h that need to be shared between the
filesystem and architecture code to header files that fs/resctrl can
include.

Doing this separately means the filesystem code only moves between files
of the same name, instead of having these changes mixed in too.

Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
----
Changes since v2:
 * Dropped the rfflags and some other defines from being moved.

Changes since v1:
 * Revert apparently unintentional duplication of a couple of variable
   declarations in <linux/resctrl.h>.

   No functional change.
---
 arch/x86/include/asm/resctrl.h         | 3 +++
 arch/x86/kernel/cpu/resctrl/core.c     | 5 +++++
 arch/x86/kernel/cpu/resctrl/internal.h | 9 ---------
 include/linux/resctrl_types.h          | 3 +++
 4 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 491342f56811..746431c66fc4 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -218,6 +218,9 @@ int resctrl_arch_measure_l2_residency(void *_plr);
 int resctrl_arch_measure_l3_residency(void *_plr);
 void resctrl_cpu_detect(struct cpuinfo_x86 *c);
 
+bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l);
+int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
+
 #else
 
 static inline void resctrl_arch_sched_in(struct task_struct *tsk) {}
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index cb8119d58c30..b8e4a7a2ebb8 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -283,6 +283,11 @@ static void rdt_get_cdp_l2_config(void)
 	rdt_get_cdp_config(RDT_RESOURCE_L2);
 }
 
+bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
+{
+	return rdt_resources_all[l].cdp_enabled;
+}
+
 static void mba_wrmsr_amd(struct msr_param *m)
 {
 	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(m->res);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 6f6785a31efe..40704445ac38 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -17,8 +17,6 @@
 #define CQM_LIMBOCHECK_INTERVAL	1000
 
 #define MBM_CNTR_WIDTH_BASE		24
-#define MBM_OVERFLOW_INTERVAL		1000
-#define MAX_MBA_BW			100u
 #define MBA_IS_LINEAR			0x4
 #define MBM_CNTR_WIDTH_OFFSET_AMD	20
 
@@ -371,13 +369,6 @@ static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
 	return &hw_res->r_resctrl;
 }
 
-static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
-{
-	return rdt_resources_all[l].cdp_enabled;
-}
-
-int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
-
 /*
  * To return the common struct rdt_resource, which is contained in struct
  * rdt_hw_resource, walk the resctrl member of struct rdt_hw_resource.
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
index 70226f5ab3e3..b84a6e0834a7 100644
--- a/include/linux/resctrl_types.h
+++ b/include/linux/resctrl_types.h
@@ -7,6 +7,9 @@
 #ifndef __LINUX_RESCTRL_TYPES_H
 #define __LINUX_RESCTRL_TYPES_H
 
+#define MAX_MBA_BW			100u
+#define MBM_OVERFLOW_INTERVAL		1000
+
 /* Reads to Local DRAM Memory */
 #define READS_TO_LOCAL_MEM		BIT(0)
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v3 38/38] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (36 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 37/38] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
@ 2024-06-14 15:00 ` James Morse
  2024-07-11 22:00 ` [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem " Carl Worth
  38 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-06-14 15:00 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

To support more than one architecture resctrl needs to move from arch/x86
to live under fs. Moving all the code breaks any series on the mailing
list, so needs scheduling carefully.

Maintaining the patch that moves all this code has proved labour intensive.
It's also near-impossible to review that no inadvertent changes have
crept in.

To solve these problems, temporarily add a hacky python program that
lists all the functions that should move, and those that should stay.

No attempt to parse C code is made, this thing tries to name 'blocks'
based on hueristics about the kernel coding style. It's fragile, but
good enough for its single use here.

This causes the original files to be regenerated, which will add
newlines that are not present in the original file.

I don't suggested this gets merged.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v2:
 * This patch is new.
---
 resctrl_copy_pasta.py | 766 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 766 insertions(+)
 create mode 100644 resctrl_copy_pasta.py

diff --git a/resctrl_copy_pasta.py b/resctrl_copy_pasta.py
new file mode 100644
index 000000000000..5e6089b9c57b
--- /dev/null
+++ b/resctrl_copy_pasta.py
@@ -0,0 +1,766 @@
+#!/usr/bin/python
+import sys;
+import os;
+import re;
+
+############
+
+SRC_DIR = "arch/x86/kernel/cpu/resctrl";
+DST_DIR = "fs/resctrl";
+
+resctrl_files = [
+	"ctrlmondata.c",
+	"internal.h",
+	"monitor.c",
+	"pseudo_lock.c",
+	"rdtgroup.c",
+	"trace.h",
+];
+
+functions_to_keep = [
+    # common
+    "pr_fmt",
+
+	# core.c
+	"domain_list_lock",
+	"resctrl_arch_late_init",
+	"resctrl_arch_exit",
+	"resctrl_cpu_detect",
+	"rdt_cpu_has",
+	"resctrl_arch_is_evt_configurable",
+	"get_mem_config",
+	"get_slow_mem_config",
+	"get_rdt_alloc_resources",
+	"get_rdt_mon_resources",
+	"__check_quirks_intel",
+	"check_quirks",
+	"get_rdt_resources",
+	"rdt_init_res_defs_intel",
+	"rdt_init_res_defs_amd",
+	"rdt_init_res_defs",
+	"resctrl_cpu_detect",
+	"resctrl_arch_late_init",
+	"resctrl_arch_exit",
+	"setup_default_ctrlval",
+	"domain_free",
+	"domain_setup_ctrlval",
+	"arch_domain_mbm_alloc",
+	"domain_add_cpu",
+	"domain_remove_cpu",
+	"clear_closid_rmid",
+	"resctrl_arch_online_cpu",
+	"resctrl_arch_offline_cpu",
+	"resctrl_arch_find_domain",
+	"resctrl_arch_get_num_closid",
+	"rdt_ctrl_update",
+	"domain_init",
+	"resctrl_arch_get_resource",
+	"cache_alloc_hsw_probe",
+	"rdt_get_mb_table",
+	"__get_mem_config_intel",
+	"__rdt_get_mem_config_amd",
+	"rdt_get_cache_alloc_cfg",
+	"rdt_get_cdp_config",
+	"rdt_get_cdp_l3_config",
+	"rdt_get_cdp_l2_config",
+	"resctrl_arch_get_cdp_enabled",
+	"set_rdt_options",
+	"pqr_state",
+	"rdt_resources_all",
+	"delay_bw_map",
+	"rdt_options",
+	"cat_wrmsr",
+	"mba_wrmsr_amd",
+	"mba_wrmsr_intel",
+	"anonymous-enum",
+	"rdt_find_domain",
+	"rdt_alloc_capable",
+	"rdt_online",
+	"RDT_OPT",
+
+	# ctrlmon.c
+	"apply_config",
+	"resctrl_arch_update_one",
+	"resctrl_arch_update_domains",
+	"resctrl_arch_get_config",
+
+	# internal.h
+	"L3_QOS_CDP_ENABLE",
+	"L2_QOS_CDP_ENABLE",
+	"MBM_CNTR_WIDTH_OFFSET_AMD",
+	"arch_mbm_state",
+	"rdt_hw_domain",
+	"resctrl_to_arch_dom",
+	"msr_param",
+	"rdt_hw_resource",
+	"resctrl_to_arch_res",
+	"rdt_resources_all",
+	"resctrl_inc",
+	"for_each_rdt_resource",
+	"for_each_capable_rdt_resource",
+	"for_each_alloc_capable_rdt_resource",
+	"for_each_mon_capable_rdt_resource",
+	"cpuid_0x10_1_eax",
+	"cpuid_0x10_3_eax",
+	"cpuid_0x10_x_ecx",
+	"cpuid_0x10_x_edx",
+	"rdt_ctrl_update",
+	"rdt_get_mon_l3_config",
+	"rdt_cpu_has",
+	"intel_rdt_mbm_apply_quirk",
+	"rdt_domain_reconfigure_cdp",
+
+	# monitor.c
+	"rdt_mon_capable",
+	"rdt_mon_features",
+	"CF",
+	"mbm_cf_table",
+	"mbm_cf_rmidthreshold",
+	"mbm_cf",
+	"get_corrected_mbm_count",
+	"__rmid_read",
+	"get_arch_mbm_state",
+	"resctrl_arch_reset_rmid",
+	"resctrl_arch_reset_rmid_all",
+	"mbm_overflow_count",
+	"resctrl_arch_rmid_read",
+	"rdt_get_mon_l3_config",
+	"intel_rdt_mbm_apply_quirk",
+
+	# pseudo_lock.c
+	"prefetch_disable_bits",
+	"resctrl_arch_get_prefetch_disable_bits",
+	"resctrl_arch_pseudo_lock_fn",
+	"resctrl_arch_measure_cycles_lat_fn",
+	"perf_miss_attr",
+	"perf_hit_attr",
+	"residency_counts",
+	"measure_residency_fn",
+	"resctrl_arch_measure_l2_residency",
+	"resctrl_arch_measure_l3_residency",
+
+	# rdtgroup.c
+	"rdt_enable_key",
+	"rdt_mon_enable_key",
+	"rdt_alloc_enable_key",
+	"resctrl_arch_sync_cpu_closid_rmid",
+	"INVALID_CONFIG_INDEX",
+	"mon_event_config_index_get",
+	"resctrl_arch_mon_event_config_read",
+	"resctrl_arch_mon_event_config_write",
+	"l3_qos_cfg_update",
+	"l2_qos_cfg_update",
+	"set_cache_qos_cfg",
+	"rdt_domain_reconfigure_cdp",
+	"cdp_enable",
+	"cdp_disable",
+	"resctrl_arch_set_cdp_enabled",
+	"reset_all_ctrls",
+	"resctrl_arch_reset_resources",
+
+	# trace.h
+	"TRACE_SYSTEM",
+	"pseudo_lock_mem_latency",
+	"pseudo_lock_l2",
+	"pseudo_lock_l3",
+];
+
+functions_to_move = [
+    # common
+    "pr_fmt",
+
+	# ctrlmon.c
+	"rdt_parse_data",
+	"(ctrlval_parser_t)",
+	"bw_validate",
+	"parse_bw",
+	"cbm_validate",
+	"parse_cbm",
+	"get_parser",
+	"parse_line",
+	"rdtgroup_parse_resource",
+	"rdtgroup_schemata_write",
+	"show_doms",
+	"rdtgroup_schemata_show",
+	"smp_mon_event_count",
+	"mon_event_read",
+	"rdtgroup_mondata_show",
+
+	# internal.h
+	"MBM_OVERFLOW_INTERVAL",
+	"CQM_LIMBOCHECK_INTERVAL",
+	"cpumask_any_housekeeping",
+	"rdt_fs_context",
+	"rdt_fc2context",
+	"mon_evt",
+	"mon_data_bits",
+	"rmid_read",
+	"resctrl_schema_all",
+	"resctrl_mounted",
+	"rdt_group_type",
+	"rdtgrp_mode",
+	"mongroup",
+	"rdtgroup",
+	"RFTYPE_FLAGS_CPUS_LIST",
+	"rdt_all_groups",
+	"rftype",
+	"mbm_state",
+	"is_mba_sc",
+
+	# monitor.c
+	"rmid_entry",
+	"rmid_free_lru",
+	"closid_num_dirty_rmid",
+	"rmid_limbo_count",
+	"rmid_ptrs",
+	"resctrl_rmid_realloc_threshold",
+	"resctrl_rmid_realloc_limit",
+	"__rmid_entry",
+	"limbo_release_entry",
+	"__check_limbo",
+	"has_busy_rmid",
+	"resctrl_find_free_rmid",
+	"resctrl_find_cleanest_closid",
+	"alloc_rmid",
+	"add_rmid_to_limbo",
+	"free_rmid",
+	"get_mbm_state",
+	"__mon_event_count",
+	"mbm_bw_count",
+	"mon_event_count",
+	"update_mba_bw",
+	"mbm_update",
+	"cqm_handle_limbo",
+	"cqm_setup_limbo_handler",
+	"mbm_handle_overflow",
+	"mbm_setup_overflow_handler",
+	"dom_data_init",
+	"dom_data_exit",
+	"llc_occupancy_event",
+	"mbm_total_event",
+	"mbm_local_event",
+	"l3_mon_evt_init",
+	"resctrl_mon_resource_init",
+	"resctrl_mon_resource_exit",
+
+	# pseudo_lock.c
+	"pseudo_lock_major",
+	"pseudo_lock_minor_avail",
+	"pseudo_lock_devnode",
+	"pseudo_lock_class",
+	"pseudo_lock_minor_get",
+	"pseudo_lock_minor_release",
+	"region_find_by_minor",
+	"pseudo_lock_pm_req",
+	"pseudo_lock_cstates_relax",
+	"pseudo_lock_cstates_constrain",
+	"pseudo_lock_region_clear",
+	"pseudo_lock_region_init",
+	"pseudo_lock_init",
+	"pseudo_lock_region_alloc",
+	"pseudo_lock_free",
+	"rdtgroup_monitor_in_progress",
+	"rdtgroup_locksetup_user_restrict",
+	"rdtgroup_locksetup_user_restore",
+	"rdtgroup_locksetup_enter",
+	"rdtgroup_locksetup_exit",
+	"rdtgroup_cbm_overlaps_pseudo_locked",
+	"rdtgroup_pseudo_locked_in_hierarchy",
+	"pseudo_lock_measure_cycles",
+	"pseudo_lock_measure_trigger",
+	"pseudo_measure_fops",
+	"rdtgroup_pseudo_lock_create",
+	"rdtgroup_pseudo_lock_remove",
+	"pseudo_lock_dev_open",
+	"pseudo_lock_dev_release",
+	"pseudo_lock_dev_mremap",
+	"pseudo_mmap_ops",
+	"pseudo_lock_dev_mmap",
+	"pseudo_lock_dev_fops",
+	"rdt_pseudo_lock_init",
+	"rdt_pseudo_lock_release",
+
+	# rdtgroup.c
+	"rdtgroup_mutex",
+	"rdt_root",
+	"rdtgroup_default",
+	"rdt_all_groups",
+	"resctrl_schema_all",
+	"resctrl_mounted",
+	"kn_info",
+	"kn_mongrp",
+	"kn_mondata",
+	"max_name_width",
+	"last_cmd_status",
+	"last_cmd_status_buf",
+	"rdtgroup_setup_root",
+	"rdtgroup_destroy_root",
+	"debugfs_resctrl",
+	"resctrl_debug",
+	"rdt_last_cmd_clear",
+	"rdt_last_cmd_puts",
+	"rdt_last_cmd_printf",
+	"rdt_staged_configs_clear",
+	"resctrl_is_mbm_enabled",
+	"resctrl_is_mbm_event",
+	"closid_free_map",
+	"closid_free_map_len",
+	"closids_supported",
+	"closid_init",
+	"closid_alloc",
+	"closid_free",
+	"closid_allocated",
+	"rdtgroup_mode_by_closid",
+	"rdt_mode_str",
+	"rdtgroup_mode_str",
+	"rdtgroup_kn_set_ugid",
+	"rdtgroup_add_file",
+	"rdtgroup_seqfile_show",
+	"rdtgroup_file_write",
+	"rdtgroup_kf_single_ops",
+	"kf_mondata_ops",
+	"is_cpu_list",
+	"rdtgroup_cpus_show",
+	"update_closid_rmid",
+	"cpus_mon_write",
+	"cpumask_rdtgrp_clear",
+	"cpus_ctrl_write",
+	"rdtgroup_cpus_write",
+	"rdtgroup_remove",
+	"_update_task_closid_rmid",
+	"update_task_closid_rmid",
+	"task_in_rdtgroup",
+	"__rdtgroup_move_task",
+	"is_closid_match",
+	"is_rmid_match",
+	"rdtgroup_tasks_assigned",
+	"rdtgroup_task_write_permission",
+	"rdtgroup_move_task",
+	"rdtgroup_tasks_write",
+	"show_rdt_tasks",
+	"rdtgroup_tasks_show",
+	"rdtgroup_closid_show",
+	"rdtgroup_rmid_show",
+	"proc_resctrl_show",
+	"rdt_last_cmd_status_show",
+	"rdt_num_closids_show",
+	"rdt_default_ctrl_show",
+	"rdt_min_cbm_bits_show",
+	"rdt_shareable_bits_show",
+	"rdt_bit_usage_show",
+	"rdt_min_bw_show",
+	"rdt_num_rmids_show",
+	"rdt_mon_features_show",
+	"rdt_bw_gran_show",
+	"rdt_delay_linear_show",
+	"max_threshold_occ_show",
+	"rdt_thread_throttle_mode_show",
+	"max_threshold_occ_write",
+	"rdtgroup_mode_show",
+	"resctrl_peer_type",
+	"rdt_has_sparse_bitmasks_show",
+	"__rdtgroup_cbm_overlaps",
+	"rdtgroup_cbm_overlaps",
+	"rdtgroup_mode_test_exclusive",
+	"rdtgroup_mode_write",
+	"rdtgroup_cbm_to_size",
+	"rdtgroup_size_show",
+	"mondata_config_read",
+	"mbm_config_show",
+	"mbm_total_bytes_config_show",
+	"mbm_local_bytes_config_show",
+	"mbm_config_write_domain",
+	"mon_config_write",
+	"mbm_total_bytes_config_write",
+	"mbm_local_bytes_config_write",
+	"res_common_files",
+	"rdtgroup_add_files",
+	"rdtgroup_get_rftype_by_name",
+	"thread_throttle_mode_init",
+	"mbm_config_rftype_init",
+	"rdtgroup_kn_mode_restrict",
+	"rdtgroup_kn_mode_restore",
+	"rdtgroup_mkdir_info_resdir",
+	"fflags_from_resource",
+	"rdtgroup_create_info_dir",
+	"mongroup_create_dir",
+	"is_mba_linear",
+	"mba_sc_domain_allocate",
+	"mba_sc_domain_destroy",
+	"supports_mba_mbps",
+	"set_mba_sc",
+	"kernfs_to_rdtgroup",
+	"rdtgroup_kn_get",
+	"rdtgroup_kn_put",
+	"rdtgroup_kn_lock_live",
+	"rdtgroup_kn_unlock",
+	"rdt_disable_ctx",
+	"rdt_enable_ctx",
+	"schemata_list_add",
+	"schemata_list_create",
+	"schemata_list_destroy",
+	"rdt_get_tree",
+	"rdt_param",
+	"rdt_fs_parameters",
+	"rdt_parse_param",
+	"rdt_fs_context_free",
+	"rdt_fs_context_ops",
+	"rdt_init_fs_context",
+	"rdt_move_group_tasks",
+	"free_all_child_rdtgrp",
+	"rmdir_all_sub",
+	"rdt_kill_sb",
+	"rdt_fs_type",
+	"mon_addfile",
+	"rmdir_mondata_subdir_allrdtgrp",
+	"mkdir_mondata_subdir",
+	"mkdir_mondata_subdir_allrdtgrp",
+	"mkdir_mondata_subdir_alldom",
+	"mkdir_mondata_all",
+	"cbm_ensure_valid",
+	"__init_one_rdt_domain",
+	"rdtgroup_init_cat",
+	"rdtgroup_init_mba",
+	"rdtgroup_init_alloc",
+	"mkdir_rdt_prepare_rmid_alloc",
+	"mkdir_rdt_prepare_rmid_free",
+	"mkdir_rdt_prepare",
+	"mkdir_rdt_prepare_clean",
+	"rdtgroup_mkdir_mon",
+	"rdtgroup_mkdir_ctrl_mon",
+	"is_mon_groups",
+	"rdtgroup_mkdir",
+	"rdtgroup_rmdir_mon",
+	"rdtgroup_ctrl_remove",
+	"rdtgroup_rmdir_ctrl",
+	"rdtgroup_rmdir",
+	"mongrp_reparent",
+	"rdtgroup_rename",
+	"rdtgroup_show_options",
+	"rdtgroup_kf_syscall_ops",
+	"rdtgroup_setup_root",
+	"rdtgroup_destroy_root",
+	"rdtgroup_setup_default",
+	"domain_destroy_mon_state",
+	"resctrl_offline_domain",
+	"domain_setup_mon_state",
+	"resctrl_online_domain",
+	"resctrl_online_cpu",
+	"clear_childcpus",
+	"resctrl_offline_cpu",
+	"resctrl_init",
+	"resctrl_exit",
+
+	# trace.h
+	"TRACE_SYSTEM",
+	"mon_llc_occupancy_limbo",
+];
+
+############
+
+builtin_non_functions = ["__setup", "__exitcall", "__printf"];
+builtin_one_arg_macros = ["LIST_HEAD", "DEFINE_MUTEX", "DEFINE_STATIC_KEY_FALSE"];
+types = ["bool",  "char", "int", "u32", "long", "u64"];
+
+def get_array_name(line):
+  tok = re.search(r'([^\s]+?)\[\]', line)
+  if (tok is None):
+    return None;
+  return tok.group(1);
+
+
+def get_struct_name(line):
+  tok = re.search(r'struct ([^\s]+?) {', line)
+  if (tok is None):
+    return None;
+  return tok.group(1);
+
+def get_enum_name(line):
+  tok = re.search(r'enum ([^\s]+?) {', line)
+  if (tok is None):
+    return None;
+  return tok.group(1);
+
+def get_union_name(line):
+  tok = re.search(r'union ([^\s]+?) {', line)
+  if (tok is None):
+    return None;
+  return tok.group(1);
+
+
+def get_macro_name(line):
+  tok = re.search(r'#define[\s]+([^\s]+?)\(', line)
+  if (tok):
+    return tok.group(1);
+
+  tok = re.search(r'#define[\s]+([^\s]+?)[\s]+[^\s]+?\n', line)
+  if (tok):
+    return tok.group(1);
+
+  return None;
+
+
+def get_macro_target(line):
+  tok = re.search(r'[^\s]+?\(([^\s]+?)\);\n', line)
+  if (tok):
+    return tok.group(1);
+
+  return None;
+
+
+# Things like 'bool my_bool;'
+def get_object_name(line):
+  # remove things that don't change the meaning of the name
+  if line.startswith("static "):
+    line = line[len("static "):];
+  if line.startswith("extern "):
+    line = line[len("extern "):];
+  if line.startswith("unsigned "):
+    line = line[len("unsigned "):];
+
+  # Note the trailing semicolon..
+  tok = re.search(r'([^\s]+)\s[\*]*([^\s\[\],;]+)', line)
+  if tok:
+    if tok.group(1) in types:
+      return tok.group(2);
+
+  tok = re.search(r'struct\s[^\s]+\s[\*]*([^\s;]+)', line)
+  if tok:
+    return tok.group(1);
+
+  tok = re.search(r'enum\s[^\s]+\s([^\s;]+)', line)
+  if tok:
+    return tok.group(1);
+
+  return None;
+
+
+# Is there a name for this block of code?
+#
+# Function names are the token before '(' ... assuming there is only one '('.
+# This also handles structs and arrays,
+def get_block_name(line):
+  # remove things that don't change the meaning of the name
+  if (" __read_mostly" in line):
+    line = line.replace(" __read_mostly", "");
+  if (" __initconst" in line):
+    line = line.replace(" __initconst", "");
+
+  if line == "enum {\n":
+    return "anonymous-enum";
+  if (line.startswith("#define ")):
+    return get_macro_name(line);
+
+  if ("=" in line):
+    tok = re.search(r'[\*]*([^\s\[\]]+?)[\s\[\]]*=', line)
+  else:
+    tok = re.search(r'[\*]*([^\s]+?)\(.+?', line)
+
+  if (tok is None):
+    if ("[]" in line):
+      return get_array_name(line);
+    if (line.startswith("struct") and line.endswith("{\n")):
+      return get_struct_name(line);
+    if (line.startswith("enum") and line.endswith("{\n")):
+      return get_enum_name(line);
+    if (line.startswith("union") and line.endswith("{\n")):
+      return get_union_name(line);
+    if (line.endswith(";\n") and '(' not in line):
+      return get_object_name(line);
+    if (line.endswith("= {\n") and '(' not in line):
+      return get_object_name(line);
+    return None;
+
+  func_name = tok.group(1);
+  if (func_name in builtin_one_arg_macros):
+    tok = re.search(r'[^\(]+\(([^\s]+?)\)', line)
+    if (tok is None):
+      return None;
+    return tok.group(1);
+  elif (func_name == "DEFINE_PER_CPU"):
+    tok = re.search(r'DEFINE_PER_CPU\(.+?, ([^\s]+?)\)', line)
+    if (tok is None):
+      return None;
+    return tok.group(1);
+  elif (func_name == "TRACE_EVENT"):
+    tok = re.search(r'TRACE_EVENT\((.+?),', line)
+    if (tok is None):
+      return None;
+    return tok.group(1);
+  elif (func_name == "late_initcall"):
+    return get_macro_target(line);
+  else:
+    return func_name;
+
+def output_function_body(body, file):
+  # Mandatory whitespace between blocks
+  if os.lseek(file.fileno(), 0, os.SEEK_CUR) > 0:
+    file.write("\n".encode());
+
+  for out_line in body:
+    file.write(out_line.encode());
+
+# Where should we put this block of code?
+def output_function(name, body, files):
+  output = False;
+  (new_src, new_dst) = files;
+
+  if (len(body)) == 0:
+    return;
+
+  # Output to both files
+  if (name is None):
+    output_function_body(body, new_src);
+    output_function_body(body, new_dst);
+    output = True;
+  if (name in functions_to_keep):
+    output_function_body(body, new_src);
+    output = True;
+  if (name in functions_to_move):
+    output_function_body(body, new_dst);
+    output = True;
+
+  if not output:
+    print("Missing function name: "+name);
+    #print(body);
+
+def reset_parser():
+  global function_name;
+  global define_name;
+  global function_body;
+  global in_define;
+
+  function_name = None;
+  define_name = None;
+  function_body = [];
+  in_define = False;
+
+############
+
+for file in resctrl_files:
+  function_name = None;
+  # function_names take priority over defines, this is only used when
+  # no function_name was found
+  define_name = None;
+  function_body = [];
+  # Nothing clever - this is just to detect newlines between functions
+  in_function = False;
+  in_define = False;
+
+  src_path = SRC_DIR + "/" + str(file);
+  if (not os.path.isfile(src_path)):
+    continue;
+  dst_path = DST_DIR + "/" + str(file);
+
+  orig_file = open(src_path, "r");
+  lines = orig_file.readlines();
+
+  # Now unlink the original file, so it can be re-created with new
+  # contents.
+  try:
+    os.unlink(src_path);
+  except Exception as err:
+    print("Failed to unlink source file: {err}");
+    sys.exit(1);
+
+  # non-buffering is so we can snoop the fd offset to avoid trailing newlines
+  new_src = open(src_path, "wb", buffering=0);
+  new_dst = open(dst_path, "wb", buffering=0);
+
+  for line in lines:
+    # Empty lines outside a function - reset the function tracking
+    if (line == "\n" and not in_function):
+      if function_name is None and define_name is not None:
+        function_name = define_name;
+      output_function(function_name, function_body, (new_src, new_dst));
+      reset_parser();
+
+    # Function prototypes are a funny C thing - reset the function tracking
+    elif (line[0].isspace() and not in_function and line.endswith(");\n")):
+      function_body += [line];
+      output_function(function_name, function_body, (new_src, new_dst));
+      reset_parser();
+
+    # Lines that begin with whitespace are part of the current function.
+    elif (line[0].isspace()):
+      function_body += [line];
+
+    # Next, try to find the kind of line that contains a function name
+
+    # Ignore lines with comment markers, braces
+    elif (line.startswith("/*")):
+      function_body += [line];
+    elif (line.startswith("*/")):
+      function_body += [line];
+    elif (line.startswith("//")):
+      function_body += [line];
+    elif (line == "{\n"):
+      function_body += [line];
+      in_function = True;
+    elif (line == "}\n"):
+      function_body += [line];
+      in_function = False;
+    elif (line == "};\n"):
+      function_body += [line];
+      in_function = False;
+
+    elif (line.startswith("#include")):
+      function_body += [line];
+    elif (line.startswith("#if ")):
+      function_body += [line];
+    elif (line.startswith("#ifdef ")):
+      function_body += [line];
+    elif (line.startswith("#ifndef ")):
+      function_body += [line];
+    elif (line.startswith("#else")):
+      function_body += [line];
+    elif (line.startswith("#endif")):
+      function_body += [line];
+    elif (line.startswith("#undef ")):
+      function_body += [line];
+    elif (line.startswith("#define")):
+      function_body += [line];
+      define_name = get_block_name(line);
+      if line.endswith("\\\n"):
+        in_define = True;
+    elif in_define and line.endswith("\\\n"):
+      function_body += [line];
+
+    # goto was always a crime
+    elif (' ' not in line and line.endswith(":\n")):
+      function_body += [line];
+
+    # Try and parse a function/array name
+
+    # Things like late_initcall() aren't function names, but belong to
+    # the previous function.
+    elif (get_block_name(line) in builtin_non_functions):
+      function_body += [line];
+
+    # Start a new block if we can get a block name for this line
+    elif (get_block_name(line) != None and function_name is None):
+      _name = get_block_name(line);
+
+      if (line.endswith("{\n")):
+        in_function = True;
+
+      # Is this a function prototype? Output it now
+      if (line.endswith(";\n")):
+        function_body += [line];
+        output_function(_name, function_body, (new_src, new_dst));
+        reset_parser();
+      else:
+        function_name = _name;
+        function_body += [line];
+
+    # Failed to parse a function name ... did it get split up?
+    elif (line.startswith("static")):
+      function_body += [line];
+
+    else:
+       print("Unknown: '" + line + "'");
+
+  # Output whatever is left in the buffer
+  output_function(function_name, function_body, (new_src, new_dst));
+
+  orig_file.close();
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 01/38] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors
  2024-06-14 14:59 ` [PATCH v3 01/38] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
@ 2024-06-28 16:41   ` Reinette Chatre
  0 siblings, 0 replies; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:41 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 7:59 AM, James Morse wrote:
> commit 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by
> searching closid_num_dirty_rmid") added logic that causes resctrl to

There is a custom in x86 on how messages containing references to commits
are formatted. For reference you can see how the recent fix
739c9765793e ("x86/resctrl: Don't try to free nonexistent RMIDs") was
reformatted before merge. Applied to this patch, it may look something
like:

	Commit

	  6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid")

	added logic ...
	
Looking at, for example, commit e3ca96e479c9 ("x86/resctrl: Pass domain to
target CPU"), it does seem ok to split the name of the commit.


> search for the CLOSID with the fewest dirty cache lines when creating a
> new control group, if requested by the arch code. This depends on the
> values read from the llc_occupancy counters. The logic is applicable to
> architectures where the CLOSID effectively forms part of the monitoring
> identifier and so do not allow complete freedom to choose an unused
> monitoring identifier for a given CLOSID.
> 
> This support missed that some platforms may not have these counters.
> This causes a NULL pointer dereference when creating a new control
> group as the array was not allocated by dom_data_init().
> 
> As this feature isn't necessary on platforms that don't have cache
> occupancy monitors, add this to the check that occurs when a new
> control group is allocated.
> 

The snippet below does not belong in changelog and can be moved to
maintainer notes.

> The existing code is not selected by any upstream platform, it makes
> no sense to backport this patch to stable.
> 
> Fixes: 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid")
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> 
> ---
> Changes since v1:
>   * [Commit message only] Reword the first paragraph to make it clear
>     that the issue being fixed wasn't directly associated with addition
>     of a Kconfig option.  (Actually, the option is not in Kconfig yet,
>     and gets added later in this series.)
> ---
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 02f213f1c51c..d02f4c97e40f 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -149,7 +149,8 @@ static int closid_alloc(void)
>   
>   	lockdep_assert_held(&rdtgroup_mutex);
>   
> -	if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) {
> +	if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID) &&
> +	    is_llc_occupancy_enabled()) {
>   		cleanest_closid = resctrl_find_cleanest_closid();
>   		if (cleanest_closid < 0)
>   			return cleanest_closid;

With changelog updated:

| Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 02/38] x86/resctrl: Add a helper to avoid reaching into the arch code resource list
  2024-06-14 14:59 ` [PATCH v3 02/38] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
@ 2024-06-28 16:42   ` Reinette Chatre
  0 siblings, 0 replies; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:42 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 7:59 AM, James Morse wrote:
> Resctrl occasionally wants to know something about a specific resource,
> in these cases it reaches into the arch code's rdt_resources_all[]
> array.
> 
> Once the filesystem parts of resctrl are moved to /fs/, this means it
> will need visibility of the architecture specific struct
> rdt_hw_resource definition, and the array of all resources.  All
> architectures would also need a r_resctrl member in this struct.
> 
> Instead, abstract this via a helper to allow architectures to do
> different things here. Move the level enum to the resctrl header and
> add a helper to retrieve the struct rdt_resource by 'rid'.
> 
> resctrl_arch_get_resource() should not return NULL for any value in
> the enum, it may instead return a dummy resource that is
> !alloc_enabled && !mon_enabled.
> 
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 03/38] x86/resctrl: Add a schema format enum and use this for fflags
  2024-06-14 14:59 ` [PATCH v3 03/38] x86/resctrl: Add a schema format enum and use this for fflags James Morse
@ 2024-06-28 16:43   ` Reinette Chatre
  2024-07-01 18:17     ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:43 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin

Hi James,

On 6/14/24 7:59 AM, James Morse wrote:
> resctrl has three types of control, these emerge from the way the
> architecture initialises a number of properties in struct rdt_resource.
> 
> A group of these properties need to be set the same on all architectures,
> it would be better to specify the format the schema entry should use, and
> allow resctrl to generate all the other properties it needs. This avoids
> architectures having divergant behaviour here.

divergant -> divergent ?

> 
> Add a schema format enum, and as a first use, replace the fflags member
> of struct rdt_resource.
> 
> The MBA schema has a different format between AMD and Intel systems.
> The schema_fmt property is changed by __rdt_get_mem_config_amd() to
> enable the MBPS format.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---

...

> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index e3edc41882dc..b12307d465bc 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2162,6 +2162,19 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>   	return ret;
>   }
>   
> +static u32 fflags_from_resource(struct rdt_resource *r)
> +{
> +	switch (r->schema_fmt) {
> +	case RESCTRL_SCHEMA_BITMAP:
> +		return RFTYPE_RES_CACHE;
> +	case RESCTRL_SCHEMA_PERCENTAGE:
> +	case RESCTRL_SCHEMA_MBPS:
> +		return RFTYPE_RES_MB;
> +	}
> +
> +	return WARN_ON_ONCE(1);
> +}
> +

The fflags returned specifies which files will be associated with the resource
in the "info" directory. Basing this on a property of the schema does not look
right to me. I understand that many of the info files relate to, for example,
information related to the bitmap used by the cache, but that is not the same for
info files related to the MBA resource (all info files related to MBA resource
are not about the schema property format).

I do not think the type of values of a schema should dictate which files
appear in the info directory. Doesn't MPAM support percentage for cache resources
and bitmaps for memory resources?

Can the fflags rather depend on the resource type itself, by using the rid?

>   static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
>   {
>   	struct resctrl_schema *s;
> @@ -2182,14 +2195,14 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
>   	/* loop over enabled controls, these are all alloc_capable */
>   	list_for_each_entry(s, &resctrl_schema_all, list) {
>   		r = s->res;
> -		fflags = r->fflags | RFTYPE_CTRL_INFO;
> +		fflags =  fflags_from_resource(r) | RFTYPE_CTRL_INFO;

(please watch for extra spaces)

>   		ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
>   		if (ret)
>   			goto out_destroy;
>   	}
>   
>   	for_each_mon_capable_rdt_resource(r) {
> -		fflags = r->fflags | RFTYPE_MON_INFO;
> +		fflags =  fflags_from_resource(r) | RFTYPE_MON_INFO;

(please watch for extra spaces)

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 04/38] x86/resctrl: Use schema type to determine how to parse schema values
  2024-06-14 14:59 ` [PATCH v3 04/38] x86/resctrl: Use schema type to determine how to parse schema values James Morse
@ 2024-06-28 16:43   ` Reinette Chatre
  0 siblings, 0 replies; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:43 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin

Hi James,

On 6/14/24 7:59 AM, James Morse wrote:
> Resctrl's architecture code gets to specify a function pointer that is
> used when parsing schema entries. This is expected to be one of two
> helpers from the filesystem code.
> 
> Setting this function pointer allows the architecture code to change
> the ABI resctrl presents to user-space, and forces resctrl to expose
> these helpers.
> 
> Instead, use the schema format enum to choose which schema parser to
> use. This allows the helpers to be made static and the structs used
> for passing arguments moved out of shared headers.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 05/38] x86/resctrl: Use schema type to determine the schema format string
  2024-06-14 15:00 ` [PATCH v3 05/38] x86/resctrl: Use schema type to determine the schema format string James Morse
@ 2024-06-28 16:43   ` Reinette Chatre
  0 siblings, 0 replies; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:43 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> Resctrl's architecture code gets to specify a format string that is
> used when printing schema entries. This is expected to be one of two
> values that the filesystem code supports.
> 
> Setting this format string allows the architecture code to change
> the ABI resctrl presents to user-space.
> 
> Instead, use the schema format enum to choose which format string to
> use.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 06/38] x86/resctrl: Move data_width to be a schema property
  2024-06-14 15:00 ` [PATCH v3 06/38] x86/resctrl: Move data_width to be a schema property James Morse
@ 2024-06-28 16:45   ` Reinette Chatre
  0 siblings, 0 replies; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:45 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> The resctrl architecture code gets to specify the width of the schema
> entries that are used by resctrl. These are determined by the schema
> format, e.g. percentage or bitmap.
> 
> Move this property into struct resctrl_schema and get the filesystem
> parts of resctrl to set it based on the schema format.
> 
> This allows rdt_init_padding() to be removed, its work can be done
> by schemata_list_add(), allowing max_name_width and max_data_width
> to be moved out of core.c which has no counterpart after the
> move to fs.

Please do write commit messages in imperative mood.

> 
> The logic for calculating max_name_width was moved in earlier patches,
> but the definition was not moved.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v2:
>   * This patch is new.
> ---
>   arch/x86/kernel/cpu/resctrl/core.c     | 26 --------------------------
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 11 +++++++++++
>   include/linux/resctrl.h                |  4 ++--
>   3 files changed, 13 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 4a5216a13b46..4de7d20aa5aa 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -44,12 +44,6 @@ static DEFINE_MUTEX(domain_list_lock);
>    */
>   DEFINE_PER_CPU(struct resctrl_pqr_state, pqr_state);
>   
> -/*
> - * Used to store the max resource name width and max resource data width
> - * to display the schemata in a tabular format
> - */
> -int max_name_width, max_data_width;
> -
>   /*
>    * Global boolean for rdt_alloc which is true if any
>    * resource allocation is enabled.
> @@ -222,7 +216,6 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
>   			return false;
>   		r->membw.arch_needs_linear = false;
>   	}
> -	r->data_width = 3;
>   
>   	if (boot_cpu_has(X86_FEATURE_PER_THREAD_MBA))
>   		r->membw.throttle_mode = THREAD_THROTTLE_PER_THREAD;
> @@ -262,8 +255,6 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
>   	r->membw.throttle_mode = THREAD_THROTTLE_UNDEFINED;
>   	r->membw.min_bw = 0;
>   	r->membw.bw_gran = 1;
> -	/* Max value is 2048, Data width should be 4 in decimal */
> -	r->data_width = 4;
>   
>   	r->alloc_capable = true;
>   
> @@ -283,7 +274,6 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
>   	r->cache.cbm_len = eax.split.cbm_len + 1;
>   	r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
>   	r->cache.shareable_bits = ebx & r->default_ctrl;
> -	r->data_width = (r->cache.cbm_len + 3) / 4;
>   	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
>   		r->cache.arch_has_sparse_bitmasks = ecx.split.noncont;
>   	r->alloc_capable = true;
> @@ -631,20 +621,6 @@ static int resctrl_arch_offline_cpu(unsigned int cpu)
>   	return 0;
>   }
>   
> -/*
> - * Choose a width for the resource name and resource data based on the
> - * resource that has widest name and cbm.
> - */
> -static __init void rdt_init_padding(void)
> -{
> -	struct rdt_resource *r;
> -
> -	for_each_alloc_capable_rdt_resource(r) {
> -		if (r->data_width > max_data_width)
> -			max_data_width = r->data_width;
> -	}
> -}
> -
>   enum {
>   	RDT_FLAG_CMT,
>   	RDT_FLAG_MBM_TOTAL,
> @@ -942,8 +918,6 @@ static int __init resctrl_late_init(void)
>   	if (!get_rdt_resources())
>   		return -ENODEV;
>   
> -	rdt_init_padding();
> -
>   	state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
>   				  "x86/resctrl/cat:online:",
>   				  resctrl_arch_online_cpu,
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index af9968328771..4f8e20cc06eb 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -58,6 +58,12 @@ static struct kernfs_node *kn_mongrp;
>   /* Kernel fs node for "mon_data" directory under root */
>   static struct kernfs_node *kn_mondata;
>   
> +/*
> + * Used to store the max resource name width and max resource data width
> + * to display the schemata in a tabular format
> + */

I understand that you just copied existing text here, but since you touch this line
could you please have this sentence end with a period?

> +int max_name_width, max_data_width;
> +
>   static struct seq_buf last_cmd_status;
>   static char last_cmd_status_buf[512];
>   
> @@ -2600,15 +2606,20 @@ static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type
>   	switch (r->schema_fmt) {
>   	case RESCTRL_SCHEMA_BITMAP:
>   		s->fmt_str = "%d=%0*x";
> +		s->data_width = (r->cache.cbm_len + 3) / 4;
>   		break;
>   	case RESCTRL_SCHEMA_PERCENTAGE:
>   		s->fmt_str = "%d=%0*u";
> +		s->data_width = 3;
>   		break;
>   	case RESCTRL_SCHEMA_MBPS:
>   		s->fmt_str = "%d=%0*u";
> +		s->data_width = 4;
>   		break;
>   	}
>   
> +	max_data_width = max(max_data_width, s->data_width);
> +
>   	INIT_LIST_HEAD(&s->list);
>   	list_add(&s->list, &resctrl_schema_all);
>   
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index abecbd92ac93..ddcd938972d2 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -182,7 +182,6 @@ enum resctrl_schema_fmt {
>    * @membw:		If the component has bandwidth controls, their properties.
>    * @domains:		RCU list of all domains for this resource
>    * @name:		Name to use in "schemata" file.
> - * @data_width:		Character width of data when displaying
>    * @default_ctrl:	Specifies default cache cbm or memory B/W percent.
>    * @schema_fmt:	Which format string and parser is used for this schema.
>    * @evt_list:		List of monitoring events
> @@ -198,7 +197,6 @@ struct rdt_resource {
>   	struct resctrl_membw	membw;
>   	struct list_head	domains;
>   	char			*name;
> -	int			data_width;
>   	u32			default_ctrl;
>   	enum resctrl_schema_fmt	schema_fmt;
>   	struct list_head	evt_list;
> @@ -218,6 +216,7 @@ struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
>    * @list:	Member of resctrl_schema_all.
>    * @name:	The name to use in the "schemata" file.
>    * @fmt_str:	Format string to show domain value
> + * @data_width:	Character width of data when displaying
>    * @conf_type:	Whether this schema is specific to code/data.
>    * @res:	The resource structure exported by the architecture to describe
>    *		the hardware that is configured by this schema.
> @@ -229,6 +228,7 @@ struct resctrl_schema {
>   	struct list_head		list;
>   	char				name[8];
>   	const char			*fmt_str;
> +	int				data_width;
>   	enum resctrl_conf_type		conf_type;
>   	struct rdt_resource		*res;
>   	u32				num_closid;

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 13/38] x86/resctrl: Move resctrl types to a separate header
  2024-06-14 15:00 ` [PATCH v3 13/38] x86/resctrl: Move resctrl types to a separate header James Morse
@ 2024-06-28 16:45   ` Reinette Chatre
  2024-07-01 18:16     ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:45 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> When resctrl is fully factored into core and per-arch code, each arch
> will need to use some resctrl common definitions in order to define its
> own specializations and helpers.  Following conventional practice, it
> would be desirable to put the dependent arch definitions in an
> <asm/resctrl.h> header that is included by the common <linux/resctrl.h>
> header.  However, this can make it awkward to avoid a circular
> dependency between <linux/resctrl.h> and the arch header.
> 
> To avoid such dependencies, move the affected common types and
> constants into a new header that does not need to depend on
> <linux/resctrl.h> or on the arch headers.
> 
> The same logic applies to the monitor-configuration defines, move these
> too.
> 
> Some kind of enumeration for events is needed between the filesystem
> and architecture code. Take the x86 definition as its convenient for
> x86.
> 
> The definition of enum resctrl_event_id is need to allow the architecture

"is need" -> "is needed" ?

> code to define resctrl_arch_event_is_free_running(),

Cannot find resctrl_arch_event_is_free_running()

> resctrl_arch_set_cdp_enabled(), resctrl_arch_mon_ctx_alloc() and

resctrl_arch_set_cdp_enabled() should not need enum resctrl_event_id


> resctrl_arch_mon_ctx_free().
> 
> The definition of enum resctrl_res_level is needed to allow the
> architecture code to define resctrl_arch_set_cdp_enabled() and
> resctrl_arch_get_cdp_enabled().
> 
> The bits for mbm_local_bytes_config et al are ABI, and must be the same
> on all architectures. These are documented in
> Documentation/arch/x86/resctrl.rst
> 
> The maintainers entry for these headers was missed when resctrl.h was
> created. Add a wildcard entry to match both resctrl.h and
> resctrl_types.h.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> Changes since v2:
>   * Added to the commit message why each of these things is necessary.
>   * Moved the enum resctrl_conf_type back to resctrl.h - this week arm's
>     CDP emulation code gets away without this...
> 
> Changes since v1:
>   * [Commit message only] Rewrite commit message to clarify the the
>     rationale for refactoring the headers in this way.
> ---
>   MAINTAINERS                            |  1 +
>   arch/x86/kernel/cpu/resctrl/internal.h | 24 ------------
>   include/linux/resctrl.h                | 21 +---------
>   include/linux/resctrl_types.h          | 54 ++++++++++++++++++++++++++

Considering the motivation I also expected to see a change in
arch/x86/include/asm/resctrl.h that adds the #include of the new file.

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 15/38] x86/resctrl: Move monitor exit work to a restrl exit call
  2024-06-14 15:00 ` [PATCH v3 15/38] x86/resctrl: Move monitor exit work to a restrl exit call James Morse
@ 2024-06-28 16:46   ` Reinette Chatre
  2024-07-01 18:17     ` James Morse
  2024-07-11 21:12   ` Carl Worth
  1 sibling, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:46 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> rdt_put_mon_l3_config() is called via the architecture's
> resctrl_arch_exit() call, and appears to free the rmid_ptrs[]
> and closid_num_dirty_rmid[] arrays. In reality this code is marked
> __exit, and is removed by the linker as resctl can't be built
> as a module.
> 
> To separate the filesystem and architecture parts of resctrl,
> this free()ing work needs to be triggered by the filesystem,
> as these structures belong to the filesystem code.
> 
> Rename rdt_put_mon_l3_config() resctrl_mon_resource_exit()
> and call it from resctrl_exit(). The kfree() is currently
> dependent on r->mon_capable. resctrl_mon_resource_init()

resctrl_mon_resource_init() does not exist at this point making
this motivation difficult to follow.

> takes no arguments, so resctrl_mon_resource_exit() shouldn't
> take any either. Add the check to dom_data_exit(), making it
> take the resource as an argument. This makes it more symmetrical
> with dom_data_init().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> Changes since v2:
>   * Dropped __exit as needed in the next patch.
> 
> Change since v1:
>   * [Commit message only] Typo fixes:
>     s/restrl/resctrl/g
>     s/resctl/resctrl/g

Something went wrong here since the subject and changelog still contains
the terms that were intended to be replaced.

> 
>   * [Commit message only] Reword second paragraph to remove reference to
>     the MPAM error interrupt, which provides background rationale for a
>     later patch rather than for this patch, and so it is not really
>     relevant here.
> ---
>   arch/x86/kernel/cpu/resctrl/core.c     |  5 -----
>   arch/x86/kernel/cpu/resctrl/internal.h |  2 +-
>   arch/x86/kernel/cpu/resctrl/monitor.c  | 12 ++++++++----
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c |  2 ++
>   4 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 9ad660b2b097..2540a7cb11b0 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -950,14 +950,9 @@ late_initcall(resctrl_arch_late_init);
>   
>   static void __exit resctrl_arch_exit(void)
>   {
> -	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> -
>   	cpuhp_remove_state(rdt_online);
>   
>   	resctrl_exit();
> -
> -	if (r->mon_capable)
> -		rdt_put_mon_l3_config();
>   }
>   
>   __exitcall(resctrl_arch_exit);
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 7ede340b1301..9aa7f587484c 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -535,7 +535,7 @@ void closid_free(int closid);
>   int alloc_rmid(u32 closid);
>   void free_rmid(u32 closid, u32 rmid);
>   int rdt_get_mon_l3_config(struct rdt_resource *r);
> -void __exit rdt_put_mon_l3_config(void);
> +void resctrl_mon_resource_exit(void);
>   bool __init rdt_cpu_has(int flag);
>   void mon_event_count(void *info);
>   int rdtgroup_mondata_show(struct seq_file *m, void *arg);
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 3e5375c365e6..7d6aebce75c1 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -965,10 +965,12 @@ static int dom_data_init(struct rdt_resource *r)
>   	return err;
>   }
>   
> -static void __exit dom_data_exit(void)
> +static void dom_data_exit(struct rdt_resource *r)
>   {
> -	mutex_lock(&rdtgroup_mutex);
> +	if (!r->mon_capable)
> +		return;
>   
> +	mutex_lock(&rdtgroup_mutex);

I know there has been a bit of back&forth on whether the mutex is needed
here. With this change moving dom_data_exit() out from __exit I think
the locking should aim to be consistent with existing runtime
and thus the check of r->mon_capable should be with mutex held. Having
this little snippet outside mutex will just cause confusion. Do you
have motivation for needing this be done outside of mutex? I think it
ended up this way with this patch aiming to keep existing flow exactly,
but that ended up as convenience in a flow where mutex was not really
needed at all.

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 16/38] x86/resctrl: Move monitor init work to a resctrl init call
  2024-06-14 15:00 ` [PATCH v3 16/38] x86/resctrl: Move monitor init work to a resctrl init call James Morse
@ 2024-06-28 16:47   ` Reinette Chatre
  2024-07-01 18:17     ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:47 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> rdt_get_mon_l3_config() is called from the architecture's
> resctrl_arch_late_init(), and initialises both architecture specific
> fields, such as hw_res->mon_scale and resctrl filesystem fields
> by calling dom_data_init().
> 
> To separate the filesystem and architecture parts of resctrl, this
> function needs splitting up.
> 
> Add resctrl_mon_resource_init() to do the filesystem specific work,
> and call it from resctrl_init(). This runs later, but is still before
> the filesystem is mounted and the rmid_ptrs[] array can be used.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> Changes since v2:
>   * Added error handling for the case sysfs files can't be created.
> ---
>   arch/x86/kernel/cpu/resctrl/internal.h |  1 +
>   arch/x86/kernel/cpu/resctrl/monitor.c  | 24 +++++++++++++++++-------
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c |  9 ++++++++-
>   3 files changed, 26 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 9aa7f587484c..eaf458967fa1 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -542,6 +542,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg);
>   void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>   		    struct rdt_domain *d, struct rdtgroup *rdtgrp,
>   		    int evtid, int first);
> +int resctrl_mon_resource_init(void);
>   void mbm_setup_overflow_handler(struct rdt_domain *dom,
>   				unsigned long delay_ms,
>   				int exclude_cpu);
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 7d6aebce75c1..527c0e9d7b2e 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1016,12 +1016,28 @@ static void l3_mon_evt_init(struct rdt_resource *r)
>   		list_add_tail(&mbm_local_event.list, &r->evt_list);
>   }
>   
> +int resctrl_mon_resource_init(void)

(Lack of an __init is unexpected but I assume it was done since that will be removed
in later patch anyway?)

This function needs a big warning to deter anybody from considering this to
be the place where any and all monitor related allocations happen. It needs
to warn developers that only resources that can only be touched after fs mount
may be allocated here.

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 17/38] x86/resctrl: Stop using the for_each_*_rdt_resource() walkers
  2024-06-14 15:00 ` [PATCH v3 17/38] x86/resctrl: Stop using the for_each_*_rdt_resource() walkers James Morse
@ 2024-06-28 16:48   ` Reinette Chatre
  2024-07-01 18:16     ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:48 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> The for_each_*_rdt_resource() helpers walk the architecture's array
> of structures, using the resctrl visible part as an iterator. These
> became over-complex when the structures were split into a
> filesystem and architecture-specific struct. This approach avoided
> the need to touch every call site.
> 
> Once the filesystem parts of resctrl are moved to /fs/, both the
> architecture's resource array, and the definition of those structures
> is no longer accessible. To support resctrl, each architecture would
> have to provide equally complex macros.
> 
> Change the resctrl code that uses these to walk through the resource_level
> enum and check the mon/alloc capable flags instead. Instances in core.c,
> and resctrl_arch_reset_resources() remain part of x86's architecture
> specific code.
> 
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> Changes since v1:
>   * [Whitespace only] Fix bogus whitespace introduced in
>     rdtgroup_create_info_dir().
> 
>   * [Commit message only] Typo fix:
>     s/architectures/architecture's/g
> ---
>   arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  7 +++++-
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 28 +++++++++++++++++++----
>   2 files changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
> index aacf236dfe3b..ad20822bb64e 100644
> --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
> +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
> @@ -840,6 +840,7 @@ bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm
>   bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
>   {
>   	cpumask_var_t cpu_with_psl;
> +	enum resctrl_res_level i;
>   	struct rdt_resource *r;
>   	struct rdt_domain *d_i;
>   	bool ret = false;
> @@ -854,7 +855,11 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
>   	 * First determine which cpus have pseudo-locked regions
>   	 * associated with them.
>   	 */
> -	for_each_alloc_capable_rdt_resource(r) {
> +	for (i = 0; i < RDT_NUM_RESOURCES; i++) {
> +		r = resctrl_arch_get_resource(i);
> +		if (!r->alloc_capable)
> +			continue;
> +

This looks like enough duplicate boilerplate for a new macro. For simplicity the
macro could require two arguments with enum resctrl_res_level also provided?

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 20/38] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
  2024-06-14 15:00 ` [PATCH v3 20/38] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
@ 2024-06-28 16:49   ` Reinette Chatre
  0 siblings, 0 replies; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:49 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:

...

> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index fbae9a907544..65f0a2d17e4b 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -220,6 +220,13 @@ struct resctrl_cpu_defaults {
>   	u32 rmid;
>   };
>   
> +struct resctrl_mon_config_info {
> +	struct rdt_resource	*r;
> +	struct rdt_domain	*d;
> +	u32			evtid;
> +	u32			mon_config;
> +};
> +
>   /**
>    * resctrl_arch_sync_cpu_closid_rmid() - Refresh this CPU's CLOSID and RMID.
>    *					 Call via IPI.
> @@ -263,6 +270,8 @@ struct rdt_domain *resctrl_arch_find_domain(struct rdt_resource *r, int id);
>   int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
>   
>   bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
> +void resctrl_arch_mon_event_config_write(void *info);
> +void resctrl_arch_mon_event_config_read(void *info);
>  

Considering this is the API doc I think it will be useful to have a comment that
connects the void pointers in these functions to struct resctrl_mon_config_info.
  
>   /*
>    * Update the ctrl_val and apply this config right now.

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 21/38] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource
  2024-06-14 15:00 ` [PATCH v3 21/38] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
@ 2024-06-28 16:53   ` Reinette Chatre
  0 siblings, 0 replies; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:53 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> The mbm_cfg_mask field lists the bits that user-space can set when
> configuring an event. This value is output via the last_cmd_status
> file.
> 
> Once the filesystem parts of resctrl are moved to live in /fs/, the
> struct rdt_hw_resource is inaccessible to the filesystem code. Because
> this value is output to user-space, it has to be accessible to the
> filesystem code.
> 
> Move it to struct rdt_resource.

Change looks good. Please do note that there is work in progress to
consolidate the monitoring related data within a new struct that will
impact this change:
https://lore.kernel.org/lkml/8f73c9ec4c9999c262d9297d46a03209a8affe3f.1716552602.git.babu.moger@amd.com/

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
  2024-06-14 15:00 ` [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
@ 2024-06-28 16:53   ` Reinette Chatre
  2024-07-04 16:41     ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:53 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
> resctrl can't be built as a module, and the kernfs helpers are not exported
> so this is unlikely to change. MPAM has an error interrupt which indicates
> the MPAM driver has gone haywire. Should this occur tasks could run with
> the wrong control values, leading to bad performance for impoartant tasks.

impoartant -> important

> The MPAM driver needs a way to tell resctrl that no further configuration
> should be attempted.
> 
> Using resctrl_exit() for this leaves the system in a funny state as
> resctrl is still mounted, but cannot be un-mounted because the sysfs
> directory that is typically used has been removed. Dave Martin suggests
> this may cause systemd trouble in the future as not all filesystems
> can be unmounted.
> 
> Add calls to remove all the files and directories in resctrl, and
> remove the sysfs_remove_mount_point() call that leaves the system
> in a funny state. When triggered, this causes all the resctrl files
> to disappear. resctrl can be unmounted, but not mounted again.

I am not familiar with these flows so I would like to confirm ...
In this scenario the resctrl filesystem will be unregistered, are
you saying that it is possible to unmount a filesystem after it has
been unregistered?

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 36/38] fs/resctrl: Add boiler plate for external resctrl code
  2024-06-14 15:00 ` [PATCH v3 36/38] fs/resctrl: Add boiler plate for external resctrl code James Morse
@ 2024-06-28 16:54   ` Reinette Chatre
  2024-07-04 16:40     ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 16:54 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL
> for the common parts of the resctrl interface and make X86_CPU_RESCTRL
> depend on this.
> 
> Adding an include of asm/resctrl.h to linux/resctrl.h allows some
> of the files to switch over to using this header instead.
> 
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> Changes since v2:
>   * Dropped KERNFS dependency from arch side Kconfig.
>   * Added empty trace.h file.
>   * Merged asm->linux includes from Dave's patch to decouple those
>     patches from this series.
> 
> Changes since v1:
>   * Rename new file psuedo_lock.c to pseudo_lock.c, to match the name
>     of the original file (and to be less surprising).
> 
>   * [Whitespace only] Under RESCTRL_FS in fs/resctrl/Kconfig, delete
>     alignment space in orphaned select ... if (which has nothing to line
>     up with any more).
> 
>   * [Whitespace only] Reflow and re-tab Kconfig additions.
> ---
>   MAINTAINERS                               |  1 +
>   arch/Kconfig                              |  8 +++++
>   arch/x86/Kconfig                          |  5 ++--
>   arch/x86/kernel/cpu/resctrl/internal.h    |  3 +-
>   arch/x86/kernel/cpu/resctrl/monitor.c     |  2 +-
>   arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  2 +-
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c    |  2 +-
>   fs/Kconfig                                |  1 +
>   fs/Makefile                               |  1 +
>   fs/resctrl/Kconfig                        | 36 +++++++++++++++++++++++
>   fs/resctrl/Makefile                       |  3 ++
>   fs/resctrl/ctrlmondata.c                  |  0
>   fs/resctrl/internal.h                     |  0
>   fs/resctrl/monitor.c                      |  0
>   fs/resctrl/pseudo_lock.c                  |  0
>   fs/resctrl/rdtgroup.c                     |  0
>   fs/resctrl/trace.h                        |  0
>   include/linux/resctrl.h                   |  4 +++
>   18 files changed, 61 insertions(+), 7 deletions(-)
>   create mode 100644 fs/resctrl/Kconfig
>   create mode 100644 fs/resctrl/Makefile
>   create mode 100644 fs/resctrl/ctrlmondata.c
>   create mode 100644 fs/resctrl/internal.h
>   create mode 100644 fs/resctrl/monitor.c
>   create mode 100644 fs/resctrl/pseudo_lock.c
>   create mode 100644 fs/resctrl/rdtgroup.c
>   create mode 100644 fs/resctrl/trace.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 441b039068d8..64195c298baf 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -18859,6 +18859,7 @@ S:	Supported
>   F:	Documentation/arch/x86/resctrl*
>   F:	arch/x86/include/asm/resctrl.h
>   F:	arch/x86/kernel/cpu/resctrl/
> +F:	fs/resctrl/
>   F:	include/linux/resctrl*.h
>   F:	tools/testing/selftests/resctrl/
>   
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 975dd22a2dbd..4156604dd926 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -1431,6 +1431,14 @@ config STRICT_MODULE_RWX
>   config ARCH_HAS_PHYS_TO_DMA
>   	bool
>   
> +config ARCH_HAS_CPU_RESCTRL
> +	bool
> +	help
> +	  An architecture selects this option to indicate that the necessary
> +	  hooks are provided to support the common memory system usage
> +	  monitoring and control interfaces provided by the 'resctrl'
> +	  filesystem (see RESCTRL_FS).
> +
>   config HAVE_ARCH_COMPILER_H
>   	bool
>   	help
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 446984277b45..e4dd4097e10f 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -490,8 +490,9 @@ config X86_MPPARSE
>   config X86_CPU_RESCTRL
>   	bool "x86 CPU resource control support"
>   	depends on X86 && (CPU_SUP_INTEL || CPU_SUP_AMD)
> -	select KERNFS
> -	select PROC_CPU_RESCTRL		if PROC_FS
> +	depends on MISC_FILESYSTEMS
> +	select ARCH_HAS_CPU_RESCTRL
> +	select RESCTRL_FS
>   	select RESCTRL_FS_PSEUDO_LOCK
>   	help
>   	  Enable x86 CPU resource control support.
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index bad103f20663..6f6785a31efe 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -7,10 +7,9 @@
>   #include <linux/kernfs.h>
>   #include <linux/fs_context.h>
>   #include <linux/jump_label.h>
> +#include <linux/resctrl.h>
>   #include <linux/tick.h>
>   
> -#include <asm/resctrl.h>
> -
>   #define L3_QOS_CDP_ENABLE		0x01ULL
>   
>   #define L2_QOS_CDP_ENABLE		0x01ULL
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 145bd05eafa5..a1539edb25fa 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -17,11 +17,11 @@
>   
>   #include <linux/cpu.h>
>   #include <linux/module.h>
> +#include <linux/resctrl.h>
>   #include <linux/sizes.h>
>   #include <linux/slab.h>
>   
>   #include <asm/cpu_device_id.h>
> -#include <asm/resctrl.h>
>   
>   #include "internal.h"
>   #include "trace.h"
> diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
> index c096fa106b80..97e901009c91 100644
> --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
> +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
> @@ -19,12 +19,12 @@
>   #include <linux/mman.h>
>   #include <linux/perf_event.h>
>   #include <linux/pm_qos.h>
> +#include <linux/resctrl.h>
>   #include <linux/slab.h>
>   #include <linux/uaccess.h>
>   
>   #include <asm/cacheflush.h>
>   #include <asm/cpu_device_id.h>
> -#include <asm/resctrl.h>
>   #include <asm/perf_event.h>
>   
>   #include "../../events/perf_event.h" /* For X86_CONFIG() */
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 969c454b67f1..c7cbd30ac0f2 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -19,6 +19,7 @@
>   #include <linux/fs_parser.h>
>   #include <linux/sysfs.h>
>   #include <linux/kernfs.h>
> +#include <linux/resctrl.h>
>   #include <linux/seq_buf.h>
>   #include <linux/seq_file.h>
>   #include <linux/sched/signal.h>
> @@ -29,7 +30,6 @@
>   
>   #include <uapi/linux/magic.h>
>   
> -#include <asm/resctrl.h>
>   #include "internal.h"
>   
>   DEFINE_STATIC_KEY_FALSE(rdt_enable_key);
> diff --git a/fs/Kconfig b/fs/Kconfig
> index a46b0cbc4d8f..d8a36383b6dc 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -331,6 +331,7 @@ source "fs/omfs/Kconfig"
>   source "fs/hpfs/Kconfig"
>   source "fs/qnx4/Kconfig"
>   source "fs/qnx6/Kconfig"
> +source "fs/resctrl/Kconfig"
>   source "fs/romfs/Kconfig"
>   source "fs/pstore/Kconfig"
>   source "fs/sysv/Kconfig"
> diff --git a/fs/Makefile b/fs/Makefile
> index 6ecc9b0a53f2..da6e2d028722 100644
> --- a/fs/Makefile
> +++ b/fs/Makefile
> @@ -129,3 +129,4 @@ obj-$(CONFIG_EFIVAR_FS)		+= efivarfs/
>   obj-$(CONFIG_EROFS_FS)		+= erofs/
>   obj-$(CONFIG_VBOXSF_FS)		+= vboxsf/
>   obj-$(CONFIG_ZONEFS_FS)		+= zonefs/
> +obj-$(CONFIG_RESCTRL_FS)	+= resctrl/
> diff --git a/fs/resctrl/Kconfig b/fs/resctrl/Kconfig
> new file mode 100644
> index 000000000000..a5fbda54d32f
> --- /dev/null
> +++ b/fs/resctrl/Kconfig
> @@ -0,0 +1,36 @@
> +config RESCTRL_FS
> +	bool "CPU Resource Control Filesystem (resctrl)"
> +	depends on ARCH_HAS_CPU_RESCTRL
> +	select KERNFS
> +	select PROC_CPU_RESCTRL if PROC_FS
> +	help
> +	  Some architectures provide hardware facilities to group tasks and
> +	  monitor and control their usage of memory system resources such as
> +	  caches and memory bandwidth.  Examples of such facilities include
> +	  Intel's Resource Director Technology (Intel(R) RDT) and AMD's
> +	  Platform Quality of Service (AMD QoS).
> +
> +	  If your system has the necessary support and you want to be able to
> +	  assign tasks to groups and manipulate the associated resource
> +	  monitors and controls from userspace, say Y here to get a mountable
> +	  'resctrl' filesystem that lets you do just that.
> +
> +	  If nothing mounts or prods the 'resctrl' filesystem, resource
> +	  controls and monitors are left in a quiescent, permissive state.
> +
> +	  If unsure, it is safe to say N.
> +

Will user ever get opportunity to say "Y" or "N"? It looks to me that
RESCTRL_FS will be "forced" on user as it is selected by the arch specific
config X86_CPU_RESCTRL and be invisble otherwise because of the dependency
on ARCH_HAS_CPU_RESCTRL. The text about when to select "Y" or "N" thus does
not look practical to me and it may be helpful to instead provide
information about when it is selected? I do not know the customs for this
text and if it is intended to document any future usages also.


Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 37/38] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
  2024-06-14 15:00 ` [PATCH v3 37/38] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
@ 2024-06-28 17:04   ` Reinette Chatre
  0 siblings, 0 replies; 75+ messages in thread
From: Reinette Chatre @ 2024-06-28 17:04 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> Once the filesystem parts of resctrl move to fs/resctrl, it cannot rely
> on definitions in x86's internal.h.
> 
> Move definitions in internal.h that need to be shared between the
> filesystem and architecture code to header files that fs/resctrl can
> include.
> 
> Doing this separately means the filesystem code only moves between files
> of the same name, instead of having these changes mixed in too.
> 
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ----

"----" -> "---" to prevent following text from being included in changelog.

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 17/38] x86/resctrl: Stop using the for_each_*_rdt_resource() walkers
  2024-06-28 16:48   ` Reinette Chatre
@ 2024-07-01 18:16     ` James Morse
  2024-07-01 21:10       ` Reinette Chatre
  0 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-07-01 18:16 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi Reinette,

On 28/06/2024 17:48, Reinette Chatre wrote:
> On 6/14/24 8:00 AM, James Morse wrote:
>> The for_each_*_rdt_resource() helpers walk the architecture's array
>> of structures, using the resctrl visible part as an iterator. These
>> became over-complex when the structures were split into a
>> filesystem and architecture-specific struct. This approach avoided
>> the need to touch every call site.
>>
>> Once the filesystem parts of resctrl are moved to /fs/, both the
>> architecture's resource array, and the definition of those structures
>> is no longer accessible. To support resctrl, each architecture would
>> have to provide equally complex macros.
>>
>> Change the resctrl code that uses these to walk through the resource_level
>> enum and check the mon/alloc capable flags instead. Instances in core.c,
>> and resctrl_arch_reset_resources() remain part of x86's architecture
>> specific code.

>> diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>> b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>> index aacf236dfe3b..ad20822bb64e 100644
>> --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>> +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>> @@ -854,7 +855,11 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
>>        * First determine which cpus have pseudo-locked regions
>>        * associated with them.
>>        */
>> -    for_each_alloc_capable_rdt_resource(r) {
>> +    for (i = 0; i < RDT_NUM_RESOURCES; i++) {
>> +        r = resctrl_arch_get_resource(i);
>> +        if (!r->alloc_capable)
>> +            continue;
>> +
> 
> This looks like enough duplicate boilerplate for a new macro. For simplicity the
> macro could require two arguments with enum resctrl_res_level also provided?

I was hoping to escape from these clever macros! If you think this is too much:
- we'd need to come up with another name, as the arch code keeps the existing definition.
- to avoid touching every caller, it needs doing without an explicit iterator variable.

I guess the cleanest thing is to redefine the existing macros to use
resctrl_arch_get_resource(). Putting this in include/linxu/resctrl.h at least avoids each
architecture needing to define these, or forcing it to use an array.

The result is slightly more readable than the current version:
| #define for_each_rdt_resource(_r)                              \
|        for (_r = resctrl_arch_get_resource(0);                 \
|             _r->rid < RDT_NUM_RESOURCES;                       \
|             _r = resctrl_arch_get_resource(_r->rid + 1))

This leans heavily on resctrl_arch_get_resource() not being able to return NULL, and
having to return a dummy resource that is neither alloc nor mon capable. We may need to
revisit that if it becomes a burden for the arch code.


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 13/38] x86/resctrl: Move resctrl types to a separate header
  2024-06-28 16:45   ` Reinette Chatre
@ 2024-07-01 18:16     ` James Morse
  0 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-07-01 18:16 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi Reinette,

On 28/06/2024 17:45, Reinette Chatre wrote:
> On 6/14/24 8:00 AM, James Morse wrote:
>> When resctrl is fully factored into core and per-arch code, each arch
>> will need to use some resctrl common definitions in order to define its
>> own specializations and helpers.  Following conventional practice, it
>> would be desirable to put the dependent arch definitions in an
>> <asm/resctrl.h> header that is included by the common <linux/resctrl.h>
>> header.  However, this can make it awkward to avoid a circular
>> dependency between <linux/resctrl.h> and the arch header.
>>
>> To avoid such dependencies, move the affected common types and
>> constants into a new header that does not need to depend on
>> <linux/resctrl.h> or on the arch headers.
>>
>> The same logic applies to the monitor-configuration defines, move these
>> too.
>>
>> Some kind of enumeration for events is needed between the filesystem
>> and architecture code. Take the x86 definition as its convenient for
>> x86.
>>
>> The definition of enum resctrl_event_id is need to allow the architecture
> 
> "is need" -> "is needed" ?


>> code to define resctrl_arch_event_is_free_running(),
> 
> Cannot find resctrl_arch_event_is_free_running()

Sorry - this will show up after the MPAM driver:
{
	MPAM has an additional piece of hardware that needs to be allocated to read the
	memory bandwidth counters. resctrl expects these things to be pre-allocated and
	free running from the start of time.
	I have some patches to explicitly tell resctrl this, so that the resctrl interface
	to these things can be used by perf to query the 'mbm' counters, even if the files
	are not exposed.
}

>> resctrl_arch_set_cdp_enabled(), resctrl_arch_mon_ctx_alloc() and
> 
> resctrl_arch_set_cdp_enabled() should not need enum resctrl_event_id

Sorry, wrong list.


>> resctrl_arch_mon_ctx_free().
>>
>> The definition of enum resctrl_res_level is needed to allow the
>> architecture code to define resctrl_arch_set_cdp_enabled() and
>> resctrl_arch_get_cdp_enabled().
>>
>> The bits for mbm_local_bytes_config et al are ABI, and must be the same
>> on all architectures. These are documented in
>> Documentation/arch/x86/resctrl.rst
>>
>> The maintainers entry for these headers was missed when resctrl.h was
>> created. Add a wildcard entry to match both resctrl.h and
>> resctrl_types.h.

>> ---
>>   MAINTAINERS                            |  1 +
>>   arch/x86/kernel/cpu/resctrl/internal.h | 24 ------------
>>   include/linux/resctrl.h                | 21 +---------
>>   include/linux/resctrl_types.h          | 54 ++++++++++++++++++++++++++
> 
> Considering the motivation I also expected to see a change in
> arch/x86/include/asm/resctrl.h that adds the #include of the new file.

It gets added in a later patch - but I'll move it here.


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 16/38] x86/resctrl: Move monitor init work to a resctrl init call
  2024-06-28 16:47   ` Reinette Chatre
@ 2024-07-01 18:17     ` James Morse
  2024-07-01 21:11       ` Reinette Chatre
  0 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-07-01 18:17 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi Reinette,

On 28/06/2024 17:47, Reinette Chatre wrote:
> On 6/14/24 8:00 AM, James Morse wrote:
>> rdt_get_mon_l3_config() is called from the architecture's
>> resctrl_arch_late_init(), and initialises both architecture specific
>> fields, such as hw_res->mon_scale and resctrl filesystem fields
>> by calling dom_data_init().
>>
>> To separate the filesystem and architecture parts of resctrl, this
>> function needs splitting up.
>>
>> Add resctrl_mon_resource_init() to do the filesystem specific work,
>> and call it from resctrl_init(). This runs later, but is still before
>> the filesystem is mounted and the rmid_ptrs[] array can be used.

>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 7d6aebce75c1..527c0e9d7b2e 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1016,12 +1016,28 @@ static void l3_mon_evt_init(struct rdt_resource *r)
>>           list_add_tail(&mbm_local_event.list, &r->evt_list);
>>   }
>>   +int resctrl_mon_resource_init(void)
> 
> (Lack of an __init is unexpected but I assume it was done since that will be removed
> in later patch anyway?)

Yup - I'll add and remove that if you find it surprising.


> This function needs a big warning to deter anybody from considering this to
> be the place where any and all monitor related allocations happen. It needs
> to warn developers that only resources that can only be touched after fs mount
> may be allocated here.

I'm afraid I don't follow. Can you give an example of the scenario you are worried about?

This is called from resctrl_init(), which is called once the architecture code has done
its setup, and reckons resctrl is something that can be supported on this platform. It
would be safe for the limbo/overflow callbacks to start ticking after this point - but
there is no point if the filesystem isn't mounted yet.
Filesystem mount is triggered through rdt_get_tree(). The filesystem can't be mounted
until resctrl_init() goes on to call register_filesystem().
These allocations could be made later (at mount time), but they're allocated once up-front
today.


I've added:
/**
 * resctrl_mon_resource_init() - Initialise monitoring structures.
 *
 * Allocate and initialise the rmid_ptrs[] used for the limbo and free lists.
 * Called once during boot after the struct rdt_resource's have been configured
 * but before the filesystem is mounted.
 */


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 03/38] x86/resctrl: Add a schema format enum and use this for fflags
  2024-06-28 16:43   ` Reinette Chatre
@ 2024-07-01 18:17     ` James Morse
  2024-07-01 21:09       ` Reinette Chatre
  0 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-07-01 18:17 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin

Hi Reinette,

On 28/06/2024 17:43, Reinette Chatre wrote:
> On 6/14/24 7:59 AM, James Morse wrote:
>> resctrl has three types of control, these emerge from the way the
>> architecture initialises a number of properties in struct rdt_resource.
>>
>> A group of these properties need to be set the same on all architectures,
>> it would be better to specify the format the schema entry should use, and
>> allow resctrl to generate all the other properties it needs. This avoids
>> architectures having divergant behaviour here.
> 
> divergant -> divergent ?
> 
>>
>> Add a schema format enum, and as a first use, replace the fflags member
>> of struct rdt_resource.
>>
>> The MBA schema has a different format between AMD and Intel systems.
>> The schema_fmt property is changed by __rdt_get_mem_config_amd() to
>> enable the MBPS format.

>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index e3edc41882dc..b12307d465bc 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2162,6 +2162,19 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>>       return ret;
>>   }
>>   +static u32 fflags_from_resource(struct rdt_resource *r)
>> +{
>> +    switch (r->schema_fmt) {
>> +    case RESCTRL_SCHEMA_BITMAP:
>> +        return RFTYPE_RES_CACHE;
>> +    case RESCTRL_SCHEMA_PERCENTAGE:
>> +    case RESCTRL_SCHEMA_MBPS:
>> +        return RFTYPE_RES_MB;
>> +    }
>> +
>> +    return WARN_ON_ONCE(1);
>> +}
>> +
> 
> The fflags returned specifies which files will be associated with the resource
> in the "info" directory. Basing this on a property of the schema does not look
> right to me. I understand that many of the info files relate to, for example,
> information related to the bitmap used by the cache,

Do we agree that some of them are?

One reason for doing this is it decouples the parsing and management of bitmaps from "this
is the L3 cache", which will make it much easier to support bitmaps on some other kind of
resource.

Ultimately I'd like to expose these to user-space, so that user-space can work out how to
configure resources it doesn't recognise. Today '100' could be a percentage, a bitmap, or
a value in MB/s. Today some knowledge of the control type is needed to work this out.


> but that is not the same for
> info files related to the MBA resource (all info files related to MBA resource
> are not about the schema property format).

Hmmm, because the files min_bandwidth and bandwidth_gran both have bandwidth in their name?

I agree 'delay_linear' and 'thread_throttle_mode' are a bit strange.


> I do not think the type of values of a schema should dictate which files
> appear in the info directory.

Longer term I think this will be a problem. We probably only have 3 types of control:
percentage, bitmap and MB/s... but if each resource on each architecture adds files here
the list will quickly grow. User-space won't be able to work out how to configure a
resource type it hadn't seen before.

This may not be the time - but I think eventually resctrl shouldn't have to care about
what resources the architecture is presenting.
For these files, we may need to duplicate 'min_bandwidth' as 'min_percentage'. MBA would
have both, but any new controls using percentage wouldn't expose them.


> Doesn't MPAM support percentage for cache resources
> and bitmaps for memory resources?

It can have fixed-point-fractions and bitmaps for both caches and memory. Unfortunately
everything in MPAM is optional - the driver converts whatever it finds for memory
bandwidth to a percentage as that is what resctrl and user-space expect.
I can't do the same for cache controls as bitmaps implicitly isolate portions, something
that can't be done with the fractional control. So far everyone has built the bitmaps
because its the easiest implementation - but I have had requests to support the cache
fixed-point-fraction. Doing it as a percentage is least invasive to resctrl...


> Can the fflags rather depend on the resource type itself, by using the rid?

Sure.



Thanks,

James

>> @@ -2182,14 +2195,14 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
>>       /* loop over enabled controls, these are all alloc_capable */
>>       list_for_each_entry(s, &resctrl_schema_all, list) {
>>           r = s->res;
>> -        fflags = r->fflags | RFTYPE_CTRL_INFO;
>> +        fflags =  fflags_from_resource(r) | RFTYPE_CTRL_INFO;
> 
> (please watch for extra spaces)
> 
>>           ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
>>           if (ret)
>>               goto out_destroy;
>>       }
>>         for_each_mon_capable_rdt_resource(r) {
>> -        fflags = r->fflags | RFTYPE_MON_INFO;
>> +        fflags =  fflags_from_resource(r) | RFTYPE_MON_INFO;
> 
> (please watch for extra spaces)
> 
> Reinette


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 15/38] x86/resctrl: Move monitor exit work to a restrl exit call
  2024-06-28 16:46   ` Reinette Chatre
@ 2024-07-01 18:17     ` James Morse
  0 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-07-01 18:17 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi Reinette,

On 28/06/2024 17:46, Reinette Chatre wrote:
> On 6/14/24 8:00 AM, James Morse wrote:
>> rdt_put_mon_l3_config() is called via the architecture's
>> resctrl_arch_exit() call, and appears to free the rmid_ptrs[]
>> and closid_num_dirty_rmid[] arrays. In reality this code is marked
>> __exit, and is removed by the linker as resctl can't be built
>> as a module.
>>
>> To separate the filesystem and architecture parts of resctrl,
>> this free()ing work needs to be triggered by the filesystem,
>> as these structures belong to the filesystem code.
>>
>> Rename rdt_put_mon_l3_config() resctrl_mon_resource_exit()
>> and call it from resctrl_exit(). The kfree() is currently
>> dependent on r->mon_capable. resctrl_mon_resource_init()

> resctrl_mon_resource_init() does not exist at this point making
> this motivation difficult to follow.

Right - I re-ordered the patches to make the diffs simpler. I'll drop this paragraph.


>> takes no arguments, so resctrl_mon_resource_exit() shouldn't
>> take any either. Add the check to dom_data_exit(), making it
>> take the resource as an argument. This makes it more symmetrical
>> with dom_data_init().

>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 3e5375c365e6..7d6aebce75c1 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -965,10 +965,12 @@ static int dom_data_init(struct rdt_resource *r)
>>       return err;
>>   }
>>   -static void __exit dom_data_exit(void)
>> +static void dom_data_exit(struct rdt_resource *r)
>>   {
>> -    mutex_lock(&rdtgroup_mutex);
>> +    if (!r->mon_capable)
>> +        return;
>>   +    mutex_lock(&rdtgroup_mutex);
> 
> I know there has been a bit of back&forth on whether the mutex is needed
> here. With this change moving dom_data_exit() out from __exit I think
> the locking should aim to be consistent with existing runtime
> and thus the check of r->mon_capable should be with mutex held. Having
> this little snippet outside mutex will just cause confusion.

> Do you have motivation for needing this be done outside of mutex?

Just to avoid sleeping - then returning having done nothing.


> I think it
> ended up this way with this patch aiming to keep existing flow exactly,
> but that ended up as convenience in a flow where mutex was not really
> needed at all.

Sure - I don't think it matters either way. Its not a path where performance matters.
The property is read-only from when resctrl_init() is called. I'll make it look like
resctrl_offline_domain().


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 03/38] x86/resctrl: Add a schema format enum and use this for fflags
  2024-07-01 18:17     ` James Morse
@ 2024-07-01 21:09       ` Reinette Chatre
  2024-08-02 17:24         ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-07-01 21:09 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin

Hi James,

On 7/1/24 11:17 AM, James Morse wrote:
> Hi Reinette,
> 
> On 28/06/2024 17:43, Reinette Chatre wrote:
>> On 6/14/24 7:59 AM, James Morse wrote:
>>> resctrl has three types of control, these emerge from the way the
>>> architecture initialises a number of properties in struct rdt_resource.
>>>
>>> A group of these properties need to be set the same on all architectures,
>>> it would be better to specify the format the schema entry should use, and
>>> allow resctrl to generate all the other properties it needs. This avoids
>>> architectures having divergant behaviour here.
>>
>> divergant -> divergent ?
>>
>>>
>>> Add a schema format enum, and as a first use, replace the fflags member
>>> of struct rdt_resource.
>>>
>>> The MBA schema has a different format between AMD and Intel systems.
>>> The schema_fmt property is changed by __rdt_get_mem_config_amd() to
>>> enable the MBPS format.
> 
>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index e3edc41882dc..b12307d465bc 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -2162,6 +2162,19 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>>>        return ret;
>>>    }
>>>    +static u32 fflags_from_resource(struct rdt_resource *r)
>>> +{
>>> +    switch (r->schema_fmt) {
>>> +    case RESCTRL_SCHEMA_BITMAP:
>>> +        return RFTYPE_RES_CACHE;
>>> +    case RESCTRL_SCHEMA_PERCENTAGE:
>>> +    case RESCTRL_SCHEMA_MBPS:
>>> +        return RFTYPE_RES_MB;
>>> +    }
>>> +
>>> +    return WARN_ON_ONCE(1);
>>> +}
>>> +
>>
>> The fflags returned specifies which files will be associated with the resource
>> in the "info" directory. Basing this on a property of the schema does not look
>> right to me. I understand that many of the info files relate to, for example,
>> information related to the bitmap used by the cache,
> 
> Do we agree that some of them are?
> 
> One reason for doing this is it decouples the parsing and management of bitmaps from "this
> is the L3 cache", which will make it much easier to support bitmaps on some other kind of
> resource.

The way I see it is that it changes the meaning of the RFTYPE_RES_CACHE flag from "this is a
file related to the cache resource" to "this is a file containing a bitmap property".
It prevents us from easily adding a file related to the cache resource, which
the info directory is intended to contain.

> 
> Ultimately I'd like to expose these to user-space, so that user-space can work out how to
> configure resources it doesn't recognise. Today '100' could be a percentage, a bitmap, or
> a value in MB/s. Today some knowledge of the control type is needed to work this out.
> 
> 
>> but that is not the same for
>> info files related to the MBA resource (all info files related to MBA resource
>> are not about the schema property format).
> 
> Hmmm, because the files min_bandwidth and bandwidth_gran both have bandwidth in their name?
> 
> I agree 'delay_linear' and 'thread_throttle_mode' are a bit strange.

Right. This is not a clean association.

> 
> 
>> I do not think the type of values of a schema should dictate which files
>> appear in the info directory.
> 
> Longer term I think this will be a problem. We probably only have 3 types of control:
> percentage, bitmap and MB/s... but if each resource on each architecture adds files here
> the list will quickly grow. User-space won't be able to work out how to configure a
> resource type it hadn't seen before.

That is fair. This makes the type of control a property of the resource as is done in this
series. Perhaps this can be exposed to user space via the info directory?

Possibly the files related to control can have new flags that that reflect the control type
instead of the resource. For example, "bit_usage" currently has
"RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE" and that could be (for lack of better
term) "RFTYPE_CTRL_INFO | RFTYPE_CTRL_BITMAP" to disconnect the control type from the
resource. Doing so may then map nicely to the fflags_from_resource() in this patch that
connects the schema format to the _control_ type flag. As we have found there is not
a clear mapping between the control type and the resource type so I expect RFTYPE_RES_CACHE
and RFTYPE_RES_MB to remain and be associated with files that contain information
specific to that resource. This enables future additions of files containing cache specific
(non-bitmap) properties to still be added (with RFTYPE_RES_CACHE flag) without impacting
everything that uses a bitmap.

What do you think?

> 
> This may not be the time - but I think eventually resctrl shouldn't have to care about
> what resources the architecture is presenting.
> For these files, we may need to duplicate 'min_bandwidth' as 'min_percentage'. MBA would
> have both, but any new controls using percentage wouldn't expose them.
> 
> 
>> Doesn't MPAM support percentage for cache resources
>> and bitmaps for memory resources?
> 
> It can have fixed-point-fractions and bitmaps for both caches and memory. Unfortunately
> everything in MPAM is optional - the driver converts whatever it finds for memory
> bandwidth to a percentage as that is what resctrl and user-space expect.
> I can't do the same for cache controls as bitmaps implicitly isolate portions, something
> that can't be done with the fractional control. So far everyone has built the bitmaps
> because its the easiest implementation - but I have had requests to support the cache
> fixed-point-fraction. Doing it as a percentage is least invasive to resctrl...
> 
> 
>> Can the fflags rather depend on the resource type itself, by using the rid?
> 
> Sure.
> 

Reinette



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 17/38] x86/resctrl: Stop using the for_each_*_rdt_resource() walkers
  2024-07-01 18:16     ` James Morse
@ 2024-07-01 21:10       ` Reinette Chatre
  2024-08-02 17:22         ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-07-01 21:10 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 7/1/24 11:16 AM, James Morse wrote:
> Hi Reinette,
> 
> On 28/06/2024 17:48, Reinette Chatre wrote:
>> On 6/14/24 8:00 AM, James Morse wrote:
>>> The for_each_*_rdt_resource() helpers walk the architecture's array
>>> of structures, using the resctrl visible part as an iterator. These
>>> became over-complex when the structures were split into a
>>> filesystem and architecture-specific struct. This approach avoided
>>> the need to touch every call site.
>>>
>>> Once the filesystem parts of resctrl are moved to /fs/, both the
>>> architecture's resource array, and the definition of those structures
>>> is no longer accessible. To support resctrl, each architecture would
>>> have to provide equally complex macros.
>>>
>>> Change the resctrl code that uses these to walk through the resource_level
>>> enum and check the mon/alloc capable flags instead. Instances in core.c,
>>> and resctrl_arch_reset_resources() remain part of x86's architecture
>>> specific code.
> 
>>> diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>>> b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>>> index aacf236dfe3b..ad20822bb64e 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>>> @@ -854,7 +855,11 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
>>>         * First determine which cpus have pseudo-locked regions
>>>         * associated with them.
>>>         */
>>> -    for_each_alloc_capable_rdt_resource(r) {
>>> +    for (i = 0; i < RDT_NUM_RESOURCES; i++) {
>>> +        r = resctrl_arch_get_resource(i);
>>> +        if (!r->alloc_capable)
>>> +            continue;
>>> +
>>
>> This looks like enough duplicate boilerplate for a new macro. For simplicity the
>> macro could require two arguments with enum resctrl_res_level also provided?
> 
> I was hoping to escape from these clever macros! If you think this is too much:
> - we'd need to come up with another name, as the arch code keeps the existing definition.
> - to avoid touching every caller, it needs doing without an explicit iterator variable.
> 
> I guess the cleanest thing is to redefine the existing macros to use
> resctrl_arch_get_resource(). Putting this in include/linxu/resctrl.h at least avoids each
> architecture needing to define these, or forcing it to use an array.
> 
> The result is slightly more readable than the current version:
> | #define for_each_rdt_resource(_r)                              \
> |        for (_r = resctrl_arch_get_resource(0);                 \
> |             _r->rid < RDT_NUM_RESOURCES;                       \
> |             _r = resctrl_arch_get_resource(_r->rid + 1))
> 
> This leans heavily on resctrl_arch_get_resource() not being able to return NULL, and
> having to return a dummy resource that is neither alloc nor mon capable. We may need to
> revisit that if it becomes a burden for the arch code.

Replacing the repetitive four lines of code with a single line seems good to me.

resctrl_arch_get_resource() being able to return NULL is introduced in this series but
I am not seeing any handling of a possible NULL value. Not being able to return NULL thus
already seems a requirement?

Reinette



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 16/38] x86/resctrl: Move monitor init work to a resctrl init call
  2024-07-01 18:17     ` James Morse
@ 2024-07-01 21:11       ` Reinette Chatre
  2024-08-02 17:23         ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-07-01 21:11 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 7/1/24 11:17 AM, James Morse wrote:
> Hi Reinette,
> 
> On 28/06/2024 17:47, Reinette Chatre wrote:
>> On 6/14/24 8:00 AM, James Morse wrote:
>>> rdt_get_mon_l3_config() is called from the architecture's
>>> resctrl_arch_late_init(), and initialises both architecture specific
>>> fields, such as hw_res->mon_scale and resctrl filesystem fields
>>> by calling dom_data_init().
>>>
>>> To separate the filesystem and architecture parts of resctrl, this
>>> function needs splitting up.
>>>
>>> Add resctrl_mon_resource_init() to do the filesystem specific work,
>>> and call it from resctrl_init(). This runs later, but is still before
>>> the filesystem is mounted and the rmid_ptrs[] array can be used.
> 
>>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>>> index 7d6aebce75c1..527c0e9d7b2e 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>>> @@ -1016,12 +1016,28 @@ static void l3_mon_evt_init(struct rdt_resource *r)
>>>            list_add_tail(&mbm_local_event.list, &r->evt_list);
>>>    }
>>>    +int resctrl_mon_resource_init(void)
>>
>> (Lack of an __init is unexpected but I assume it was done since that will be removed
>> in later patch anyway?)
> 
> Yup - I'll add and remove that if you find it surprising.
> 
> 
>> This function needs a big warning to deter anybody from considering this to
>> be the place where any and all monitor related allocations happen. It needs
>> to warn developers that only resources that can only be touched after fs mount
>> may be allocated here.
> 
> I'm afraid I don't follow. Can you give an example of the scenario you are worried about?

My concern is not a scenario with current code flow but a request for informational
comments to prevent future mistakes. Specifically, as I understand the CPU online/offline
handlers can run before this function is called. Those handlers do a lot of setup, getting
resctrl and the system ready. It can be reasonable that some future action may need to touch
a new monitoring structure and with a name like resctrl_mon_resource_init() it seems appropriate
to allocate this new monitoring structure there. I am hoping that resctrl_mon_resource_init()
will have sufficient comments to deter that.

> This is called from resctrl_init(), which is called once the architecture code has done
> its setup, and reckons resctrl is something that can be supported on this platform. It
> would be safe for the limbo/overflow callbacks to start ticking after this point - but
> there is no point if the filesystem isn't mounted yet.
> Filesystem mount is triggered through rdt_get_tree(). The filesystem can't be mounted
> until resctrl_init() goes on to call register_filesystem().
> These allocations could be made later (at mount time), but they're allocated once up-front
> today.
> 
> 
> I've added:
> /**
>   * resctrl_mon_resource_init() - Initialise monitoring structures.

How about a more specific "Initialise monitoring structures used after filesystem mount"?

>   *
>   * Allocate and initialise the rmid_ptrs[] used for the limbo and free lists.
>   * Called once during boot after the struct rdt_resource's have been configured
>   * but before the filesystem is mounted.

Can there be a warning (please feel free to improve):
	"Only for resources used after filesystem mount. For example, do not allocate resources
	 needed by the CPU online/offline handlers since these handlers may run before this
	 function."

>   */
> 
> 
> Thanks,
> 
> James

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 36/38] fs/resctrl: Add boiler plate for external resctrl code
  2024-06-28 16:54   ` Reinette Chatre
@ 2024-07-04 16:40     ` James Morse
  2024-07-08 17:47       ` Reinette Chatre
  0 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-07-04 16:40 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi Reinette,

On 28/06/2024 17:54, Reinette Chatre wrote:
> On 6/14/24 8:00 AM, James Morse wrote:
>> Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL
>> for the common parts of the resctrl interface and make X86_CPU_RESCTRL
>> depend on this.
>>
>> Adding an include of asm/resctrl.h to linux/resctrl.h allows some
>> of the files to switch over to using this header instead.


>> diff --git a/fs/resctrl/Kconfig b/fs/resctrl/Kconfig
>> new file mode 100644
>> index 000000000000..a5fbda54d32f
>> --- /dev/null
>> +++ b/fs/resctrl/Kconfig
>> @@ -0,0 +1,36 @@
>> +config RESCTRL_FS
>> +    bool "CPU Resource Control Filesystem (resctrl)"
>> +    depends on ARCH_HAS_CPU_RESCTRL
>> +    select KERNFS
>> +    select PROC_CPU_RESCTRL if PROC_FS
>> +    help
>> +      Some architectures provide hardware facilities to group tasks and
>> +      monitor and control their usage of memory system resources such as
>> +      caches and memory bandwidth.  Examples of such facilities include
>> +      Intel's Resource Director Technology (Intel(R) RDT) and AMD's
>> +      Platform Quality of Service (AMD QoS).
>> +
>> +      If your system has the necessary support and you want to be able to
>> +      assign tasks to groups and manipulate the associated resource
>> +      monitors and controls from userspace, say Y here to get a mountable
>> +      'resctrl' filesystem that lets you do just that.
>> +
>> +      If nothing mounts or prods the 'resctrl' filesystem, resource
>> +      controls and monitors are left in a quiescent, permissive state.
>> +
>> +      If unsure, it is safe to say N.
>> +
> 
> Will user ever get opportunity to say "Y" or "N"?
> It looks to me that
> RESCTRL_FS will be "forced" on user as it is selected by the arch specific
> config X86_CPU_RESCTRL and be invisble otherwise because of the dependency
> on ARCH_HAS_CPU_RESCTRL.

I did it like this so that this change is invisible for x86 config files on the principle
of 'least noise'. Users can't enable RDT but disable resctrl today.
It isn't actually possible to enable RDT and disable resctrl until after the code has been
split from the architecture code.

I have ended up supporting this for MPAM - you can enable the architecture's MPAM code and
the driver, but not resctrl. This will eventually be for in-kernel users of resources that
resctrl doesn't understand.


> The text about when to select "Y" or "N" thus does
> not look practical to me and it may be helpful to instead provide
> information about when it is selected? I do not know the customs for this
> text and if it is intended to document any future usages also.

I think Dave wrote this text because its traditional for Kconfig options to say this.

Describing when it is selected gets messy as this varies by architecture, and Kconfig can
already tell you this:
| Selected by [y]:
│   - X86_CPU_RESCTRL [=y] && X86 [=y] && (CPU_SUP_INTEL [=y] || CPU_SUP_AMD [=y]) &&
MISC_FILESYSTEMS [=y]

I don't think it makes sense for resctrl to be enabled/disabled independently on x86.
If you want to support this, we need a few more IS_ENABLED() checks and stubs to make it
build. The only reason I can see to do it is to ensure the architecture code is self
contained.

I'll reword this as "On architectures where this can be disabled independently, it is safe
to say N".


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
  2024-06-28 16:53   ` Reinette Chatre
@ 2024-07-04 16:41     ` James Morse
  2024-07-08 17:47       ` Reinette Chatre
  0 siblings, 1 reply; 75+ messages in thread
From: James Morse @ 2024-07-04 16:41 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi Reinette,

On 28/06/2024 17:53, Reinette Chatre wrote:
> On 6/14/24 8:00 AM, James Morse wrote:
>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>> resctrl can't be built as a module, and the kernfs helpers are not exported
>> so this is unlikely to change. MPAM has an error interrupt which indicates
>> the MPAM driver has gone haywire. Should this occur tasks could run with
>> the wrong control values, leading to bad performance for impoartant tasks.
> 
> impoartant -> important
> 
>> The MPAM driver needs a way to tell resctrl that no further configuration
>> should be attempted.
>>
>> Using resctrl_exit() for this leaves the system in a funny state as
>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>> directory that is typically used has been removed. Dave Martin suggests
>> this may cause systemd trouble in the future as not all filesystems
>> can be unmounted.
>>
>> Add calls to remove all the files and directories in resctrl, and
>> remove the sysfs_remove_mount_point() call that leaves the system
>> in a funny state. When triggered, this causes all the resctrl files
>> to disappear. resctrl can be unmounted, but not mounted again.

> I am not familiar with these flows so I would like to confirm ...
> In this scenario the resctrl filesystem will be unregistered, are
> you saying that it is possible to unmount a filesystem after it has
> been unregistered?

Counter-intuitively: yes.

The rules are described in fs/filesystems.c: We can access the members of the struct
file_system_type if the list lock is held, or a reference is held to the module. This is
how /proc/mounts is able to print the filesystem name from struct file_system_type without
taking the lock - it holds a reference to any module to prevent the structure from being
freed. Because resctrl can't be built as a module, we can say there is always a reference
held, and we can never free struct file_system_type.


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 36/38] fs/resctrl: Add boiler plate for external resctrl code
  2024-07-04 16:40     ` James Morse
@ 2024-07-08 17:47       ` Reinette Chatre
  0 siblings, 0 replies; 75+ messages in thread
From: Reinette Chatre @ 2024-07-08 17:47 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 7/4/24 9:40 AM, James Morse wrote:
> Hi Reinette,
> 
> On 28/06/2024 17:54, Reinette Chatre wrote:
>> On 6/14/24 8:00 AM, James Morse wrote:
>>> Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL
>>> for the common parts of the resctrl interface and make X86_CPU_RESCTRL
>>> depend on this.
>>>
>>> Adding an include of asm/resctrl.h to linux/resctrl.h allows some
>>> of the files to switch over to using this header instead.
> 
> 
>>> diff --git a/fs/resctrl/Kconfig b/fs/resctrl/Kconfig
>>> new file mode 100644
>>> index 000000000000..a5fbda54d32f
>>> --- /dev/null
>>> +++ b/fs/resctrl/Kconfig
>>> @@ -0,0 +1,36 @@
>>> +config RESCTRL_FS
>>> +    bool "CPU Resource Control Filesystem (resctrl)"
>>> +    depends on ARCH_HAS_CPU_RESCTRL
>>> +    select KERNFS
>>> +    select PROC_CPU_RESCTRL if PROC_FS
>>> +    help
>>> +      Some architectures provide hardware facilities to group tasks and
>>> +      monitor and control their usage of memory system resources such as
>>> +      caches and memory bandwidth.  Examples of such facilities include
>>> +      Intel's Resource Director Technology (Intel(R) RDT) and AMD's
>>> +      Platform Quality of Service (AMD QoS).
>>> +
>>> +      If your system has the necessary support and you want to be able to
>>> +      assign tasks to groups and manipulate the associated resource
>>> +      monitors and controls from userspace, say Y here to get a mountable
>>> +      'resctrl' filesystem that lets you do just that.
>>> +
>>> +      If nothing mounts or prods the 'resctrl' filesystem, resource
>>> +      controls and monitors are left in a quiescent, permissive state.
>>> +
>>> +      If unsure, it is safe to say N.
>>> +
>>
>> Will user ever get opportunity to say "Y" or "N"?
>> It looks to me that
>> RESCTRL_FS will be "forced" on user as it is selected by the arch specific
>> config X86_CPU_RESCTRL and be invisble otherwise because of the dependency
>> on ARCH_HAS_CPU_RESCTRL.
> 
> I did it like this so that this change is invisible for x86 config files on the principle
> of 'least noise'. Users can't enable RDT but disable resctrl today.
> It isn't actually possible to enable RDT and disable resctrl until after the code has been
> split from the architecture code.
> 
> I have ended up supporting this for MPAM - you can enable the architecture's MPAM code and
> the driver, but not resctrl. This will eventually be for in-kernel users of resources that
> resctrl doesn't understand.
> 
> 
>> The text about when to select "Y" or "N" thus does
>> not look practical to me and it may be helpful to instead provide
>> information about when it is selected? I do not know the customs for this
>> text and if it is intended to document any future usages also.
> 
> I think Dave wrote this text because its traditional for Kconfig options to say this.
> 
> Describing when it is selected gets messy as this varies by architecture, and Kconfig can
> already tell you this:
> | Selected by [y]:
> │   - X86_CPU_RESCTRL [=y] && X86 [=y] && (CPU_SUP_INTEL [=y] || CPU_SUP_AMD [=y]) &&
> MISC_FILESYSTEMS [=y]

Right.

> 
> I don't think it makes sense for resctrl to be enabled/disabled independently on x86.

I was not asking for resctrl to be enabled/disabled independently on x86. I commented on this
patch that adds text to guide user for options that the user is never able to select.

> If you want to support this, we need a few more IS_ENABLED() checks and stubs to make it

I did not intend to suggest this at all.

> build. The only reason I can see to do it is to ensure the architecture code is self
> contained.
> 
> I'll reword this as "On architectures where this can be disabled independently, it is safe
> to say N".

ok

Reinette

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
  2024-07-04 16:41     ` James Morse
@ 2024-07-08 17:47       ` Reinette Chatre
  2024-08-02 17:28         ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Reinette Chatre @ 2024-07-08 17:47 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi James,

On 7/4/24 9:41 AM, James Morse wrote:
> Hi Reinette,
> 
> On 28/06/2024 17:53, Reinette Chatre wrote:
>> On 6/14/24 8:00 AM, James Morse wrote:
>>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>>> resctrl can't be built as a module, and the kernfs helpers are not exported
>>> so this is unlikely to change. MPAM has an error interrupt which indicates
>>> the MPAM driver has gone haywire. Should this occur tasks could run with
>>> the wrong control values, leading to bad performance for impoartant tasks.
>>
>> impoartant -> important
>>
>>> The MPAM driver needs a way to tell resctrl that no further configuration
>>> should be attempted.
>>>
>>> Using resctrl_exit() for this leaves the system in a funny state as
>>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>>> directory that is typically used has been removed. Dave Martin suggests
>>> this may cause systemd trouble in the future as not all filesystems
>>> can be unmounted.
>>>
>>> Add calls to remove all the files and directories in resctrl, and
>>> remove the sysfs_remove_mount_point() call that leaves the system
>>> in a funny state. When triggered, this causes all the resctrl files
>>> to disappear. resctrl can be unmounted, but not mounted again.
> 
>> I am not familiar with these flows so I would like to confirm ...
>> In this scenario the resctrl filesystem will be unregistered, are
>> you saying that it is possible to unmount a filesystem after it has
>> been unregistered?
> 
> Counter-intuitively: yes.
> 
> The rules are described in fs/filesystems.c: We can access the members of the struct
> file_system_type if the list lock is held, or a reference is held to the module. This is
> how /proc/mounts is able to print the filesystem name from struct file_system_type without
> taking the lock - it holds a reference to any module to prevent the structure from being

hmmm ... does this mean I am supposed to find calls to try_module_get() in the flow from
mounts_open_common()?

> freed. Because resctrl can't be built as a module, we can say there is always a reference
> held, and we can never free struct file_system_type.

unregister_filesystem() continues to be called and as I understand in new MPAM usages will be
called during runtime. unregister_filesystem() comments state "Once this function has returned
the &struct file_system_type structure may be freed or reused.". Could you please highlight to me
what gives the confidence of "we can say there is always a reference held"? Could you please
point to me where that reference is obtained that will prevent the structure from being
freed?

Thank you

Reinette


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 15/38] x86/resctrl: Move monitor exit work to a restrl exit call
  2024-06-14 15:00 ` [PATCH v3 15/38] x86/resctrl: Move monitor exit work to a restrl exit call James Morse
  2024-06-28 16:46   ` Reinette Chatre
@ 2024-07-11 21:12   ` Carl Worth
  1 sibling, 0 replies; 75+ messages in thread
From: Carl Worth @ 2024-07-11 21:12 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

James Morse <james.morse@arm.com> writes:
> __exit, and is removed by the linker as resctl can't be built

Typo fix: resctl -> resctrl

Also in the commit message subject line:

Typo fix: restrl -> resctrl

> Change since v1:
>  * [Commit message only] Typo fixes:
>    s/restrl/resctrl/g
>    s/resctl/resctrl/g

Oh? I'm not sure how convinced I am. :-)

-Carl

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 23/38] x86/resctrl: Allow an architecture to disable pseudo lock
  2024-06-14 15:00 ` [PATCH v3 23/38] x86/resctrl: Allow an architecture to disable pseudo lock James Morse
@ 2024-07-11 21:33   ` Carl Worth
  2024-08-02 17:22     ` James Morse
  0 siblings, 1 reply; 75+ messages in thread
From: Carl Worth @ 2024-07-11 21:33 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

James Morse <james.morse@arm.com> writes:
> diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/resctrl/Makefile
> index 4a06c37b9cf1..0c13b0befd8a 100644
> --- a/arch/x86/kernel/cpu/resctrl/Makefile
> +++ b/arch/x86/kernel/cpu/resctrl/Makefile
> @@ -1,4 +1,5 @@
>  # SPDX-License-Identifier: GPL-2.0
> -obj-$(CONFIG_X86_CPU_RESCTRL)	+= core.o rdtgroup.o monitor.o
> -obj-$(CONFIG_X86_CPU_RESCTRL)	+= ctrlmondata.o pseudo_lock.o
> +obj-$(CONFIG_X86_CPU_RESCTRL)		+= core.o rdtgroup.o monitor.o
> +obj-$(CONFIG_X86_CPU_RESCTRL)		+= ctrlmondata.o
> +obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK)	+= pseudo_lock.o
>  CFLAGS_pseudo_lock.o = -I$(src)

Now that pseudo_lock.c is only conditionally compiled, the work it's
doing to define tracepoints, (that is, #define CREATE_TRACE_POINTS),
should be moved to monitor.c which is unconditionally compiled.

And then, the CFLAGS line above should be adjusted to apply to
the compilation of monitor.c, that is:

CFLAGS_monitor.o = -I$(src)

Without these changes, compiling without CONFIG_RESCTRL_FS_PSEUDO_LOCK
will fail due to undefined tracepoint functions.

-Carl

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
  2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
                   ` (37 preceding siblings ...)
  2024-06-14 15:00 ` [PATCH v3 38/38] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
@ 2024-07-11 22:00 ` Carl Worth
  2024-08-02 17:22   ` James Morse
  38 siblings, 1 reply; 75+ messages in thread
From: Carl Worth @ 2024-07-11 22:00 UTC (permalink / raw)
  To: James Morse, x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
	shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

James Morse <james.morse@arm.com> writes:
> This is the final series that allows other architectures to implement resctrl.
> The final patch to move the code has been ommited, but can be generated using
> the python script at the end of the series.
> The final move is a bit of a monster. I don't expect that to get merged as part
> of this series - we should wait for it to make less impact on other
> series.

Thanks, again, James.

As with previous versions, I've tested this code (along with additional
MPAM code from you and other code we've written), to test MPAM
functionality on an Ampere implementation.

I replied to the in the series which introduces
CONFIG_RESCTRL_FS_PSEUDO_LOCK to point out how that commit will actually
break compilation if that option is not selected, (and I described the
minor change needed to fix that).

With that fixed, for the series:

Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64

-Carl

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
  2024-07-11 22:00 ` [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem " Carl Worth
@ 2024-08-02 17:22   ` James Morse
  0 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-08-02 17:22 UTC (permalink / raw)
  To: Carl Worth, x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger,
	shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin

Hi Carl,

On 11/07/2024 23:00, Carl Worth wrote:
> James Morse <james.morse@arm.com> writes:
>> This is the final series that allows other architectures to implement resctrl.
>> The final patch to move the code has been ommited, but can be generated using
>> the python script at the end of the series.
>> The final move is a bit of a monster. I don't expect that to get merged as part
>> of this series - we should wait for it to make less impact on other
>> series.
> 
> Thanks, again, James.
> 
> As with previous versions, I've tested this code (along with additional
> MPAM code from you and other code we've written), to test MPAM
> functionality on an Ampere implementation.
> 
> I replied to the in the series which introduces
> CONFIG_RESCTRL_FS_PSEUDO_LOCK to point out how that commit will actually
> break compilation if that option is not selected, (and I described the
> minor change needed to fix that).
> 
> With that fixed, for the series:
> 
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64

Great - thanks!
(I assume you didn't test the python script that generates the move-to-fs patch)


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 23/38] x86/resctrl: Allow an architecture to disable pseudo lock
  2024-07-11 21:33   ` Carl Worth
@ 2024-08-02 17:22     ` James Morse
  0 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-08-02 17:22 UTC (permalink / raw)
  To: Carl Worth, x86, linux-kernel
  Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H Peter Anvin, Babu Moger,
	shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
	bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
	peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
	Dave Martin, Shaopeng Tan

Hi Carl,

On 11/07/2024 22:33, Carl Worth wrote:
> James Morse <james.morse@arm.com> writes:
>> diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/resctrl/Makefile
>> index 4a06c37b9cf1..0c13b0befd8a 100644
>> --- a/arch/x86/kernel/cpu/resctrl/Makefile
>> +++ b/arch/x86/kernel/cpu/resctrl/Makefile
>> @@ -1,4 +1,5 @@
>>  # SPDX-License-Identifier: GPL-2.0
>> -obj-$(CONFIG_X86_CPU_RESCTRL)	+= core.o rdtgroup.o monitor.o
>> -obj-$(CONFIG_X86_CPU_RESCTRL)	+= ctrlmondata.o pseudo_lock.o
>> +obj-$(CONFIG_X86_CPU_RESCTRL)		+= core.o rdtgroup.o monitor.o
>> +obj-$(CONFIG_X86_CPU_RESCTRL)		+= ctrlmondata.o
>> +obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK)	+= pseudo_lock.o
>>  CFLAGS_pseudo_lock.o = -I$(src)
> 
> Now that pseudo_lock.c is only conditionally compiled, the work it's
> doing to define tracepoints, (that is, #define CREATE_TRACE_POINTS),
> should be moved to monitor.c which is unconditionally compiled.
> 
> And then, the CFLAGS line above should be adjusted to apply to
> the compilation of monitor.c, that is:
> 
> CFLAGS_monitor.o = -I$(src)

> Without these changes, compiling without CONFIG_RESCTRL_FS_PSEUDO_LOCK
> will fail due to undefined tracepoint functions.

Thanks for catching this - I bashed my head against it for a while.

I've split trace.h in two: monitor_trace.h and pseudo_lock_trace.h. When the code gets
moved to /fs/ only one of those files moves.


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 17/38] x86/resctrl: Stop using the for_each_*_rdt_resource() walkers
  2024-07-01 21:10       ` Reinette Chatre
@ 2024-08-02 17:22         ` James Morse
  0 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-08-02 17:22 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi Reinette,

On 01/07/2024 22:10, Reinette Chatre wrote:
> On 7/1/24 11:16 AM, James Morse wrote:
>> On 28/06/2024 17:48, Reinette Chatre wrote:
>>> On 6/14/24 8:00 AM, James Morse wrote:
>>>> The for_each_*_rdt_resource() helpers walk the architecture's array
>>>> of structures, using the resctrl visible part as an iterator. These
>>>> became over-complex when the structures were split into a
>>>> filesystem and architecture-specific struct. This approach avoided
>>>> the need to touch every call site.
>>>>
>>>> Once the filesystem parts of resctrl are moved to /fs/, both the
>>>> architecture's resource array, and the definition of those structures
>>>> is no longer accessible. To support resctrl, each architecture would
>>>> have to provide equally complex macros.
>>>>
>>>> Change the resctrl code that uses these to walk through the resource_level
>>>> enum and check the mon/alloc capable flags instead. Instances in core.c,
>>>> and resctrl_arch_reset_resources() remain part of x86's architecture
>>>> specific code.
>>
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>>>> b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>>>> index aacf236dfe3b..ad20822bb64e 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
>>>> @@ -854,7 +855,11 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
>>>>         * First determine which cpus have pseudo-locked regions
>>>>         * associated with them.
>>>>         */
>>>> -    for_each_alloc_capable_rdt_resource(r) {
>>>> +    for (i = 0; i < RDT_NUM_RESOURCES; i++) {
>>>> +        r = resctrl_arch_get_resource(i);
>>>> +        if (!r->alloc_capable)
>>>> +            continue;
>>>> +
>>>
>>> This looks like enough duplicate boilerplate for a new macro. For simplicity the
>>> macro could require two arguments with enum resctrl_res_level also provided?
>>
>> I was hoping to escape from these clever macros! If you think this is too much:
>> - we'd need to come up with another name, as the arch code keeps the existing definition.
>> - to avoid touching every caller, it needs doing without an explicit iterator variable.
>>
>> I guess the cleanest thing is to redefine the existing macros to use
>> resctrl_arch_get_resource(). Putting this in include/linxu/resctrl.h at least avoids each
>> architecture needing to define these, or forcing it to use an array.
>>
>> The result is slightly more readable than the current version:
>> | #define for_each_rdt_resource(_r)                              \
>> |        for (_r = resctrl_arch_get_resource(0);                 \
>> |             _r->rid < RDT_NUM_RESOURCES;                       \
>> |             _r = resctrl_arch_get_resource(_r->rid + 1))
>>
>> This leans heavily on resctrl_arch_get_resource() not being able to return NULL, and
>> having to return a dummy resource that is neither alloc nor mon capable. We may need to
>> revisit that if it becomes a burden for the arch code.
> 
> Replacing the repetitive four lines of code with a single line seems good to me.

> resctrl_arch_get_resource() being able to return NULL is introduced in this series but
> I am not seeing any handling of a possible NULL value. Not being able to return NULL thus
> already seems a requirement?

It's currently implicit because until this point resctrl has just reached into the
rdt_resources_all[] array - and can never get a NULL pointer. Replacing that with a helper
needed to preserve the no-NULLs behaviour.
Changing this created too much churn so the resctrl idiom is to check
alloc_enabled/mon_enabled to see if the resource actually exists....

If we wanted to change this, that for_each_rdt_resource() would need an index variable as
_r could be NULL.


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 16/38] x86/resctrl: Move monitor init work to a resctrl init call
  2024-07-01 21:11       ` Reinette Chatre
@ 2024-08-02 17:23         ` James Morse
  0 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-08-02 17:23 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi Reinette,

On 01/07/2024 22:11, Reinette Chatre wrote:
> On 7/1/24 11:17 AM, James Morse wrote:
>> On 28/06/2024 17:47, Reinette Chatre wrote:
>>> On 6/14/24 8:00 AM, James Morse wrote:
>>>> rdt_get_mon_l3_config() is called from the architecture's
>>>> resctrl_arch_late_init(), and initialises both architecture specific
>>>> fields, such as hw_res->mon_scale and resctrl filesystem fields
>>>> by calling dom_data_init().
>>>>
>>>> To separate the filesystem and architecture parts of resctrl, this
>>>> function needs splitting up.
>>>>
>>>> Add resctrl_mon_resource_init() to do the filesystem specific work,
>>>> and call it from resctrl_init(). This runs later, but is still before
>>>> the filesystem is mounted and the rmid_ptrs[] array can be used.
>>
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c
>>>> b/arch/x86/kernel/cpu/resctrl/monitor.c
>>>> index 7d6aebce75c1..527c0e9d7b2e 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>>>> @@ -1016,12 +1016,28 @@ static void l3_mon_evt_init(struct rdt_resource *r)
>>>>            list_add_tail(&mbm_local_event.list, &r->evt_list);
>>>>    }
>>>>    +int resctrl_mon_resource_init(void)
>>>
>>> (Lack of an __init is unexpected but I assume it was done since that will be removed
>>> in later patch anyway?)
>>
>> Yup - I'll add and remove that if you find it surprising.
>>
>>
>>> This function needs a big warning to deter anybody from considering this to
>>> be the place where any and all monitor related allocations happen. It needs
>>> to warn developers that only resources that can only be touched after fs mount
>>> may be allocated here.
>>
>> I'm afraid I don't follow. Can you give an example of the scenario you are worried about?

> My concern is not a scenario with current code flow but a request for informational
> comments to prevent future mistakes. Specifically, as I understand the CPU online/offline
> handlers can run before this function is called. Those handlers do a lot of setup, getting
> resctrl and the system ready. It can be reasonable that some future action may need to touch
> a new monitoring structure and with a name like resctrl_mon_resource_init() it seems
> appropriate
> to allocate this new monitoring structure there. I am hoping that resctrl_mon_resource_init()
> will have sufficient comments to deter that.

Ah, Of course! ... this is about 'global' allocations that don't belong to a specific domain.

I've reworded the comment above the function as:
| * Allocate and initialise global monitor resources that do not belong to a
| * specific domain. i.e. the rmid_ptrs[] used for the limbo and free lists.
| * Called once during boot after the struct rdt_resource's have been configured
| * but before the filesystem is mounted.
| * Resctrl's cpuhp callbacks may be called before this point to bring a domain
| * online.

and a similar comment above domain_setup_mon_state:
| * Allocate monitor resources that belong to this domain.
| * Called when the first CPU of a domain comes online, regardless of whether
| * the filesystem is mounted.
| * During boot this may be called before global allocations have been made by
| * resctrl_mon_resource_init().



>> This is called from resctrl_init(), which is called once the architecture code has done
>> its setup, and reckons resctrl is something that can be supported on this platform. It
>> would be safe for the limbo/overflow callbacks to start ticking after this point - but
>> there is no point if the filesystem isn't mounted yet.
>> Filesystem mount is triggered through rdt_get_tree(). The filesystem can't be mounted
>> until resctrl_init() goes on to call register_filesystem().
>> These allocations could be made later (at mount time), but they're allocated once up-front
>> today.
>>
>>
>> I've added:
>> /**
>>   * resctrl_mon_resource_init() - Initialise monitoring structures.
> 
> How about a more specific "Initialise monitoring structures used after filesystem mount"?

Sure, this has become;
| * resctrl_mon_resource_init() - Initialise global monitoring structures used
| *				  after filesystem mount.


>>   *
>>   * Allocate and initialise the rmid_ptrs[] used for the limbo and free lists.
>>   * Called once during boot after the struct rdt_resource's have been configured
>>   * but before the filesystem is mounted.
> 
> Can there be a warning (please feel free to improve):
>     "Only for resources used after filesystem mount. For example, do not allocate resources
>      needed by the CPU online/offline handlers since these handlers may run before this
>      function."

Enumerating what not to do feels like the beginning of a never ending story!
I think describing these as global/specific-to-a-domain makes it clear what kind of
allocation should go here.


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 03/38] x86/resctrl: Add a schema format enum and use this for fflags
  2024-07-01 21:09       ` Reinette Chatre
@ 2024-08-02 17:24         ` James Morse
  0 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-08-02 17:24 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin

Hi Reinette,

On 01/07/2024 22:09, Reinette Chatre wrote:
> On 7/1/24 11:17 AM, James Morse wrote:
>> On 28/06/2024 17:43, Reinette Chatre wrote:
>>> On 6/14/24 7:59 AM, James Morse wrote:
>>>> resctrl has three types of control, these emerge from the way the
>>>> architecture initialises a number of properties in struct rdt_resource.
>>>>
>>>> A group of these properties need to be set the same on all architectures,
>>>> it would be better to specify the format the schema entry should use, and
>>>> allow resctrl to generate all the other properties it needs. This avoids
>>>> architectures having divergant behaviour here.
>>>
>>> divergant -> divergent ?
>>>
>>>>
>>>> Add a schema format enum, and as a first use, replace the fflags member
>>>> of struct rdt_resource.
>>>>
>>>> The MBA schema has a different format between AMD and Intel systems.
>>>> The schema_fmt property is changed by __rdt_get_mem_config_amd() to
>>>> enable the MBPS format.
>>
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> index e3edc41882dc..b12307d465bc 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> @@ -2162,6 +2162,19 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>>>>        return ret;
>>>>    }
>>>>    +static u32 fflags_from_resource(struct rdt_resource *r)
>>>> +{
>>>> +    switch (r->schema_fmt) {
>>>> +    case RESCTRL_SCHEMA_BITMAP:
>>>> +        return RFTYPE_RES_CACHE;
>>>> +    case RESCTRL_SCHEMA_PERCENTAGE:
>>>> +    case RESCTRL_SCHEMA_MBPS:
>>>> +        return RFTYPE_RES_MB;
>>>> +    }
>>>> +
>>>> +    return WARN_ON_ONCE(1);
>>>> +}
>>>> +
>>>
>>> The fflags returned specifies which files will be associated with the resource
>>> in the "info" directory. Basing this on a property of the schema does not look
>>> right to me. I understand that many of the info files relate to, for example,
>>> information related to the bitmap used by the cache,
>>
>> Do we agree that some of them are?
>>
>> One reason for doing this is it decouples the parsing and management of bitmaps from "this
>> is the L3 cache", which will make it much easier to support bitmaps on some other kind of
>> resource.

> The way I see it is that it changes the meaning of the RFTYPE_RES_CACHE flag from "this is a
> file related to the cache resource" to "this is a file containing a bitmap property".
> It prevents us from easily adding a file related to the cache resource, which
> the info directory is intended to contain.

I struggled to find something that is a property of a "cache control", but is neither a
property of the control (e.g. bitmap size) or the cache. I guess the 'bit_usage' stuff is
the best example.

Maybe we end up with two sets of flags - this will be for the distant future. Currently I
taking your 'base fflags on resource id'.


>> Ultimately I'd like to expose these to user-space, so that user-space can work out how to
>> configure resources it doesn't recognise. Today '100' could be a percentage, a bitmap, or
>> a value in MB/s. Today some knowledge of the control type is needed to work this out.
>>
>>
>>> but that is not the same for
>>> info files related to the MBA resource (all info files related to MBA resource
>>> are not about the schema property format).
>>
>> Hmmm, because the files min_bandwidth and bandwidth_gran both have bandwidth in their name?
>>
>> I agree 'delay_linear' and 'thread_throttle_mode' are a bit strange.
> 
> Right. This is not a clean association.
> 
>>
>>
>>> I do not think the type of values of a schema should dictate which files
>>> appear in the info directory.
>>
>> Longer term I think this will be a problem. We probably only have 3 types of control:
>> percentage, bitmap and MB/s... but if each resource on each architecture adds files here
>> the list will quickly grow. User-space won't be able to work out how to configure a
>> resource type it hadn't seen before.
> 
> That is fair. This makes the type of control a property of the resource as is done in this
> series. Perhaps this can be exposed to user space via the info directory?

Yes, that is something I intend to look at. I eventually need to get MPAM's "cache
capacity" controls working as there are a number of hardware platforms that have it. This
would probably be a percentage control for 'L2' or 'L3', exposing an "info/schema_format"
file makes the most sense. I can't convert the existing bitmap as it implies isolation,
which this control format can't do, so it does need to be separate.
But! - to prevent confusing existing software, I don't think the L2/L3 should be touched -
those will forever have to be implicitly a bitmap, so anything in this area would have to
be an additional schema.


> Possibly the files related to control can have new flags that that reflect the control type
> instead of the resource. For example, "bit_usage" currently has
> "RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE" and that could be (for lack of better
> term) "RFTYPE_CTRL_INFO | RFTYPE_CTRL_BITMAP" to disconnect the control type from the
> resource. Doing so may then map nicely to the fflags_from_resource() in this patch that
> connects the schema format to the _control_ type flag. As we have found there is not
> a clear mapping between the control type and the resource type so I expect RFTYPE_RES_CACHE
> and RFTYPE_RES_MB to remain and be associated with files that contain information
> specific to that resource. This enables future additions of files containing cache specific
> (non-bitmap) properties to still be added (with RFTYPE_RES_CACHE flag) without impacting
> everything that uses a bitmap.
> 
> What do you think?

Makes sense!


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
  2024-07-08 17:47       ` Reinette Chatre
@ 2024-08-02 17:28         ` James Morse
  0 siblings, 0 replies; 75+ messages in thread
From: James Morse @ 2024-08-02 17:28 UTC (permalink / raw)
  To: Reinette Chatre, x86, linux-kernel
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
	D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
	tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
	dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
	Shaopeng Tan

Hi Reinette,

On 08/07/2024 18:47, Reinette Chatre wrote:
> On 7/4/24 9:41 AM, James Morse wrote:
>> On 28/06/2024 17:53, Reinette Chatre wrote:
>>> On 6/14/24 8:00 AM, James Morse wrote:
>>>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>>>> resctrl can't be built as a module, and the kernfs helpers are not exported
>>>> so this is unlikely to change. MPAM has an error interrupt which indicates
>>>> the MPAM driver has gone haywire. Should this occur tasks could run with
>>>> the wrong control values, leading to bad performance for impoartant tasks.
>>>
>>> impoartant -> important
>>>
>>>> The MPAM driver needs a way to tell resctrl that no further configuration
>>>> should be attempted.
>>>>
>>>> Using resctrl_exit() for this leaves the system in a funny state as
>>>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>>>> directory that is typically used has been removed. Dave Martin suggests
>>>> this may cause systemd trouble in the future as not all filesystems
>>>> can be unmounted.
>>>>
>>>> Add calls to remove all the files and directories in resctrl, and
>>>> remove the sysfs_remove_mount_point() call that leaves the system
>>>> in a funny state. When triggered, this causes all the resctrl files
>>>> to disappear. resctrl can be unmounted, but not mounted again.
>>
>>> I am not familiar with these flows so I would like to confirm ...
>>> In this scenario the resctrl filesystem will be unregistered, are
>>> you saying that it is possible to unmount a filesystem after it has
>>> been unregistered?
>>
>> Counter-intuitively: yes.
>>
>> The rules are described in fs/filesystems.c: We can access the members of the struct
>> file_system_type if the list lock is held, or a reference is held to the module. This is
>> how /proc/mounts is able to print the filesystem name from struct file_system_type without
>> taking the lock - it holds a reference to any module to prevent the structure from being
> 
> hmmm ... does this mean I am supposed to find calls to try_module_get() in the flow from
> mounts_open_common()?

There may be, but when a filesystem is mounted the code in super.c holds a reference to
the filesystem - which translates to a reference on the module/filesystem->owner.

My point was only that its possible to unregister a filesystem while its mounted. The
reference counting takes care of this - and is unnecessary in our case.


>> freed. Because resctrl can't be built as a module, we can say there is always a reference
>> held, and we can never free struct file_system_type.
> 
> unregister_filesystem() continues to be called and as I understand in new MPAM usages will be
> called during runtime. unregister_filesystem() comments state "Once this function has
> returned
> the &struct file_system_type structure may be freed or reused.". Could you please
> highlight to me
> what gives the confidence of "we can say there is always a reference held"? Could you please
> point to me where that reference is obtained that will prevent the structure from being
> freed?

I think we are rat-holing on something that doesn't matter:
 * resctrl can't be built as a module - it is always built in.
 * rdt_fs_type is therefore part of the kernel data section - it can't be freed.
 * likewise the code that is part of resctrl can't be freed either.


Thanks,

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2024-08-02 17:28 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-14 14:59 [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
2024-06-14 14:59 ` [PATCH v3 01/38] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
2024-06-28 16:41   ` Reinette Chatre
2024-06-14 14:59 ` [PATCH v3 02/38] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
2024-06-28 16:42   ` Reinette Chatre
2024-06-14 14:59 ` [PATCH v3 03/38] x86/resctrl: Add a schema format enum and use this for fflags James Morse
2024-06-28 16:43   ` Reinette Chatre
2024-07-01 18:17     ` James Morse
2024-07-01 21:09       ` Reinette Chatre
2024-08-02 17:24         ` James Morse
2024-06-14 14:59 ` [PATCH v3 04/38] x86/resctrl: Use schema type to determine how to parse schema values James Morse
2024-06-28 16:43   ` Reinette Chatre
2024-06-14 15:00 ` [PATCH v3 05/38] x86/resctrl: Use schema type to determine the schema format string James Morse
2024-06-28 16:43   ` Reinette Chatre
2024-06-14 15:00 ` [PATCH v3 06/38] x86/resctrl: Move data_width to be a schema property James Morse
2024-06-28 16:45   ` Reinette Chatre
2024-06-14 15:00 ` [PATCH v3 07/38] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
2024-06-14 15:00 ` [PATCH v3 08/38] x86/resctrl: Generate default_ctrl instead of sharing it James Morse
2024-06-14 15:00 ` [PATCH v3 09/38] x86/resctrl: Add helper for setting CPU default properties James Morse
2024-06-14 15:00 ` [PATCH v3 10/38] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid() James Morse
2024-06-14 15:00 ` [PATCH v3 11/38] x86/resctrl: Export resctrl fs's init function James Morse
2024-06-14 15:00 ` [PATCH v3 12/38] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain() James Morse
2024-06-14 15:00 ` [PATCH v3 13/38] x86/resctrl: Move resctrl types to a separate header James Morse
2024-06-28 16:45   ` Reinette Chatre
2024-07-01 18:16     ` James Morse
2024-06-14 15:00 ` [PATCH v3 14/38] x86/resctrl: Add a resctrl helper to reset all the resources James Morse
2024-06-14 15:00 ` [PATCH v3 15/38] x86/resctrl: Move monitor exit work to a restrl exit call James Morse
2024-06-28 16:46   ` Reinette Chatre
2024-07-01 18:17     ` James Morse
2024-07-11 21:12   ` Carl Worth
2024-06-14 15:00 ` [PATCH v3 16/38] x86/resctrl: Move monitor init work to a resctrl init call James Morse
2024-06-28 16:47   ` Reinette Chatre
2024-07-01 18:17     ` James Morse
2024-07-01 21:11       ` Reinette Chatre
2024-08-02 17:23         ` James Morse
2024-06-14 15:00 ` [PATCH v3 17/38] x86/resctrl: Stop using the for_each_*_rdt_resource() walkers James Morse
2024-06-28 16:48   ` Reinette Chatre
2024-07-01 18:16     ` James Morse
2024-07-01 21:10       ` Reinette Chatre
2024-08-02 17:22         ` James Morse
2024-06-14 15:00 ` [PATCH v3 18/38] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h James Morse
2024-06-14 15:00 ` [PATCH v3 19/38] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC James Morse
2024-06-14 15:00 ` [PATCH v3 20/38] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
2024-06-28 16:49   ` Reinette Chatre
2024-06-14 15:00 ` [PATCH v3 21/38] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
2024-06-28 16:53   ` Reinette Chatre
2024-06-14 15:00 ` [PATCH v3 22/38] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions James Morse
2024-06-14 15:00 ` [PATCH v3 23/38] x86/resctrl: Allow an architecture to disable pseudo lock James Morse
2024-07-11 21:33   ` Carl Worth
2024-08-02 17:22     ` James Morse
2024-06-14 15:00 ` [PATCH v3 24/38] x86/resctrl: Make prefetch_disable_bits belong to the arch code James Morse
2024-06-14 15:00 ` [PATCH v3 25/38] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr James Morse
2024-06-14 15:00 ` [PATCH v3 26/38] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl James Morse
2024-06-14 15:00 ` [PATCH v3 27/38] x86/resctrl: Move get_config_index() to a header James Morse
2024-06-14 15:00 ` [PATCH v3 28/38] x86/resctrl: Claim get_domain_from_cpu() for resctrl James Morse
2024-06-14 15:00 ` [PATCH v3 29/38] x86/resctrl: Describe resctrl's bitmap size assumptions James Morse
2024-06-14 15:00 ` [PATCH v3 30/38] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" James Morse
2024-06-14 15:00 ` [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
2024-06-28 16:53   ` Reinette Chatre
2024-07-04 16:41     ` James Morse
2024-07-08 17:47       ` Reinette Chatre
2024-08-02 17:28         ` James Morse
2024-06-14 15:00 ` [PATCH v3 32/38] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
2024-06-14 15:00 ` [PATCH v3 33/38] x86/resctrl: Move is_mba_sc() out of core.c James Morse
2024-06-14 15:00 ` [PATCH v3 34/38] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
2024-06-14 15:00 ` [PATCH v3 35/38] x86/resctrl: Remove a newline to avoid confusing the code move script James Morse
2024-06-14 15:00 ` [PATCH v3 36/38] fs/resctrl: Add boiler plate for external resctrl code James Morse
2024-06-28 16:54   ` Reinette Chatre
2024-07-04 16:40     ` James Morse
2024-07-08 17:47       ` Reinette Chatre
2024-06-14 15:00 ` [PATCH v3 37/38] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
2024-06-28 17:04   ` Reinette Chatre
2024-06-14 15:00 ` [PATCH v3 38/38] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
2024-07-11 22:00 ` [PATCH v3 00/38] x86/resctrl: Move the resctrl filesystem " Carl Worth
2024-08-02 17:22   ` James Morse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox