* [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
@ 2024-10-04 18:03 James Morse
2024-10-04 18:03 ` [PATCH v5 01/40] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
` (43 more replies)
0 siblings, 44 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin
Changes since v4?:
* Dropped the percentage/mbps distinction, this can be future cleanup as I
think the difference matters to user-space. These are both treated as a
'range'.
* Picked a pre-requisite cleanup patch from Christophe to make merging
easier.
* More of the __init/__exit stuff has consolodated in the patch that removes
them from filesystem code.
Regardless, changes are noted on each patch.
~
This is the final series that allows other architectures to implement resctrl.
The final patch to move the code has been omited, but can be generated using
the python script at the end of the series.
The final move is a bit of a monster. I don't expect that to get merged as part
of this series - we should wait for it to make less impact on other series.
Otherwise this series renames functions and moves code around. With the
exception of invalid configurations for the configurable-events, there should
be no changes in behaviour caused by this series.
The driving pattern is to make things like struct rdtgroup private to resctrl.
Features like pseudo-lock aren't going to work on arm64, the ability to disable
it at compile time is added.
After this, I can start posting the MPAM driver to make use of resctrl on arm64.
(What's MPAM? See the cover letter of the first series. [1])
This series is based on v6.12-rc1 and can be retrieved from:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/move_to_fs/v5
As ever - bugs welcome,
Thanks,
James
[v4] https://lore.kernel.org/all/20240802172853.22529-1-james.morse@arm.com/
[v3] https://lore.kernel.org/r/20240614150033.10454-1-james.morse@arm.com
[v2] https://lore.kernel.org/r/20240426150537.8094-1-Dave.Martin@arm.com
[v1] https://lore.kernel.org/r/20240321165106.31602-1-james.morse@arm.com
[1] https://lore.kernel.org/lkml/20201030161120.227225-1-james.morse@arm.com/
Christophe JAILLET (1):
x86/resctrl: Slightly clean-up mbm_config_show()
James Morse (39):
x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no
monitors
x86/resctrl: Add a helper to avoid reaching into the arch code
resource list
x86/resctrl: Remove fflags from struct rdt_resource
x86/resctrl: Use schema type to determine how to parse schema values
x86/resctrl: Use schema type to determine the schema format string
x86/resctrl: Remove data_width and the tabular format
x86/resctrl: Add max_bw to struct resctrl_membw
x86/resctrl: Generate default_ctrl instead of sharing it
x86/resctrl: Add helper for setting CPU default properties
x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid()
x86/resctrl: Export resctrl fs's init function
x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain()
x86/resctrl: Move resctrl types to a separate header
x86/resctrl: Add a resctrl helper to reset all the resources
x86/resctrl: Move monitor exit work to a resctrl exit call
x86/resctrl: Move monitor init work to a resctrl init call
x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers
x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h
x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC
x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
x86/resctrl: Move mbm_cfg_mask to struct rdt_resource
x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions
x86/resctrl: Allow an architecture to disable pseudo lock
x86/resctrl: Make prefetch_disable_bits belong to the arch code
x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr
x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl
x86/resctrl: Move get_config_index() to a header
x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for
resctrl
x86/resctrl: Describe resctrl's bitmap size assumptions
x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_"
x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
x86/resctrl: Drop __init/__exit on assorted symbols
x86/resctrl: Move is_mba_sc() out of core.c
x86/resctrl: Add end-marker to the resctrl_event_id enum
x86/resctrl: Remove a newline to avoid confusing the code move script
x86/resctrl: Split trace.h
fs/resctrl: Add boiler plate for external resctrl code
x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
x86/resctrl: Add python script to move resctrl code to /fs/resctrl
MAINTAINERS | 2 +
arch/Kconfig | 8 +
arch/x86/Kconfig | 12 +-
arch/x86/include/asm/resctrl.h | 45 +-
arch/x86/kernel/cpu/resctrl/Makefile | 8 +-
arch/x86/kernel/cpu/resctrl/core.c | 170 ++--
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 67 +-
arch/x86/kernel/cpu/resctrl/internal.h | 217 ++---
arch/x86/kernel/cpu/resctrl/monitor.c | 104 ++-
arch/x86/kernel/cpu/resctrl/monitor_trace.h | 31 +
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 60 +-
.../resctrl/{trace.h => pseudo_lock_trace.h} | 24 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 268 ++++--
arch/x86/kernel/process_32.c | 2 +-
arch/x86/kernel/process_64.c | 2 +-
fs/Kconfig | 1 +
fs/Makefile | 1 +
fs/resctrl/Kconfig | 37 +
fs/resctrl/Makefile | 6 +
fs/resctrl/ctrlmondata.c | 0
fs/resctrl/internal.h | 0
fs/resctrl/monitor.c | 0
fs/resctrl/monitor_trace.h | 0
fs/resctrl/pseudo_lock.c | 0
fs/resctrl/pseudo_lock_trace.h | 0
fs/resctrl/rdtgroup.c | 0
include/linux/resctrl.h | 239 +++++-
include/linux/resctrl_types.h | 59 ++
resctrl_copy_pasta.py | 779 ++++++++++++++++++
29 files changed, 1638 insertions(+), 504 deletions(-)
create mode 100644 arch/x86/kernel/cpu/resctrl/monitor_trace.h
rename arch/x86/kernel/cpu/resctrl/{trace.h => pseudo_lock_trace.h} (56%)
create mode 100644 fs/resctrl/Kconfig
create mode 100644 fs/resctrl/Makefile
create mode 100644 fs/resctrl/ctrlmondata.c
create mode 100644 fs/resctrl/internal.h
create mode 100644 fs/resctrl/monitor.c
create mode 100644 fs/resctrl/monitor_trace.h
create mode 100644 fs/resctrl/pseudo_lock.c
create mode 100644 fs/resctrl/pseudo_lock_trace.h
create mode 100644 fs/resctrl/rdtgroup.c
create mode 100644 include/linux/resctrl_types.h
create mode 100644 resctrl_copy_pasta.py
--
2.39.2
^ permalink raw reply [flat|nested] 102+ messages in thread
* [PATCH v5 01/40] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 02/40] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
` (42 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
commit 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by
searching closid_num_dirty_rmid") added logic that causes resctrl to
search for the CLOSID with the fewest dirty cache lines when creating a
new control group, if requested by the arch code. This depends on the
values read from the llc_occupancy counters. The logic is applicable to
architectures where the CLOSID effectively forms part of the monitoring
identifier and so do not allow complete freedom to choose an unused
monitoring identifier for a given CLOSID.
This support missed that some platforms may not have these counters.
This causes a NULL pointer dereference when creating a new control
group as the array was not allocated by dom_data_init().
As this feature isn't necessary on platforms that don't have cache
occupancy monitors, add this to the check that occurs when a new
control group is allocated.
Fixes: 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid")
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
The existing code is not selected by any upstream platform, it makes
no sense to backport this patch to stable.
Changes since v1:
* [Commit message only] Reword the first paragraph to make it clear
that the issue being fixed wasn't directly associated with addition
of a Kconfig option. (Actually, the option is not in Kconfig yet,
and gets added later in this series.)
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index d7163b764c62..2d48db66fca8 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -148,7 +148,8 @@ static int closid_alloc(void)
lockdep_assert_held(&rdtgroup_mutex);
- if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) {
+ if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID) &&
+ is_llc_occupancy_enabled()) {
cleanest_closid = resctrl_find_cleanest_closid();
if (cleanest_closid < 0)
return cleanest_closid;
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 02/40] x86/resctrl: Add a helper to avoid reaching into the arch code resource list
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
2024-10-04 18:03 ` [PATCH v5 01/40] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-15 22:57 ` Tony Luck
2024-10-23 21:03 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 03/40] x86/resctrl: Remove fflags from struct rdt_resource James Morse
` (41 subsequent siblings)
43 siblings, 2 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Dave Martin, Shaopeng Tan
Resctrl occasionally wants to know something about a specific resource,
in these cases it reaches into the arch code's rdt_resources_all[]
array.
Once the filesystem parts of resctrl are moved to /fs/, this means it
will need visibility of the architecture specific struct
rdt_hw_resource definition, and the array of all resources. All
architectures would also need a r_resctrl member in this struct.
Instead, abstract this via a helper to allow architectures to do
different things here. Move the level enum to the resctrl header and
add a helper to retrieve the struct rdt_resource by 'rid'.
resctrl_arch_get_resource() should not return NULL for any value in
the enum, it may instead return a dummy resource that is
!alloc_enabled && !mon_enabled.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since v1:
* Backed out non-functional renaming of "r" to "l3" in rdt_get_tree(),
and unhoisted the assignment of r (as now is) back into the if ()
where it started out. There seem to be no uses of this variable
outside this if().
* [Commit message only] Typo fix:
s/resctrl_hw_resource/rdt_hw_resource/g
---
arch/x86/kernel/cpu/resctrl/core.c | 10 +++++++++-
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 10 ----------
arch/x86/kernel/cpu/resctrl/monitor.c | 8 ++++----
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 14 +++++++-------
include/linux/resctrl.h | 17 +++++++++++++++++
6 files changed, 38 insertions(+), 23 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 8591d53c144b..12af2adf371c 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -127,6 +127,14 @@ u32 resctrl_arch_system_num_rmid_idx(void)
return r->num_rmid;
}
+struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
+{
+ if (l >= RDT_NUM_RESOURCES)
+ return NULL;
+
+ return &rdt_resources_all[l].r_resctrl;
+}
+
/*
* cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs
* as they do not have CPUID enumeration support for Cache allocation.
@@ -174,7 +182,7 @@ static inline void cache_alloc_hsw_probe(void)
bool is_mba_sc(struct rdt_resource *r)
{
if (!r)
- return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.mba_sc;
+ r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
/*
* The software controller support is only applicable to MBA resource.
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 50fa1fe9a073..e078bfe3840d 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -574,7 +574,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
resid = md.u.rid;
domid = md.u.domid;
evtid = md.u.evtid;
- r = &rdt_resources_all[resid].r_resctrl;
+ r = resctrl_arch_get_resource(resid);
if (md.u.sum) {
/*
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 955999aecfca..b5a34a3fa599 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -509,16 +509,6 @@ extern struct rdt_hw_resource rdt_resources_all[];
extern struct rdtgroup rdtgroup_default;
extern struct dentry *debugfs_resctrl;
-enum resctrl_res_level {
- RDT_RESOURCE_L3,
- RDT_RESOURCE_L2,
- RDT_RESOURCE_MBA,
- RDT_RESOURCE_SMBA,
-
- /* Must be the last */
- RDT_NUM_RESOURCES,
-};
-
static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(res);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 851b561850e0..00d906a1f51c 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -365,7 +365,7 @@ static void limbo_release_entry(struct rmid_entry *entry)
*/
void __check_limbo(struct rdt_mon_domain *d, bool force_free)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
u32 idx_limit = resctrl_arch_system_num_rmid_idx();
struct rmid_entry *entry;
u32 idx, cur_idx = 1;
@@ -521,7 +521,7 @@ int alloc_rmid(u32 closid)
static void add_rmid_to_limbo(struct rmid_entry *entry)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
struct rdt_mon_domain *d;
u32 idx;
@@ -760,7 +760,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
if (!is_mbm_local_enabled())
return;
- r_mba = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
+ r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
closid = rgrp->closid;
rmid = rgrp->mon.rmid;
@@ -929,7 +929,7 @@ void mbm_handle_overflow(struct work_struct *work)
if (!resctrl_mounted || !resctrl_arch_mon_capable())
goto out_unlock;
- r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
d = container_of(work, struct rdt_mon_domain, mbm_over.work);
list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 2d48db66fca8..6225d0b7e9ee 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2251,7 +2251,7 @@ static void l2_qos_cfg_update(void *arg)
static inline bool is_mba_linear(void)
{
- return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.delay_linear;
+ return resctrl_arch_get_resource(RDT_RESOURCE_MBA)->membw.delay_linear;
}
static int set_cache_qos_cfg(int level, bool enable)
@@ -2341,8 +2341,8 @@ static void mba_sc_domain_destroy(struct rdt_resource *r,
*/
static bool supports_mba_mbps(void)
{
- struct rdt_resource *rmbm = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
+ struct rdt_resource *rmbm = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
return (is_mbm_local_enabled() &&
r->alloc_capable && is_mba_linear() &&
@@ -2355,7 +2355,7 @@ static bool supports_mba_mbps(void)
*/
static int set_mba_sc(bool mba_sc)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
u32 num_closid = resctrl_arch_get_num_closid(r);
struct rdt_ctrl_domain *d;
int i;
@@ -2703,7 +2703,7 @@ static int rdt_get_tree(struct fs_context *fc)
resctrl_mounted = true;
if (is_mbm_enabled()) {
- r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
list_for_each_entry(dom, &r->mon_domains, hdr.list)
mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL,
RESCTRL_PICK_ANY_CPU);
@@ -3938,7 +3938,7 @@ static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2))
seq_puts(seq, ",cdpl2");
- if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl))
+ if (is_mba_sc(resctrl_arch_get_resource(RDT_RESOURCE_MBA)))
seq_puts(seq, ",mba_MBps");
if (resctrl_debug)
@@ -4138,7 +4138,7 @@ static void clear_childcpus(struct rdtgroup *r, unsigned int cpu)
void resctrl_offline_cpu(unsigned int cpu)
{
- struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ struct rdt_resource *l3 = resctrl_arch_get_resource(RDT_RESOURCE_L3);
struct rdt_mon_domain *d;
struct rdtgroup *rdtgrp;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index d94abba1c716..37279e2a89da 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -37,6 +37,16 @@ enum resctrl_conf_type {
CDP_DATA,
};
+enum resctrl_res_level {
+ RDT_RESOURCE_L3,
+ RDT_RESOURCE_L2,
+ RDT_RESOURCE_MBA,
+ RDT_RESOURCE_SMBA,
+
+ /* Must be the last */
+ RDT_NUM_RESOURCES,
+};
+
#define CDP_NUM_TYPES (CDP_DATA + 1)
/*
@@ -226,6 +236,13 @@ struct rdt_resource {
bool cdp_capable;
};
+/*
+ * Get the resource that exists at this level. If the level is not supported
+ * a dummy/not-capable resource can be returned. Levels >= RDT_NUM_RESOURCES
+ * will return NULL.
+ */
+struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
+
/**
* struct resctrl_schema - configuration abilities of a resource presented to
* user-space
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 03/40] x86/resctrl: Remove fflags from struct rdt_resource
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
2024-10-04 18:03 ` [PATCH v5 01/40] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
2024-10-04 18:03 ` [PATCH v5 02/40] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 21:03 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 04/40] x86/resctrl: Use schema type to determine how to parse schema values James Morse
` (40 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
The resctrl arch code specifies whether a resource controls a cache or
memory using the fflags field. This field is then used by resctrl to
determine which files should be exposed in the filesystem.
Allowing the architecture to pick this value means the RFTYPE_
flags have to be in a shared header, and allows an architecture
to create a combination that resctrl does not support.
Remove the fflags field, and pick the value based on the resource
id.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v4:
* Removed an extra space
* Fixed a typo
---
arch/x86/kernel/cpu/resctrl/core.c | 4 ----
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 18 ++++++++++++++++--
include/linux/resctrl.h | 2 --
3 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 12af2adf371c..a508433ff354 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -74,7 +74,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.mon_domains = mon_domain_init(RDT_RESOURCE_L3),
.parse_ctrlval = parse_cbm,
.format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
.msr_base = MSR_IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
@@ -88,7 +87,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_L2),
.parse_ctrlval = parse_cbm,
.format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
.msr_base = MSR_IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
@@ -102,7 +100,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_MBA),
.parse_ctrlval = parse_bw,
.format_str = "%d=%*u",
- .fflags = RFTYPE_RES_MB,
},
},
[RDT_RESOURCE_SMBA] =
@@ -114,7 +111,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_SMBA),
.parse_ctrlval = parse_bw,
.format_str = "%d=%*u",
- .fflags = RFTYPE_RES_MB,
},
},
};
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 6225d0b7e9ee..2abe17574407 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2160,6 +2160,20 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
return ret;
}
+static u32 fflags_from_resource(struct rdt_resource *r)
+{
+ switch (r->rid) {
+ case RDT_RESOURCE_L3:
+ case RDT_RESOURCE_L2:
+ return RFTYPE_RES_CACHE;
+ case RDT_RESOURCE_MBA:
+ case RDT_RESOURCE_SMBA:
+ return RFTYPE_RES_MB;
+ }
+
+ return WARN_ON_ONCE(1);
+}
+
static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
{
struct resctrl_schema *s;
@@ -2180,14 +2194,14 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
/* loop over enabled controls, these are all alloc_capable */
list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
- fflags = r->fflags | RFTYPE_CTRL_INFO;
+ fflags = fflags_from_resource(r) | RFTYPE_CTRL_INFO;
ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
if (ret)
goto out_destroy;
}
for_each_mon_capable_rdt_resource(r) {
- fflags = r->fflags | RFTYPE_MON_INFO;
+ fflags = fflags_from_resource(r) | RFTYPE_MON_INFO;
sprintf(name, "%s_MON", r->name);
ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
if (ret)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 37279e2a89da..496ddcaa4ecf 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -210,7 +210,6 @@ enum resctrl_scope {
* @format_str: Per resource format string to show domain value
* @parse_ctrlval: Per resource function pointer to parse control values
* @evt_list: List of monitoring events
- * @fflags: flags to choose base and info files
* @cdp_capable: Is the CDP feature available on this resource
*/
struct rdt_resource {
@@ -232,7 +231,6 @@ struct rdt_resource {
struct resctrl_schema *s,
struct rdt_ctrl_domain *d);
struct list_head evt_list;
- unsigned long fflags;
bool cdp_capable;
};
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 04/40] x86/resctrl: Use schema type to determine how to parse schema values
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (2 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 03/40] x86/resctrl: Remove fflags from struct rdt_resource James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-15 23:15 ` Tony Luck
2024-10-23 21:14 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 05/40] x86/resctrl: Use schema type to determine the schema format string James Morse
` (39 subsequent siblings)
43 siblings, 2 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Resctrl's architecture code gets to specify a function pointer that is
used when parsing schema entries. This is expected to be one of two
helpers from the filesystem code.
Setting this function pointer allows the architecture code to change
the ABI resctrl presents to user-space, and forces resctrl to expose
these helpers.
Instead, add a schema format enum to choose which schema parser to
use. This allows the helpers to be made static and the structs used
for passing arguments moved out of shared headers.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v4:
* Creation of the enum moves into this patch - review tags not picked up.
* Removed some whitespace.
Changes since v3:
* Removed a spurious semicolon
Changes since v2:
* This patch is new
---
arch/x86/kernel/cpu/resctrl/core.c | 8 +++---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 32 +++++++++++++++++++----
arch/x86/kernel/cpu/resctrl/internal.h | 10 -------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
include/linux/resctrl.h | 18 +++++++++----
5 files changed, 45 insertions(+), 25 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index a508433ff354..0a05df02d2ed 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -72,7 +72,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.mon_scope = RESCTRL_L3_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_L3),
.mon_domains = mon_domain_init(RDT_RESOURCE_L3),
- .parse_ctrlval = parse_cbm,
+ .schema_fmt = RESCTRL_SCHEMA_BITMAP,
.format_str = "%d=%0*x",
},
.msr_base = MSR_IA32_L3_CBM_BASE,
@@ -85,7 +85,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.name = "L2",
.ctrl_scope = RESCTRL_L2_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_L2),
- .parse_ctrlval = parse_cbm,
+ .schema_fmt = RESCTRL_SCHEMA_BITMAP,
.format_str = "%d=%0*x",
},
.msr_base = MSR_IA32_L2_CBM_BASE,
@@ -98,7 +98,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.name = "MB",
.ctrl_scope = RESCTRL_L3_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_MBA),
- .parse_ctrlval = parse_bw,
+ .schema_fmt = RESCTRL_SCHEMA_RANGE,
.format_str = "%d=%*u",
},
},
@@ -109,7 +109,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.name = "SMBA",
.ctrl_scope = RESCTRL_L3_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_SMBA),
- .parse_ctrlval = parse_bw,
+ .schema_fmt = RESCTRL_SCHEMA_RANGE,
.format_str = "%d=%*u",
},
},
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index e078bfe3840d..a042e234f4f8 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -23,6 +23,15 @@
#include "internal.h"
+struct rdt_parse_data {
+ struct rdtgroup *rdtgrp;
+ char *buf;
+};
+
+typedef int (ctrlval_parser_t)(struct rdt_parse_data *data,
+ struct resctrl_schema *s,
+ struct rdt_ctrl_domain *d);
+
/*
* Check whether MBA bandwidth percentage value is correct. The value is
* checked against the minimum and max bandwidth values specified by the
@@ -59,8 +68,8 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
return true;
}
-int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
- struct rdt_ctrl_domain *d)
+static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
+ struct rdt_ctrl_domain *d)
{
struct resctrl_staged_config *cfg;
u32 closid = data->rdtgrp->closid;
@@ -138,8 +147,8 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
*/
-int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
- struct rdt_ctrl_domain *d)
+static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
+ struct rdt_ctrl_domain *d)
{
struct rdtgroup *rdtgrp = data->rdtgrp;
struct resctrl_staged_config *cfg;
@@ -195,6 +204,18 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
return 0;
}
+static ctrlval_parser_t *get_parser(struct rdt_resource *r)
+{
+ switch (r->schema_fmt) {
+ case RESCTRL_SCHEMA_BITMAP:
+ return &parse_cbm;
+ case RESCTRL_SCHEMA_RANGE:
+ return &parse_bw;
+ }
+
+ return NULL;
+}
+
/*
* For each domain in this resource we expect to find a series of:
* id=mask
@@ -204,6 +225,7 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
static int parse_line(char *line, struct resctrl_schema *s,
struct rdtgroup *rdtgrp)
{
+ ctrlval_parser_t *parse_ctrlval = get_parser(s->res);
enum resctrl_conf_type t = s->conf_type;
struct resctrl_staged_config *cfg;
struct rdt_resource *r = s->res;
@@ -235,7 +257,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
if (d->hdr.id == dom_id) {
data.buf = dom;
data.rdtgrp = rdtgrp;
- if (r->parse_ctrlval(&data, s, d))
+ if (parse_ctrlval(&data, s, d))
return -EINVAL;
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
cfg = &d->staged_config[t];
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index b5a34a3fa599..ffcade365070 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -457,11 +457,6 @@ static inline bool is_mbm_event(int e)
e <= QOS_L3_MBM_LOCAL_EVENT_ID);
}
-struct rdt_parse_data {
- struct rdtgroup *rdtgrp;
- char *buf;
-};
-
/**
* struct rdt_hw_resource - arch private attributes of a resctrl resource
* @r_resctrl: Attributes of the resource used directly by resctrl.
@@ -498,11 +493,6 @@ static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r
return container_of(r, struct rdt_hw_resource, r_resctrl);
}
-int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
- struct rdt_ctrl_domain *d);
-int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
- struct rdt_ctrl_domain *d);
-
extern struct mutex rdtgroup_mutex;
extern struct rdt_hw_resource rdt_resources_all[];
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 2abe17574407..11153271cbdc 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2201,7 +2201,7 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
}
for_each_mon_capable_rdt_resource(r) {
- fflags = fflags_from_resource(r) | RFTYPE_MON_INFO;
+ fflags = fflags_from_resource(r) | RFTYPE_MON_INFO;
sprintf(name, "%s_MON", r->name);
ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
if (ret)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 496ddcaa4ecf..54ec87339038 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -183,7 +183,6 @@ struct resctrl_membw {
u32 *mb_map;
};
-struct rdt_parse_data;
struct resctrl_schema;
enum resctrl_scope {
@@ -192,6 +191,17 @@ enum resctrl_scope {
RESCTRL_L3_NODE,
};
+/**
+ * enum resctrl_schema_fmt - The format user-space provides for a schema.
+ * @RESCTRL_SCHEMA_BITMAP: The schema is a bitmap in hex.
+ * @RESCTRL_SCHEMA_RANGE: The schema is a number, either a percentage
+ * or a MBps value.
+ */
+enum resctrl_schema_fmt {
+ RESCTRL_SCHEMA_BITMAP,
+ RESCTRL_SCHEMA_RANGE,
+};
+
/**
* struct rdt_resource - attributes of a resctrl resource
* @rid: The index of the resource
@@ -208,7 +218,7 @@ enum resctrl_scope {
* @data_width: Character width of data when displaying
* @default_ctrl: Specifies default cache cbm or memory B/W percent.
* @format_str: Per resource format string to show domain value
- * @parse_ctrlval: Per resource function pointer to parse control values
+ * @schema_fmt: Which format string and parser is used for this schema.
* @evt_list: List of monitoring events
* @cdp_capable: Is the CDP feature available on this resource
*/
@@ -227,9 +237,7 @@ struct rdt_resource {
int data_width;
u32 default_ctrl;
const char *format_str;
- int (*parse_ctrlval)(struct rdt_parse_data *data,
- struct resctrl_schema *s,
- struct rdt_ctrl_domain *d);
+ enum resctrl_schema_fmt schema_fmt;
struct list_head evt_list;
bool cdp_capable;
};
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 05/40] x86/resctrl: Use schema type to determine the schema format string
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (3 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 04/40] x86/resctrl: Use schema type to determine how to parse schema values James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-21 17:39 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 06/40] x86/resctrl: Remove data_width and the tabular format James Morse
` (38 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Resctrl's architecture code gets to specify a format string that is
used when printing schema entries. This is expected to be one of two
values that the filesystem code supports.
Setting this format string allows the architecture code to change
the ABI resctrl presents to user-space.
Instead, use the schema format enum to choose which format string to
use.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
Change since v4:
* Added a stop to a struct comment.
Changes since v2:
* This patch is new.
---
arch/x86/kernel/cpu/resctrl/core.c | 4 ----
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 9 +++++++++
include/linux/resctrl.h | 4 ++--
4 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 0a05df02d2ed..2a7f0f92c632 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -73,7 +73,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_L3),
.mon_domains = mon_domain_init(RDT_RESOURCE_L3),
.schema_fmt = RESCTRL_SCHEMA_BITMAP,
- .format_str = "%d=%0*x",
},
.msr_base = MSR_IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
@@ -86,7 +85,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.ctrl_scope = RESCTRL_L2_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_L2),
.schema_fmt = RESCTRL_SCHEMA_BITMAP,
- .format_str = "%d=%0*x",
},
.msr_base = MSR_IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
@@ -99,7 +97,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.ctrl_scope = RESCTRL_L3_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_MBA),
.schema_fmt = RESCTRL_SCHEMA_RANGE,
- .format_str = "%d=%*u",
},
},
[RDT_RESOURCE_SMBA] =
@@ -110,7 +107,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.ctrl_scope = RESCTRL_L3_CACHE,
.ctrl_domains = ctrl_domain_init(RDT_RESOURCE_SMBA),
.schema_fmt = RESCTRL_SCHEMA_RANGE,
- .format_str = "%d=%*u",
},
},
};
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index a042e234f4f8..71881f902728 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -482,7 +482,7 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
ctrl_val = resctrl_arch_get_config(r, dom, closid,
schema->conf_type);
- seq_printf(s, r->format_str, dom->hdr.id, max_data_width,
+ seq_printf(s, schema->fmt_str, dom->hdr.id, max_data_width,
ctrl_val);
sep = true;
}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 11153271cbdc..896350e9fb32 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2600,6 +2600,15 @@ static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type
if (cl > max_name_width)
max_name_width = cl;
+ switch (r->schema_fmt) {
+ case RESCTRL_SCHEMA_BITMAP:
+ s->fmt_str = "%d=%0*x";
+ break;
+ case RESCTRL_SCHEMA_RANGE:
+ s->fmt_str = "%d=%0*u";
+ break;
+ }
+
INIT_LIST_HEAD(&s->list);
list_add(&s->list, &resctrl_schema_all);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 54ec87339038..8a7f58d67ed6 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -217,7 +217,6 @@ enum resctrl_schema_fmt {
* @name: Name to use in "schemata" file.
* @data_width: Character width of data when displaying
* @default_ctrl: Specifies default cache cbm or memory B/W percent.
- * @format_str: Per resource format string to show domain value
* @schema_fmt: Which format string and parser is used for this schema.
* @evt_list: List of monitoring events
* @cdp_capable: Is the CDP feature available on this resource
@@ -236,7 +235,6 @@ struct rdt_resource {
char *name;
int data_width;
u32 default_ctrl;
- const char *format_str;
enum resctrl_schema_fmt schema_fmt;
struct list_head evt_list;
bool cdp_capable;
@@ -254,6 +252,7 @@ struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
* user-space
* @list: Member of resctrl_schema_all.
* @name: The name to use in the "schemata" file.
+ * @fmt_str: Format string to show domain value.
* @conf_type: Whether this schema is specific to code/data.
* @res: The resource structure exported by the architecture to describe
* the hardware that is configured by this schema.
@@ -264,6 +263,7 @@ struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l);
struct resctrl_schema {
struct list_head list;
char name[8];
+ const char *fmt_str;
enum resctrl_conf_type conf_type;
struct rdt_resource *res;
u32 num_closid;
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 06/40] x86/resctrl: Remove data_width and the tabular format
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (4 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 05/40] x86/resctrl: Use schema type to determine the schema format string James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-15 23:29 ` Tony Luck
2024-10-04 18:03 ` [PATCH v5 07/40] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
` (37 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
The resctrl architecture code provides a data_width for the controls of
each resource. This is used to zero pad all control values in the schemata
file so they appear in columns. The same is done with the resource names
to complete the visual effect. e.g.
| SMBA:0=2048
| L3:0=00ff
AMD platforms discover their maximum bandwidth for the MB resource from
firmware, but hard-code the data_width to 4. If the maximum bandwidth
requires more digits - the tabular format is silently broken.
If new schema are added resctrl will need to be able to determine the
maximum width. The benefit of this pretty-printing is questionable.
Instead of handling runtime discovery of the data_width for AMD platforms,
remove the feature. These fields are always zero padded so should be
harmless to remove if the whole field has been treated as a number.
In the above example, this would now look like this:
| SMBA:0=2048
| L3:0=ff
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since
---
arch/x86/kernel/cpu/resctrl/core.c | 26 -----------------------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 3 +--
arch/x86/kernel/cpu/resctrl/internal.h | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 +++++++--
include/linux/resctrl.h | 2 --
5 files changed, 10 insertions(+), 33 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 2a7f0f92c632..4c16e58c4a1b 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -43,12 +43,6 @@ static DEFINE_MUTEX(domain_list_lock);
*/
DEFINE_PER_CPU(struct resctrl_pqr_state, pqr_state);
-/*
- * Used to store the max resource name width and max resource data width
- * to display the schemata in a tabular format
- */
-int max_name_width, max_data_width;
-
/*
* Global boolean for rdt_alloc which is true if any
* resource allocation is enabled.
@@ -228,7 +222,6 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
return false;
r->membw.arch_needs_linear = false;
}
- r->data_width = 3;
if (boot_cpu_has(X86_FEATURE_PER_THREAD_MBA))
r->membw.throttle_mode = THREAD_THROTTLE_PER_THREAD;
@@ -267,8 +260,6 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
r->membw.throttle_mode = THREAD_THROTTLE_UNDEFINED;
r->membw.min_bw = 0;
r->membw.bw_gran = 1;
- /* Max value is 2048, Data width should be 4 in decimal */
- r->data_width = 4;
r->alloc_capable = true;
@@ -288,7 +279,6 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
r->cache.cbm_len = eax.split.cbm_len + 1;
r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
r->cache.shareable_bits = ebx & r->default_ctrl;
- r->data_width = (r->cache.cbm_len + 3) / 4;
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
r->cache.arch_has_sparse_bitmasks = ecx.split.noncont;
r->alloc_capable = true;
@@ -784,20 +774,6 @@ static int resctrl_arch_offline_cpu(unsigned int cpu)
return 0;
}
-/*
- * Choose a width for the resource name and resource data based on the
- * resource that has widest name and cbm.
- */
-static __init void rdt_init_padding(void)
-{
- struct rdt_resource *r;
-
- for_each_alloc_capable_rdt_resource(r) {
- if (r->data_width > max_data_width)
- max_data_width = r->data_width;
- }
-}
-
enum {
RDT_FLAG_CMT,
RDT_FLAG_MBM_TOTAL,
@@ -1095,8 +1071,6 @@ static int __init resctrl_late_init(void)
if (!get_rdt_resources())
return -ENODEV;
- rdt_init_padding();
-
state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
"x86/resctrl/cat:online:",
resctrl_arch_online_cpu,
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 71881f902728..8d1bdfe89692 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -482,8 +482,7 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
ctrl_val = resctrl_arch_get_config(r, dom, closid,
schema->conf_type);
- seq_printf(s, schema->fmt_str, dom->hdr.id, max_data_width,
- ctrl_val);
+ seq_printf(s, schema->fmt_str, dom->hdr.id, ctrl_val);
sep = true;
}
seq_puts(s, "\n");
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index ffcade365070..b69722faa703 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -324,7 +324,7 @@ struct rdtgroup {
/* List of all resource groups */
extern struct list_head rdt_all_groups;
-extern int max_name_width, max_data_width;
+extern int max_name_width;
int __init rdtgroup_init(void);
void __exit rdtgroup_exit(void);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 896350e9fb32..1707b04e901e 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -57,6 +57,12 @@ static struct kernfs_node *kn_mongrp;
/* Kernel fs node for "mon_data" directory under root */
static struct kernfs_node *kn_mondata;
+/*
+ * Used to store the max resource name width to display the schemata names in
+ * a tabular format.
+ */
+int max_name_width;
+
static struct seq_buf last_cmd_status;
static char last_cmd_status_buf[512];
@@ -2602,10 +2608,10 @@ static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type
switch (r->schema_fmt) {
case RESCTRL_SCHEMA_BITMAP:
- s->fmt_str = "%d=%0*x";
+ s->fmt_str = "%d=%x";
break;
case RESCTRL_SCHEMA_RANGE:
- s->fmt_str = "%d=%0*u";
+ s->fmt_str = "%d=%u";
break;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 8a7f58d67ed6..0f61673c9165 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -215,7 +215,6 @@ enum resctrl_schema_fmt {
* @ctrl_domains: RCU list of all control domains for this resource
* @mon_domains: RCU list of all monitor domains for this resource
* @name: Name to use in "schemata" file.
- * @data_width: Character width of data when displaying
* @default_ctrl: Specifies default cache cbm or memory B/W percent.
* @schema_fmt: Which format string and parser is used for this schema.
* @evt_list: List of monitoring events
@@ -233,7 +232,6 @@ struct rdt_resource {
struct list_head ctrl_domains;
struct list_head mon_domains;
char *name;
- int data_width;
u32 default_ctrl;
enum resctrl_schema_fmt schema_fmt;
struct list_head evt_list;
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 07/40] x86/resctrl: Add max_bw to struct resctrl_membw
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (5 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 06/40] x86/resctrl: Remove data_width and the tabular format James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-09 18:02 ` Tony Luck
2024-10-23 21:14 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 08/40] x86/resctrl: Generate default_ctrl instead of sharing it James Morse
` (36 subsequent siblings)
43 siblings, 2 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
__rdt_get_mem_config_amd() and __get_mem_config_intel() both use
the default_ctrl property as a maximum value. This is because the
MBA schema works differently between these platforms. Doing this
complicates determining whether the default_ctrl property belongs
to the arch code, or can be derived from the schema format.
Add a max_bw property for x86 platforms to specify their maximum
MBA bandwidth. This isn't needed for other schema formats.
This will allow the default_ctrl to be generated from the schema
properties when it is needed.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v2:
* This patch is new.
---
arch/x86/kernel/cpu/resctrl/core.c | 3 +++
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 9 +++++----
include/linux/resctrl.h | 2 ++
3 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 4c16e58c4a1b..e79807a8f060 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -212,6 +212,7 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
hw_res->num_closid = edx.split.cos_max + 1;
max_delay = eax.split.max_delay + 1;
r->default_ctrl = MAX_MBA_BW;
+ r->membw.max_bw = MAX_MBA_BW;
r->membw.arch_needs_linear = true;
if (ecx & MBA_IS_LINEAR) {
r->membw.delay_linear = true;
@@ -248,6 +249,8 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
cpuid_count(0x80000020, subleaf, &eax, &ebx, &ecx, &edx);
hw_res->num_closid = edx + 1;
r->default_ctrl = 1 << eax;
+ r->schema_fmt = RESCTRL_SCHEMA_RANGE;
+ r->membw.max_bw = 1 << eax;
/* AMD does not use delay */
r->membw.delay_linear = false;
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 8d1bdfe89692..56c41bfd07e4 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -57,10 +57,10 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
return false;
}
- if ((bw < r->membw.min_bw || bw > r->default_ctrl) &&
+ if ((bw < r->membw.min_bw || bw > r->membw.max_bw) &&
!is_mba_sc(r)) {
rdt_last_cmd_printf("MB value %ld out of range [%d,%d]\n", bw,
- r->membw.min_bw, r->default_ctrl);
+ r->membw.min_bw, r->membw.max_bw);
return false;
}
@@ -108,8 +108,9 @@ static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
*/
static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
{
- unsigned long first_bit, zero_bit, val;
+ u32 supported_bits = BIT_MASK(r->cache.cbm_len + 1) - 1;
unsigned int cbm_len = r->cache.cbm_len;
+ unsigned long first_bit, zero_bit, val;
int ret;
ret = kstrtoul(buf, 16, &val);
@@ -118,7 +119,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
return false;
}
- if ((r->cache.min_cbm_bits > 0 && val == 0) || val > r->default_ctrl) {
+ if ((r->cache.min_cbm_bits > 0 && val == 0) || val > supported_bits) {
rdt_last_cmd_puts("Mask out of range\n");
return false;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 0f61673c9165..b66cd977b658 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -165,6 +165,7 @@ enum membw_throttle_mode {
/**
* struct resctrl_membw - Memory bandwidth allocation related data
* @min_bw: Minimum memory bandwidth percentage user can request
+ * @max_bw: Maximum memory bandwidth value, used as the reset value
* @bw_gran: Granularity at which the memory bandwidth is allocated
* @delay_linear: True if memory B/W delay is in linear scale
* @arch_needs_linear: True if we can't configure non-linear resources
@@ -175,6 +176,7 @@ enum membw_throttle_mode {
*/
struct resctrl_membw {
u32 min_bw;
+ u32 max_bw;
u32 bw_gran;
u32 delay_linear;
bool arch_needs_linear;
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 08/40] x86/resctrl: Generate default_ctrl instead of sharing it
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (6 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 07/40] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 21:15 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 09/40] x86/resctrl: Add helper for setting CPU default properties James Morse
` (35 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
The struct rdt_resource default_ctrl is used by both the architecture
code for resetting the hardware controls, and by the filesystem parts
of resctrl to report to user-space.
This means the value has to be shared, but might not match the
properties of the control. e.g. a percentage greater than 100.
Instead, determine the default control value from a shared helper
resctrl_get_default_ctrl() that uses the schema properties to
determine the correct value.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v2:
* This patch is new.
---
arch/x86/kernel/cpu/resctrl/core.c | 16 +++++++---------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 +++---
include/linux/resctrl.h | 19 +++++++++++++++++--
3 files changed, 27 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index e79807a8f060..d77bfa17447a 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -143,7 +143,10 @@ static inline void cache_alloc_hsw_probe(void)
{
struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_L3];
struct rdt_resource *r = &hw_res->r_resctrl;
- u64 max_cbm = BIT_ULL_MASK(20) - 1, l3_cbm_0;
+ u64 max_cbm, l3_cbm_0;
+
+ r->cache.cbm_len = 20;
+ max_cbm = resctrl_get_default_ctrl(r);
if (wrmsrl_safe(MSR_IA32_L3_CBM_BASE, max_cbm))
return;
@@ -155,8 +158,6 @@ static inline void cache_alloc_hsw_probe(void)
return;
hw_res->num_closid = 4;
- r->default_ctrl = max_cbm;
- r->cache.cbm_len = 20;
r->cache.shareable_bits = 0xc0000;
r->cache.min_cbm_bits = 2;
r->cache.arch_has_sparse_bitmasks = false;
@@ -211,7 +212,6 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
hw_res->num_closid = edx.split.cos_max + 1;
max_delay = eax.split.max_delay + 1;
- r->default_ctrl = MAX_MBA_BW;
r->membw.max_bw = MAX_MBA_BW;
r->membw.arch_needs_linear = true;
if (ecx & MBA_IS_LINEAR) {
@@ -248,7 +248,6 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
cpuid_count(0x80000020, subleaf, &eax, &ebx, &ecx, &edx);
hw_res->num_closid = edx + 1;
- r->default_ctrl = 1 << eax;
r->schema_fmt = RESCTRL_SCHEMA_RANGE;
r->membw.max_bw = 1 << eax;
@@ -280,8 +279,7 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
hw_res->num_closid = edx.split.cos_max + 1;
r->cache.cbm_len = eax.split.cbm_len + 1;
- r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
- r->cache.shareable_bits = ebx & r->default_ctrl;
+ r->cache.shareable_bits = ebx & resctrl_get_default_ctrl(r);
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
r->cache.arch_has_sparse_bitmasks = ecx.split.noncont;
r->alloc_capable = true;
@@ -328,7 +326,7 @@ static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
return MAX_MBA_BW - bw;
pr_warn_once("Non Linear delay-bw map not supported but queried\n");
- return r->default_ctrl;
+ return resctrl_get_default_ctrl(r);
}
static void mba_wrmsr_intel(struct msr_param *m)
@@ -437,7 +435,7 @@ static void setup_default_ctrlval(struct rdt_resource *r, u32 *dc)
* For Memory Allocation: Set b/w requested to 100%
*/
for (i = 0; i < hw_res->num_closid; i++, dc++)
- *dc = r->default_ctrl;
+ *dc = resctrl_get_default_ctrl(r);
}
static void ctrl_domain_free(struct rdt_hw_ctrl_domain *hw_dom)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 1707b04e901e..de2c169a1678 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -969,7 +969,7 @@ static int rdt_default_ctrl_show(struct kernfs_open_file *of,
struct resctrl_schema *s = of->kn->parent->priv;
struct rdt_resource *r = s->res;
- seq_printf(seq, "%x\n", r->default_ctrl);
+ seq_printf(seq, "%x\n", resctrl_get_default_ctrl(r));
return 0;
}
@@ -2866,7 +2866,7 @@ static int reset_all_ctrls(struct rdt_resource *r)
hw_dom = resctrl_to_arch_ctrl_dom(d);
for (i = 0; i < hw_res->num_closid; i++)
- hw_dom->ctrl_val[i] = r->default_ctrl;
+ hw_dom->ctrl_val[i] = resctrl_get_default_ctrl(r);
msr_param.dom = d;
smp_call_function_any(&d->hdr.cpu_mask, rdt_ctrl_update, &msr_param, 1);
}
@@ -3401,7 +3401,7 @@ static void rdtgroup_init_mba(struct rdt_resource *r, u32 closid)
}
cfg = &d->staged_config[CDP_NONE];
- cfg->new_ctrl = r->default_ctrl;
+ cfg->new_ctrl = resctrl_get_default_ctrl(r);
cfg->have_new_ctrl = true;
}
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index b66cd977b658..1a9cca95b5e8 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -217,7 +217,6 @@ enum resctrl_schema_fmt {
* @ctrl_domains: RCU list of all control domains for this resource
* @mon_domains: RCU list of all monitor domains for this resource
* @name: Name to use in "schemata" file.
- * @default_ctrl: Specifies default cache cbm or memory B/W percent.
* @schema_fmt: Which format string and parser is used for this schema.
* @evt_list: List of monitoring events
* @cdp_capable: Is the CDP feature available on this resource
@@ -234,7 +233,6 @@ struct rdt_resource {
struct list_head ctrl_domains;
struct list_head mon_domains;
char *name;
- u32 default_ctrl;
enum resctrl_schema_fmt schema_fmt;
struct list_head evt_list;
bool cdp_capable;
@@ -269,6 +267,23 @@ struct resctrl_schema {
u32 num_closid;
};
+/**
+ * resctrl_get_default_ctrl() - Return the default control value for this
+ * resource.
+ * @r: The resource whose default control type is queried.
+ */
+static inline u32 resctrl_get_default_ctrl(struct rdt_resource *r)
+{
+ switch (r->schema_fmt) {
+ case RESCTRL_SCHEMA_BITMAP:
+ return BIT_MASK(r->cache.cbm_len) - 1;
+ case RESCTRL_SCHEMA_RANGE:
+ return r->membw.max_bw;
+ }
+
+ return WARN_ON_ONCE(1);
+}
+
/* The number of closid supported by this resource regardless of CDP */
u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
u32 resctrl_arch_system_num_rmid_idx(void);
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 09/40] x86/resctrl: Add helper for setting CPU default properties
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (7 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 08/40] x86/resctrl: Generate default_ctrl instead of sharing it James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 10/40] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid() James Morse
` (34 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Dave Martin, Shaopeng Tan
rdtgroup_rmdir_ctrl() and rdtgroup_rmdir_mon() set the per-CPU
pqr_state for CPUs that were part of the rmdir()'d group.
Another architecture might not have a 'pqr_state', its hardware may
need the values in a different format. MPAM's equivalent of RMID values
are not unique, and always need the CLOSID to be provided too.
There is only one caller that modifies a single value,
(rdtgroup_rmdir_mon()). MPAM always needs both CLOSID and RMID
for the hardware value as these are written to the same system
register.
As rdtgroup_rmdir_mon() has the CLOSID on hand, only provide a
helper to set both values. These values are read by
__resctrl_sched_in(), but may be written by a different CPU without
any locking, add READ/WRTE_ONCE() to avoid torn values.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
* In rdtgroup_rmdir_mon(), (re)set CPU default closid based on the
parent control group, to avoid the appearance of referencing
something that we're in the process of destroying (even if it
doesn't make a difference because the victim mon group necessarily
has the same closid as the parent control group).
Update comment to match.
No (intentional) functional change.
---
arch/x86/include/asm/resctrl.h | 14 +++++++++++---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 20 ++++++++++++++------
2 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 8b1b6ce1e51b..6908cd0e6e40 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -4,8 +4,9 @@
#ifdef CONFIG_X86_CPU_RESCTRL
-#include <linux/sched.h>
#include <linux/jump_label.h>
+#include <linux/percpu.h>
+#include <linux/sched.h>
/*
* This value can never be a valid CLOSID, and is used when mapping a
@@ -96,8 +97,8 @@ static inline void resctrl_arch_disable_mon(void)
static inline void __resctrl_sched_in(struct task_struct *tsk)
{
struct resctrl_pqr_state *state = this_cpu_ptr(&pqr_state);
- u32 closid = state->default_closid;
- u32 rmid = state->default_rmid;
+ u32 closid = READ_ONCE(state->default_closid);
+ u32 rmid = READ_ONCE(state->default_rmid);
u32 tmp;
/*
@@ -132,6 +133,13 @@ static inline unsigned int resctrl_arch_round_mon_val(unsigned int val)
return val * scale;
}
+static inline void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid,
+ u32 rmid)
+{
+ WRITE_ONCE(per_cpu(pqr_state.default_closid, cpu), closid);
+ WRITE_ONCE(per_cpu(pqr_state.default_rmid, cpu), rmid);
+}
+
static inline void resctrl_arch_set_closid_rmid(struct task_struct *tsk,
u32 closid, u32 rmid)
{
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index de2c169a1678..b430f4465cbf 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -3713,14 +3713,21 @@ static int rdtgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
{
struct rdtgroup *prdtgrp = rdtgrp->mon.parent;
+ u32 closid, rmid;
int cpu;
/* Give any tasks back to the parent group */
rdt_move_group_tasks(rdtgrp, prdtgrp, tmpmask);
- /* Update per cpu rmid of the moved CPUs first */
+ /*
+ * Update per cpu closid/rmid of the moved CPUs first.
+ * Note: the closid will not change, but the arch code still needs it.
+ */
+ closid = prdtgrp->closid;
+ rmid = prdtgrp->mon.rmid;
for_each_cpu(cpu, &rdtgrp->cpu_mask)
- per_cpu(pqr_state.default_rmid, cpu) = prdtgrp->mon.rmid;
+ resctrl_arch_set_cpu_default_closid_rmid(cpu, closid, rmid);
+
/*
* Update the MSR on moved CPUs and CPUs which have moved
* task running on them.
@@ -3753,6 +3760,7 @@ static int rdtgroup_ctrl_remove(struct rdtgroup *rdtgrp)
static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
{
+ u32 closid, rmid;
int cpu;
/* Give any tasks back to the default group */
@@ -3763,10 +3771,10 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
&rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
/* Update per cpu closid and rmid of the moved CPUs first */
- for_each_cpu(cpu, &rdtgrp->cpu_mask) {
- per_cpu(pqr_state.default_closid, cpu) = rdtgroup_default.closid;
- per_cpu(pqr_state.default_rmid, cpu) = rdtgroup_default.mon.rmid;
- }
+ closid = rdtgroup_default.closid;
+ rmid = rdtgroup_default.mon.rmid;
+ for_each_cpu(cpu, &rdtgrp->cpu_mask)
+ resctrl_arch_set_cpu_default_closid_rmid(cpu, closid, rmid);
/*
* Update the MSR on moved CPUs and CPUs which have moved
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 10/40] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid()
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (8 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 09/40] x86/resctrl: Add helper for setting CPU default properties James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 11/40] x86/resctrl: Export resctrl fs's init function James Morse
` (33 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Dave Martin, Shaopeng Tan
update_cpu_closid_rmid() takes a struct rdtgroup as an argument, which
it uses to update the local CPUs default pqr values. This is a problem
once the resctrl parts move out to /fs/, as the arch code cannot
poke around inside struct rdtgroup.
Rename update_cpu_closid_rmid() as resctrl_arch_sync_cpus_defaults()
to be used as the target of an IPI, and pass the effective CLOSID
and RMID in a new struct.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
* To clarify the meanings of the new helper and struct:
Rename resctrl_arch_sync_cpu_default() to
resctrl_arch_sync_cpu_closid_rmid();
Rename struct resctrl_cpu_sync to struct resctrl_cpu_defaults;
Flesh out the comment block in <linux/resctrl.h>.
No functional change.
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 17 +++++++++++++----
include/linux/resctrl.h | 22 ++++++++++++++++++++++
2 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index b430f4465cbf..04a6afedb070 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -346,13 +346,13 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
* from update_closid_rmid() is protected against __switch_to() because
* preemption is disabled.
*/
-static void update_cpu_closid_rmid(void *info)
+void resctrl_arch_sync_cpu_closid_rmid(void *info)
{
- struct rdtgroup *r = info;
+ struct resctrl_cpu_defaults *r = info;
if (r) {
this_cpu_write(pqr_state.default_closid, r->closid);
- this_cpu_write(pqr_state.default_rmid, r->mon.rmid);
+ this_cpu_write(pqr_state.default_rmid, r->rmid);
}
/*
@@ -367,11 +367,20 @@ static void update_cpu_closid_rmid(void *info)
* Update the PGR_ASSOC MSR on all cpus in @cpu_mask,
*
* Per task closids/rmids must have been set up before calling this function.
+ * @r may be NULL.
*/
static void
update_closid_rmid(const struct cpumask *cpu_mask, struct rdtgroup *r)
{
- on_each_cpu_mask(cpu_mask, update_cpu_closid_rmid, r, 1);
+ struct resctrl_cpu_defaults defaults, *p = NULL;
+
+ if (r) {
+ defaults.closid = r->closid;
+ defaults.rmid = r->mon.rmid;
+ p = &defaults;
+ }
+
+ on_each_cpu_mask(cpu_mask, resctrl_arch_sync_cpu_closid_rmid, p, 1);
}
static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 1a9cca95b5e8..4a3f8c171eb7 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -267,6 +267,28 @@ struct resctrl_schema {
u32 num_closid;
};
+struct resctrl_cpu_defaults {
+ u32 closid;
+ u32 rmid;
+};
+
+/**
+ * resctrl_arch_sync_cpu_closid_rmid() - Refresh this CPU's CLOSID and RMID.
+ * Call via IPI.
+ * @info: If non-NULL, a pointer to a struct resctrl_cpu_defaults
+ * specifying the new CLOSID and RMID for tasks in the default
+ * resctrl ctrl and mon group when running on this CPU. If NULL,
+ * this CPU is not re-assigned to a different default group.
+ *
+ * Propagates reassignment of CPUs and/or tasks to different resctrl groups
+ * when requested by the resctrl core code.
+ *
+ * This function records the per-cpu defaults specified by @info (if any),
+ * and then reconfigures the CPU's hardware CLOSID and RMID for subsequent
+ * execution based on @current, in the same way as during a task switch.
+ */
+void resctrl_arch_sync_cpu_closid_rmid(void *info);
+
/**
* resctrl_get_default_ctrl() - Return the default control value for this
* resource.
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 11/40] x86/resctrl: Export resctrl fs's init function
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (9 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 10/40] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid() James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-16 16:20 ` Tony Luck
2024-10-04 18:03 ` [PATCH v5 12/40] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain() James Morse
` (32 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Dave Martin, Shaopeng Tan
rdtgroup_init() needs exporting so that arch code can call it once
it lives in core code. As this is one of the few functions exported,
rename it to have "resctrl" in the name. The same goes for the exit
call.
Rename x86's arch code init functions for RDT to have an arch
prefix to make it clear these are part of the architecture code.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v4:
* Chagned the voice of some of the commit message.
Changes since v1:
* Rename stale rdtgroup_init() to resctrl_init() in
arch/x86/kernel/cpu/resctrl/monitor.c comments.
No functional change.
* [Commit message only] Minor rewording to avoid "impersonating code".
* [Commit message only] Typo fix:
s/to have the resctrl/to have resctrl/ in commit message.
---
arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++------
arch/x86/kernel/cpu/resctrl/internal.h | 3 ---
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 8 ++++----
include/linux/resctrl.h | 3 +++
5 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index d77bfa17447a..7aecade1337c 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -1056,7 +1056,7 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c)
}
}
-static int __init resctrl_late_init(void)
+static int __init resctrl_arch_late_init(void)
{
struct rdt_resource *r;
int state, ret;
@@ -1079,7 +1079,7 @@ static int __init resctrl_late_init(void)
if (state < 0)
return state;
- ret = rdtgroup_init();
+ ret = resctrl_init();
if (ret) {
cpuhp_remove_state(state);
return ret;
@@ -1095,18 +1095,18 @@ static int __init resctrl_late_init(void)
return 0;
}
-late_initcall(resctrl_late_init);
+late_initcall(resctrl_arch_late_init);
-static void __exit resctrl_exit(void)
+static void __exit resctrl_arch_exit(void)
{
struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
cpuhp_remove_state(rdt_online);
- rdtgroup_exit();
+ resctrl_exit();
if (r->mon_capable)
rdt_put_mon_l3_config();
}
-__exitcall(resctrl_exit);
+__exitcall(resctrl_arch_exit);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index b69722faa703..2bf08bd920f0 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -326,9 +326,6 @@ extern struct list_head rdt_all_groups;
extern int max_name_width;
-int __init rdtgroup_init(void);
-void __exit rdtgroup_exit(void);
-
/**
* struct rftype - describe each file in the resctrl file system
* @name: File name
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 00d906a1f51c..9cdca9d2bbde 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1031,7 +1031,7 @@ static int dom_data_init(struct rdt_resource *r)
/*
* RESCTRL_RESERVED_CLOSID and RESCTRL_RESERVED_RMID are special and
* are always allocated. These are used for the rdtgroup_default
- * control group, which will be setup later in rdtgroup_init().
+ * control group, which will be setup later in resctrl_init().
*/
idx = resctrl_arch_rmid_idx_encode(RESCTRL_RESERVED_CLOSID,
RESCTRL_RESERVED_RMID);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 04a6afedb070..61c8add103fe 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4217,14 +4217,14 @@ void resctrl_offline_cpu(unsigned int cpu)
}
/*
- * rdtgroup_init - rdtgroup initialization
+ * resctrl_init - resctrl filesystem initialization
*
* Setup resctrl file system including set up root, create mount point,
- * register rdtgroup filesystem, and initialize files under root directory.
+ * register resctrl filesystem, and initialize files under root directory.
*
* Return: 0 on success or -errno
*/
-int __init rdtgroup_init(void)
+int __init resctrl_init(void)
{
int ret = 0;
@@ -4272,7 +4272,7 @@ int __init rdtgroup_init(void)
return ret;
}
-void __exit rdtgroup_exit(void)
+void __exit resctrl_exit(void)
{
debugfs_remove_recursive(debugfs_resctrl);
unregister_filesystem(&rdt_fs_type);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 4a3f8c171eb7..3b0283fa7f80 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -403,4 +403,7 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
extern unsigned int resctrl_rmid_realloc_threshold;
extern unsigned int resctrl_rmid_realloc_limit;
+int __init resctrl_init(void);
+void __exit resctrl_exit(void);
+
#endif /* _RESCTRL_H */
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 12/40] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain()
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (10 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 11/40] x86/resctrl: Export resctrl fs's init function James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 21:16 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 13/40] x86/resctrl: Move resctrl types to a separate header James Morse
` (31 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
rdt_find_domain() finds a domain given a resource and a cache-id.
It's not quite right for the resctrl arch API as it also returns the
position to insert a new domain, which is needed when bringing a
domain online in the arch code.
Wrap rdt_find_domain() in another function resctrl_arch_find_domain()
in order to avoid the unnecessary argument outside the arch code.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v3:
* Used domain_list as a meaningful name instead of 'h'.
Changes since v1:
* [Commit message only] Minor rewording to avoid "impersonating code".
* [Commit message only] Typo fix:
s/in a another/in another/ in commit message.
---
arch/x86/kernel/cpu/resctrl/core.c | 10 ++++++++--
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 2 --
include/linux/resctrl.h | 3 +++
4 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 7aecade1337c..fceb56697a4a 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -402,8 +402,8 @@ void rdt_ctrl_update(void *arg)
* found (and NULL returned) then the first domain with id bigger than
* the input id can be returned to the caller via @pos.
*/
-struct rdt_domain_hdr *rdt_find_domain(struct list_head *h, int id,
- struct list_head **pos)
+static struct rdt_domain_hdr *rdt_find_domain(struct list_head *h, int id,
+ struct list_head **pos)
{
struct rdt_domain_hdr *d;
struct list_head *l;
@@ -424,6 +424,12 @@ struct rdt_domain_hdr *rdt_find_domain(struct list_head *h, int id,
return NULL;
}
+struct rdt_domain_hdr *resctrl_arch_find_domain(struct list_head *domain_list,
+ int id)
+{
+ return rdt_find_domain(domain_list, id, NULL);
+}
+
static void setup_default_ctrlval(struct rdt_resource *r, u32 *dc)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 56c41bfd07e4..7ea362c099db 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -620,7 +620,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
* This file provides data from a single domain. Search
* the resource to find the domain with "domid".
*/
- hdr = rdt_find_domain(&r->mon_domains, domid, NULL);
+ hdr = resctrl_arch_find_domain(&r->mon_domains, domid);
if (!hdr || WARN_ON_ONCE(hdr->type != RESCTRL_MON_DOMAIN)) {
ret = -ENOENT;
goto out;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 2bf08bd920f0..89eb2604a16e 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -578,8 +578,6 @@ void rdtgroup_kn_unlock(struct kernfs_node *kn);
int rdtgroup_kn_mode_restrict(struct rdtgroup *r, const char *name);
int rdtgroup_kn_mode_restore(struct rdtgroup *r, const char *name,
umode_t mask);
-struct rdt_domain_hdr *rdt_find_domain(struct list_head *h, int id,
- struct list_head **pos);
ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off);
int rdtgroup_schemata_show(struct kernfs_open_file *of,
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 3b0283fa7f80..7a39f271561c 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -309,6 +309,9 @@ static inline u32 resctrl_get_default_ctrl(struct rdt_resource *r)
/* The number of closid supported by this resource regardless of CDP */
u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
u32 resctrl_arch_system_num_rmid_idx(void);
+
+struct rdt_domain_hdr *resctrl_arch_find_domain(struct list_head *domain_list,
+ int id);
int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
/*
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 13/40] x86/resctrl: Move resctrl types to a separate header
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (11 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 12/40] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain() James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 14/40] x86/resctrl: Add a resctrl helper to reset all the resources James Morse
` (30 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
When resctrl is fully factored into core and per-arch code, each arch
will need to use some resctrl common definitions in order to define its
own specializations and helpers. Following conventional practice, it
would be desirable to put the dependent arch definitions in an
<asm/resctrl.h> header that is included by the common <linux/resctrl.h>
header. However, this can make it awkward to avoid a circular
dependency between <linux/resctrl.h> and the arch header.
To avoid such dependencies, move the affected common types and
constants into a new header that does not need to depend on
<linux/resctrl.h> or on the arch headers.
The same logic applies to the monitor-configuration defines, move these
too.
Some kind of enumeration for events is needed between the filesystem
and architecture code. Take the x86 definition as its convenient for
x86.
The definition of enum resctrl_event_id is needed to allow the
architecture code to define resctrl_arch_mon_ctx_alloc() and
resctrl_arch_mon_ctx_free().
The definition of enum resctrl_res_level is needed to allow the
architecture code to define resctrl_arch_set_cdp_enabled() and
resctrl_arch_get_cdp_enabled().
The bits for mbm_local_bytes_config et al are ABI, and must be the same
on all architectures. These are documented in
Documentation/arch/x86/resctrl.rst
The maintainers entry for these headers was missed when resctrl.h was
created. Add a wildcard entry to match both resctrl.h and
resctrl_types.h.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Change since v3:
* Added header include.
* Corrected lists in the commit message.
Changes since v2:
* Added to the commit message why each of these things is necessary.
* Moved the enum resctrl_conf_type back to resctrl.h - this week arm's
CDP emulation code gets away without this...
Changes since v1:
* [Commit message only] Rewrite commit message to clarify the the
rationale for refactoring the headers in this way.
---
MAINTAINERS | 1 +
arch/x86/include/asm/resctrl.h | 1 +
arch/x86/kernel/cpu/resctrl/internal.h | 24 ------------
include/linux/resctrl.h | 21 +---------
include/linux/resctrl_types.h | 54 ++++++++++++++++++++++++++
5 files changed, 57 insertions(+), 44 deletions(-)
create mode 100644 include/linux/resctrl_types.h
diff --git a/MAINTAINERS b/MAINTAINERS
index c27f3190737f..fd5a1621c026 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -19468,6 +19468,7 @@ S: Supported
F: Documentation/arch/x86/resctrl*
F: arch/x86/include/asm/resctrl.h
F: arch/x86/kernel/cpu/resctrl/
+F: include/linux/resctrl*.h
F: tools/testing/selftests/resctrl/
READ-COPY UPDATE (RCU)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 6908cd0e6e40..52f2326e2b1e 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -6,6 +6,7 @@
#include <linux/jump_label.h>
#include <linux/percpu.h>
+#include <linux/resctrl_types.h>
#include <linux/sched.h>
/*
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 89eb2604a16e..35761fe8dbc1 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -32,30 +32,6 @@
*/
#define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
-/* Reads to Local DRAM Memory */
-#define READS_TO_LOCAL_MEM BIT(0)
-
-/* Reads to Remote DRAM Memory */
-#define READS_TO_REMOTE_MEM BIT(1)
-
-/* Non-Temporal Writes to Local Memory */
-#define NON_TEMP_WRITE_TO_LOCAL_MEM BIT(2)
-
-/* Non-Temporal Writes to Remote Memory */
-#define NON_TEMP_WRITE_TO_REMOTE_MEM BIT(3)
-
-/* Reads to Local Memory the system identifies as "Slow Memory" */
-#define READS_TO_LOCAL_S_MEM BIT(4)
-
-/* Reads to Remote Memory the system identifies as "Slow Memory" */
-#define READS_TO_REMOTE_S_MEM BIT(5)
-
-/* Dirty Victims to All Types of Memory */
-#define DIRTY_VICTIMS_TO_ALL_MEM BIT(6)
-
-/* Max event bits supported */
-#define MAX_EVT_CONFIG_BITS GENMASK(6, 0)
-
/**
* cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
* aren't marked nohz_full
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 7a39f271561c..8894aed3c593 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -6,6 +6,7 @@
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/pid.h>
+#include <linux/resctrl_types.h>
/* CLOSID, RMID value used by the default control group */
#define RESCTRL_RESERVED_CLOSID 0
@@ -37,28 +38,8 @@ enum resctrl_conf_type {
CDP_DATA,
};
-enum resctrl_res_level {
- RDT_RESOURCE_L3,
- RDT_RESOURCE_L2,
- RDT_RESOURCE_MBA,
- RDT_RESOURCE_SMBA,
-
- /* Must be the last */
- RDT_NUM_RESOURCES,
-};
-
#define CDP_NUM_TYPES (CDP_DATA + 1)
-/*
- * Event IDs, the values match those used to program IA32_QM_EVTSEL before
- * reading IA32_QM_CTR on RDT systems.
- */
-enum resctrl_event_id {
- QOS_L3_OCCUP_EVENT_ID = 0x01,
- QOS_L3_MBM_TOTAL_EVENT_ID = 0x02,
- QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
-};
-
/**
* struct resctrl_staged_config - parsed configuration to be applied
* @new_ctrl: new ctrl value to be loaded
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
new file mode 100644
index 000000000000..51c51a1aabfb
--- /dev/null
+++ b/include/linux/resctrl_types.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2024 Arm Ltd.
+ * Based on arch/x86/kernel/cpu/resctrl/internal.h
+ */
+
+#ifndef __LINUX_RESCTRL_TYPES_H
+#define __LINUX_RESCTRL_TYPES_H
+
+/* Reads to Local DRAM Memory */
+#define READS_TO_LOCAL_MEM BIT(0)
+
+/* Reads to Remote DRAM Memory */
+#define READS_TO_REMOTE_MEM BIT(1)
+
+/* Non-Temporal Writes to Local Memory */
+#define NON_TEMP_WRITE_TO_LOCAL_MEM BIT(2)
+
+/* Non-Temporal Writes to Remote Memory */
+#define NON_TEMP_WRITE_TO_REMOTE_MEM BIT(3)
+
+/* Reads to Local Memory the system identifies as "Slow Memory" */
+#define READS_TO_LOCAL_S_MEM BIT(4)
+
+/* Reads to Remote Memory the system identifies as "Slow Memory" */
+#define READS_TO_REMOTE_S_MEM BIT(5)
+
+/* Dirty Victims to All Types of Memory */
+#define DIRTY_VICTIMS_TO_ALL_MEM BIT(6)
+
+/* Max event bits supported */
+#define MAX_EVT_CONFIG_BITS GENMASK(6, 0)
+
+enum resctrl_res_level {
+ RDT_RESOURCE_L3,
+ RDT_RESOURCE_L2,
+ RDT_RESOURCE_MBA,
+ RDT_RESOURCE_SMBA,
+
+ /* Must be the last */
+ RDT_NUM_RESOURCES,
+};
+
+/*
+ * Event IDs, the values match those used to program IA32_QM_EVTSEL before
+ * reading IA32_QM_CTR on RDT systems.
+ */
+enum resctrl_event_id {
+ QOS_L3_OCCUP_EVENT_ID = 0x01,
+ QOS_L3_MBM_TOTAL_EVENT_ID = 0x02,
+ QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
+};
+
+#endif /* __LINUX_RESCTRL_TYPES_H */
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 14/40] x86/resctrl: Add a resctrl helper to reset all the resources
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (12 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 13/40] x86/resctrl: Move resctrl types to a separate header James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 21:32 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 15/40] x86/resctrl: Move monitor exit work to a resctrl exit call James Morse
` (29 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Dave Martin, Shaopeng Tan
On umount(), resctrl resets each resource back to its default
configuration. It only ever does this for all resources in one go.
reset_all_ctrls() is architecture specific as it works with struct
rdt_hw_resource.
Add an architecture helper to reset all resources.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
* Rename the for_each_capable_rdt_resource() introduced in the new
function resctrl_arch_reset_resources(), back to
for_each_alloc_capable_rdt_resource() as it was in the original code.
The change looked unintentional; and presumably a resource that does
not support resource allocation doesn't have any properties to
reset...
---
arch/x86/include/asm/resctrl.h | 2 ++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 16 +++++++++++-----
2 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 52f2326e2b1e..5622943f6354 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -16,6 +16,8 @@
*/
#define X86_RESCTRL_EMPTY_CLOSID ((u32)~0)
+void resctrl_arch_reset_resources(void);
+
/**
* struct resctrl_pqr_state - State cache for the PQR MSR
* @cur_rmid: The cached Resource Monitoring ID
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 61c8add103fe..a15198f90b29 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2883,6 +2883,14 @@ static int reset_all_ctrls(struct rdt_resource *r)
return 0;
}
+void resctrl_arch_reset_resources(void)
+{
+ struct rdt_resource *r;
+
+ for_each_alloc_capable_rdt_resource(r)
+ reset_all_ctrls(r);
+}
+
/*
* Move tasks from one to the other group. If @from is NULL, then all tasks
* in the systems are moved unconditionally (used for teardown).
@@ -2992,16 +3000,14 @@ static void rmdir_all_sub(void)
static void rdt_kill_sb(struct super_block *sb)
{
- struct rdt_resource *r;
-
cpus_read_lock();
mutex_lock(&rdtgroup_mutex);
rdt_disable_ctx();
- /*Put everything back to default values. */
- for_each_alloc_capable_rdt_resource(r)
- reset_all_ctrls(r);
+ /* Put everything back to default values. */
+ resctrl_arch_reset_resources();
+
rmdir_all_sub();
rdt_pseudo_lock_release();
rdtgroup_default.mode = RDT_MODE_SHAREABLE;
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 15/40] x86/resctrl: Move monitor exit work to a resctrl exit call
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (13 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 14/40] x86/resctrl: Add a resctrl helper to reset all the resources James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 16/40] x86/resctrl: Move monitor init work to a resctrl init call James Morse
` (28 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
rdt_put_mon_l3_config() is called via the architecture's
resctrl_arch_exit() call, and appears to free the rmid_ptrs[]
and closid_num_dirty_rmid[] arrays. In reality this code is marked
__exit, and is removed by the linker as resctrl can't be built
as a module.
To separate the filesystem and architecture parts of resctrl,
this free()ing work needs to be triggered by the filesystem,
as these structures belong to the filesystem code.
Rename rdt_put_mon_l3_config() resctrl_mon_resource_exit()
and call it from resctrl_exit(). The kfree() is currently
dependent on r->mon_capable.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v4:
* Added __exit so it can be removed in the next patch.
Changes since v3:
* Moved r->mon_capable check under the lock.
* Dropped references to resctrl_mon_resource_init() from the commit message.
* Fixed more resctrl typos,
Changes since v2:
* Dropped __exit as needed in the next patch.
Change since v1:
* [Commit message only] Typo fixes:
s/restrl/resctrl/g
s/resctl/resctrl/g
* [Commit message only] Reword second paragraph to remove reference to
the MPAM error interrupt, which provides background rationale for a
later patch rather than for this patch, and so it is not really
relevant here.
---
arch/x86/kernel/cpu/resctrl/core.c | 5 -----
arch/x86/kernel/cpu/resctrl/internal.h | 2 +-
arch/x86/kernel/cpu/resctrl/monitor.c | 12 +++++++++---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 ++
4 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index fceb56697a4a..830607986c06 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -1105,14 +1105,9 @@ late_initcall(resctrl_arch_late_init);
static void __exit resctrl_arch_exit(void)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
-
cpuhp_remove_state(rdt_online);
resctrl_exit();
-
- if (r->mon_capable)
- rdt_put_mon_l3_config();
}
__exitcall(resctrl_arch_exit);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 35761fe8dbc1..84a5bc840a98 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -579,7 +579,7 @@ void closid_free(int closid);
int alloc_rmid(u32 closid);
void free_rmid(u32 closid, u32 rmid);
int rdt_get_mon_l3_config(struct rdt_resource *r);
-void __exit rdt_put_mon_l3_config(void);
+void __exit resctrl_mon_resource_exit(void);
bool __init rdt_cpu_has(int flag);
void mon_event_count(void *info);
int rdtgroup_mondata_show(struct seq_file *m, void *arg);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 9cdca9d2bbde..d6e178be34ad 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1044,10 +1044,13 @@ static int dom_data_init(struct rdt_resource *r)
return err;
}
-static void __exit dom_data_exit(void)
+static void __exit dom_data_exit(struct rdt_resource *r)
{
mutex_lock(&rdtgroup_mutex);
+ if (!r->mon_capable)
+ goto out_unlock;
+
if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) {
kfree(closid_num_dirty_rmid);
closid_num_dirty_rmid = NULL;
@@ -1056,6 +1059,7 @@ static void __exit dom_data_exit(void)
kfree(rmid_ptrs);
rmid_ptrs = NULL;
+out_unlock:
mutex_unlock(&rdtgroup_mutex);
}
@@ -1238,9 +1242,11 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
return 0;
}
-void __exit rdt_put_mon_l3_config(void)
+void __exit resctrl_mon_resource_exit(void)
{
- dom_data_exit();
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+
+ dom_data_exit(r);
}
void __init intel_rdt_mbm_apply_quirk(void)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index a15198f90b29..6d5fe973cb92 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4283,4 +4283,6 @@ void __exit resctrl_exit(void)
debugfs_remove_recursive(debugfs_resctrl);
unregister_filesystem(&rdt_fs_type);
sysfs_remove_mount_point(fs_kobj, "resctrl");
+
+ resctrl_mon_resource_exit();
}
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 16/40] x86/resctrl: Move monitor init work to a resctrl init call
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (14 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 15/40] x86/resctrl: Move monitor exit work to a resctrl exit call James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers James Morse
` (27 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
rdt_get_mon_l3_config() is called from the architecture's
resctrl_arch_late_init(), and initialises both architecture specific
fields, such as hw_res->mon_scale and resctrl filesystem fields
by calling dom_data_init().
To separate the filesystem and architecture parts of resctrl, this
function needs splitting up.
Add resctrl_mon_resource_init() to do the filesystem specific work,
and call it from resctrl_init(). This runs later, but is still before
the filesystem is mounted and the rmid_ptrs[] array can be used.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v4:
* Removed __exit markers
Changes since v3:
* Added a comment over resctrl_mon_resource_init().
* Added a comment over domain_setup_mon_state() to warn of cpuhp ordering.
* Added __init to resctrl_mon_resource_init().
Changes since v2:
* Added error handling for the case sysfs files can't be created.
---
arch/x86/kernel/cpu/resctrl/internal.h | 3 +-
arch/x86/kernel/cpu/resctrl/monitor.c | 40 ++++++++++++++++++++------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 22 +++++++++++++-
3 files changed, 54 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 84a5bc840a98..4e2ec1fed2ff 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -579,13 +579,14 @@ void closid_free(int closid);
int alloc_rmid(u32 closid);
void free_rmid(u32 closid, u32 rmid);
int rdt_get_mon_l3_config(struct rdt_resource *r);
-void __exit resctrl_mon_resource_exit(void);
+void resctrl_mon_resource_exit(void);
bool __init rdt_cpu_has(int flag);
void mon_event_count(void *info);
int rdtgroup_mondata_show(struct seq_file *m, void *arg);
void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
cpumask_t *cpumask, int evtid, int first);
+int __init resctrl_mon_resource_init(void);
void mbm_setup_overflow_handler(struct rdt_mon_domain *dom,
unsigned long delay_ms,
int exclude_cpu);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index d6e178be34ad..cecc96213c49 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1044,7 +1044,7 @@ static int dom_data_init(struct rdt_resource *r)
return err;
}
-static void __exit dom_data_exit(struct rdt_resource *r)
+static void dom_data_exit(struct rdt_resource *r)
{
mutex_lock(&rdtgroup_mutex);
@@ -1179,12 +1179,40 @@ static __init int snc_get_config(void)
return ret;
}
+/**
+ * resctrl_mon_resource_init() - Initialise global monitoring structures.
+ *
+ * Allocate and initialise global monitor resources that do not belong to a
+ * specific domain. i.e. the rmid_ptrs[] used for the limbo and free lists.
+ * Called once during boot after the struct rdt_resource's have been configured
+ * but before the filesystem is mounted.
+ * Resctrl's cpuhp callbacks may be called before this point to bring a domain
+ * online.
+ *
+ * Returns 0 for success, or -ENOMEM.
+ */
+int __init resctrl_mon_resource_init(void)
+{
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+ int ret;
+
+ if (!r->mon_capable)
+ return 0;
+
+ ret = dom_data_init(r);
+ if (ret)
+ return ret;
+
+ l3_mon_evt_init(r);
+
+ return 0;
+}
+
int __init rdt_get_mon_l3_config(struct rdt_resource *r)
{
unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
unsigned int threshold;
- int ret;
snc_nodes_per_l3_cache = snc_get_config();
@@ -1214,10 +1242,6 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
*/
resctrl_rmid_realloc_threshold = resctrl_arch_round_mon_val(threshold);
- ret = dom_data_init(r);
- if (ret)
- return ret;
-
if (rdt_cpu_has(X86_FEATURE_BMEC)) {
u32 eax, ebx, ecx, edx;
@@ -1235,14 +1259,12 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
}
}
- l3_mon_evt_init(r);
-
r->mon_capable = true;
return 0;
}
-void __exit resctrl_mon_resource_exit(void)
+void resctrl_mon_resource_exit(void)
{
struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 6d5fe973cb92..7be1ff466559 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4089,6 +4089,19 @@ void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d
mutex_unlock(&rdtgroup_mutex);
}
+/**
+ * domain_setup_mon_state() - Initialise domain monitoring structures.
+ * @r: The resource for the newly online domain.
+ * @d: The newly online domain.
+ *
+ * Allocate monitor resources that belong to this domain.
+ * Called when the first CPU of a domain comes online, regardless of whether
+ * the filesystem is mounted.
+ * During boot this may be called before global allocations have been made by
+ * resctrl_mon_resource_init().
+ *
+ * Returns 0 for success, or -ENOMEM.
+ */
static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain *d)
{
u32 idx_limit = resctrl_arch_system_num_rmid_idx();
@@ -4239,9 +4252,15 @@ int __init resctrl_init(void)
rdtgroup_setup_default();
+ ret = resctrl_mon_resource_init();
+ if (ret)
+ return ret;
+
ret = sysfs_create_mount_point(fs_kobj, "resctrl");
- if (ret)
+ if (ret) {
+ resctrl_mon_resource_exit();
return ret;
+ }
ret = register_filesystem(&rdt_fs_type);
if (ret)
@@ -4274,6 +4293,7 @@ int __init resctrl_init(void)
cleanup_mountpoint:
sysfs_remove_mount_point(fs_kobj, "resctrl");
+ resctrl_mon_resource_exit();
return ret;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (15 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 16/40] x86/resctrl: Move monitor init work to a resctrl init call James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-08 0:00 ` Tony Luck
2024-10-23 21:51 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 18/40] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h James Morse
` (26 subsequent siblings)
43 siblings, 2 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
The for_each_*_rdt_resource() helpers walk the architecture's array
of structures, using the resctrl visible part as an iterator. These
became over-complex when the structures were split into a
filesystem and architecture-specific struct. This approach avoided
the need to touch every call site, and was done before there was a
helper to retrieve a resource by rid.
Once the filesystem parts of resctrl are moved to /fs/, both the
architecture's resource array, and the definition of those structures
is no longer accessible. To support resctrl, each architecture would
have to provide equally complex macros.
Rewrite the macro to make use of resctrl_arch_get_resource(), and
move these to the core header so existing x86 arch code continues
to use them.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v3:
* Restructure the existing macros instead of open-coding the for loop.
Changes since v1:
* [Whitespace only] Fix bogus whitespace introduced in
rdtgroup_create_info_dir().
* [Commit message only] Typo fix:
s/architectures/architecture's/g
---
arch/x86/kernel/cpu/resctrl/internal.h | 29 --------------------------
include/linux/resctrl.h | 18 ++++++++++++++++
2 files changed, 18 insertions(+), 29 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 4e2ec1fed2ff..a434daa6dba4 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -472,14 +472,6 @@ extern struct rdt_hw_resource rdt_resources_all[];
extern struct rdtgroup rdtgroup_default;
extern struct dentry *debugfs_resctrl;
-static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
-{
- struct rdt_hw_resource *hw_res = resctrl_to_arch_res(res);
-
- hw_res++;
- return &hw_res->r_resctrl;
-}
-
static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
{
return rdt_resources_all[l].cdp_enabled;
@@ -489,27 +481,6 @@ int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
void arch_mon_domain_online(struct rdt_resource *r, struct rdt_mon_domain *d);
-/*
- * To return the common struct rdt_resource, which is contained in struct
- * rdt_hw_resource, walk the resctrl member of struct rdt_hw_resource.
- */
-#define for_each_rdt_resource(r) \
- for (r = &rdt_resources_all[0].r_resctrl; \
- r <= &rdt_resources_all[RDT_NUM_RESOURCES - 1].r_resctrl; \
- r = resctrl_inc(r))
-
-#define for_each_capable_rdt_resource(r) \
- for_each_rdt_resource(r) \
- if (r->alloc_capable || r->mon_capable)
-
-#define for_each_alloc_capable_rdt_resource(r) \
- for_each_rdt_resource(r) \
- if (r->alloc_capable)
-
-#define for_each_mon_capable_rdt_resource(r) \
- for_each_rdt_resource(r) \
- if (r->mon_capable)
-
/* CPUID.(EAX=10H, ECX=ResID=1).EAX */
union cpuid_0x10_1_eax {
struct {
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 8894aed3c593..f75f0409ae09 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -26,6 +26,24 @@ int proc_resctrl_show(struct seq_file *m,
/* max value for struct rdt_domain's mbps_val */
#define MBA_MAX_MBPS U32_MAX
+/* Walk all possible resources, with variants for only controls or monitors. */
+#define for_each_rdt_resource(_r) \
+ for ((_r) = resctrl_arch_get_resource(0); \
+ (_r)->rid < RDT_NUM_RESOURCES - 1; \
+ (_r) = resctrl_arch_get_resource((_r)->rid + 1))
+
+#define for_each_capable_rdt_resource(r) \
+ for_each_rdt_resource((r)) \
+ if ((r)->alloc_capable || (r)->mon_capable)
+
+#define for_each_alloc_capable_rdt_resource(r) \
+ for_each_rdt_resource((r)) \
+ if ((r)->alloc_capable)
+
+#define for_each_mon_capable_rdt_resource(r) \
+ for_each_rdt_resource((r)) \
+ if ((r)->mon_capable)
+
/**
* enum resctrl_conf_type - The type of configuration.
* @CDP_NONE: No prioritisation, both code and data are controlled or monitored.
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 18/40] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (16 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 22:00 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 19/40] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC James Morse
` (25 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
The architecture specific parts of resctrl have helpers to hide accesses
to the rdt_mon_features bitmap.
Once the filesystem parts of resctrl are moved, these can no longer live
in internal.h. Once these are exposed to the wider kernel, they should
have a 'resctrl_arch_' prefix, to fit the rest of the arch<->fs interface.
Move and rename the helpers that touch rdt_mon_features directly.
is_mbm_event() and is_mbm_enabled() are only called from rdtgroup.c,
so can be moved into that file.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
arch/x86/include/asm/resctrl.h | 16 +++++++++++
arch/x86/kernel/cpu/resctrl/core.c | 4 +--
arch/x86/kernel/cpu/resctrl/internal.h | 27 -----------------
arch/x86/kernel/cpu/resctrl/monitor.c | 18 ++++++------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 40 +++++++++++++++++---------
5 files changed, 53 insertions(+), 52 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 5622943f6354..a28a1d8a4ca8 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -44,6 +44,7 @@ DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state);
extern bool rdt_alloc_capable;
extern bool rdt_mon_capable;
+extern unsigned int rdt_mon_features;
DECLARE_STATIC_KEY_FALSE(rdt_enable_key);
DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key);
@@ -83,6 +84,21 @@ static inline void resctrl_arch_disable_mon(void)
static_branch_dec_cpuslocked(&rdt_enable_key);
}
+static inline bool resctrl_arch_is_llc_occupancy_enabled(void)
+{
+ return (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID));
+}
+
+static inline bool resctrl_arch_is_mbm_total_enabled(void)
+{
+ return (rdt_mon_features & (1 << QOS_L3_MBM_TOTAL_EVENT_ID));
+}
+
+static inline bool resctrl_arch_is_mbm_local_enabled(void)
+{
+ return (rdt_mon_features & (1 << QOS_L3_MBM_LOCAL_EVENT_ID));
+}
+
/*
* __resctrl_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR
*
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 830607986c06..8db946ebc4ff 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -489,13 +489,13 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_mon_domain *hw_dom)
{
size_t tsize;
- if (is_mbm_total_enabled()) {
+ if (resctrl_arch_is_mbm_total_enabled()) {
tsize = sizeof(*hw_dom->arch_mbm_total);
hw_dom->arch_mbm_total = kcalloc(num_rmid, tsize, GFP_KERNEL);
if (!hw_dom->arch_mbm_total)
return -ENOMEM;
}
- if (is_mbm_local_enabled()) {
+ if (resctrl_arch_is_mbm_local_enabled()) {
tsize = sizeof(*hw_dom->arch_mbm_local);
hw_dom->arch_mbm_local = kcalloc(num_rmid, tsize, GFP_KERNEL);
if (!hw_dom->arch_mbm_local) {
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index a434daa6dba4..6b076216911c 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -156,7 +156,6 @@ struct rmid_read {
void *arch_mon_ctx;
};
-extern unsigned int rdt_mon_features;
extern struct list_head resctrl_schema_all;
extern bool resctrl_mounted;
@@ -404,32 +403,6 @@ struct msr_param {
u32 high;
};
-static inline bool is_llc_occupancy_enabled(void)
-{
- return (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID));
-}
-
-static inline bool is_mbm_total_enabled(void)
-{
- return (rdt_mon_features & (1 << QOS_L3_MBM_TOTAL_EVENT_ID));
-}
-
-static inline bool is_mbm_local_enabled(void)
-{
- return (rdt_mon_features & (1 << QOS_L3_MBM_LOCAL_EVENT_ID));
-}
-
-static inline bool is_mbm_enabled(void)
-{
- return (is_mbm_total_enabled() || is_mbm_local_enabled());
-}
-
-static inline bool is_mbm_event(int e)
-{
- return (e >= QOS_L3_MBM_TOTAL_EVENT_ID &&
- e <= QOS_L3_MBM_LOCAL_EVENT_ID);
-}
-
/**
* struct rdt_hw_resource - arch private attributes of a resctrl resource
* @r_resctrl: Attributes of the resource used directly by resctrl.
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index cecc96213c49..68e05bd0eb94 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -295,11 +295,11 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
{
struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
- if (is_mbm_total_enabled())
+ if (resctrl_arch_is_mbm_total_enabled())
memset(hw_dom->arch_mbm_total, 0,
sizeof(*hw_dom->arch_mbm_total) * r->num_rmid);
- if (is_mbm_local_enabled())
+ if (resctrl_arch_is_mbm_local_enabled())
memset(hw_dom->arch_mbm_local, 0,
sizeof(*hw_dom->arch_mbm_local) * r->num_rmid);
}
@@ -569,7 +569,7 @@ void free_rmid(u32 closid, u32 rmid)
entry = __rmid_entry(idx);
- if (is_llc_occupancy_enabled())
+ if (resctrl_arch_is_llc_occupancy_enabled())
add_rmid_to_limbo(entry);
else
list_add_tail(&entry->list, &rmid_free_lru);
@@ -757,7 +757,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
struct list_head *head;
struct rdtgroup *entry;
- if (!is_mbm_local_enabled())
+ if (!resctrl_arch_is_mbm_local_enabled())
return;
r_mba = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
@@ -825,7 +825,7 @@ static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
* This is protected from concurrent reads from user
* as both the user and we hold the global mutex.
*/
- if (is_mbm_total_enabled()) {
+ if (resctrl_arch_is_mbm_total_enabled()) {
rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID;
rr.val = 0;
rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
@@ -839,7 +839,7 @@ static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
}
- if (is_mbm_local_enabled()) {
+ if (resctrl_arch_is_mbm_local_enabled()) {
rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID;
rr.val = 0;
rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
@@ -1089,11 +1089,11 @@ static void l3_mon_evt_init(struct rdt_resource *r)
{
INIT_LIST_HEAD(&r->evt_list);
- if (is_llc_occupancy_enabled())
+ if (resctrl_arch_is_llc_occupancy_enabled())
list_add_tail(&llc_occupancy_event.list, &r->evt_list);
- if (is_mbm_total_enabled())
+ if (resctrl_arch_is_mbm_total_enabled())
list_add_tail(&mbm_total_event.list, &r->evt_list);
- if (is_mbm_local_enabled())
+ if (resctrl_arch_is_mbm_local_enabled())
list_add_tail(&mbm_local_event.list, &r->evt_list);
}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 7be1ff466559..7ed295225da7 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -108,6 +108,18 @@ void rdt_staged_configs_clear(void)
}
}
+static bool resctrl_is_mbm_enabled(void)
+{
+ return (resctrl_arch_is_mbm_total_enabled() ||
+ resctrl_arch_is_mbm_local_enabled());
+}
+
+static bool resctrl_is_mbm_event(int e)
+{
+ return (e >= QOS_L3_MBM_TOTAL_EVENT_ID &&
+ e <= QOS_L3_MBM_LOCAL_EVENT_ID);
+}
+
/*
* Trivial allocator for CLOSIDs. Since h/w only supports a small number,
* we can keep a bitmap of free CLOSIDs in a single integer.
@@ -155,7 +167,7 @@ static int closid_alloc(void)
lockdep_assert_held(&rdtgroup_mutex);
if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID) &&
- is_llc_occupancy_enabled()) {
+ resctrl_arch_is_llc_occupancy_enabled()) {
cleanest_closid = resctrl_find_cleanest_closid();
if (cleanest_closid < 0)
return cleanest_closid;
@@ -2373,7 +2385,7 @@ static bool supports_mba_mbps(void)
struct rdt_resource *rmbm = resctrl_arch_get_resource(RDT_RESOURCE_L3);
struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
- return (is_mbm_local_enabled() &&
+ return (resctrl_arch_is_mbm_local_enabled() &&
r->alloc_capable && is_mba_linear() &&
r->ctrl_scope == rmbm->mon_scope);
}
@@ -2740,7 +2752,7 @@ static int rdt_get_tree(struct fs_context *fc)
if (resctrl_arch_alloc_capable() || resctrl_arch_mon_capable())
resctrl_mounted = true;
- if (is_mbm_enabled()) {
+ if (resctrl_is_mbm_enabled()) {
r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
list_for_each_entry(dom, &r->mon_domains, hdr.list)
mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL,
@@ -3114,7 +3126,7 @@ static int mon_add_all_files(struct kernfs_node *kn, struct rdt_mon_domain *d,
if (ret)
return ret;
- if (!do_sum && is_mbm_event(mevt->evtid))
+ if (!do_sum && resctrl_is_mbm_event(mevt->evtid))
mon_event_read(&rr, r, d, prgrp, &d->hdr.cpu_mask, mevt->evtid, true);
}
@@ -4069,9 +4081,9 @@ void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d
if (resctrl_mounted && resctrl_arch_mon_capable())
rmdir_mondata_subdir_allrdtgrp(r, d);
- if (is_mbm_enabled())
+ if (resctrl_is_mbm_enabled())
cancel_delayed_work(&d->mbm_over);
- if (is_llc_occupancy_enabled() && has_busy_rmid(d)) {
+ if (resctrl_arch_is_llc_occupancy_enabled() && has_busy_rmid(d)) {
/*
* When a package is going down, forcefully
* decrement rmid->ebusy. There is no way to know
@@ -4107,12 +4119,12 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
u32 idx_limit = resctrl_arch_system_num_rmid_idx();
size_t tsize;
- if (is_llc_occupancy_enabled()) {
+ if (resctrl_arch_is_llc_occupancy_enabled()) {
d->rmid_busy_llc = bitmap_zalloc(idx_limit, GFP_KERNEL);
if (!d->rmid_busy_llc)
return -ENOMEM;
}
- if (is_mbm_total_enabled()) {
+ if (resctrl_arch_is_mbm_total_enabled()) {
tsize = sizeof(*d->mbm_total);
d->mbm_total = kcalloc(idx_limit, tsize, GFP_KERNEL);
if (!d->mbm_total) {
@@ -4120,7 +4132,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
return -ENOMEM;
}
}
- if (is_mbm_local_enabled()) {
+ if (resctrl_arch_is_mbm_local_enabled()) {
tsize = sizeof(*d->mbm_local);
d->mbm_local = kcalloc(idx_limit, tsize, GFP_KERNEL);
if (!d->mbm_local) {
@@ -4159,13 +4171,13 @@ int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d)
if (err)
goto out_unlock;
- if (is_mbm_enabled()) {
+ if (resctrl_is_mbm_enabled()) {
INIT_DELAYED_WORK(&d->mbm_over, mbm_handle_overflow);
mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL,
RESCTRL_PICK_ANY_CPU);
}
- if (is_llc_occupancy_enabled())
+ if (resctrl_arch_is_llc_occupancy_enabled())
INIT_DELAYED_WORK(&d->cqm_limbo, cqm_handle_limbo);
/*
@@ -4220,12 +4232,12 @@ void resctrl_offline_cpu(unsigned int cpu)
d = get_mon_domain_from_cpu(cpu, l3);
if (d) {
- if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
+ if (resctrl_is_mbm_enabled() && cpu == d->mbm_work_cpu) {
cancel_delayed_work(&d->mbm_over);
mbm_setup_overflow_handler(d, 0, cpu);
}
- if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu &&
- has_busy_rmid(d)) {
+ if (resctrl_arch_is_llc_occupancy_enabled() &&
+ cpu == d->cqm_work_cpu && has_busy_rmid(d)) {
cancel_delayed_work(&d->cqm_limbo);
cqm_setup_limbo_handler(d, 0, cpu);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 19/40] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (17 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 18/40] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 22:04 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 20/40] x86/resctrl: Slightly clean-up mbm_config_show() James Morse
` (24 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
When BMEC is supported the resctrl event can be configured in a number
of ways. This depends on architecture support. rdt_get_mon_l3_config()
modifies the struct mon_evt and calls mbm_config_rftype_init() to create
the files that allow the configuration.
Splitting this into separate architecture and filesystem parts would
require the struct mon_evt and mbm_config_rftype_init() to be exposed.
Instead, add resctrl_arch_is_evt_configurable(), and use this from
resctrl_mon_resource_init() to initialise struct mon_evt and call
mbm_config_rftype_init().
resctrl_arch_is_evt_configurable() calls rdt_cpu_has() so it doesn't
obviously benefit from being inlined. Putting it in core.c will allow
rdt_cpu_has() to eventually become static.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v4:
* Moved all the __init changes to a later patch now that the exit gubbins
comes first.
---
arch/x86/kernel/cpu/resctrl/core.c | 15 +++++++++++++++
arch/x86/kernel/cpu/resctrl/monitor.c | 18 +++++++++---------
include/linux/resctrl.h | 2 ++
3 files changed, 26 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 8db946ebc4ff..b5ad1ed2a4de 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -866,6 +866,21 @@ bool __init rdt_cpu_has(int flag)
return ret;
}
+bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
+{
+ if (!rdt_cpu_has(X86_FEATURE_BMEC))
+ return false;
+
+ switch (evt) {
+ case QOS_L3_MBM_TOTAL_EVENT_ID:
+ return rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL);
+ case QOS_L3_MBM_LOCAL_EVENT_ID:
+ return rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL);
+ default:
+ return false;
+ }
+}
+
static __init bool get_mem_config(void)
{
struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_MBA];
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 68e05bd0eb94..ae8552ef98e6 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1205,6 +1205,15 @@ int __init resctrl_mon_resource_init(void)
l3_mon_evt_init(r);
+ if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_TOTAL_EVENT_ID)) {
+ mbm_total_event.configurable = true;
+ mbm_config_rftype_init("mbm_total_bytes_config");
+ }
+ if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_LOCAL_EVENT_ID)) {
+ mbm_local_event.configurable = true;
+ mbm_config_rftype_init("mbm_local_bytes_config");
+ }
+
return 0;
}
@@ -1248,15 +1257,6 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
/* Detect list of bandwidth sources that can be tracked */
cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
hw_res->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
-
- if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
- mbm_total_event.configurable = true;
- mbm_config_rftype_init("mbm_total_bytes_config");
- }
- if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL)) {
- mbm_local_event.configurable = true;
- mbm_config_rftype_init("mbm_local_bytes_config");
- }
}
r->mon_capable = true;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index f75f0409ae09..224a14d9aa7a 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -313,6 +313,8 @@ struct rdt_domain_hdr *resctrl_arch_find_domain(struct list_head *domain_list,
int id);
int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
+bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
+
/*
* Update the ctrl_val and apply this config right now.
* Must be called on one of the domain's CPUs.
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 20/40] x86/resctrl: Slightly clean-up mbm_config_show()
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (18 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 19/40] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-16 16:50 ` Tony Luck
2024-10-04 18:03 ` [PATCH v5 21/40] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
` (23 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Christophe JAILLET, Shaopeng Tan
From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
'mon_info' is already zeroed in the list_for_each_entry() loop below.
There is no need to explicitly initialize it here. It just wastes some
space and cycles.
Remove this un-needed code.
On a x86_64, with allmodconfig:
Before:
======
text data bss dec hex filename
74967 5103 1880 81950 1401e arch/x86/kernel/cpu/resctrl/rdtgroup.o
After:
=====
text data bss dec hex filename
74903 5103 1880 81886 13fde arch/x86/kernel/cpu/resctrl/rdtgroup.o
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://patch.msgid.link/6be685f7-e99d-42af-b26e-d5e542f597fd@intel.com/
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 7ed295225da7..9294bf74f3a8 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1624,7 +1624,7 @@ static void mondata_config_read(struct rdt_mon_domain *d, struct mon_config_info
static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
{
- struct mon_config_info mon_info = {0};
+ struct mon_config_info mon_info;
struct rdt_mon_domain *dom;
bool sep = false;
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 21/40] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (19 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 20/40] x86/resctrl: Slightly clean-up mbm_config_show() James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 22:19 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 22/40] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
` (22 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Dave Martin, Shaopeng Tan
mon_event_config_{read,write}() are called via IPI and access model
specific registers to do their work.
To support another architecture, this needs abstracting.
Rename mon_event_config_{read,write}() to have a "resctrl_arch_"
prefix, and move their struct mon_config_info parameter into
<linux/resctrl.h>. This allows another architecture to supply an
implementation of these.
As struct mon_config_info is now exposed globally, give it a 'resctrl_'
prefix. MPAM systems need access to the domain to do this work, add
the resource and domain to struct resctrl_mon_config_info.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v3:
* Added comments over the read/write helper to explain the type of the void
pointer.
Changes since v1:
* [Whitespace only] Re-tabbed struct resctrl_mon_config_info in
<linux/resctrl.h> to fit the prevailing style.
Non-functional change.
* [Commit message only] Reword to align with the actual naming of the
definitions and destination header file.
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 34 +++++++++++++-------------
include/linux/resctrl.h | 25 +++++++++++++++++++
2 files changed, 42 insertions(+), 17 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 9294bf74f3a8..304fdf199de7 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1571,11 +1571,6 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
return ret;
}
-struct mon_config_info {
- u32 evtid;
- u32 mon_config;
-};
-
#define INVALID_CONFIG_INDEX UINT_MAX
/**
@@ -1600,9 +1595,9 @@ static inline unsigned int mon_event_config_index_get(u32 evtid)
}
}
-static void mon_event_config_read(void *info)
+void resctrl_arch_mon_event_config_read(void *info)
{
- struct mon_config_info *mon_info = info;
+ struct resctrl_mon_config_info *mon_info = info;
unsigned int index;
u64 msrval;
@@ -1617,14 +1612,15 @@ static void mon_event_config_read(void *info)
mon_info->mon_config = msrval & MAX_EVT_CONFIG_BITS;
}
-static void mondata_config_read(struct rdt_mon_domain *d, struct mon_config_info *mon_info)
+static void mondata_config_read(struct resctrl_mon_config_info *mon_info)
{
- smp_call_function_any(&d->hdr.cpu_mask, mon_event_config_read, mon_info, 1);
+ smp_call_function_any(&mon_info->d->hdr.cpu_mask,
+ resctrl_arch_mon_event_config_read, mon_info, 1);
}
static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
{
- struct mon_config_info mon_info;
+ struct resctrl_mon_config_info mon_info;
struct rdt_mon_domain *dom;
bool sep = false;
@@ -1635,9 +1631,11 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid
if (sep)
seq_puts(s, ";");
- memset(&mon_info, 0, sizeof(struct mon_config_info));
+ memset(&mon_info, 0, sizeof(struct resctrl_mon_config_info));
+ mon_info.r = r;
+ mon_info.d = dom;
mon_info.evtid = evtid;
- mondata_config_read(dom, &mon_info);
+ mondata_config_read(&mon_info);
seq_printf(s, "%d=0x%02x", dom->hdr.id, mon_info.mon_config);
sep = true;
@@ -1670,9 +1668,9 @@ static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
return 0;
}
-static void mon_event_config_write(void *info)
+void resctrl_arch_mon_event_config_write(void *info)
{
- struct mon_config_info *mon_info = info;
+ struct resctrl_mon_config_info *mon_info = info;
unsigned int index;
index = mon_event_config_index_get(mon_info->evtid);
@@ -1686,14 +1684,16 @@ static void mon_event_config_write(void *info)
static void mbm_config_write_domain(struct rdt_resource *r,
struct rdt_mon_domain *d, u32 evtid, u32 val)
{
- struct mon_config_info mon_info = {0};
+ struct resctrl_mon_config_info mon_info = {0};
/*
* Read the current config value first. If both are the same then
* no need to write it again.
*/
+ mon_info.r = r;
+ mon_info.d = d;
mon_info.evtid = evtid;
- mondata_config_read(d, &mon_info);
+ mondata_config_read(&mon_info);
if (mon_info.mon_config == val)
return;
@@ -1705,7 +1705,7 @@ static void mbm_config_write_domain(struct rdt_resource *r,
* are scoped at the domain level. Writing any of these MSRs
* on one CPU is observed by all the CPUs in the domain.
*/
- smp_call_function_any(&d->hdr.cpu_mask, mon_event_config_write,
+ smp_call_function_any(&d->hdr.cpu_mask, resctrl_arch_mon_event_config_write,
&mon_info, 1);
/*
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 224a14d9aa7a..0072c2e5947f 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -271,6 +271,13 @@ struct resctrl_cpu_defaults {
u32 rmid;
};
+struct resctrl_mon_config_info {
+ struct rdt_resource *r;
+ struct rdt_mon_domain *d;
+ u32 evtid;
+ u32 mon_config;
+};
+
/**
* resctrl_arch_sync_cpu_closid_rmid() - Refresh this CPU's CLOSID and RMID.
* Call via IPI.
@@ -315,6 +322,24 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
+/**
+ * resctrl_arch_mon_event_config_write() - Write the config for a counter.
+ * @info: struct resctrl_mon_config_info describing the resource, domain
+ * and event.
+ *
+ * Must be called on a CPU that is a member of the specified domain.
+ */
+void resctrl_arch_mon_event_config_write(void *info);
+
+/**
+ * resctrl_arch_mon_event_config_read() - Read the config for a counter.
+ * @info: struct resctrl_mon_config_info describing the resource, domain
+ * and event.
+ *
+ * Must be called on a CPU that is a member of the specified domain.
+ */
+void resctrl_arch_mon_event_config_read(void *info);
+
/*
* Update the ctrl_val and apply this config right now.
* Must be called on one of the domain's CPUs.
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 22/40] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (20 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 21/40] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 22:42 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 23/40] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions James Morse
` (21 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
The mbm_cfg_mask field lists the bits that user-space can set when
configuring an event. This value is output via the last_cmd_status
file.
Once the filesystem parts of resctrl are moved to live in /fs/, the
struct rdt_hw_resource is inaccessible to the filesystem code. Because
this value is output to user-space, it has to be accessible to the
filesystem code.
Move it to struct rdt_resource.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Change since v1:
* Reword comments to avoid being overly arch-specific.
---
arch/x86/kernel/cpu/resctrl/internal.h | 3 ---
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 ++---
include/linux/resctrl.h | 3 +++
4 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 6b076216911c..92a94939cf93 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -415,8 +415,6 @@ struct msr_param {
* @msr_update: Function pointer to update QOS MSRs
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
* @mbm_width: Monitor width, to detect and correct for overflow.
- * @mbm_cfg_mask: Bandwidth sources that can be tracked when Bandwidth
- * Monitoring Event Configuration (BMEC) is supported.
* @cdp_enabled: CDP state of this resource
*
* Members of this structure are either private to the architecture
@@ -430,7 +428,6 @@ struct rdt_hw_resource {
void (*msr_update)(struct msr_param *m);
unsigned int mon_scale;
unsigned int mbm_width;
- unsigned int mbm_cfg_mask;
bool cdp_enabled;
};
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index ae8552ef98e6..175fd7dbf34f 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1256,7 +1256,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
/* Detect list of bandwidth sources that can be tracked */
cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
- hw_res->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
+ r->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
}
r->mon_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 304fdf199de7..0f839b5c59da 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1722,7 +1722,6 @@ static void mbm_config_write_domain(struct rdt_resource *r,
static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid)
{
- struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
char *dom_str = NULL, *id_str;
unsigned long dom_id, val;
struct rdt_mon_domain *d;
@@ -1749,9 +1748,9 @@ static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid)
}
/* Value from user cannot be more than the supported set of events */
- if ((val & hw_res->mbm_cfg_mask) != val) {
+ if ((val & r->mbm_cfg_mask) != val) {
rdt_last_cmd_printf("Invalid event configuration: max valid mask is 0x%02x\n",
- hw_res->mbm_cfg_mask);
+ r->mbm_cfg_mask);
return -EINVAL;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 0072c2e5947f..84588ab1994d 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -218,6 +218,8 @@ enum resctrl_schema_fmt {
* @name: Name to use in "schemata" file.
* @schema_fmt: Which format string and parser is used for this schema.
* @evt_list: List of monitoring events
+ * @mbm_cfg_mask: Bandwidth sources that can be tracked when Bandwidth
+ * Monitoring Event Configuration (BMEC) is supported.
* @cdp_capable: Is the CDP feature available on this resource
*/
struct rdt_resource {
@@ -234,6 +236,7 @@ struct rdt_resource {
char *name;
enum resctrl_schema_fmt schema_fmt;
struct list_head evt_list;
+ unsigned int mbm_cfg_mask;
bool cdp_capable;
};
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 23/40] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (21 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 22/40] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 22:44 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 24/40] x86/resctrl: Allow an architecture to disable pseudo lock James Morse
` (20 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
resctrl's pseudo lock has some copy-to-cache and measurement
functions that are micro-architecture specific. pseudo_lock_fn()
is not at all portable. Label these 'resctrl_arch_' so they stay
under /arch/x86.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
arch/x86/include/asm/resctrl.h | 5 ++++
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 36 ++++++++++++-----------
2 files changed, 24 insertions(+), 17 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index a28a1d8a4ca8..bf32d30e595b 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -205,6 +205,11 @@ static inline void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, int evtid
static inline void resctrl_arch_mon_ctx_free(struct rdt_resource *r, int evtid,
void *ctx) { };
+u64 resctrl_arch_get_prefetch_disable_bits(void);
+int resctrl_arch_pseudo_lock_fn(void *_rdtgrp);
+int resctrl_arch_measure_cycles_lat_fn(void *_plr);
+int resctrl_arch_measure_l2_residency(void *_plr);
+int resctrl_arch_measure_l3_residency(void *_plr);
void resctrl_cpu_detect(struct cpuinfo_x86 *c);
#else
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 972e6b6b0481..56e59441d6ae 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -61,7 +61,8 @@ static const struct class pseudo_lock_class = {
};
/**
- * get_prefetch_disable_bits - prefetch disable bits of supported platforms
+ * resctrl_arch_get_prefetch_disable_bits - prefetch disable bits of supported
+ * platforms
* @void: It takes no parameters.
*
* Capture the list of platforms that have been validated to support
@@ -75,13 +76,13 @@ static const struct class pseudo_lock_class = {
* in the SDM.
*
* When adding a platform here also add support for its cache events to
- * measure_cycles_perf_fn()
+ * resctrl_arch_measure_l*_residency()
*
* Return:
* If platform is supported, the bits to disable hardware prefetchers, 0
* if platform is not supported.
*/
-static u64 get_prefetch_disable_bits(void)
+u64 resctrl_arch_get_prefetch_disable_bits(void)
{
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
boot_cpu_data.x86 != 6)
@@ -408,7 +409,7 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
}
/**
- * pseudo_lock_fn - Load kernel memory into cache
+ * resctrl_arch_pseudo_lock_fn - Load kernel memory into cache
* @_rdtgrp: resource group to which pseudo-lock region belongs
*
* This is the core pseudo-locking flow.
@@ -426,7 +427,7 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
*
* Return: 0. Waiter on waitqueue will be woken on completion.
*/
-static int pseudo_lock_fn(void *_rdtgrp)
+int resctrl_arch_pseudo_lock_fn(void *_rdtgrp)
{
struct rdtgroup *rdtgrp = _rdtgrp;
struct pseudo_lock_region *plr = rdtgrp->plr;
@@ -712,7 +713,7 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
* Not knowing the bits to disable prefetching implies that this
* platform does not support Cache Pseudo-Locking.
*/
- prefetch_disable_bits = get_prefetch_disable_bits();
+ prefetch_disable_bits = resctrl_arch_get_prefetch_disable_bits();
if (prefetch_disable_bits == 0) {
rdt_last_cmd_puts("Pseudo-locking not supported\n");
return -EINVAL;
@@ -872,7 +873,8 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d)
}
/**
- * measure_cycles_lat_fn - Measure cycle latency to read pseudo-locked memory
+ * resctrl_arch_measure_cycles_lat_fn - Measure cycle latency to read
+ * pseudo-locked memory
* @_plr: pseudo-lock region to measure
*
* There is no deterministic way to test if a memory region is cached. One
@@ -885,7 +887,7 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d)
*
* Return: 0. Waiter on waitqueue will be woken on completion.
*/
-static int measure_cycles_lat_fn(void *_plr)
+int resctrl_arch_measure_cycles_lat_fn(void *_plr)
{
struct pseudo_lock_region *plr = _plr;
u32 saved_low, saved_high;
@@ -1069,7 +1071,7 @@ static int measure_residency_fn(struct perf_event_attr *miss_attr,
return 0;
}
-static int measure_l2_residency(void *_plr)
+int resctrl_arch_measure_l2_residency(void *_plr)
{
struct pseudo_lock_region *plr = _plr;
struct residency_counts counts = {0};
@@ -1107,7 +1109,7 @@ static int measure_l2_residency(void *_plr)
return 0;
}
-static int measure_l3_residency(void *_plr)
+int resctrl_arch_measure_l3_residency(void *_plr)
{
struct pseudo_lock_region *plr = _plr;
struct residency_counts counts = {0};
@@ -1205,18 +1207,18 @@ static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
plr->cpu = cpu;
if (sel == 1)
- thread = kthread_create_on_node(measure_cycles_lat_fn, plr,
- cpu_to_node(cpu),
+ thread = kthread_create_on_node(resctrl_arch_measure_cycles_lat_fn,
+ plr, cpu_to_node(cpu),
"pseudo_lock_measure/%u",
cpu);
else if (sel == 2)
- thread = kthread_create_on_node(measure_l2_residency, plr,
- cpu_to_node(cpu),
+ thread = kthread_create_on_node(resctrl_arch_measure_l2_residency,
+ plr, cpu_to_node(cpu),
"pseudo_lock_measure/%u",
cpu);
else if (sel == 3)
- thread = kthread_create_on_node(measure_l3_residency, plr,
- cpu_to_node(cpu),
+ thread = kthread_create_on_node(resctrl_arch_measure_l3_residency,
+ plr, cpu_to_node(cpu),
"pseudo_lock_measure/%u",
cpu);
else
@@ -1315,7 +1317,7 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
plr->thread_done = 0;
- thread = kthread_create_on_node(pseudo_lock_fn, rdtgrp,
+ thread = kthread_create_on_node(resctrl_arch_pseudo_lock_fn, rdtgrp,
cpu_to_node(plr->cpu),
"pseudo_lock/%u", plr->cpu);
if (IS_ERR(thread)) {
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 24/40] x86/resctrl: Allow an architecture to disable pseudo lock
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (22 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 23/40] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 25/40] x86/resctrl: Make prefetch_disable_bits belong to the arch code James Morse
` (19 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Pseudo-lock relies on knowledge of the micro-architecture to disable
prefetchers etc.
On arm64 these controls are typically secure only, meaning linux can't
access them. Arm's cache-lockdown feature works in a very different
way. Resctrl's pseudo-lock isn't going to be used on arm64 platforms.
Add a Kconfig symbol that can be selected by the architecture. This
enables or disables building of the pseudo_lock.c file, and replaces
the functions with stubs. An additional IS_ENABLED() check is needed
in rdtgroup_mode_write() so that attempting to enable pseudo-lock
reports an "Unknown or unsupported mode" to user-space via the
last_cmd_status file.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v4:
* "last_cmd file" -> "last_cmd_status file"
Changes since v2:
* Clarified the commit message as to where the error string is printed.
Changes since v1:
* [Commit message only] Typo fix:
s/psuedo/pseudo/g
---
arch/x86/Kconfig | 7 ++++
arch/x86/kernel/cpu/resctrl/Makefile | 5 +--
arch/x86/kernel/cpu/resctrl/internal.h | 48 +++++++++++++++++++++-----
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 3 +-
4 files changed, 52 insertions(+), 11 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2852fcd82cbd..47ff2589fbce 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -498,6 +498,7 @@ config X86_CPU_RESCTRL
depends on X86 && (CPU_SUP_INTEL || CPU_SUP_AMD)
select KERNFS
select PROC_CPU_RESCTRL if PROC_FS
+ select RESCTRL_FS_PSEUDO_LOCK
help
Enable x86 CPU resource control support.
@@ -514,6 +515,12 @@ config X86_CPU_RESCTRL
Say N if unsure.
+config RESCTRL_FS_PSEUDO_LOCK
+ bool
+ help
+ Software mechanism to pin data in a cache portion using
+ micro-architecture specific knowledge.
+
config X86_FRED
bool "Flexible Return and Event Delivery"
depends on X86_64
diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/resctrl/Makefile
index 4a06c37b9cf1..0c13b0befd8a 100644
--- a/arch/x86/kernel/cpu/resctrl/Makefile
+++ b/arch/x86/kernel/cpu/resctrl/Makefile
@@ -1,4 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-obj-$(CONFIG_X86_CPU_RESCTRL) += core.o rdtgroup.o monitor.o
-obj-$(CONFIG_X86_CPU_RESCTRL) += ctrlmondata.o pseudo_lock.o
+obj-$(CONFIG_X86_CPU_RESCTRL) += core.o rdtgroup.o monitor.o
+obj-$(CONFIG_X86_CPU_RESCTRL) += ctrlmondata.o
+obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK) += pseudo_lock.o
CFLAGS_pseudo_lock.o = -I$(src)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 92a94939cf93..21109418b46a 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -505,14 +505,6 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_ctrl_domain
unsigned long cbm);
enum rdtgrp_mode rdtgroup_mode_by_closid(int closid);
int rdtgroup_tasks_assigned(struct rdtgroup *r);
-int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
-int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp);
-bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_ctrl_domain *d, unsigned long cbm);
-bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d);
-int rdt_pseudo_lock_init(void);
-void rdt_pseudo_lock_release(void);
-int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp);
-void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp);
struct rdt_ctrl_domain *get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r);
struct rdt_mon_domain *get_mon_domain_from_cpu(int cpu, struct rdt_resource *r);
int closids_supported(void);
@@ -546,4 +538,44 @@ void rdt_staged_configs_clear(void);
bool closid_allocated(unsigned int closid);
int resctrl_find_cleanest_closid(void);
+#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
+int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
+int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp);
+bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_ctrl_domain *d, unsigned long cbm);
+bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d);
+int rdt_pseudo_lock_init(void);
+void rdt_pseudo_lock_release(void);
+int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp);
+void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp);
+#else
+static inline int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_ctrl_domain *d, unsigned long cbm)
+{
+ return false;
+}
+
+static inline bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d)
+{
+ return false;
+}
+
+static inline int rdt_pseudo_lock_init(void) { return 0; }
+static inline void rdt_pseudo_lock_release(void) { }
+static inline int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp) { }
+#endif /* CONFIG_RESCTRL_FS_PSEUDO_LOCK */
+
#endif /* _ASM_X86_RESCTRL_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 0f839b5c59da..3f10e6897daa 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1444,7 +1444,8 @@ static ssize_t rdtgroup_mode_write(struct kernfs_open_file *of,
goto out;
}
rdtgrp->mode = RDT_MODE_EXCLUSIVE;
- } else if (!strcmp(buf, "pseudo-locksetup")) {
+ } else if (IS_ENABLED(CONFIG_RESCTRL_FS_PSEUDO_LOCK) &&
+ !strcmp(buf, "pseudo-locksetup")) {
ret = rdtgroup_locksetup_enter(rdtgrp);
if (ret)
goto out;
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 25/40] x86/resctrl: Make prefetch_disable_bits belong to the arch code
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (23 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 24/40] x86/resctrl: Allow an architecture to disable pseudo lock James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 22:53 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 26/40] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr James Morse
` (18 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
prefetch_disable_bits is set by rdtgroup_locksetup_enter() from a
value provided by the architecture, but is largely read by other
architecture helpers.
Instead of exporting this value, make
resctrl_arch_get_prefetch_disable_bits() set it so that the other
arch-code helpers can use the cached-value.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 56e59441d6ae..0fee49fc153a 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -84,6 +84,8 @@ static const struct class pseudo_lock_class = {
*/
u64 resctrl_arch_get_prefetch_disable_bits(void)
{
+ prefetch_disable_bits = 0;
+
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
boot_cpu_data.x86 != 6)
return 0;
@@ -99,7 +101,8 @@ u64 resctrl_arch_get_prefetch_disable_bits(void)
* 3 DCU IP Prefetcher Disable (R/W)
* 63:4 Reserved
*/
- return 0xF;
+ prefetch_disable_bits = 0xF;
+ break;
case INTEL_ATOM_GOLDMONT:
case INTEL_ATOM_GOLDMONT_PLUS:
/*
@@ -110,10 +113,11 @@ u64 resctrl_arch_get_prefetch_disable_bits(void)
* 2 DCU Hardware Prefetcher Disable (R/W)
* 63:3 Reserved
*/
- return 0x5;
+ prefetch_disable_bits = 0x5;
+ break;
}
- return 0;
+ return prefetch_disable_bits;
}
/**
@@ -713,8 +717,7 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
* Not knowing the bits to disable prefetching implies that this
* platform does not support Cache Pseudo-Locking.
*/
- prefetch_disable_bits = resctrl_arch_get_prefetch_disable_bits();
- if (prefetch_disable_bits == 0) {
+ if (resctrl_arch_get_prefetch_disable_bits() == 0) {
rdt_last_cmd_puts("Pseudo-locking not supported\n");
return -EINVAL;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 26/40] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (24 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 25/40] x86/resctrl: Make prefetch_disable_bits belong to the arch code James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 27/40] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl James Morse
` (17 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
resctrl_arch_pseudo_lock_fn() has architecture specific behaviour,
and takes a struct rdtgroup as an argument.
After the filesystem code moves to /fs/, the definition of struct
rdtgroup will not be available to the architecture code.
The only reason resctrl_arch_pseudo_lock_fn() wants the rdtgroup is
for the CLOSID. Embed that in the pseudo_lock_region as a closid,
and move the definition of struct pseudo_lock_region to resctrl.h.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Change since v1:
* [Commit message only] Typo fix:
s/hw_closid/closid/g
---
arch/x86/include/asm/resctrl.h | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 37 ---------------------
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 13 ++++----
include/linux/resctrl.h | 39 +++++++++++++++++++++++
4 files changed, 47 insertions(+), 44 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index bf32d30e595b..6c1446ce43da 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -206,7 +206,7 @@ static inline void resctrl_arch_mon_ctx_free(struct rdt_resource *r, int evtid,
void *ctx) { };
u64 resctrl_arch_get_prefetch_disable_bits(void);
-int resctrl_arch_pseudo_lock_fn(void *_rdtgrp);
+int resctrl_arch_pseudo_lock_fn(void *_plr);
int resctrl_arch_measure_cycles_lat_fn(void *_plr);
int resctrl_arch_measure_l2_residency(void *_plr);
int resctrl_arch_measure_l3_residency(void *_plr);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 21109418b46a..9c08efb0e198 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -208,43 +208,6 @@ struct mongroup {
u32 rmid;
};
-/**
- * struct pseudo_lock_region - pseudo-lock region information
- * @s: Resctrl schema for the resource to which this
- * pseudo-locked region belongs
- * @d: RDT domain to which this pseudo-locked region
- * belongs
- * @cbm: bitmask of the pseudo-locked region
- * @lock_thread_wq: waitqueue used to wait on the pseudo-locking thread
- * completion
- * @thread_done: variable used by waitqueue to test if pseudo-locking
- * thread completed
- * @cpu: core associated with the cache on which the setup code
- * will be run
- * @line_size: size of the cache lines
- * @size: size of pseudo-locked region in bytes
- * @kmem: the kernel memory associated with pseudo-locked region
- * @minor: minor number of character device associated with this
- * region
- * @debugfs_dir: pointer to this region's directory in the debugfs
- * filesystem
- * @pm_reqs: Power management QoS requests related to this region
- */
-struct pseudo_lock_region {
- struct resctrl_schema *s;
- struct rdt_ctrl_domain *d;
- u32 cbm;
- wait_queue_head_t lock_thread_wq;
- int thread_done;
- int cpu;
- unsigned int line_size;
- unsigned int size;
- void *kmem;
- unsigned int minor;
- struct dentry *debugfs_dir;
- struct list_head pm_reqs;
-};
-
/**
* struct rdtgroup - store rdtgroup's data in resctrl file system.
* @kn: kernfs node
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 0fee49fc153a..9bcd1d06b4e8 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -414,7 +414,7 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
/**
* resctrl_arch_pseudo_lock_fn - Load kernel memory into cache
- * @_rdtgrp: resource group to which pseudo-lock region belongs
+ * @_plr: the pseudo-lock region descriptor
*
* This is the core pseudo-locking flow.
*
@@ -431,10 +431,9 @@ static void pseudo_lock_free(struct rdtgroup *rdtgrp)
*
* Return: 0. Waiter on waitqueue will be woken on completion.
*/
-int resctrl_arch_pseudo_lock_fn(void *_rdtgrp)
+int resctrl_arch_pseudo_lock_fn(void *_plr)
{
- struct rdtgroup *rdtgrp = _rdtgrp;
- struct pseudo_lock_region *plr = rdtgrp->plr;
+ struct pseudo_lock_region *plr = _plr;
u32 rmid_p, closid_p;
unsigned long i;
u64 saved_msr;
@@ -494,7 +493,8 @@ int resctrl_arch_pseudo_lock_fn(void *_rdtgrp)
* pseudo-locked followed by reading of kernel memory to load it
* into the cache.
*/
- __wrmsr(MSR_IA32_PQR_ASSOC, rmid_p, rdtgrp->closid);
+ __wrmsr(MSR_IA32_PQR_ASSOC, rmid_p, plr->closid);
+
/*
* Cache was flushed earlier. Now access kernel memory to read it
* into cache region associated with just activated plr->closid.
@@ -1320,7 +1320,8 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
plr->thread_done = 0;
- thread = kthread_create_on_node(resctrl_arch_pseudo_lock_fn, rdtgrp,
+ plr->closid = rdtgrp->closid;
+ thread = kthread_create_on_node(resctrl_arch_pseudo_lock_fn, plr,
cpu_to_node(plr->cpu),
"pseudo_lock/%u", plr->cpu);
if (IS_ERR(thread)) {
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 84588ab1994d..e7354f581d3b 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -58,6 +58,45 @@ enum resctrl_conf_type {
#define CDP_NUM_TYPES (CDP_DATA + 1)
+/*
+ * struct pseudo_lock_region - pseudo-lock region information
+ * @s: Resctrl schema for the resource to which this
+ * pseudo-locked region belongs
+ * @closid: The closid that this pseudo-locked region uses
+ * @d: RDT domain to which this pseudo-locked region
+ * belongs
+ * @cbm: bitmask of the pseudo-locked region
+ * @lock_thread_wq: waitqueue used to wait on the pseudo-locking thread
+ * completion
+ * @thread_done: variable used by waitqueue to test if pseudo-locking
+ * thread completed
+ * @cpu: core associated with the cache on which the setup code
+ * will be run
+ * @line_size: size of the cache lines
+ * @size: size of pseudo-locked region in bytes
+ * @kmem: the kernel memory associated with pseudo-locked region
+ * @minor: minor number of character device associated with this
+ * region
+ * @debugfs_dir: pointer to this region's directory in the debugfs
+ * filesystem
+ * @pm_reqs: Power management QoS requests related to this region
+ */
+struct pseudo_lock_region {
+ struct resctrl_schema *s;
+ u32 closid;
+ struct rdt_ctrl_domain *d;
+ u32 cbm;
+ wait_queue_head_t lock_thread_wq;
+ int thread_done;
+ int cpu;
+ unsigned int line_size;
+ unsigned int size;
+ void *kmem;
+ unsigned int minor;
+ struct dentry *debugfs_dir;
+ struct list_head pm_reqs;
+};
+
/**
* struct resctrl_staged_config - parsed configuration to be applied
* @new_ctrl: new ctrl value to be loaded
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 27/40] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (25 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 26/40] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 22:59 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 28/40] x86/resctrl: Move get_config_index() to a header James Morse
` (16 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
thread_throttle_mode_init() is called from the architecture specific code
to make the 'thread_throttle_mode' file visible. The architecture specific
code has already set the membw.throttle_mode in the rdt_resource.
This doesn't need to be specific to the architecture, the throttle_mode
can be used by resctrl to determine if the 'thread_throttle_mode' file
should be visible.
Call thread_throttle_mode_init() from resctrl_setup(), check the
membw.throttle_mode on the MBA resource. This avoids publishing an
extra function between the architecture and filesystem code.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
arch/x86/kernel/cpu/resctrl/core.c | 1 -
arch/x86/kernel/cpu/resctrl/internal.h | 1 -
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 9 ++++++++-
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index b5ad1ed2a4de..0da7314195af 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -228,7 +228,6 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
r->membw.throttle_mode = THREAD_THROTTLE_PER_THREAD;
else
r->membw.throttle_mode = THREAD_THROTTLE_MAX;
- thread_throttle_mode_init();
r->alloc_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 9c08efb0e198..30de95e59129 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -495,7 +495,6 @@ void cqm_handle_limbo(struct work_struct *work);
bool has_busy_rmid(struct rdt_mon_domain *d);
void __check_limbo(struct rdt_mon_domain *d, bool force_free);
void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
-void __init thread_throttle_mode_init(void);
void __init mbm_config_rftype_init(const char *config);
void rdt_staged_configs_clear(void);
bool closid_allocated(unsigned int closid);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 3f10e6897daa..596f5f087834 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2048,10 +2048,15 @@ static struct rftype *rdtgroup_get_rftype_by_name(const char *name)
return NULL;
}
-void __init thread_throttle_mode_init(void)
+static void __init thread_throttle_mode_init(void)
{
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
struct rftype *rft;
+ if (!r->alloc_capable ||
+ r->membw.throttle_mode == THREAD_THROTTLE_UNDEFINED)
+ return;
+
rft = rdtgroup_get_rftype_by_name("thread_throttle_mode");
if (!rft)
return;
@@ -4264,6 +4269,8 @@ int __init resctrl_init(void)
rdtgroup_setup_default();
+ thread_throttle_mode_init();
+
ret = resctrl_mon_resource_init();
if (ret)
return ret;
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 28/40] x86/resctrl: Move get_config_index() to a header
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (26 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 27/40] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 29/40] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl James Morse
` (15 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Dave Martin, Shaopeng Tan
get_config_index() is used by the architecture specific code to map a
CLOSID+type pair to an index in the configuration arrays.
MPAM needs to do this too to preserve the ABI to user-space, there is
no reason to do it differently.
Move the helper to a header file.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v1:
* Reindent resctrl_get_config_index() as per coding-style.rst rules.
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 19 +++----------------
include/linux/resctrl.h | 15 +++++++++++++++
2 files changed, 18 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 7ea362c099db..c2c1010eb869 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -282,25 +282,12 @@ static int parse_line(char *line, struct resctrl_schema *s,
return -EINVAL;
}
-static u32 get_config_index(u32 closid, enum resctrl_conf_type type)
-{
- switch (type) {
- default:
- case CDP_NONE:
- return closid;
- case CDP_CODE:
- return closid * 2 + 1;
- case CDP_DATA:
- return closid * 2;
- }
-}
-
int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
u32 closid, enum resctrl_conf_type t, u32 cfg_val)
{
struct rdt_hw_ctrl_domain *hw_dom = resctrl_to_arch_ctrl_dom(d);
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
- u32 idx = get_config_index(closid, t);
+ u32 idx = resctrl_get_config_index(closid, t);
struct msr_param msr_param;
if (!cpumask_test_cpu(smp_processor_id(), &d->hdr.cpu_mask))
@@ -337,7 +324,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
if (!cfg->have_new_ctrl)
continue;
- idx = get_config_index(closid, t);
+ idx = resctrl_get_config_index(closid, t);
if (cfg->new_ctrl == hw_dom->ctrl_val[idx])
continue;
hw_dom->ctrl_val[idx] = cfg->new_ctrl;
@@ -457,7 +444,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
u32 closid, enum resctrl_conf_type type)
{
struct rdt_hw_ctrl_domain *hw_dom = resctrl_to_arch_ctrl_dom(d);
- u32 idx = get_config_index(closid, type);
+ u32 idx = resctrl_get_config_index(closid, type);
return hw_dom->ctrl_val[idx];
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index e7354f581d3b..653d7cf41e64 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -382,6 +382,21 @@ void resctrl_arch_mon_event_config_write(void *info);
*/
void resctrl_arch_mon_event_config_read(void *info);
+/* For use by arch code to remap resctrl's smaller CDP CLOSID range */
+static inline u32 resctrl_get_config_index(u32 closid,
+ enum resctrl_conf_type type)
+{
+ switch (type) {
+ default:
+ case CDP_NONE:
+ return closid;
+ case CDP_CODE:
+ return closid * 2 + 1;
+ case CDP_DATA:
+ return closid * 2;
+ }
+}
+
/*
* Update the ctrl_val and apply this config right now.
* Must be called on one of the domain's CPUs.
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 29/40] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (27 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 28/40] x86/resctrl: Move get_config_index() to a header James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 23:02 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 30/40] x86/resctrl: Describe resctrl's bitmap size assumptions James Morse
` (14 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
get_{mon,ctrl}_domain_from_cpu() are handy helpers that both the arch
code and resctrl need to use. Rename them to have a resctrl_ prefix
and move them to a header file.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
arch/x86/kernel/cpu/resctrl/core.c | 30 ---------------------
arch/x86/kernel/cpu/resctrl/internal.h | 2 --
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
include/linux/resctrl.h | 37 ++++++++++++++++++++++++++
5 files changed, 39 insertions(+), 34 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 0da7314195af..f484726a2588 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -349,36 +349,6 @@ static void cat_wrmsr(struct msr_param *m)
wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
}
-struct rdt_ctrl_domain *get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r)
-{
- struct rdt_ctrl_domain *d;
-
- lockdep_assert_cpus_held();
-
- list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
- /* Find the domain that contains this CPU */
- if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
- return d;
- }
-
- return NULL;
-}
-
-struct rdt_mon_domain *get_mon_domain_from_cpu(int cpu, struct rdt_resource *r)
-{
- struct rdt_mon_domain *d;
-
- lockdep_assert_cpus_held();
-
- list_for_each_entry(d, &r->mon_domains, hdr.list) {
- /* Find the domain that contains this CPU */
- if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
- return d;
- }
-
- return NULL;
-}
-
u32 resctrl_arch_get_num_closid(struct rdt_resource *r)
{
return resctrl_to_arch_res(r)->num_closid;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 30de95e59129..e939a0a28a49 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -468,8 +468,6 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_ctrl_domain
unsigned long cbm);
enum rdtgrp_mode rdtgroup_mode_by_closid(int closid);
int rdtgroup_tasks_assigned(struct rdtgroup *r);
-struct rdt_ctrl_domain *get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r);
-struct rdt_mon_domain *get_mon_domain_from_cpu(int cpu, struct rdt_resource *r);
int closids_supported(void);
void closid_free(int closid);
int alloc_rmid(u32 closid);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 175fd7dbf34f..39c450624ed0 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -767,7 +767,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
idx = resctrl_arch_rmid_idx_encode(closid, rmid);
pmbm_data = &dom_mbm->mbm_local[idx];
- dom_mba = get_ctrl_domain_from_cpu(smp_processor_id(), r_mba);
+ dom_mba = resctrl_get_ctrl_domain_from_cpu(smp_processor_id(), r_mba);
if (!dom_mba) {
pr_warn_once("Failure to get domain for MBA update\n");
return;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 596f5f087834..ee9c3e4ee889 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4235,7 +4235,7 @@ void resctrl_offline_cpu(unsigned int cpu)
if (!l3->mon_capable)
goto out_unlock;
- d = get_mon_domain_from_cpu(cpu, l3);
+ d = resctrl_get_mon_domain_from_cpu(cpu, l3);
if (d) {
if (resctrl_is_mbm_enabled() && cpu == d->mbm_work_cpu) {
cancel_delayed_work(&d->mbm_over);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 653d7cf41e64..bbce79190b13 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -3,6 +3,7 @@
#define _RESCTRL_H
#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/pid.h>
@@ -397,6 +398,42 @@ static inline u32 resctrl_get_config_index(u32 closid,
}
}
+/*
+ * Caller must hold the cpuhp read lock to prevent the struct rdt_domain being
+ * freed.
+ */
+static inline struct rdt_ctrl_domain *
+resctrl_get_ctrl_domain_from_cpu(int cpu, struct rdt_resource *r)
+{
+ struct rdt_ctrl_domain *d;
+
+ lockdep_assert_cpus_held();
+
+ list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
+ /* Find the domain that contains this CPU */
+ if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
+ return d;
+ }
+
+ return NULL;
+}
+
+static inline struct rdt_mon_domain *
+resctrl_get_mon_domain_from_cpu(int cpu, struct rdt_resource *r)
+{
+ struct rdt_mon_domain *d;
+
+ lockdep_assert_cpus_held();
+
+ list_for_each_entry(d, &r->mon_domains, hdr.list) {
+ /* Find the domain that contains this CPU */
+ if (cpumask_test_cpu(cpu, &d->hdr.cpu_mask))
+ return d;
+ }
+
+ return NULL;
+}
+
/*
* Update the ctrl_val and apply this config right now.
* Must be called on one of the domain's CPUs.
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 30/40] x86/resctrl: Describe resctrl's bitmap size assumptions
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (28 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 29/40] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-08 18:50 ` Tony Luck
2024-10-04 18:03 ` [PATCH v5 31/40] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" James Morse
` (13 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
resctrl operates on configuration bitmaps and a bitmap of allocated
CLOSID, both are stored in a u32.
MPAM supports configuration/portion bitmaps and PARTIDs larger
than will fit in a u32.
Add some preprocessor values that make it clear why MPAM clamps
some of these values. This will make it easier to find code related
to these values if this resctrl behaviour ever changes.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
include/linux/resctrl.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index bbce79190b13..7af6c40764ed 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -27,6 +27,17 @@ int proc_resctrl_show(struct seq_file *m,
/* max value for struct rdt_domain's mbps_val */
#define MBA_MAX_MBPS U32_MAX
+/*
+ * Resctrl uses a u32 as a closid bitmap. The maximum closid is 32.
+ */
+#define RESCTRL_MAX_CLOSID 32
+
+/*
+ * Resctrl uses u32 to hold the user-space config. The maximum bitmap size is
+ * 32.
+ */
+#define RESCTRL_MAX_CBM 32
+
/* Walk all possible resources, with variants for only controls or monitors. */
#define for_each_rdt_resource(_r) \
for ((_r) = resctrl_arch_get_resource(0); \
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 31/40] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_"
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (29 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 30/40] x86/resctrl: Describe resctrl's bitmap size assumptions James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 32/40] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
` (12 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
resctrl_sched_in() loads the architecture specific CPU MSRs with the
CLOSID and RMID values. This function was named before resctrl was
split to have architecture specific code, and generic filesystem code.
This function is obviously architecture specific, but does not begin
with 'resctrl_arch_', making it the odd one out in the functions an
architecture needs to support to enable resctrl.
Rename it for consistency. This is purely cosmetic.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
arch/x86/include/asm/resctrl.h | 4 ++--
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 12 ++++++------
arch/x86/kernel/process_32.c | 2 +-
arch/x86/kernel/process_64.c | 2 +-
4 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 6c1446ce43da..22e508c10059 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -177,7 +177,7 @@ static inline bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 ignored,
return READ_ONCE(tsk->rmid) == rmid;
}
-static inline void resctrl_sched_in(struct task_struct *tsk)
+static inline void resctrl_arch_sched_in(struct task_struct *tsk)
{
if (static_branch_likely(&rdt_enable_key))
__resctrl_sched_in(tsk);
@@ -214,7 +214,7 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c);
#else
-static inline void resctrl_sched_in(struct task_struct *tsk) {}
+static inline void resctrl_arch_sched_in(struct task_struct *tsk) {}
static inline void resctrl_cpu_detect(struct cpuinfo_x86 *c) {}
#endif /* CONFIG_X86_CPU_RESCTRL */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index ee9c3e4ee889..f77fab859c35 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -353,7 +353,7 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
}
/*
- * This is safe against resctrl_sched_in() called from __switch_to()
+ * This is safe against resctrl_arch_sched_in() called from __switch_to()
* because __switch_to() is executed with interrupts disabled. A local call
* from update_closid_rmid() is protected against __switch_to() because
* preemption is disabled.
@@ -372,7 +372,7 @@ void resctrl_arch_sync_cpu_closid_rmid(void *info)
* executing task might have its own closid selected. Just reuse
* the context switch code.
*/
- resctrl_sched_in(current);
+ resctrl_arch_sched_in(current);
}
/*
@@ -597,7 +597,7 @@ static void _update_task_closid_rmid(void *task)
* Otherwise, the MSR is updated when the task is scheduled in.
*/
if (task == current)
- resctrl_sched_in(task);
+ resctrl_arch_sched_in(task);
}
static void update_task_closid_rmid(struct task_struct *t)
@@ -655,7 +655,7 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
* Ensure the task's closid and rmid are written before determining if
* the task is current that will decide if it will be interrupted.
* This pairs with the full barrier between the rq->curr update and
- * resctrl_sched_in() during context switch.
+ * resctrl_arch_sched_in() during context switch.
*/
smp_mb();
@@ -2931,8 +2931,8 @@ static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to,
/*
* Order the closid/rmid stores above before the loads
* in task_curr(). This pairs with the full barrier
- * between the rq->curr update and resctrl_sched_in()
- * during context switch.
+ * between the rq->curr update and
+ * resctrl_arch_sched_in() during context switch.
*/
smp_mb();
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 0917c7f25720..8697b02dabf1 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -211,7 +211,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
switch_fpu_finish(next_p);
/* Load the Intel cache allocation PQR MSR. */
- resctrl_sched_in(next_p);
+ resctrl_arch_sched_in(next_p);
return prev_p;
}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 226472332a70..3f1235d3bf1d 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -707,7 +707,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
}
/* Load the Intel cache allocation PQR MSR. */
- resctrl_sched_in(next_p);
+ resctrl_arch_sched_in(next_p);
return prev_p;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 32/40] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (30 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 31/40] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 23:50 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 33/40] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
` (11 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
resctrl can't be built as a module, and the kernfs helpers are not exported
so this is unlikely to change. MPAM has an error interrupt which indicates
the MPAM driver has gone haywire. Should this occur tasks could run with
the wrong control values, leading to bad performance for important tasks.
The MPAM driver needs a way to tell resctrl that no further configuration
should be attempted.
Using resctrl_exit() for this leaves the system in a funny state as
resctrl is still mounted, but cannot be un-mounted because the sysfs
directory that is typically used has been removed. Dave Martin suggests
this may cause systemd trouble in the future as not all filesystems
can be unmounted.
Add calls to remove all the files and directories in resctrl, and
remove the sysfs_remove_mount_point() call that leaves the system
in a funny state. When triggered, this causes all the resctrl files
to disappear. resctrl can be unmounted, but not mounted again.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index f77fab859c35..bb5aadaf99b6 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4319,9 +4319,9 @@ int __init resctrl_init(void)
void __exit resctrl_exit(void)
{
+ rdtgroup_destroy_root();
debugfs_remove_recursive(debugfs_resctrl);
unregister_filesystem(&rdt_fs_type);
- sysfs_remove_mount_point(fs_kobj, "resctrl");
resctrl_mon_resource_exit();
}
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 33/40] x86/resctrl: Drop __init/__exit on assorted symbols
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (31 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 32/40] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-23 23:56 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 34/40] x86/resctrl: Move is_mba_sc() out of core.c James Morse
` (10 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Because ARM's MPAM controls are probed using MMIO, resctrl can't be
initialised until enough CPUs are online to have determined the
system-wide supported num_closid. Arm64 also supports 'late onlined
secondaries', where only a subset of CPUs are online during boot.
These two combine to mean the MPAM driver may not be able to initialise
resctrl until user-space has brought 'enough' CPUs online.
To allow MPAM to initialise resctrl after __init text has been free'd,
remove all the __init markings from resctrl.
The existing __exit markings cause these functions to be removed by the
linker as it has never been possible to build resctrl as a module. MPAM
has an error interrupt which causes the driver to reset and disable
itself. Remove the __exit markings to allow the MPAM driver to tear down
resctrl when an error occurs.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v4:
* Earlier __init marker removal migrated here.
---
arch/x86/kernel/cpu/resctrl/core.c | 6 +++---
arch/x86/kernel/cpu/resctrl/internal.h | 6 +++---
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 +++++-----
include/linux/resctrl.h | 6 +++---
5 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index f484726a2588..f713ac628444 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -775,7 +775,7 @@ struct rdt_options {
bool force_off, force_on;
};
-static struct rdt_options rdt_options[] __initdata = {
+static struct rdt_options rdt_options[] __ro_after_init = {
RDT_OPT(RDT_FLAG_CMT, "cmt", X86_FEATURE_CQM_OCCUP_LLC),
RDT_OPT(RDT_FLAG_MBM_TOTAL, "mbmtotal", X86_FEATURE_CQM_MBM_TOTAL),
RDT_OPT(RDT_FLAG_MBM_LOCAL, "mbmlocal", X86_FEATURE_CQM_MBM_LOCAL),
@@ -815,7 +815,7 @@ static int __init set_rdt_options(char *str)
}
__setup("rdt", set_rdt_options);
-bool __init rdt_cpu_has(int flag)
+bool rdt_cpu_has(int flag)
{
bool ret = boot_cpu_has(flag);
struct rdt_options *o;
@@ -835,7 +835,7 @@ bool __init rdt_cpu_has(int flag)
return ret;
}
-bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
{
if (!rdt_cpu_has(X86_FEATURE_BMEC))
return false;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index e939a0a28a49..4b7e370e71ac 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -474,13 +474,13 @@ int alloc_rmid(u32 closid);
void free_rmid(u32 closid, u32 rmid);
int rdt_get_mon_l3_config(struct rdt_resource *r);
void resctrl_mon_resource_exit(void);
-bool __init rdt_cpu_has(int flag);
+bool rdt_cpu_has(int flag);
void mon_event_count(void *info);
int rdtgroup_mondata_show(struct seq_file *m, void *arg);
void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
cpumask_t *cpumask, int evtid, int first);
-int __init resctrl_mon_resource_init(void);
+int resctrl_mon_resource_init(void);
void mbm_setup_overflow_handler(struct rdt_mon_domain *dom,
unsigned long delay_ms,
int exclude_cpu);
@@ -493,7 +493,7 @@ void cqm_handle_limbo(struct work_struct *work);
bool has_busy_rmid(struct rdt_mon_domain *d);
void __check_limbo(struct rdt_mon_domain *d, bool force_free);
void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
-void __init mbm_config_rftype_init(const char *config);
+void mbm_config_rftype_init(const char *config);
void rdt_staged_configs_clear(void);
bool closid_allocated(unsigned int closid);
int resctrl_find_cleanest_closid(void);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 39c450624ed0..1fd47f8a0e18 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1191,7 +1191,7 @@ static __init int snc_get_config(void)
*
* Returns 0 for success, or -ENOMEM.
*/
-int __init resctrl_mon_resource_init(void)
+int resctrl_mon_resource_init(void)
{
struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
int ret;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index bb5aadaf99b6..e45a8a6b5ff7 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2048,7 +2048,7 @@ static struct rftype *rdtgroup_get_rftype_by_name(const char *name)
return NULL;
}
-static void __init thread_throttle_mode_init(void)
+static void thread_throttle_mode_init(void)
{
struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
struct rftype *rft;
@@ -2064,7 +2064,7 @@ static void __init thread_throttle_mode_init(void)
rft->fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB;
}
-void __init mbm_config_rftype_init(const char *config)
+void mbm_config_rftype_init(const char *config)
{
struct rftype *rft;
@@ -4044,7 +4044,7 @@ static void rdtgroup_destroy_root(void)
rdtgroup_default.kn = NULL;
}
-static void __init rdtgroup_setup_default(void)
+static void rdtgroup_setup_default(void)
{
mutex_lock(&rdtgroup_mutex);
@@ -4260,7 +4260,7 @@ void resctrl_offline_cpu(unsigned int cpu)
*
* Return: 0 on success or -errno
*/
-int __init resctrl_init(void)
+int resctrl_init(void)
{
int ret = 0;
@@ -4317,7 +4317,7 @@ int __init resctrl_init(void)
return ret;
}
-void __exit resctrl_exit(void)
+void resctrl_exit(void)
{
rdtgroup_destroy_root();
debugfs_remove_recursive(debugfs_resctrl);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 7af6c40764ed..39303a0a398a 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -374,7 +374,7 @@ struct rdt_domain_hdr *resctrl_arch_find_domain(struct list_head *domain_list,
int id);
int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
-bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
/**
* resctrl_arch_mon_event_config_write() - Write the config for a counter.
@@ -537,7 +537,7 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
extern unsigned int resctrl_rmid_realloc_threshold;
extern unsigned int resctrl_rmid_realloc_limit;
-int __init resctrl_init(void);
-void __exit resctrl_exit(void);
+int resctrl_init(void);
+void resctrl_exit(void);
#endif /* _RESCTRL_H */
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 34/40] x86/resctrl: Move is_mba_sc() out of core.c
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (32 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 33/40] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 35/40] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
` (9 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
is_mba_sc() is defined in core.c, but has no callers there. It does
not access any architecture private structures.
Move this to rdtgroup.c where the majority of callers are. This makes
the move of the filesystem code to /fs/ cleaner.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v2:
* This patch is new.
---
arch/x86/kernel/cpu/resctrl/core.c | 15 ---------------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index f713ac628444..ec7d244d8511 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -166,21 +166,6 @@ static inline void cache_alloc_hsw_probe(void)
rdt_alloc_capable = true;
}
-bool is_mba_sc(struct rdt_resource *r)
-{
- if (!r)
- r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
-
- /*
- * The software controller support is only applicable to MBA resource.
- * Make sure to check for resource type.
- */
- if (r->rid != RDT_RESOURCE_MBA)
- return false;
-
- return r->membw.mba_sc;
-}
-
/*
* rdt_get_mb_table() - get a mapping of bandwidth(b/w) percentage values
* exposed to user interface and the h/w understandable delay values.
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e45a8a6b5ff7..7a4716fb1604 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1493,6 +1493,21 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r,
return size;
}
+bool is_mba_sc(struct rdt_resource *r)
+{
+ if (!r)
+ r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
+
+ /*
+ * The software controller support is only applicable to MBA resource.
+ * Make sure to check for resource type.
+ */
+ if (r->rid != RDT_RESOURCE_MBA)
+ return false;
+
+ return r->membw.mba_sc;
+}
+
/*
* rdtgroup_size_show - Display size in bytes of allocated regions
*
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 35/40] x86/resctrl: Add end-marker to the resctrl_event_id enum
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (33 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 34/40] x86/resctrl: Move is_mba_sc() out of core.c James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 36/40] x86/resctrl: Remove a newline to avoid confusing the code move script James Morse
` (8 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
The resctrl_event_id enum gives names to the counter event numbers on x86.
These are used directly by resctrl.
To allow the MPAM driver to keep an array of these the size of the enum
needs to be known.
Add a 'num_events' define which can be used to size an array. This isn't
a member of the enum to avoid updating switch statements that would
otherwise be missing a case.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
include/linux/resctrl_types.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
index 51c51a1aabfb..70226f5ab3e3 100644
--- a/include/linux/resctrl_types.h
+++ b/include/linux/resctrl_types.h
@@ -51,4 +51,6 @@ enum resctrl_event_id {
QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
};
+#define QOS_NUM_EVENTS (QOS_L3_MBM_LOCAL_EVENT_ID + 1)
+
#endif /* __LINUX_RESCTRL_TYPES_H */
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 36/40] x86/resctrl: Remove a newline to avoid confusing the code move script
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (34 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 35/40] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 37/40] x86/resctrl: Split trace.h James Morse
` (7 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
The resctrl filesystem code will shortly be moved to /fs/. This involves
splitting all the existing files, with some functions remaining under
arch/x86, and others moving to fs/resctrl.
To make this reproducible, a python script does the heavy lif^W
copy-and-paste. This involves some clunky parsing of C code.
The parser gets confused by the newline after this #ifdef.
Just remove it.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 7a4716fb1604..9696bdcc39f2 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -874,7 +874,6 @@ static int rdtgroup_rmid_show(struct kernfs_open_file *of,
}
#ifdef CONFIG_PROC_CPU_RESCTRL
-
/*
* A task can only be part of one resctrl control group and of one monitor
* group which is associated to that control group.
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 37/40] x86/resctrl: Split trace.h
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (35 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 36/40] x86/resctrl: Remove a newline to avoid confusing the code move script James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 38/40] fs/resctrl: Add boiler plate for external resctrl code James Morse
` (6 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
trace.h contains all the tracepoints. After the move to /fs/resctrl, some
of these will be left behind. All the pseudo_lock tracepoints remain part
of the architecture. The lone tracepoint in monitor.c moves to /fs/resctrl.
Split trace.h so that each C file includes a different trace header file.
This means the trace header files are not modified when they are moved.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
arch/x86/kernel/cpu/resctrl/Makefile | 3 ++
arch/x86/kernel/cpu/resctrl/monitor.c | 4 ++-
arch/x86/kernel/cpu/resctrl/monitor_trace.h | 31 +++++++++++++++++++
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +-
.../resctrl/{trace.h => pseudo_lock_trace.h} | 24 +++-----------
5 files changed, 42 insertions(+), 22 deletions(-)
create mode 100644 arch/x86/kernel/cpu/resctrl/monitor_trace.h
rename arch/x86/kernel/cpu/resctrl/{trace.h => pseudo_lock_trace.h} (56%)
diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/resctrl/Makefile
index 0c13b0befd8a..909be78ec6da 100644
--- a/arch/x86/kernel/cpu/resctrl/Makefile
+++ b/arch/x86/kernel/cpu/resctrl/Makefile
@@ -2,4 +2,7 @@
obj-$(CONFIG_X86_CPU_RESCTRL) += core.o rdtgroup.o monitor.o
obj-$(CONFIG_X86_CPU_RESCTRL) += ctrlmondata.o
obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK) += pseudo_lock.o
+
+# To allow define_trace.h's recursive include:
CFLAGS_pseudo_lock.o = -I$(src)
+CFLAGS_monitor.o = -I$(src)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 1fd47f8a0e18..b7662782ea59 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -26,7 +26,9 @@
#include <asm/resctrl.h>
#include "internal.h"
-#include "trace.h"
+
+#define CREATE_TRACE_POINTS
+#include "monitor_trace.h"
/**
* struct rmid_entry - dirty tracking for all RMID.
diff --git a/arch/x86/kernel/cpu/resctrl/monitor_trace.h b/arch/x86/kernel/cpu/resctrl/monitor_trace.h
new file mode 100644
index 000000000000..ade67daf42c2
--- /dev/null
+++ b/arch/x86/kernel/cpu/resctrl/monitor_trace.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM resctrl
+
+#if !defined(_FS_RESCTRL_MONITOR_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _FS_RESCTRL_MONITOR_TRACE_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(mon_llc_occupancy_limbo,
+ TP_PROTO(u32 ctrl_hw_id, u32 mon_hw_id, int domain_id, u64 llc_occupancy_bytes),
+ TP_ARGS(ctrl_hw_id, mon_hw_id, domain_id, llc_occupancy_bytes),
+ TP_STRUCT__entry(__field(u32, ctrl_hw_id)
+ __field(u32, mon_hw_id)
+ __field(int, domain_id)
+ __field(u64, llc_occupancy_bytes)),
+ TP_fast_assign(__entry->ctrl_hw_id = ctrl_hw_id;
+ __entry->mon_hw_id = mon_hw_id;
+ __entry->domain_id = domain_id;
+ __entry->llc_occupancy_bytes = llc_occupancy_bytes;),
+ TP_printk("ctrl_hw_id=%u mon_hw_id=%u domain_id=%d llc_occupancy_bytes=%llu",
+ __entry->ctrl_hw_id, __entry->mon_hw_id, __entry->domain_id,
+ __entry->llc_occupancy_bytes)
+ );
+
+#endif /* _FS_RESCTRL_MONITOR_TRACE_H */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE monitor_trace
+#include <trace/define_trace.h>
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 9bcd1d06b4e8..60ed5be212e1 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -30,7 +30,7 @@
#include "internal.h"
#define CREATE_TRACE_POINTS
-#include "trace.h"
+#include "pseudo_lock_trace.h"
/*
* The bits needed to disable hardware prefetching varies based on the
diff --git a/arch/x86/kernel/cpu/resctrl/trace.h b/arch/x86/kernel/cpu/resctrl/pseudo_lock_trace.h
similarity index 56%
rename from arch/x86/kernel/cpu/resctrl/trace.h
rename to arch/x86/kernel/cpu/resctrl/pseudo_lock_trace.h
index 2a506316b303..5a0fae61d3ee 100644
--- a/arch/x86/kernel/cpu/resctrl/trace.h
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock_trace.h
@@ -2,8 +2,8 @@
#undef TRACE_SYSTEM
#define TRACE_SYSTEM resctrl
-#if !defined(_TRACE_RESCTRL_H) || defined(TRACE_HEADER_MULTI_READ)
-#define _TRACE_RESCTRL_H
+#if !defined(_X86_RESCTRL_PSEUDO_LOCK_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _X86_RESCTRL_PSEUDO_LOCK_TRACE_H
#include <linux/tracepoint.h>
@@ -35,25 +35,9 @@ TRACE_EVENT(pseudo_lock_l3,
TP_printk("hits=%llu miss=%llu",
__entry->l3_hits, __entry->l3_miss));
-TRACE_EVENT(mon_llc_occupancy_limbo,
- TP_PROTO(u32 ctrl_hw_id, u32 mon_hw_id, int domain_id, u64 llc_occupancy_bytes),
- TP_ARGS(ctrl_hw_id, mon_hw_id, domain_id, llc_occupancy_bytes),
- TP_STRUCT__entry(__field(u32, ctrl_hw_id)
- __field(u32, mon_hw_id)
- __field(int, domain_id)
- __field(u64, llc_occupancy_bytes)),
- TP_fast_assign(__entry->ctrl_hw_id = ctrl_hw_id;
- __entry->mon_hw_id = mon_hw_id;
- __entry->domain_id = domain_id;
- __entry->llc_occupancy_bytes = llc_occupancy_bytes;),
- TP_printk("ctrl_hw_id=%u mon_hw_id=%u domain_id=%d llc_occupancy_bytes=%llu",
- __entry->ctrl_hw_id, __entry->mon_hw_id, __entry->domain_id,
- __entry->llc_occupancy_bytes)
- );
-
-#endif /* _TRACE_RESCTRL_H */
+#endif /* _X86_RESCTRL_PSEUDO_LOCK_TRACE_H */
#undef TRACE_INCLUDE_PATH
#define TRACE_INCLUDE_PATH .
-#define TRACE_INCLUDE_FILE trace
+#define TRACE_INCLUDE_FILE pseudo_lock_trace
#include <trace/define_trace.h>
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 38/40] fs/resctrl: Add boiler plate for external resctrl code
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (36 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 37/40] x86/resctrl: Split trace.h James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-08 23:03 ` Tony Luck
2024-10-24 0:08 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 39/40] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
` (5 subsequent siblings)
43 siblings, 2 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Dave Martin, Shaopeng Tan
Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL
for the common parts of the resctrl interface and make X86_CPU_RESCTRL
select this.
Adding an include of asm/resctrl.h to linux/resctrl.h allows the
/fs/resctrl files to switch over to using this header instead.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v4:
* Tweaking of the commit message.
Changes since v3:
* Reworded 'if unsure say N' from the Kconfig text, the user doesn't have
the choice anyway at this point.
* Added PWD to monitor.o's CFLAGS for the ftrace rube-goldberg build machine.
* Added split trace files.
Changes since v2:
* Dropped KERNFS dependency from arch side Kconfig.
* Added empty trace.h file.
* Merged asm->linux includes from Dave's patch to decouple those
patches from this series.
Changes since v1:
* Rename new file psuedo_lock.c to pseudo_lock.c, to match the name
of the original file (and to be less surprising).
* [Whitespace only] Under RESCTRL_FS in fs/resctrl/Kconfig, delete
alignment space in orphaned select ... if (which has nothing to line
up with any more).
* [Whitespace only] Reflow and re-tab Kconfig additions.
---
MAINTAINERS | 1 +
arch/Kconfig | 8 +++++
arch/x86/Kconfig | 5 +--
arch/x86/kernel/cpu/resctrl/internal.h | 3 +-
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
fs/Kconfig | 1 +
fs/Makefile | 1 +
fs/resctrl/Kconfig | 37 +++++++++++++++++++++++
fs/resctrl/Makefile | 6 ++++
fs/resctrl/ctrlmondata.c | 0
fs/resctrl/internal.h | 0
fs/resctrl/monitor.c | 0
fs/resctrl/monitor_trace.h | 0
fs/resctrl/pseudo_lock.c | 0
fs/resctrl/pseudo_lock_trace.h | 0
fs/resctrl/rdtgroup.c | 0
include/linux/resctrl.h | 4 +++
19 files changed, 65 insertions(+), 7 deletions(-)
create mode 100644 fs/resctrl/Kconfig
create mode 100644 fs/resctrl/Makefile
create mode 100644 fs/resctrl/ctrlmondata.c
create mode 100644 fs/resctrl/internal.h
create mode 100644 fs/resctrl/monitor.c
create mode 100644 fs/resctrl/monitor_trace.h
create mode 100644 fs/resctrl/pseudo_lock.c
create mode 100644 fs/resctrl/pseudo_lock_trace.h
create mode 100644 fs/resctrl/rdtgroup.c
diff --git a/MAINTAINERS b/MAINTAINERS
index fd5a1621c026..f4a785a44c83 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -19468,6 +19468,7 @@ S: Supported
F: Documentation/arch/x86/resctrl*
F: arch/x86/include/asm/resctrl.h
F: arch/x86/kernel/cpu/resctrl/
+F: fs/resctrl/
F: include/linux/resctrl*.h
F: tools/testing/selftests/resctrl/
diff --git a/arch/Kconfig b/arch/Kconfig
index 98157b38f5cf..55865b903d9d 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1456,6 +1456,14 @@ config STRICT_MODULE_RWX
config ARCH_HAS_PHYS_TO_DMA
bool
+config ARCH_HAS_CPU_RESCTRL
+ bool
+ help
+ An architecture selects this option to indicate that the necessary
+ hooks are provided to support the common memory system usage
+ monitoring and control interfaces provided by the 'resctrl'
+ filesystem (see RESCTRL_FS).
+
config HAVE_ARCH_COMPILER_H
bool
help
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 47ff2589fbce..dd6a4b19d5b7 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -496,8 +496,9 @@ config X86_MPPARSE
config X86_CPU_RESCTRL
bool "x86 CPU resource control support"
depends on X86 && (CPU_SUP_INTEL || CPU_SUP_AMD)
- select KERNFS
- select PROC_CPU_RESCTRL if PROC_FS
+ depends on MISC_FILESYSTEMS
+ select ARCH_HAS_CPU_RESCTRL
+ select RESCTRL_FS
select RESCTRL_FS_PSEUDO_LOCK
help
Enable x86 CPU resource control support.
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 4b7e370e71ac..973fddf7e9a3 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -7,10 +7,9 @@
#include <linux/kernfs.h>
#include <linux/fs_context.h>
#include <linux/jump_label.h>
+#include <linux/resctrl.h>
#include <linux/tick.h>
-#include <asm/resctrl.h>
-
#define L3_QOS_CDP_ENABLE 0x01ULL
#define L2_QOS_CDP_ENABLE 0x01ULL
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index b7662782ea59..9ae709ba5744 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -19,11 +19,11 @@
#include <linux/cpu.h>
#include <linux/module.h>
+#include <linux/resctrl.h>
#include <linux/sizes.h>
#include <linux/slab.h>
#include <asm/cpu_device_id.h>
-#include <asm/resctrl.h>
#include "internal.h"
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 60ed5be212e1..3d3a0d952cbd 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -18,12 +18,12 @@
#include <linux/mman.h>
#include <linux/perf_event.h>
#include <linux/pm_qos.h>
+#include <linux/resctrl.h>
#include <linux/slab.h>
#include <linux/uaccess.h>
#include <asm/cacheflush.h>
#include <asm/cpu_device_id.h>
-#include <asm/resctrl.h>
#include <asm/perf_event.h>
#include "../../events/perf_event.h" /* For X86_CONFIG() */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 9696bdcc39f2..5a47e223830c 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -18,6 +18,7 @@
#include <linux/fs_parser.h>
#include <linux/sysfs.h>
#include <linux/kernfs.h>
+#include <linux/resctrl.h>
#include <linux/seq_buf.h>
#include <linux/seq_file.h>
#include <linux/sched/signal.h>
@@ -28,7 +29,6 @@
#include <uapi/linux/magic.h>
-#include <asm/resctrl.h>
#include "internal.h"
DEFINE_STATIC_KEY_FALSE(rdt_enable_key);
diff --git a/fs/Kconfig b/fs/Kconfig
index 949895cff872..2069e49e8099 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -335,6 +335,7 @@ source "fs/omfs/Kconfig"
source "fs/hpfs/Kconfig"
source "fs/qnx4/Kconfig"
source "fs/qnx6/Kconfig"
+source "fs/resctrl/Kconfig"
source "fs/romfs/Kconfig"
source "fs/pstore/Kconfig"
source "fs/sysv/Kconfig"
diff --git a/fs/Makefile b/fs/Makefile
index 61679fd587b7..619a102f81d7 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -130,3 +130,4 @@ obj-$(CONFIG_EROFS_FS) += erofs/
obj-$(CONFIG_VBOXSF_FS) += vboxsf/
obj-$(CONFIG_ZONEFS_FS) += zonefs/
obj-$(CONFIG_BPF_LSM) += bpf_fs_kfuncs.o
+obj-$(CONFIG_RESCTRL_FS) += resctrl/
diff --git a/fs/resctrl/Kconfig b/fs/resctrl/Kconfig
new file mode 100644
index 000000000000..3a3a75dad40d
--- /dev/null
+++ b/fs/resctrl/Kconfig
@@ -0,0 +1,37 @@
+config RESCTRL_FS
+ bool "CPU Resource Control Filesystem (resctrl)"
+ depends on ARCH_HAS_CPU_RESCTRL
+ select KERNFS
+ select PROC_CPU_RESCTRL if PROC_FS
+ help
+ Some architectures provide hardware facilities to group tasks and
+ monitor and control their usage of memory system resources such as
+ caches and memory bandwidth. Examples of such facilities include
+ Intel's Resource Director Technology (Intel(R) RDT) and AMD's
+ Platform Quality of Service (AMD QoS).
+
+ If your system has the necessary support and you want to be able to
+ assign tasks to groups and manipulate the associated resource
+ monitors and controls from userspace, say Y here to get a mountable
+ 'resctrl' filesystem that lets you do just that.
+
+ If nothing mounts or prods the 'resctrl' filesystem, resource
+ controls and monitors are left in a quiescent, permissive state.
+
+ On architectures where this can be disabled independently, it is
+ safe to say N.
+
+ See <file:Documentation/arch/x86/resctrl.rst> for more information.
+
+config RESCTRL_FS_PSEUDO_LOCK
+ bool
+ help
+ Software mechanism to pin data in a cache portion using
+ micro-architecture specific knowledge.
+
+config RESCTRL_RMID_DEPENDS_ON_CLOSID
+ bool
+ help
+ Enable by the architecture when the RMID values depend on the CLOSID.
+ This causes the closid allocator to search for CLOSID with clean
+ RMID.
diff --git a/fs/resctrl/Makefile b/fs/resctrl/Makefile
new file mode 100644
index 000000000000..e67f34d2236a
--- /dev/null
+++ b/fs/resctrl/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_RESCTRL_FS) += rdtgroup.o ctrlmondata.o monitor.o
+obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK) += pseudo_lock.o
+
+# To allow define_trace.h's recursive include:
+CFLAGS_monitor.o = -I$(src)
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/monitor_trace.h b/fs/resctrl/monitor_trace.h
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/pseudo_lock.c b/fs/resctrl/pseudo_lock.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/pseudo_lock_trace.h b/fs/resctrl/pseudo_lock_trace.h
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 39303a0a398a..6b64bfa45673 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -9,6 +9,10 @@
#include <linux/pid.h>
#include <linux/resctrl_types.h>
+#ifdef CONFIG_ARCH_HAS_CPU_RESCTRL
+#include <asm/resctrl.h>
+#endif
+
/* CLOSID, RMID value used by the default control group */
#define RESCTRL_RESERVED_CLOSID 0
#define RESCTRL_RESERVED_RMID 0
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 39/40] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (37 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 38/40] fs/resctrl: Add boiler plate for external resctrl code James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-04 18:03 ` [PATCH v5 40/40] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
` (4 subsequent siblings)
43 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Dave Martin, Shaopeng Tan
Once the filesystem parts of resctrl move to fs/resctrl, it cannot rely
on definitions in x86's internal.h.
Move definitions in internal.h that need to be shared between the
filesystem and architecture code to header files that fs/resctrl can
include.
Doing this separately means the filesystem code only moves between files
of the same name, instead of having these changes mixed in too.
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Changes since v3:
* Changed the number of hyphens at the end of the commit message.
Changes since v2:
* Dropped the rfflags and some other defines from being moved.
Changes since v1:
* Revert apparently unintentional duplication of a couple of variable
declarations in <linux/resctrl.h>.
No functional change.
---
arch/x86/include/asm/resctrl.h | 3 +++
arch/x86/kernel/cpu/resctrl/core.c | 5 +++++
arch/x86/kernel/cpu/resctrl/internal.h | 9 ---------
include/linux/resctrl_types.h | 3 +++
4 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 22e508c10059..332b0617d728 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -212,6 +212,9 @@ int resctrl_arch_measure_l2_residency(void *_plr);
int resctrl_arch_measure_l3_residency(void *_plr);
void resctrl_cpu_detect(struct cpuinfo_x86 *c);
+bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l);
+int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
+
#else
static inline void resctrl_arch_sched_in(struct task_struct *tsk) {}
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index ec7d244d8511..816d9af6b36b 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -289,6 +289,11 @@ static void rdt_get_cdp_l2_config(void)
rdt_get_cdp_config(RDT_RESOURCE_L2);
}
+bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
+{
+ return rdt_resources_all[l].cdp_enabled;
+}
+
static void mba_wrmsr_amd(struct msr_param *m)
{
struct rdt_hw_ctrl_domain *hw_dom = resctrl_to_arch_ctrl_dom(m->dom);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 973fddf7e9a3..d1de46a62416 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -17,8 +17,6 @@
#define CQM_LIMBOCHECK_INTERVAL 1000
#define MBM_CNTR_WIDTH_BASE 24
-#define MBM_OVERFLOW_INTERVAL 1000
-#define MAX_MBA_BW 100u
#define MBA_IS_LINEAR 0x4
#define MBM_CNTR_WIDTH_OFFSET_AMD 20
@@ -404,13 +402,6 @@ extern struct rdt_hw_resource rdt_resources_all[];
extern struct rdtgroup rdtgroup_default;
extern struct dentry *debugfs_resctrl;
-static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
-{
- return rdt_resources_all[l].cdp_enabled;
-}
-
-int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
-
void arch_mon_domain_online(struct rdt_resource *r, struct rdt_mon_domain *d);
/* CPUID.(EAX=10H, ECX=ResID=1).EAX */
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
index 70226f5ab3e3..b84a6e0834a7 100644
--- a/include/linux/resctrl_types.h
+++ b/include/linux/resctrl_types.h
@@ -7,6 +7,9 @@
#ifndef __LINUX_RESCTRL_TYPES_H
#define __LINUX_RESCTRL_TYPES_H
+#define MAX_MBA_BW 100u
+#define MBM_OVERFLOW_INTERVAL 1000
+
/* Reads to Local DRAM Memory */
#define READS_TO_LOCAL_MEM BIT(0)
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v5 40/40] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (38 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 39/40] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
@ 2024-10-04 18:03 ` James Morse
2024-10-08 23:08 ` Tony Luck
2024-10-04 21:18 ` [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem " Reinette Chatre
` (3 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-04 18:03 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger, James Morse,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
To support more than one architecture resctrl needs to move from arch/x86
to live under fs. Moving all the code breaks any series on the mailing
list, so needs scheduling carefully.
Maintaining the patch that moves all this code has proved labour intensive.
It's also near-impossible to review that no inadvertent changes have
crept in.
To solve these problems, temporarily add a hacky python program that
lists all the functions that should move, and those that should stay.
No attempt to parse C code is made, this thing tries to name 'blocks'
based on hueristics about the kernel coding style. It's fragile, but
good enough for its single use here.
This causes the original files to be regenerated, which will add
newlines that are not present in the original file.
I don't suggested this gets merged.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
resctrl_copy_pasta.py | 779 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 779 insertions(+)
create mode 100644 resctrl_copy_pasta.py
diff --git a/resctrl_copy_pasta.py b/resctrl_copy_pasta.py
new file mode 100644
index 000000000000..227e67eac4c4
--- /dev/null
+++ b/resctrl_copy_pasta.py
@@ -0,0 +1,779 @@
+#!/usr/bin/python
+import sys;
+import os;
+import re;
+
+############
+
+SRC_DIR = "arch/x86/kernel/cpu/resctrl";
+DST_DIR = "fs/resctrl";
+
+resctrl_files = [
+ "ctrlmondata.c",
+ "internal.h",
+ "monitor.c",
+ "pseudo_lock.c",
+ "rdtgroup.c",
+ "pseudo_lock_trace.h",
+ "monitor_trace.h",
+];
+
+functions_to_keep = [
+ # common
+ "pr_fmt",
+
+ # core.c
+ "domain_list_lock",
+ "resctrl_arch_late_init",
+ "resctrl_arch_exit",
+ "resctrl_cpu_detect",
+ "rdt_cpu_has",
+ "resctrl_arch_is_evt_configurable",
+ "get_mem_config",
+ "get_slow_mem_config",
+ "get_rdt_alloc_resources",
+ "get_rdt_mon_resources",
+ "__check_quirks_intel",
+ "check_quirks",
+ "get_rdt_resources",
+ "rdt_init_res_defs_intel",
+ "rdt_init_res_defs_amd",
+ "rdt_init_res_defs",
+ "resctrl_cpu_detect",
+ "resctrl_arch_late_init",
+ "resctrl_arch_exit",
+ "setup_default_ctrlval",
+ "domain_free",
+ "domain_setup_ctrlval",
+ "arch_domain_mbm_alloc",
+ "domain_add_cpu",
+ "domain_remove_cpu",
+ "clear_closid_rmid",
+ "resctrl_arch_online_cpu",
+ "resctrl_arch_offline_cpu",
+ "resctrl_arch_find_domain",
+ "resctrl_arch_get_num_closid",
+ "rdt_ctrl_update",
+ "domain_init",
+ "resctrl_arch_get_resource",
+ "cache_alloc_hsw_probe",
+ "rdt_get_mb_table",
+ "__get_mem_config_intel",
+ "__rdt_get_mem_config_amd",
+ "rdt_get_cache_alloc_cfg",
+ "rdt_get_cdp_config",
+ "rdt_get_cdp_l3_config",
+ "rdt_get_cdp_l2_config",
+ "resctrl_arch_get_cdp_enabled",
+ "set_rdt_options",
+ "pqr_state",
+ "rdt_resources_all",
+ "delay_bw_map",
+ "rdt_options",
+ "cat_wrmsr",
+ "mba_wrmsr_amd",
+ "mba_wrmsr_intel",
+ "anonymous-enum",
+ "rdt_find_domain",
+ "rdt_alloc_capable",
+ "rdt_online",
+ "RDT_OPT",
+
+ # ctrlmon.c
+ "apply_config",
+ "resctrl_arch_update_one",
+ "resctrl_arch_update_domains",
+ "resctrl_arch_get_config",
+
+ # internal.h
+ "L3_QOS_CDP_ENABLE",
+ "L2_QOS_CDP_ENABLE",
+ "MBM_CNTR_WIDTH_OFFSET_AMD",
+ "arch_mbm_state",
+ "rdt_hw_ctrl_domain",
+ "rdt_hw_mon_domain",
+ "resctrl_to_arch_ctrl_dom",
+ "resctrl_to_arch_mon_dom",
+ "msr_param",
+ "rdt_hw_resource",
+ "resctrl_to_arch_res",
+ "rdt_resources_all",
+ "resctrl_inc",
+ "for_each_rdt_resource",
+ "for_each_capable_rdt_resource",
+ "for_each_alloc_capable_rdt_resource",
+ "for_each_mon_capable_rdt_resource",
+ "arch_mon_domain_online",
+ "cpuid_0x10_1_eax",
+ "cpuid_0x10_3_eax",
+ "cpuid_0x10_x_ecx",
+ "cpuid_0x10_x_edx",
+ "rdt_ctrl_update",
+ "rdt_get_mon_l3_config",
+ "rdt_cpu_has",
+ "intel_rdt_mbm_apply_quirk",
+ "rdt_domain_reconfigure_cdp",
+
+ # monitor.c
+ "rdt_mon_capable",
+ "rdt_mon_features",
+ "CF",
+ "snc_nodes_per_l3_cache",
+ "mbm_cf_table",
+ "mbm_cf_rmidthreshold",
+ "mbm_cf",
+ "logical_rmid_to_physical_rmid",
+ "__rmid_read_phys",
+ "get_corrected_mbm_count",
+ "__rmid_read",
+ "get_arch_mbm_state",
+ "resctrl_arch_reset_rmid",
+ "resctrl_arch_reset_rmid_all",
+ "mbm_overflow_count",
+ "resctrl_arch_rmid_read",
+ "snc_cpu_ids",
+ "snc_get_config",
+ "rdt_get_mon_l3_config",
+ "intel_rdt_mbm_apply_quirk",
+
+ # pseudo_lock.c
+ "prefetch_disable_bits",
+ "resctrl_arch_get_prefetch_disable_bits",
+ "resctrl_arch_pseudo_lock_fn",
+ "resctrl_arch_measure_cycles_lat_fn",
+ "perf_miss_attr",
+ "perf_hit_attr",
+ "residency_counts",
+ "measure_residency_fn",
+ "resctrl_arch_measure_l2_residency",
+ "resctrl_arch_measure_l3_residency",
+
+ # rdtgroup.c
+ "rdt_enable_key",
+ "rdt_mon_enable_key",
+ "rdt_alloc_enable_key",
+ "resctrl_arch_sync_cpu_closid_rmid",
+ "INVALID_CONFIG_INDEX",
+ "mon_event_config_index_get",
+ "resctrl_arch_mon_event_config_read",
+ "resctrl_arch_mon_event_config_write",
+ "l3_qos_cfg_update",
+ "l2_qos_cfg_update",
+ "set_cache_qos_cfg",
+ "rdt_domain_reconfigure_cdp",
+ "cdp_enable",
+ "cdp_disable",
+ "resctrl_arch_set_cdp_enabled",
+ "reset_all_ctrls",
+ "resctrl_arch_reset_resources",
+
+ # pseudo_lock_trace.h
+ "TRACE_SYSTEM",
+ "pseudo_lock_mem_latency",
+ "pseudo_lock_l2",
+ "pseudo_lock_l3",
+];
+
+functions_to_move = [
+ # common
+ "pr_fmt",
+
+ # ctrlmon.c
+ "rdt_parse_data",
+ "(ctrlval_parser_t)",
+ "bw_validate",
+ "parse_bw",
+ "cbm_validate",
+ "parse_cbm",
+ "get_parser",
+ "parse_line",
+ "rdtgroup_parse_resource",
+ "rdtgroup_schemata_write",
+ "show_doms",
+ "rdtgroup_schemata_show",
+ "smp_mon_event_count",
+ "mon_event_read",
+ "rdtgroup_mondata_show",
+
+ # internal.h
+ "MBM_OVERFLOW_INTERVAL",
+ "CQM_LIMBOCHECK_INTERVAL",
+ "cpumask_any_housekeeping",
+ "rdt_fs_context",
+ "rdt_fc2context",
+ "mon_evt",
+ "mon_data_bits",
+ "rmid_read",
+ "resctrl_schema_all",
+ "resctrl_mounted",
+ "rdt_group_type",
+ "rdtgrp_mode",
+ "mongroup",
+ "rdtgroup",
+ "RFTYPE_FLAGS_CPUS_LIST",
+ "rdt_all_groups",
+ "rftype",
+ "mbm_state",
+ "is_mba_sc",
+
+ # monitor.c
+ "rmid_entry",
+ "rmid_free_lru",
+ "closid_num_dirty_rmid",
+ "rmid_limbo_count",
+ "rmid_ptrs",
+ "resctrl_rmid_realloc_threshold",
+ "resctrl_rmid_realloc_limit",
+ "__rmid_entry",
+ "limbo_release_entry",
+ "__check_limbo",
+ "has_busy_rmid",
+ "resctrl_find_free_rmid",
+ "resctrl_find_cleanest_closid",
+ "alloc_rmid",
+ "add_rmid_to_limbo",
+ "free_rmid",
+ "get_mbm_state",
+ "__mon_event_count",
+ "mbm_bw_count",
+ "mon_event_count",
+ "update_mba_bw",
+ "mbm_update",
+ "cqm_handle_limbo",
+ "cqm_setup_limbo_handler",
+ "mbm_handle_overflow",
+ "mbm_setup_overflow_handler",
+ "dom_data_init",
+ "dom_data_exit",
+ "llc_occupancy_event",
+ "mbm_total_event",
+ "mbm_local_event",
+ "l3_mon_evt_init",
+ "resctrl_mon_resource_init",
+ "resctrl_mon_resource_exit",
+
+ # pseudo_lock.c
+ "pseudo_lock_major",
+ "pseudo_lock_minor_avail",
+ "pseudo_lock_devnode",
+ "pseudo_lock_class",
+ "pseudo_lock_minor_get",
+ "pseudo_lock_minor_release",
+ "region_find_by_minor",
+ "pseudo_lock_pm_req",
+ "pseudo_lock_cstates_relax",
+ "pseudo_lock_cstates_constrain",
+ "pseudo_lock_region_clear",
+ "pseudo_lock_region_init",
+ "pseudo_lock_init",
+ "pseudo_lock_region_alloc",
+ "pseudo_lock_free",
+ "rdtgroup_monitor_in_progress",
+ "rdtgroup_locksetup_user_restrict",
+ "rdtgroup_locksetup_user_restore",
+ "rdtgroup_locksetup_enter",
+ "rdtgroup_locksetup_exit",
+ "rdtgroup_cbm_overlaps_pseudo_locked",
+ "rdtgroup_pseudo_locked_in_hierarchy",
+ "pseudo_lock_measure_cycles",
+ "pseudo_lock_measure_trigger",
+ "pseudo_measure_fops",
+ "rdtgroup_pseudo_lock_create",
+ "rdtgroup_pseudo_lock_remove",
+ "pseudo_lock_dev_open",
+ "pseudo_lock_dev_release",
+ "pseudo_lock_dev_mremap",
+ "pseudo_mmap_ops",
+ "pseudo_lock_dev_mmap",
+ "pseudo_lock_dev_fops",
+ "rdt_pseudo_lock_init",
+ "rdt_pseudo_lock_release",
+
+ # rdtgroup.c
+ "rdtgroup_mutex",
+ "rdt_root",
+ "rdtgroup_default",
+ "rdt_all_groups",
+ "resctrl_schema_all",
+ "resctrl_mounted",
+ "kn_info",
+ "kn_mongrp",
+ "kn_mondata",
+ "max_name_width",
+ "last_cmd_status",
+ "last_cmd_status_buf",
+ "rdtgroup_setup_root",
+ "rdtgroup_destroy_root",
+ "debugfs_resctrl",
+ "resctrl_debug",
+ "rdt_last_cmd_clear",
+ "rdt_last_cmd_puts",
+ "rdt_last_cmd_printf",
+ "rdt_staged_configs_clear",
+ "resctrl_is_mbm_enabled",
+ "resctrl_is_mbm_event",
+ "closid_free_map",
+ "closid_free_map_len",
+ "closids_supported",
+ "closid_init",
+ "closid_alloc",
+ "closid_free",
+ "closid_allocated",
+ "rdtgroup_mode_by_closid",
+ "rdt_mode_str",
+ "rdtgroup_mode_str",
+ "rdtgroup_kn_set_ugid",
+ "rdtgroup_add_file",
+ "rdtgroup_seqfile_show",
+ "rdtgroup_file_write",
+ "rdtgroup_kf_single_ops",
+ "kf_mondata_ops",
+ "is_cpu_list",
+ "rdtgroup_cpus_show",
+ "update_closid_rmid",
+ "cpus_mon_write",
+ "cpumask_rdtgrp_clear",
+ "cpus_ctrl_write",
+ "rdtgroup_cpus_write",
+ "rdtgroup_remove",
+ "_update_task_closid_rmid",
+ "update_task_closid_rmid",
+ "task_in_rdtgroup",
+ "__rdtgroup_move_task",
+ "is_closid_match",
+ "is_rmid_match",
+ "rdtgroup_tasks_assigned",
+ "rdtgroup_task_write_permission",
+ "rdtgroup_move_task",
+ "rdtgroup_tasks_write",
+ "show_rdt_tasks",
+ "rdtgroup_tasks_show",
+ "rdtgroup_closid_show",
+ "rdtgroup_rmid_show",
+ "proc_resctrl_show",
+ "rdt_last_cmd_status_show",
+ "rdt_num_closids_show",
+ "rdt_default_ctrl_show",
+ "rdt_min_cbm_bits_show",
+ "rdt_shareable_bits_show",
+ "rdt_bit_usage_show",
+ "rdt_min_bw_show",
+ "rdt_num_rmids_show",
+ "rdt_mon_features_show",
+ "rdt_bw_gran_show",
+ "rdt_delay_linear_show",
+ "max_threshold_occ_show",
+ "rdt_thread_throttle_mode_show",
+ "max_threshold_occ_write",
+ "rdtgroup_mode_show",
+ "resctrl_peer_type",
+ "rdt_has_sparse_bitmasks_show",
+ "__rdtgroup_cbm_overlaps",
+ "rdtgroup_cbm_overlaps",
+ "rdtgroup_mode_test_exclusive",
+ "rdtgroup_mode_write",
+ "rdtgroup_cbm_to_size",
+ "rdtgroup_size_show",
+ "mondata_config_read",
+ "mbm_config_show",
+ "mbm_total_bytes_config_show",
+ "mbm_local_bytes_config_show",
+ "mbm_config_write_domain",
+ "mon_config_write",
+ "mbm_total_bytes_config_write",
+ "mbm_local_bytes_config_write",
+ "res_common_files",
+ "rdtgroup_add_files",
+ "rdtgroup_get_rftype_by_name",
+ "thread_throttle_mode_init",
+ "mbm_config_rftype_init",
+ "rdtgroup_kn_mode_restrict",
+ "rdtgroup_kn_mode_restore",
+ "rdtgroup_mkdir_info_resdir",
+ "fflags_from_resource",
+ "rdtgroup_create_info_dir",
+ "mongroup_create_dir",
+ "is_mba_linear",
+ "mba_sc_domain_allocate",
+ "mba_sc_domain_destroy",
+ "supports_mba_mbps",
+ "set_mba_sc",
+ "kernfs_to_rdtgroup",
+ "rdtgroup_kn_get",
+ "rdtgroup_kn_put",
+ "rdtgroup_kn_lock_live",
+ "rdtgroup_kn_unlock",
+ "rdt_disable_ctx",
+ "rdt_enable_ctx",
+ "schemata_list_add",
+ "schemata_list_create",
+ "schemata_list_destroy",
+ "rdt_get_tree",
+ "rdt_param",
+ "rdt_fs_parameters",
+ "rdt_parse_param",
+ "rdt_fs_context_free",
+ "rdt_fs_context_ops",
+ "rdt_init_fs_context",
+ "rdt_move_group_tasks",
+ "free_all_child_rdtgrp",
+ "rmdir_all_sub",
+ "rdt_kill_sb",
+ "rdt_fs_type",
+ "mon_addfile",
+ "mon_rmdir_one_subdir",
+ "rmdir_mondata_subdir_allrdtgrp",
+ "mon_add_all_files",
+ "mkdir_mondata_subdir",
+ "mkdir_mondata_subdir_allrdtgrp",
+ "mkdir_mondata_subdir_alldom",
+ "mkdir_mondata_all",
+ "cbm_ensure_valid",
+ "__init_one_rdt_domain",
+ "rdtgroup_init_cat",
+ "rdtgroup_init_mba",
+ "rdtgroup_init_alloc",
+ "mkdir_rdt_prepare_rmid_alloc",
+ "mkdir_rdt_prepare_rmid_free",
+ "mkdir_rdt_prepare",
+ "mkdir_rdt_prepare_clean",
+ "rdtgroup_mkdir_mon",
+ "rdtgroup_mkdir_ctrl_mon",
+ "is_mon_groups",
+ "rdtgroup_mkdir",
+ "rdtgroup_rmdir_mon",
+ "rdtgroup_ctrl_remove",
+ "rdtgroup_rmdir_ctrl",
+ "rdtgroup_rmdir",
+ "mongrp_reparent",
+ "rdtgroup_rename",
+ "rdtgroup_show_options",
+ "rdtgroup_kf_syscall_ops",
+ "rdtgroup_setup_root",
+ "rdtgroup_destroy_root",
+ "rdtgroup_setup_default",
+ "domain_destroy_mon_state",
+ "resctrl_offline_ctrl_domain",
+ "resctrl_offline_mon_domain",
+ "domain_setup_mon_state",
+ "resctrl_online_ctrl_domain",
+ "resctrl_online_mon_domain",
+ "resctrl_online_cpu",
+ "clear_childcpus",
+ "resctrl_offline_cpu",
+ "resctrl_init",
+ "resctrl_exit",
+
+ # monitor_trace.h
+ "TRACE_SYSTEM",
+ "mon_llc_occupancy_limbo",
+];
+
+############
+
+builtin_non_functions = ["__setup", "__exitcall", "__printf"];
+builtin_one_arg_macros = ["LIST_HEAD", "DEFINE_MUTEX", "DEFINE_STATIC_KEY_FALSE"];
+types = ["bool", "char", "int", "u32", "long", "u64"];
+
+def get_array_name(line):
+ tok = re.search(r'([^\s]+?)\[\]', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+
+
+def get_struct_name(line):
+ tok = re.search(r'struct ([^\s]+?) {', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+
+def get_enum_name(line):
+ tok = re.search(r'enum ([^\s]+?) {', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+
+def get_union_name(line):
+ tok = re.search(r'union ([^\s]+?) {', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+
+
+def get_macro_name(line):
+ tok = re.search(r'#define[\s]+([^\s]+?)\(', line)
+ if (tok):
+ return tok.group(1);
+
+ tok = re.search(r'#define[\s]+([^\s]+?)[\s]+[^\s]+?\n', line)
+ if (tok):
+ return tok.group(1);
+
+ return None;
+
+
+def get_macro_target(line):
+ tok = re.search(r'[^\s]+?\(([^\s]+?)\);\n', line)
+ if (tok):
+ return tok.group(1);
+
+ return None;
+
+
+# Things like 'bool my_bool;'
+def get_object_name(line):
+ # remove things that don't change the meaning of the name
+ if line.startswith("static "):
+ line = line[len("static "):];
+ if line.startswith("extern "):
+ line = line[len("extern "):];
+ if line.startswith("unsigned "):
+ line = line[len("unsigned "):];
+
+ # Note the trailing semicolon..
+ tok = re.search(r'([^\s]+)\s[\*]*([^\s\[\],;]+)', line)
+ if tok:
+ if tok.group(1) in types:
+ return tok.group(2);
+
+ tok = re.search(r'struct\s[^\s]+\s[\*]*([^\s;]+)', line)
+ if tok:
+ return tok.group(1);
+
+ tok = re.search(r'enum\s[^\s]+\s([^\s;]+)', line)
+ if tok:
+ return tok.group(1);
+
+ return None;
+
+
+# Is there a name for this block of code?
+#
+# Function names are the token before '(' ... assuming there is only one '('.
+# This also handles structs and arrays,
+def get_block_name(line):
+ # remove things that don't change the meaning of the name
+ if (" __read_mostly" in line):
+ line = line.replace(" __read_mostly", "");
+ if (" __initconst" in line):
+ line = line.replace(" __initconst", "");
+
+ if line == "enum {\n":
+ return "anonymous-enum";
+ if (line.startswith("#define ")):
+ return get_macro_name(line);
+
+ if ("=" in line):
+ tok = re.search(r'[\*]*([^\s\[\]]+?)[\s\[\]]*=', line)
+ else:
+ tok = re.search(r'[\*]*([^\s]+?)\(.+?', line)
+
+ if (tok is None):
+ if ("[]" in line):
+ return get_array_name(line);
+ if (line.startswith("struct") and line.endswith("{\n")):
+ return get_struct_name(line);
+ if (line.startswith("enum") and line.endswith("{\n")):
+ return get_enum_name(line);
+ if (line.startswith("union") and line.endswith("{\n")):
+ return get_union_name(line);
+ if (line.endswith(";\n") and '(' not in line):
+ return get_object_name(line);
+ if (line.endswith("= {\n") and '(' not in line):
+ return get_object_name(line);
+ return None;
+
+ func_name = tok.group(1);
+ if (func_name in builtin_one_arg_macros):
+ tok = re.search(r'[^\(]+\(([^\s]+?)\)', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+ elif (func_name == "DEFINE_PER_CPU"):
+ tok = re.search(r'DEFINE_PER_CPU\(.+?, ([^\s]+?)\)', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+ elif (func_name == "TRACE_EVENT"):
+ tok = re.search(r'TRACE_EVENT\((.+?),', line)
+ if (tok is None):
+ return None;
+ return tok.group(1);
+ elif (func_name == "late_initcall"):
+ return get_macro_target(line);
+ else:
+ return func_name;
+
+def output_function_body(body, file):
+ # Mandatory whitespace between blocks
+ if os.lseek(file.fileno(), 0, os.SEEK_CUR) > 0:
+ file.write("\n".encode());
+
+ for out_line in body:
+ file.write(out_line.encode());
+
+# Where should we put this block of code?
+def output_function(name, body, files):
+ output = False;
+ (new_src, new_dst) = files;
+
+ if (len(body)) == 0:
+ return;
+
+ # Output to both files
+ if (name is None):
+ output_function_body(body, new_src);
+ output_function_body(body, new_dst);
+ output = True;
+ if (name in functions_to_keep):
+ output_function_body(body, new_src);
+ output = True;
+ if (name in functions_to_move):
+ output_function_body(body, new_dst);
+ output = True;
+
+ if not output:
+ print("Missing function name: "+name);
+ #print(body);
+
+def reset_parser():
+ global function_name;
+ global define_name;
+ global function_body;
+ global in_define;
+
+ function_name = None;
+ define_name = None;
+ function_body = [];
+ in_define = False;
+
+############
+
+for file in resctrl_files:
+ function_name = None;
+ # function_names take priority over defines, this is only used when
+ # no function_name was found
+ define_name = None;
+ function_body = [];
+ # Nothing clever - this is just to detect newlines between functions
+ in_function = False;
+ in_define = False;
+
+ src_path = SRC_DIR + "/" + str(file);
+ if (not os.path.isfile(src_path)):
+ continue;
+ dst_path = DST_DIR + "/" + str(file);
+
+ orig_file = open(src_path, "r");
+ lines = orig_file.readlines();
+
+ # Now unlink the original file, so it can be re-created with new
+ # contents.
+ try:
+ os.unlink(src_path);
+ except Exception as err:
+ print("Failed to unlink source file: {err}");
+ sys.exit(1);
+
+ # non-buffering is so we can snoop the fd offset to avoid trailing newlines
+ new_src = open(src_path, "wb", buffering=0);
+ new_dst = open(dst_path, "wb", buffering=0);
+
+ for line in lines:
+ # Empty lines outside a function - reset the function tracking
+ if (line == "\n" and not in_function):
+ if function_name is None and define_name is not None:
+ function_name = define_name;
+ output_function(function_name, function_body, (new_src, new_dst));
+ reset_parser();
+
+ # Function prototypes are a funny C thing - reset the function tracking
+ elif (line[0].isspace() and not in_function and line.endswith(");\n")):
+ function_body += [line];
+ output_function(function_name, function_body, (new_src, new_dst));
+ reset_parser();
+
+ # Lines that begin with whitespace are part of the current function.
+ elif (line[0].isspace()):
+ function_body += [line];
+
+ # Next, try to find the kind of line that contains a function name
+
+ # Ignore lines with comment markers, braces
+ elif (line.startswith("/*")):
+ function_body += [line];
+ elif (line.startswith("*/")):
+ function_body += [line];
+ elif (line.startswith("//")):
+ function_body += [line];
+ elif (line == "{\n"):
+ function_body += [line];
+ in_function = True;
+ elif (line == "}\n"):
+ function_body += [line];
+ in_function = False;
+ elif (line == "};\n"):
+ function_body += [line];
+ in_function = False;
+
+ elif (line.startswith("#include")):
+ function_body += [line];
+ elif (line.startswith("#if ")):
+ function_body += [line];
+ elif (line.startswith("#ifdef ")):
+ function_body += [line];
+ elif (line.startswith("#ifndef ")):
+ function_body += [line];
+ elif (line.startswith("#else")):
+ function_body += [line];
+ elif (line.startswith("#endif")):
+ function_body += [line];
+ elif (line.startswith("#undef ")):
+ function_body += [line];
+ elif (line.startswith("#define")):
+ function_body += [line];
+ define_name = get_block_name(line);
+ if line.endswith("\\\n"):
+ in_define = True;
+ elif in_define and line.endswith("\\\n"):
+ function_body += [line];
+
+ # goto was always a crime
+ elif (' ' not in line and line.endswith(":\n")):
+ function_body += [line];
+
+ # Try and parse a function/array name
+
+ # Things like late_initcall() aren't function names, but belong to
+ # the previous function.
+ elif (get_block_name(line) in builtin_non_functions):
+ function_body += [line];
+
+ # Start a new block if we can get a block name for this line
+ elif (get_block_name(line) != None and function_name is None):
+ _name = get_block_name(line);
+
+ if (line.endswith("{\n")):
+ in_function = True;
+
+ # Is this a function prototype? Output it now
+ if (line.endswith(";\n")):
+ function_body += [line];
+ output_function(_name, function_body, (new_src, new_dst));
+ reset_parser();
+ else:
+ function_name = _name;
+ function_body += [line];
+
+ # Failed to parse a function name ... did it get split up?
+ elif (line.startswith("static")):
+ function_body += [line];
+
+ else:
+ print("Unknown: '" + line + "'");
+
+ # Output whatever is left in the buffer
+ output_function(function_name, function_body, (new_src, new_dst));
+
+ orig_file.close();
--
2.39.2
^ permalink raw reply related [flat|nested] 102+ messages in thread
* Re: [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (39 preceding siblings ...)
2024-10-04 18:03 ` [PATCH v5 40/40] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
@ 2024-10-04 21:18 ` Reinette Chatre
2024-10-07 17:29 ` James Morse
2024-10-08 23:24 ` Tony Luck
` (2 subsequent siblings)
43 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-04 21:18 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin
Hi James,
While you wait for folks to consider these changes, could you please take a look
at [1] from an Arm/MPAM perspective? My understanding is that Arm/MPAM also
requires assigning counters to do monitoring and [1] aims to create
a generic interface to do so. Is [1] something that Arm/MPAM can build on
or are there some changes that can be made before its inclusion to help
with future MPAM support?
Thank you
Reinette
[1] https://lore.kernel.org/all/cover.1725488488.git.babu.moger@amd.com/
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2024-10-04 21:18 ` [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem " Reinette Chatre
@ 2024-10-07 17:29 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-07 17:29 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin
Hi Reinette,
On 04/10/2024 22:18, Reinette Chatre wrote:
> While you wait for folks to consider these changes, could you please take a look
> at [1] from an Arm/MPAM perspective? My understanding is that Arm/MPAM also
> requires assigning counters to do monitoring and [1] aims to create
> a generic interface to do so. Is [1] something that Arm/MPAM can build on
> or are there some changes that can be made before its inclusion to help
> with future MPAM support?
Yup, this is on my list to get back into this week.
I did rebase the MPAM tree over Babu's v6, but haven't had the chance to test it [0].
This is my previous feedback on where the arch/fs split needed to be came form.
Thanks,
James
[0] https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/log/?h=mpam/abmc/v6
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers
2024-10-04 18:03 ` [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers James Morse
@ 2024-10-08 0:00 ` Tony Luck
2024-10-08 16:40 ` Reinette Chatre
2024-10-23 21:51 ` Reinette Chatre
1 sibling, 1 reply; 102+ messages in thread
From: Tony Luck @ 2024-10-08 0:00 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
On Fri, Oct 04, 2024 at 06:03:24PM +0000, James Morse wrote:
> The for_each_*_rdt_resource() helpers walk the architecture's array
> of structures, using the resctrl visible part as an iterator. These
> became over-complex when the structures were split into a
> filesystem and architecture-specific struct. This approach avoided
> the need to touch every call site, and was done before there was a
> helper to retrieve a resource by rid.
>
> Once the filesystem parts of resctrl are moved to /fs/, both the
> architecture's resource array, and the definition of those structures
> is no longer accessible. To support resctrl, each architecture would
> have to provide equally complex macros.
>
> Rewrite the macro to make use of resctrl_arch_get_resource(), and
> move these to the core header so existing x86 arch code continues
> to use them.
Apologies if this comment was suggested against earlier versions
of this series.
Did you consider replacing rdt_resources_all[] a list (in the filesystem
code) instead of an array (in the architecture code)?
List would start empty. Architecture init code would enumerate features
and add entries to the list for those that exist and are to be enabled.
The "for_each" macros then walk the list (variants for all entries,
for "alloc_capable" and for "mon_capable"). Note that only enabled
entries appear on the lists.
There are currently a bunch of places in filesystem code that
do:
r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
or
r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
those could become:
r = resctrl_arch_get_mba_resource();
r = resctrl_arch_get_l3_resource();
Then the whole "enum resctrl_res_level" and ->rid field in
struct rdt_resource could go away? Remaining uses look like
distinguishing MBA from SMBA. Perhaps better done with a
flags word?
Advantage of doing this would be to avoid the generic
enum resctrl_res_level having to be a superset of all
features across all architectures. E.g. ARM might want
to add L4/L5 resources, X86 may have some that ARM will
never need. RiscV may also follow some divergent path.
If this v5 series is close to being applied then I don't
want to derail with a re-write at this late stage.
All of this could be done as a cleanup after this series
has been applied.
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers
2024-10-08 0:00 ` Tony Luck
@ 2024-10-08 16:40 ` Reinette Chatre
2024-10-18 17:07 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-08 16:40 UTC (permalink / raw)
To: Tony Luck, James Morse
Cc: x86, linux-kernel, Fenghua Yu, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Hi Tony,
On 10/7/24 5:00 PM, Tony Luck wrote:
> On Fri, Oct 04, 2024 at 06:03:24PM +0000, James Morse wrote:
>> The for_each_*_rdt_resource() helpers walk the architecture's array
>> of structures, using the resctrl visible part as an iterator. These
>> became over-complex when the structures were split into a
>> filesystem and architecture-specific struct. This approach avoided
>> the need to touch every call site, and was done before there was a
>> helper to retrieve a resource by rid.
>>
>> Once the filesystem parts of resctrl are moved to /fs/, both the
>> architecture's resource array, and the definition of those structures
>> is no longer accessible. To support resctrl, each architecture would
>> have to provide equally complex macros.
>>
>> Rewrite the macro to make use of resctrl_arch_get_resource(), and
>> move these to the core header so existing x86 arch code continues
>> to use them.
>
> Apologies if this comment was suggested against earlier versions
> of this series.
>
> Did you consider replacing rdt_resources_all[] a list (in the filesystem
> code) instead of an array (in the architecture code)?
>
> List would start empty. Architecture init code would enumerate features
> and add entries to the list for those that exist and are to be enabled.
>
> The "for_each" macros then walk the list (variants for all entries,
> for "alloc_capable" and for "mon_capable"). Note that only enabled
> entries appear on the lists.
>
> There are currently a bunch of places in filesystem code that
> do:
> r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
> or
> r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
>
> those could become:
>
> r = resctrl_arch_get_mba_resource();
>
> r = resctrl_arch_get_l3_resource();
>
> Then the whole "enum resctrl_res_level" and ->rid field in
> struct rdt_resource could go away? Remaining uses look like
> distinguishing MBA from SMBA. Perhaps better done with a
> flags word?
>
> Advantage of doing this would be to avoid the generic
> enum resctrl_res_level having to be a superset of all
> features across all architectures. E.g. ARM might want
> to add L4/L5 resources, X86 may have some that ARM will
> never need. RiscV may also follow some divergent path.
Ideally resctrl fs would remain as an interface that a user can use to interact
with all architectures without knowing architecture specific details. Platform
differences can be exposed by resctrl in a generic way to support this.
I am afraid that allowing architectures to diverge would require resctrl fs users
to additionally know which platform they are running on.
> If this v5 series is close to being applied then I don't
> want to derail with a re-write at this late stage.
> All of this could be done as a cleanup after this series
> has been applied.
Due to the already significant size of this work I think it would make it easier
if the number of functional changes are minimal. Specifically, only those functional
changes that are required to accomplish the goal of moving the code.
Considering that one goal of this proposal is to support architectural
flexibility I do think it would be easier to understand its impact if it
is implemented on top of the arch/fs split.
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 30/40] x86/resctrl: Describe resctrl's bitmap size assumptions
2024-10-04 18:03 ` [PATCH v5 30/40] x86/resctrl: Describe resctrl's bitmap size assumptions James Morse
@ 2024-10-08 18:50 ` Tony Luck
2025-02-07 15:46 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Tony Luck @ 2024-10-08 18:50 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
On Fri, Oct 04, 2024 at 06:03:37PM +0000, James Morse wrote:
> resctrl operates on configuration bitmaps and a bitmap of allocated
> CLOSID, both are stored in a u32.
>
> MPAM supports configuration/portion bitmaps and PARTIDs larger
> than will fit in a u32.
>
> Add some preprocessor values that make it clear why MPAM clamps
> some of these values. This will make it easier to find code related
> to these values if this resctrl behaviour ever changes.
...
> +#define RESCTRL_MAX_CLOSID 32
Do you really need to do this? Intel x86 architecture allows for more than
32 CLOSIDs, it's just expensive in h/w to get past 16 ... so I picked
that trivial bitmap allocator in ages past. But if ARM can have more,
then why would you need to clamp the value? File system code could ask
architecture code to allocate a CLOSID. On x86 that will fail when there
are no more CLOSIDs, so filesystem will fail the mkdir(2).
Or, since you put closid_alloc() into the filesystem code you could
change the closid_free_map to u64.
If you really do want to have this #define ... maybe you should use it
in place of the hard coded 32 here:
static void closid_init(void)
{
struct resctrl_schema *s;
u32 rdt_min_closid = 32;
}
> +#define RESCTRL_MAX_CBM 32
Intel x86 could plausibly expand the cache bitmap size (the MSRs
that store them currenly have bits 63:32 reserved, but that could be
changed). The only 32-bit limits are the CPUID field that enumerates
CBM_LEN and the CPUID field that enumerates the shared bitmap. The
length has space for expansion, the share bitfiled does not. So if
Intel did go to more than 32-bits we'd be stuck making sure any shared
bits were in the lower 32-bits.
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 38/40] fs/resctrl: Add boiler plate for external resctrl code
2024-10-04 18:03 ` [PATCH v5 38/40] fs/resctrl: Add boiler plate for external resctrl code James Morse
@ 2024-10-08 23:03 ` Tony Luck
2024-10-24 0:08 ` Reinette Chatre
1 sibling, 0 replies; 102+ messages in thread
From: Tony Luck @ 2024-10-08 23:03 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
On Fri, Oct 04, 2024 at 06:03:45PM +0000, James Morse wrote:
> Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL
> for the common parts of the resctrl interface and make X86_CPU_RESCTRL
> select this.
>
> Adding an include of asm/resctrl.h to linux/resctrl.h allows the
> /fs/resctrl files to switch over to using this header instead.
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 4b7e370e71ac..973fddf7e9a3 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -7,10 +7,9 @@
> #include <linux/kernfs.h>
> #include <linux/fs_context.h>
> #include <linux/jump_label.h>
> +#include <linux/resctrl.h>
> #include <linux/tick.h>
internal.h already has a #include of <linux/resctrl.h> it doesn't need
another one.
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 40/40] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2024-10-04 18:03 ` [PATCH v5 40/40] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
@ 2024-10-08 23:08 ` Tony Luck
2024-10-24 0:17 ` Reinette Chatre
0 siblings, 1 reply; 102+ messages in thread
From: Tony Luck @ 2024-10-08 23:08 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
On Fri, Oct 04, 2024 at 06:03:47PM +0000, James Morse wrote:
> +functions_to_move = [
> + # common
> + "pr_fmt",
> +
> + # ctrlmon.c
> + "rdt_parse_data",
> + "(ctrlval_parser_t)",
> + "bw_validate",
> + "parse_bw",
> + "cbm_validate",
> + "parse_cbm",
> + "get_parser",
> + "parse_line",
> + "rdtgroup_parse_resource",
> + "rdtgroup_schemata_write",
> + "show_doms",
> + "rdtgroup_schemata_show",
> + "smp_mon_event_count",
> + "mon_event_read",
> + "rdtgroup_mondata_show",
> +
> + # internal.h
> + "MBM_OVERFLOW_INTERVAL",
> + "CQM_LIMBOCHECK_INTERVAL",
> + "cpumask_any_housekeeping",
> + "rdt_fs_context",
> + "rdt_fc2context",
> + "mon_evt",
> + "mon_data_bits",
> + "rmid_read",
> + "resctrl_schema_all",
> + "resctrl_mounted",
> + "rdt_group_type",
> + "rdtgrp_mode",
> + "mongroup",
> + "rdtgroup",
> + "RFTYPE_FLAGS_CPUS_LIST",
Something goes wrong with moving the RFTYPE_* defines. A new copy
shows up in fs/resctrl/internal.h but the old copy isn't removed from
arch/x86/kernel/cpu/resctrl/internal.h
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (40 preceding siblings ...)
2024-10-04 21:18 ` [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem " Reinette Chatre
@ 2024-10-08 23:24 ` Tony Luck
2024-10-17 17:43 ` Tony Luck
2024-12-06 7:17 ` Shaopeng Tan (Fujitsu)
43 siblings, 0 replies; 102+ messages in thread
From: Tony Luck @ 2024-10-08 23:24 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin
On Fri, Oct 04, 2024 at 06:03:07PM +0000, James Morse wrote:
> Changes since v4?:
> * Dropped the percentage/mbps distinction, this can be future cleanup as I
> think the difference matters to user-space. These are both treated as a
> 'range'.
> * Picked a pre-requisite cleanup patch from Christophe to make merging
> easier.
> * More of the __init/__exit stuff has consolodated in the patch that removes
> them from filesystem code.
> Regardless, changes are noted on each patch.
>
> ~
>
> This is the final series that allows other architectures to implement resctrl.
> The final patch to move the code has been omited, but can be generated using
> the python script at the end of the series.
> The final move is a bit of a monster. I don't expect that to get merged as part
> of this series - we should wait for it to make less impact on other series.
I've been playing around with this series (after running the python
script to get the full effect of where the code is headed).
Things seem to look pretty good. I haven't noticed anything failing
(but I haven't done extensive testing).
I've skimmed over the patches and posted some nitpicks on a couple
of them. I'll try to find time to make a proper review pass.
Just to see whether I still understand the new layout I took my
patches for mba_MBps[1] and applied them on top of all of this.
The process was pretty simple, most functions that were changed
just moved to fs/resctrl. Just a few places where I needed to
tweak things to fit on top of the changes (like using the
new resctrl_arch_get_resource() to lookup rdt_resources
instead of indexing into the rdt_resources_all[] array,
of function names changed to add "_arch").
Overall it looks pretty good.
-Tony
[1] https://lore.kernel.org/all/20241003191228.67541-1-tony.luck@intel.com/
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 07/40] x86/resctrl: Add max_bw to struct resctrl_membw
2024-10-04 18:03 ` [PATCH v5 07/40] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
@ 2024-10-09 18:02 ` Tony Luck
2024-10-23 21:14 ` Reinette Chatre
1 sibling, 0 replies; 102+ messages in thread
From: Tony Luck @ 2024-10-09 18:02 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
On Fri, Oct 04, 2024 at 06:03:14PM +0000, James Morse wrote:
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 8d1bdfe89692..56c41bfd07e4 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -57,10 +57,10 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
> return false;
> }
>
> - if ((bw < r->membw.min_bw || bw > r->default_ctrl) &&
> + if ((bw < r->membw.min_bw || bw > r->membw.max_bw) &&
> !is_mba_sc(r)) {
> rdt_last_cmd_printf("MB value %ld out of range [%d,%d]\n", bw,
> - r->membw.min_bw, r->default_ctrl);
> + r->membw.min_bw, r->membw.max_bw);
> return false;
> }
>
Heads up. There is a patch to this function in the TIP x86/urgent
branch. So will likely go into v6.12-rc3. So this patch will need
to refactor on top of:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=2b5648416e47933939dc310c4ea1e29404f35630
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 02/40] x86/resctrl: Add a helper to avoid reaching into the arch code resource list
2024-10-04 18:03 ` [PATCH v5 02/40] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
@ 2024-10-15 22:57 ` Tony Luck
2024-10-18 17:07 ` James Morse
2024-10-23 21:03 ` Reinette Chatre
1 sibling, 1 reply; 102+ messages in thread
From: Tony Luck @ 2024-10-15 22:57 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
On Fri, Oct 04, 2024 at 06:03:09PM +0000, James Morse wrote:
> +struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
> +{
> + if (l >= RDT_NUM_RESOURCES)
> + return NULL;
> +
> + return &rdt_resources_all[l].r_resctrl;
> +}
Is this a bit fragile if someone adds a new item in enum resctrl_res_level
but doesn't add a new entry to struct rdt_hw_resource rdt_resources_all[]
in arch/x86/kernel/cpu/resctrl/core.c
Any caller of resctrl_arch_get_resource(new item name) will get past
the check "if (l >= RDT_NUM_RESOURCES)" and then return a pointer past
the end of the rdt_resources_all[] array.
Maybe make sure the array is padded out to the right size?
struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES - 1] = {
...
};
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 04/40] x86/resctrl: Use schema type to determine how to parse schema values
2024-10-04 18:03 ` [PATCH v5 04/40] x86/resctrl: Use schema type to determine how to parse schema values James Morse
@ 2024-10-15 23:15 ` Tony Luck
2024-10-18 17:07 ` James Morse
2024-10-23 21:14 ` Reinette Chatre
1 sibling, 1 reply; 102+ messages in thread
From: Tony Luck @ 2024-10-15 23:15 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
On Fri, Oct 04, 2024 at 06:03:11PM +0000, James Morse wrote:
> +static ctrlval_parser_t *get_parser(struct rdt_resource *r)
> +{
> + switch (r->schema_fmt) {
> + case RESCTRL_SCHEMA_BITMAP:
> + return &parse_cbm;
> + case RESCTRL_SCHEMA_RANGE:
> + return &parse_bw;
> + }
> +
> + return NULL;
> +}
Is it really worth making this a helper function? It's only
used once.
> +
> /*
> * For each domain in this resource we expect to find a series of:
> * id=mask
> @@ -204,6 +225,7 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
> static int parse_line(char *line, struct resctrl_schema *s,
> struct rdtgroup *rdtgrp)
> {
> + ctrlval_parser_t *parse_ctrlval = get_parser(s->res);
No check to see if get_parser() returned NULL.
> enum resctrl_conf_type t = s->conf_type;
> struct resctrl_staged_config *cfg;
> struct rdt_resource *r = s->res;
> @@ -235,7 +257,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
> if (d->hdr.id == dom_id) {
> data.buf = dom;
> data.rdtgrp = rdtgrp;
> - if (r->parse_ctrlval(&data, s, d))
> + if (parse_ctrlval(&data, s, d))
> return -EINVAL;
Without the helper this could be:
switch (r->schema_fmt) {
case RESCTRL_SCHEMA_BITMAP:
if (parse_cbm(&data, s, d))
return -EINVAL;
break;
case RESCTRL_SCHEMA_RANGE:
if (parse_bw(&data, s, d))
return -EINVAL;
break;
default:
WARN_ON_ONCE(1);
return -EINVAL;
}
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 06/40] x86/resctrl: Remove data_width and the tabular format
2024-10-04 18:03 ` [PATCH v5 06/40] x86/resctrl: Remove data_width and the tabular format James Morse
@ 2024-10-15 23:29 ` Tony Luck
2024-10-18 17:07 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Tony Luck @ 2024-10-15 23:29 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
On Fri, Oct 04, 2024 at 06:03:13PM +0000, James Morse wrote:
> The resctrl architecture code provides a data_width for the controls of
> each resource. This is used to zero pad all control values in the schemata
> file so they appear in columns. The same is done with the resource names
> to complete the visual effect. e.g.
> | SMBA:0=2048
> | L3:0=00ff
>
> AMD platforms discover their maximum bandwidth for the MB resource from
> firmware, but hard-code the data_width to 4. If the maximum bandwidth
> requires more digits - the tabular format is silently broken.
> If new schema are added resctrl will need to be able to determine the
> maximum width. The benefit of this pretty-printing is questionable.
Agreed. It's particularly non-useful for L2 resources on systems with
hundred+ cores. The L2 line in schemata is very long and doesn't look
"pretty" at all. Padding may make it even longer.
It never worked with the mba_MBps mount option because the field
width wasn't updated for prettiness. E.g.
$ cat schemata
MB:0=4294967295;1=4294967295
L3:0=fff;1=fff
> Instead of handling runtime discovery of the data_width for AMD platforms,
> remove the feature. These fields are always zero padded so should be
> harmless to remove if the whole field has been treated as a number.
> In the above example, this would now look like this:
Huzzah!
Reviewed-by: Tony Luck <tony.luck@intel.com>
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 11/40] x86/resctrl: Export resctrl fs's init function
2024-10-04 18:03 ` [PATCH v5 11/40] x86/resctrl: Export resctrl fs's init function James Morse
@ 2024-10-16 16:20 ` Tony Luck
0 siblings, 0 replies; 102+ messages in thread
From: Tony Luck @ 2024-10-16 16:20 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
On Fri, Oct 04, 2024 at 06:03:18PM +0000, James Morse wrote:
> rdtgroup_init() needs exporting so that arch code can call it once
^^^^^^^^^
> it lives in core code. As this is one of the few functions exported,
> rename it to have "resctrl" in the name. The same goes for the exit
> call.
>
> Rename x86's arch code init functions for RDT to have an arch
> prefix to make it clear these are part of the architecture code.
You aren't "exporting" these symbols in the Linux sense (as you
point out elsewhere in the series resctrl can't be built as a
module because the kernfs routines are not exported.
Maybe this commit description should just say it renames functions
to make it clear which are on the filesystem side of the boundary
and which are arch specific code?
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 20/40] x86/resctrl: Slightly clean-up mbm_config_show()
2024-10-04 18:03 ` [PATCH v5 20/40] x86/resctrl: Slightly clean-up mbm_config_show() James Morse
@ 2024-10-16 16:50 ` Tony Luck
0 siblings, 0 replies; 102+ messages in thread
From: Tony Luck @ 2024-10-16 16:50 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Christophe JAILLET, Shaopeng Tan
On Fri, Oct 04, 2024 at 06:03:27PM +0000, James Morse wrote:
> From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
>
> 'mon_info' is already zeroed in the list_for_each_entry() loop below.
> There is no need to explicitly initialize it here. It just wastes some
> space and cycles.
>
> Remove this un-needed code.
Boris has applied this one to tip x86/cache. Drop it from the series.
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (41 preceding siblings ...)
2024-10-08 23:24 ` Tony Luck
@ 2024-10-17 17:43 ` Tony Luck
2024-12-06 7:17 ` Shaopeng Tan (Fujitsu)
43 siblings, 0 replies; 102+ messages in thread
From: Tony Luck @ 2024-10-17 17:43 UTC (permalink / raw)
To: James Morse
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin
Summary of my comments:
0002 - fragile, but simple fix
0004 - drop the helper function, or add NULL check?
0007 - needs rebase against tip x86/cache
0011 - fix commit comment
0020 - already in tip x86/cache, drop from series
0030 - Use new RESCTRL_MAX_CLOSID to replace hard coded constant
0038 - duplicate #include
So nothing major.
Reviewed-by: Tony Luck <tony.luck@intel.com>
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 06/40] x86/resctrl: Remove data_width and the tabular format
2024-10-15 23:29 ` Tony Luck
@ 2024-10-18 17:07 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-18 17:07 UTC (permalink / raw)
To: Tony Luck
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Hi Tony,
On 16/10/2024 00:29, Tony Luck wrote:
> On Fri, Oct 04, 2024 at 06:03:13PM +0000, James Morse wrote:
>> The resctrl architecture code provides a data_width for the controls of
>> each resource. This is used to zero pad all control values in the schemata
>> file so they appear in columns. The same is done with the resource names
>> to complete the visual effect. e.g.
>> | SMBA:0=2048
>> | L3:0=00ff
>>
>> AMD platforms discover their maximum bandwidth for the MB resource from
>> firmware, but hard-code the data_width to 4. If the maximum bandwidth
>> requires more digits - the tabular format is silently broken.
>> If new schema are added resctrl will need to be able to determine the
>> maximum width. The benefit of this pretty-printing is questionable.
>
> Agreed. It's particularly non-useful for L2 resources on systems with
> hundred+ cores. The L2 line in schemata is very long and doesn't look
> "pretty" at all. Padding may make it even longer.
> It never worked with the mba_MBps mount option because the field
> width wasn't updated for prettiness. E.g.
>
> $ cat schemata
> MB:0=4294967295;1=4294967295
> L3:0=fff;1=fff
Good point - I'll add that the commit message.
>> Instead of handling runtime discovery of the data_width for AMD platforms,
>> remove the feature. These fields are always zero padded so should be
>> harmless to remove if the whole field has been treated as a number.
>> In the above example, this would now look like this:
>
> Huzzah!
>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
Thanks!
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers
2024-10-08 16:40 ` Reinette Chatre
@ 2024-10-18 17:07 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-18 17:07 UTC (permalink / raw)
To: Reinette Chatre, Tony Luck
Cc: x86, linux-kernel, Fenghua Yu, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Hi Tony, Reinette,
On 08/10/2024 17:40, Reinette Chatre wrote:
> On 10/7/24 5:00 PM, Tony Luck wrote:
>> On Fri, Oct 04, 2024 at 06:03:24PM +0000, James Morse wrote:
>>> The for_each_*_rdt_resource() helpers walk the architecture's array
>>> of structures, using the resctrl visible part as an iterator. These
>>> became over-complex when the structures were split into a
>>> filesystem and architecture-specific struct. This approach avoided
>>> the need to touch every call site, and was done before there was a
>>> helper to retrieve a resource by rid.
>>>
>>> Once the filesystem parts of resctrl are moved to /fs/, both the
>>> architecture's resource array, and the definition of those structures
>>> is no longer accessible. To support resctrl, each architecture would
>>> have to provide equally complex macros.
>>>
>>> Rewrite the macro to make use of resctrl_arch_get_resource(), and
>>> move these to the core header so existing x86 arch code continues
>>> to use them.
>> Apologies if this comment was suggested against earlier versions
>> of this series.
>>
>> Did you consider replacing rdt_resources_all[] a list (in the filesystem
>> code) instead of an array (in the architecture code)?
I didn't consider this, but it would be a more natural fit for the secret for loops that
are all over the resctrl code.
>> List would start empty. Architecture init code would enumerate features
>> and add entries to the list for those that exist and are to be enabled.
That saves the 'can't return NULL' wart - but that was intended to be temporary - and only
a headache for !x86 architectures.
>> The "for_each" macros then walk the list (variants for all entries,
>> for "alloc_capable" and for "mon_capable"). Note that only enabled
>> entries appear on the lists.
>>
>> There are currently a bunch of places in filesystem code that
>> do:
>> r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
>> or
>> r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
>>
>> those could become:
>>
>> r = resctrl_arch_get_mba_resource();
>>
>> r = resctrl_arch_get_l3_resource();
Where these walk this list instead of 'knowing' the offset.
(just in case I'm missing a trick here)
>> Then the whole "enum resctrl_res_level" and ->rid field in
>> struct rdt_resource could go away?
I think level is still going to be useful for cache resources - that is something we
expose via the sysfs cpu cache/indexX stuff too. I'd like resctrl to generate the names of
resources - just to ensure they are the same on every architecture.
The rid is an existing field just to make the array searching work.
>> Remaining uses look like
>> distinguishing MBA from SMBA. Perhaps better done with a
>> flags word?
>>
>> Advantage of doing this would be to avoid the generic
>> enum resctrl_res_level having to be a superset of all
>> features across all architectures.
Ah, I see this as an advantage - its much harder for an architecture to add a new type of
control or resource than it is to provide compatibility with one that is already there.
This in turn is better for user-space.
MPAM's bandwidth controls don't have the same control format as Intel RDT - but its much
better for everyone if I convert the values to hide the differences instead of trying to
shoehorn in ARM_MB as a new resource, only to find another architecture grows something
similar.
The difficult bit is making sure new resources/controls are as generic as possible,
meaning other architectures can adopt them. (L3's bitmap is a good example).
(and I agree there will always be platform specific things each camp has)
>> E.g. ARM might want to add L4/L5 resources,
/me shudders.
I've seen folk wanting to add the 'system cache' - which sits where the L3 should be, but
behaves differently. And ACPI's "Memory Side Caches" which gives me a hilarious TLA
collision to navigate).
I've argued neither of these are L<n> caches because they aren't visible to user-space in
/sys/devices/system/cpu/cpu0/cache ...
[..]
> Ideally resctrl fs would remain as an interface that a user can use to interact
> with all architectures without knowing architecture specific details. Platform
> differences can be exposed by resctrl in a generic way to support this.
> I am afraid that allowing architectures to diverge would require resctrl fs users
> to additionally know which platform they are running on.
>
>> If this v5 series is close to being applied then I don't
>> want to derail with a re-write at this late stage.
>> All of this could be done as a cleanup after this series
>> has been applied.
>
> Due to the already significant size of this work I think it would make it easier
> if the number of functional changes are minimal. Specifically, only those functional
> changes that are required to accomplish the goal of moving the code.
Yup - hence the need for !alloc_capable && !mon_capable resources behind
resctrl_arch_get_resource() - this is keeping the behaviour of the existing code.
> Considering that one goal of this proposal is to support architectural
> flexibility I do think it would be easier to understand its impact if it
> is implemented on top of the arch/fs split.
Make sense to me,
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 02/40] x86/resctrl: Add a helper to avoid reaching into the arch code resource list
2024-10-15 22:57 ` Tony Luck
@ 2024-10-18 17:07 ` James Morse
2024-10-18 17:14 ` Luck, Tony
0 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2024-10-18 17:07 UTC (permalink / raw)
To: Tony Luck
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Hi Tony,
On 15/10/2024 23:57, Tony Luck wrote:
> On Fri, Oct 04, 2024 at 06:03:09PM +0000, James Morse wrote:
>> +struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
>> +{
>> + if (l >= RDT_NUM_RESOURCES)
>> + return NULL;
>> +
>> + return &rdt_resources_all[l].r_resctrl;
>> +}
>
> Is this a bit fragile if someone adds a new item in enum resctrl_res_level
> but doesn't add a new entry to struct rdt_hw_resource rdt_resources_all[]
> in arch/x86/kernel/cpu/resctrl/core.c
>
> Any caller of resctrl_arch_get_resource(new item name) will get past
> the check "if (l >= RDT_NUM_RESOURCES)" and then return a pointer past
> the end of the rdt_resources_all[] array.
>
> Maybe make sure the array is padded out to the right size?
>
> struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES - 1] = {
> ...
> };
Sure.
I was planning to do away with the 'must not return NULL' behaviour before extra resources
start appearing. It's done like this to avoid the churn when x86 supports 'all' the
resources anyway, buy you're right it can be less-churn and safer at the same time!
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 04/40] x86/resctrl: Use schema type to determine how to parse schema values
2024-10-15 23:15 ` Tony Luck
@ 2024-10-18 17:07 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-10-18 17:07 UTC (permalink / raw)
To: Tony Luck
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Hi Tony,
On 16/10/2024 00:15, Tony Luck wrote:
> On Fri, Oct 04, 2024 at 06:03:11PM +0000, James Morse wrote:
>> +static ctrlval_parser_t *get_parser(struct rdt_resource *r)
>> +{
>> + switch (r->schema_fmt) {
>> + case RESCTRL_SCHEMA_BITMAP:
>> + return &parse_cbm;
>> + case RESCTRL_SCHEMA_RANGE:
>> + return &parse_bw;
>> + }
>> +
>> + return NULL;
>> +}
>
> Is it really worth making this a helper function? It's only
> used once.
Moved. This was just to avoid bloating the caller with boiler-plate.
>> +
>> /*
>> * For each domain in this resource we expect to find a series of:
>> * id=mask
>> @@ -204,6 +225,7 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
>> static int parse_line(char *line, struct resctrl_schema *s,
>> struct rdtgroup *rdtgrp)
>> {
>> + ctrlval_parser_t *parse_ctrlval = get_parser(s->res);
>
> No check to see if get_parser() returned NULL.
No - but you must have passed it a non-existant enum value for that to happen, so we're
already in memory corruption territory. (I probably should have made get_parser()
WARN_ON_ONCE() when returning NULL)
>> enum resctrl_conf_type t = s->conf_type;
>> struct resctrl_staged_config *cfg;
>> struct rdt_resource *r = s->res;
>> @@ -235,7 +257,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
>> if (d->hdr.id == dom_id) {
>> data.buf = dom;
>> data.rdtgrp = rdtgrp;
>> - if (r->parse_ctrlval(&data, s, d))
>> + if (parse_ctrlval(&data, s, d))
>> return -EINVAL;
>
> Without the helper this could be:
>
> switch (r->schema_fmt) {
> case RESCTRL_SCHEMA_BITMAP:
> if (parse_cbm(&data, s, d))
> return -EINVAL;
> break;
> case RESCTRL_SCHEMA_RANGE:
> if (parse_bw(&data, s, d))
> return -EINVAL;
> break;
> default:
> WARN_ON_ONCE(1);
> return -EINVAL;
> }
I'd prefer the switch statement to have no default so that it triggers a compiler warning
when future enum entries are added. This way the compiler can find cases where a new
schema format missed a bit - it doesn't need booting the result on hardware to trigger a
warning.
To avoid 'break' in a loop not breaking out of the loop, and to avoid bloating the loop
I've kept the function pointer so the non-existant enum case is handled with the rest of
the errors at the top of the function:
| /* Walking r->domains, ensure it can't race with cpuhp */
| lockdep_assert_cpus_held();
|
| switch (r->schema_fmt) {
| case RESCTRL_SCHEMA_BITMAP:
| parse_ctrlval = &parse_cbm;
| break;
| case RESCTRL_SCHEMA_RANGE:
| parse_ctrlval = &parse_bw;
| break;
| }
|
| if (WARN_ON_ONCE(!parse_ctrlval))
| return -EINVAL;
|
| if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP &&
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* RE: [PATCH v5 02/40] x86/resctrl: Add a helper to avoid reaching into the arch code resource list
2024-10-18 17:07 ` James Morse
@ 2024-10-18 17:14 ` Luck, Tony
0 siblings, 0 replies; 102+ messages in thread
From: Luck, Tony @ 2024-10-18 17:14 UTC (permalink / raw)
To: James Morse
Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Yu, Fenghua,
Chatre, Reinette, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi@huawei.com,
D Scott Phillips OS, carl@os.amperecomputing.com,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
tan.shaopeng@fujitsu.com, baolin.wang@linux.alibaba.com,
Jamie Iles, Xin Hao, peternewman@google.com,
dfustini@baylibre.com, amitsinght@marvell.com, David Hildenbrand,
Rex Nie, Dave Martin, Shaopeng Tan
> > Maybe make sure the array is padded out to the right size?
> >
> > struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES - 1] = {
Thinko on my part. The " - 1" is wrong. Array size must be RDT_NUM_RESOURCES.
> > ...
> > };
>
> Sure.
>
> I was planning to do away with the 'must not return NULL' behaviour before extra resources
> start appearing. It's done like this to avoid the churn when x86 supports 'all' the
> resources anyway, buy you're right it can be less-churn and safer at the same time!
-Tony
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 05/40] x86/resctrl: Use schema type to determine the schema format string
2024-10-04 18:03 ` [PATCH v5 05/40] x86/resctrl: Use schema type to determine the schema format string James Morse
@ 2024-10-21 17:39 ` Reinette Chatre
0 siblings, 0 replies; 102+ messages in thread
From: Reinette Chatre @ 2024-10-21 17:39 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 11153271cbdc..896350e9fb32 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2600,6 +2600,15 @@ static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type
> if (cl > max_name_width)
> max_name_width = cl;
>
> + switch (r->schema_fmt) {
> + case RESCTRL_SCHEMA_BITMAP:
> + s->fmt_str = "%d=%0*x";
> + break;
> + case RESCTRL_SCHEMA_RANGE:
> + s->fmt_str = "%d=%0*u";
> + break;
> + }
> +
The parsing of user input happens after the creation of the schema list. If the goal
is to protect against incorrect arch settings then I think schemata_list_add() initialization of
s->fmt_str needs similar WARN_ON_ONCE() treatment as planned [1] for parse_ctrlval within
parse_line().
Reinette
[1] https://lore.kernel.org/all/d48e65cc-3c7b-4a93-80a2-fa0d676e88c4@arm.com/
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 02/40] x86/resctrl: Add a helper to avoid reaching into the arch code resource list
2024-10-04 18:03 ` [PATCH v5 02/40] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
2024-10-15 22:57 ` Tony Luck
@ 2024-10-23 21:03 ` Reinette Chatre
1 sibling, 0 replies; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 21:03 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> Resctrl occasionally wants to know something about a specific resource,
> in these cases it reaches into the arch code's rdt_resources_all[]
> array.
>
> Once the filesystem parts of resctrl are moved to /fs/, this means it
> will need visibility of the architecture specific struct
> rdt_hw_resource definition, and the array of all resources. All
> architectures would also need a r_resctrl member in this struct.
>
> Instead, abstract this via a helper to allow architectures to do
> different things here. Move the level enum to the resctrl header and
> add a helper to retrieve the struct rdt_resource by 'rid'.
>
> resctrl_arch_get_resource() should not return NULL for any value in
> the enum, it may instead return a dummy resource that is
> !alloc_enabled && !mon_enabled.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
There are duplicate "Tested-by" tags from Shaopeng.
(found by checkpatch.pl)
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 03/40] x86/resctrl: Remove fflags from struct rdt_resource
2024-10-04 18:03 ` [PATCH v5 03/40] x86/resctrl: Remove fflags from struct rdt_resource James Morse
@ 2024-10-23 21:03 ` Reinette Chatre
2024-12-20 18:10 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 21:03 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> The resctrl arch code specifies whether a resource controls a cache or
> memory using the fflags field. This field is then used by resctrl to
> determine which files should be exposed in the filesystem.
>
> Allowing the architecture to pick this value means the RFTYPE_
> flags have to be in a shared header, and allows an architecture
> to create a combination that resctrl does not support.
>
> Remove the fflags field, and pick the value based on the resource
> id.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> Changes since v4:
> * Removed an extra space
Looks like this fixup was squashed into the next patch instead.
> * Fixed a typo
> ---
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 6225d0b7e9ee..2abe17574407 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2160,6 +2160,20 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
> return ret;
> }
>
> +static u32 fflags_from_resource(struct rdt_resource *r)
What is motivation for the return type of u32? I am trying to understand why this is needed
considering the value returned, variable it is assigned to, and the functions that use it
(rdtgroup_mkdir_info_resdir() and rdtgroup_add_files()) all use unsigned long.
> +{
> + switch (r->rid) {
> + case RDT_RESOURCE_L3:
> + case RDT_RESOURCE_L2:
> + return RFTYPE_RES_CACHE;
> + case RDT_RESOURCE_MBA:
> + case RDT_RESOURCE_SMBA:
> + return RFTYPE_RES_MB;
> + }
> +
> + return WARN_ON_ONCE(1);
> +}
> +
> static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
> {
> struct resctrl_schema *s;
> @@ -2180,14 +2194,14 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
> /* loop over enabled controls, these are all alloc_capable */
> list_for_each_entry(s, &resctrl_schema_all, list) {
> r = s->res;
> - fflags = r->fflags | RFTYPE_CTRL_INFO;
> + fflags = fflags_from_resource(r) | RFTYPE_CTRL_INFO;
> ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
> if (ret)
> goto out_destroy;
> }
>
> for_each_mon_capable_rdt_resource(r) {
> - fflags = r->fflags | RFTYPE_MON_INFO;
> + fflags = fflags_from_resource(r) | RFTYPE_MON_INFO;
Fixup did not make it here.
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 04/40] x86/resctrl: Use schema type to determine how to parse schema values
2024-10-04 18:03 ` [PATCH v5 04/40] x86/resctrl: Use schema type to determine how to parse schema values James Morse
2024-10-15 23:15 ` Tony Luck
@ 2024-10-23 21:14 ` Reinette Chatre
2024-12-20 18:10 ` James Morse
1 sibling, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 21:14 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 2abe17574407..11153271cbdc 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2201,7 +2201,7 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
> }
>
> for_each_mon_capable_rdt_resource(r) {
> - fflags = fflags_from_resource(r) | RFTYPE_MON_INFO;
> + fflags = fflags_from_resource(r) | RFTYPE_MON_INFO;
> sprintf(name, "%s_MON", r->name);
> ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
> if (ret)
Stray hunk.
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 496ddcaa4ecf..54ec87339038 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -183,7 +183,6 @@ struct resctrl_membw {
> u32 *mb_map;
> };
>
> -struct rdt_parse_data;
> struct resctrl_schema;
>
> enum resctrl_scope {
> @@ -192,6 +191,17 @@ enum resctrl_scope {
> RESCTRL_L3_NODE,
> };
>
> +/**
> + * enum resctrl_schema_fmt - The format user-space provides for a schema.
> + * @RESCTRL_SCHEMA_BITMAP: The schema is a bitmap in hex.
> + * @RESCTRL_SCHEMA_RANGE: The schema is a number, either a percentage
> + * or a MBps value.
The description of RESCTRL_SCHEMA_RANGE appears to aim to be specific. Considering this
it should also include the "multiples of one eighth GB/s" input option used on
AMD systems.
The software controller is the only user of actual bandwidth and for its
use it should be "MiBps".
> + */
> +enum resctrl_schema_fmt {
> + RESCTRL_SCHEMA_BITMAP,
> + RESCTRL_SCHEMA_RANGE,
> +};
> +
> /**
> * struct rdt_resource - attributes of a resctrl resource
> * @rid: The index of the resource
> @@ -208,7 +218,7 @@ enum resctrl_scope {
> * @data_width: Character width of data when displaying
> * @default_ctrl: Specifies default cache cbm or memory B/W percent.
> * @format_str: Per resource format string to show domain value
> - * @parse_ctrlval: Per resource function pointer to parse control values
> + * @schema_fmt: Which format string and parser is used for this schema.
Please fix alignment.
> * @evt_list: List of monitoring events
> * @cdp_capable: Is the CDP feature available on this resource
> */
> @@ -227,9 +237,7 @@ struct rdt_resource {
> int data_width;
> u32 default_ctrl;
> const char *format_str;
> - int (*parse_ctrlval)(struct rdt_parse_data *data,
> - struct resctrl_schema *s,
> - struct rdt_ctrl_domain *d);
> + enum resctrl_schema_fmt schema_fmt;
> struct list_head evt_list;
> bool cdp_capable;
> };
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 07/40] x86/resctrl: Add max_bw to struct resctrl_membw
2024-10-04 18:03 ` [PATCH v5 07/40] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
2024-10-09 18:02 ` Tony Luck
@ 2024-10-23 21:14 ` Reinette Chatre
2024-12-20 18:10 ` James Morse
1 sibling, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 21:14 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> __rdt_get_mem_config_amd() and __get_mem_config_intel() both use
> the default_ctrl property as a maximum value. This is because the
> MBA schema works differently between these platforms. Doing this
The schema works differently but they can still use the same property
as maximum, is that a problem?
> complicates determining whether the default_ctrl property belongs
> to the arch code, or can be derived from the schema format.
So instead of Intel and AMD both using default_ctrl as a maximum this patch
introduces a new max_bw with both using that as maximum instead.
Unclear how this change fixes the unclear complication.
>
> Add a max_bw property for x86 platforms to specify their maximum
> MBA bandwidth. This isn't needed for other schema formats.
It is not clear to me how replacing one value with a new value that is
used in exactly the same way addresses the initial complaint of complication.
>
> This will allow the default_ctrl to be generated from the schema
> properties when it is needed.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> Changes since v2:
> * This patch is new.
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 3 +++
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 9 +++++----
> include/linux/resctrl.h | 2 ++
> 3 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 4c16e58c4a1b..e79807a8f060 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -212,6 +212,7 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
> hw_res->num_closid = edx.split.cos_max + 1;
> max_delay = eax.split.max_delay + 1;
> r->default_ctrl = MAX_MBA_BW;
> + r->membw.max_bw = MAX_MBA_BW;
> r->membw.arch_needs_linear = true;
> if (ecx & MBA_IS_LINEAR) {
> r->membw.delay_linear = true;
> @@ -248,6 +249,8 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
> cpuid_count(0x80000020, subleaf, &eax, &ebx, &ecx, &edx);
> hw_res->num_closid = edx + 1;
> r->default_ctrl = 1 << eax;
> + r->schema_fmt = RESCTRL_SCHEMA_RANGE;
Stray change?
> + r->membw.max_bw = 1 << eax;
>
> /* AMD does not use delay */
> r->membw.delay_linear = false;
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 8d1bdfe89692..56c41bfd07e4 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -57,10 +57,10 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
> return false;
> }
>
> - if ((bw < r->membw.min_bw || bw > r->default_ctrl) &&
> + if ((bw < r->membw.min_bw || bw > r->membw.max_bw) &&
> !is_mba_sc(r)) {
> rdt_last_cmd_printf("MB value %ld out of range [%d,%d]\n", bw,
> - r->membw.min_bw, r->default_ctrl);
> + r->membw.min_bw, r->membw.max_bw);
> return false;
> }
>
> @@ -108,8 +108,9 @@ static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
> */
> static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
> {
> - unsigned long first_bit, zero_bit, val;
> + u32 supported_bits = BIT_MASK(r->cache.cbm_len + 1) - 1;
Same issue as V4:
https://lore.kernel.org/all/ca528ebd-fb76-40cd-a495-88c2de443cd8@intel.com/
> unsigned int cbm_len = r->cache.cbm_len;
> + unsigned long first_bit, zero_bit, val;
> int ret;
>
> ret = kstrtoul(buf, 16, &val);
> @@ -118,7 +119,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
> return false;
> }
>
> - if ((r->cache.min_cbm_bits > 0 && val == 0) || val > r->default_ctrl) {
> + if ((r->cache.min_cbm_bits > 0 && val == 0) || val > supported_bits) {
> rdt_last_cmd_puts("Mask out of range\n");
> return false;
> }
The above two changes have nothing to do with memory bandwidth. They are unrelated
to the changelog.
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 0f61673c9165..b66cd977b658 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -165,6 +165,7 @@ enum membw_throttle_mode {
> /**
> * struct resctrl_membw - Memory bandwidth allocation related data
> * @min_bw: Minimum memory bandwidth percentage user can request
> + * @max_bw: Maximum memory bandwidth value, used as the reset value
> * @bw_gran: Granularity at which the memory bandwidth is allocated
> * @delay_linear: True if memory B/W delay is in linear scale
> * @arch_needs_linear: True if we can't configure non-linear resources
> @@ -175,6 +176,7 @@ enum membw_throttle_mode {
> */
> struct resctrl_membw {
> u32 min_bw;
> + u32 max_bw;
> u32 bw_gran;
> u32 delay_linear;
> bool arch_needs_linear;
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 08/40] x86/resctrl: Generate default_ctrl instead of sharing it
2024-10-04 18:03 ` [PATCH v5 08/40] x86/resctrl: Generate default_ctrl instead of sharing it James Morse
@ 2024-10-23 21:15 ` Reinette Chatre
2025-02-07 15:42 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 21:15 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> The struct rdt_resource default_ctrl is used by both the architecture
> code for resetting the hardware controls, and by the filesystem parts
> of resctrl to report to user-space.
This is not accurate. The hardware controls may be different from
what is reported to user-space, for example when MBA software controller
is active.
>
> This means the value has to be shared, but might not match the
> properties of the control. e.g. a percentage greater than 100.
Are you referring to software controller here? When would this not
match properties of control? That this value may not match properties
of control contradicts first paragraph that this is used to reset the
control ...
>
> Instead, determine the default control value from a shared helper
> resctrl_get_default_ctrl() that uses the schema properties to
> determine the correct value.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 12/40] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain()
2024-10-04 18:03 ` [PATCH v5 12/40] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain() James Morse
@ 2024-10-23 21:16 ` Reinette Chatre
2025-02-07 15:42 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 21:16 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> rdt_find_domain() finds a domain given a resource and a cache-id.
> It's not quite right for the resctrl arch API as it also returns the
> position to insert a new domain, which is needed when bringing a
> domain online in the arch code.
>
> Wrap rdt_find_domain() in another function resctrl_arch_find_domain()
> in order to avoid the unnecessary argument outside the arch code.
I do not understand the motivation for this split. There does not seem to be
anything arch specific about rdt_find_domain(). Why does this need to
be arch API? Why can resctrl and all archs not use rdt_find_domain()?
Will MPAM organize domains differently somehow?
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 14/40] x86/resctrl: Add a resctrl helper to reset all the resources
2024-10-04 18:03 ` [PATCH v5 14/40] x86/resctrl: Add a resctrl helper to reset all the resources James Morse
@ 2024-10-23 21:32 ` Reinette Chatre
2025-02-07 15:43 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 21:32 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> On umount(), resctrl resets each resource back to its default
> configuration. It only ever does this for all resources in one go.
>
> reset_all_ctrls() is architecture specific as it works with struct
> rdt_hw_resource.
>
> Add an architecture helper to reset all resources.
>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> Changes since v1:
> * Rename the for_each_capable_rdt_resource() introduced in the new
> function resctrl_arch_reset_resources(), back to
> for_each_alloc_capable_rdt_resource() as it was in the original code.
>
> The change looked unintentional; and presumably a resource that does
> not support resource allocation doesn't have any properties to
> reset...
> ---
> arch/x86/include/asm/resctrl.h | 2 ++
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 16 +++++++++++-----
> 2 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
> index 52f2326e2b1e..5622943f6354 100644
> --- a/arch/x86/include/asm/resctrl.h
> +++ b/arch/x86/include/asm/resctrl.h
> @@ -16,6 +16,8 @@
> */
> #define X86_RESCTRL_EMPTY_CLOSID ((u32)~0)
>
> +void resctrl_arch_reset_resources(void);
> +
> /**
> * struct resctrl_pqr_state - State cache for the PQR MSR
> * @cur_rmid: The cached Resource Monitoring ID
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 61c8add103fe..a15198f90b29 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2883,6 +2883,14 @@ static int reset_all_ctrls(struct rdt_resource *r)
> return 0;
> }
>
> +void resctrl_arch_reset_resources(void)
> +{
> + struct rdt_resource *r;
> +
> + for_each_alloc_capable_rdt_resource(r)
> + reset_all_ctrls(r);
> +}
Wouldn't this require all archs to have a duplicate helper as above with
only the resctrl_all_ctrls() actually being arch specific?
What if it is instead:
resctrl_reset_alloc_resources() or reset_alloc_resources() or ...
{
struct rdt_resource *r;
for_each_alloc_capable_rdt_resource(r)
resctrl_arch_reset_all_ctrls(r);
}
With above archs only need to implement the actual reset code.
> +
> /*
> * Move tasks from one to the other group. If @from is NULL, then all tasks
> * in the systems are moved unconditionally (used for teardown).
> @@ -2992,16 +3000,14 @@ static void rmdir_all_sub(void)
>
> static void rdt_kill_sb(struct super_block *sb)
> {
> - struct rdt_resource *r;
> -
> cpus_read_lock();
> mutex_lock(&rdtgroup_mutex);
>
> rdt_disable_ctx();
>
> - /*Put everything back to default values. */
> - for_each_alloc_capable_rdt_resource(r)
> - reset_all_ctrls(r);
> + /* Put everything back to default values. */
> + resctrl_arch_reset_resources();
> +
> rmdir_all_sub();
> rdt_pseudo_lock_release();
> rdtgroup_default.mode = RDT_MODE_SHAREABLE;
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers
2024-10-04 18:03 ` [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers James Morse
2024-10-08 0:00 ` Tony Luck
@ 2024-10-23 21:51 ` Reinette Chatre
2025-02-07 15:44 ` James Morse
1 sibling, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 21:51 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> The for_each_*_rdt_resource() helpers walk the architecture's array
> of structures, using the resctrl visible part as an iterator. These
> became over-complex when the structures were split into a
> filesystem and architecture-specific struct. This approach avoided
> the need to touch every call site, and was done before there was a
> helper to retrieve a resource by rid.
>
> Once the filesystem parts of resctrl are moved to /fs/, both the
> architecture's resource array, and the definition of those structures
> is no longer accessible. To support resctrl, each architecture would
> have to provide equally complex macros.
>
> Rewrite the macro to make use of resctrl_arch_get_resource(), and
> move these to the core header so existing x86 arch code continues
> to use them.
The last part is not clear, why does it need to be moved to core
header for x86 to use it?
...
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 8894aed3c593..f75f0409ae09 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -26,6 +26,24 @@ int proc_resctrl_show(struct seq_file *m,
> /* max value for struct rdt_domain's mbps_val */
> #define MBA_MAX_MBPS U32_MAX
>
> +/* Walk all possible resources, with variants for only controls or monitors. */
> +#define for_each_rdt_resource(_r) \
> + for ((_r) = resctrl_arch_get_resource(0); \
> + (_r)->rid < RDT_NUM_RESOURCES - 1; \
I do not think this reaches all resources ... should this perhaps be:
(_r) && (_r)->rid < RDT_NUM_RESOURCES
> + (_r) = resctrl_arch_get_resource((_r)->rid + 1))
> +
> +#define for_each_capable_rdt_resource(r) \
> + for_each_rdt_resource((r)) \
> + if ((r)->alloc_capable || (r)->mon_capable)
> +
> +#define for_each_alloc_capable_rdt_resource(r) \
> + for_each_rdt_resource((r)) \
> + if ((r)->alloc_capable)
> +
> +#define for_each_mon_capable_rdt_resource(r) \
> + for_each_rdt_resource((r)) \
> + if ((r)->mon_capable)
> +
> /**
> * enum resctrl_conf_type - The type of configuration.
> * @CDP_NONE: No prioritisation, both code and data are controlled or monitored.
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 18/40] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h
2024-10-04 18:03 ` [PATCH v5 18/40] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h James Morse
@ 2024-10-23 22:00 ` Reinette Chatre
2025-02-07 15:44 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 22:00 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> The architecture specific parts of resctrl have helpers to hide accesses
> to the rdt_mon_features bitmap.
hmmmm ... no ... this patch creates those helpers.
>
> Once the filesystem parts of resctrl are moved, these can no longer live
> in internal.h. Once these are exposed to the wider kernel, they should
> have a 'resctrl_arch_' prefix, to fit the rest of the arch<->fs interface.
>
> Move and rename the helpers that touch rdt_mon_features directly.
> is_mbm_event() and is_mbm_enabled() are only called from rdtgroup.c,
> so can be moved into that file.
There seems to be a contradiction here ... earlier patch moved the
event IDs to common header so this makes these events shared between
resctrl and all archs. rdt_mon_features bitmap positions are
the common event IDs. Why should rdt_mon_features thus be considered arch
specific if bits that can be set are not?
The patch may be ok if MPAM wants to do something different here but
motivating it as "this is arch specific and needs to be hidden by helpers"
is a stretch since there is nothing arch specific about it.
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 19/40] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC
2024-10-04 18:03 ` [PATCH v5 19/40] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC James Morse
@ 2024-10-23 22:04 ` Reinette Chatre
2025-02-07 15:44 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 22:04 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> When BMEC is supported the resctrl event can be configured in a number
> of ways. This depends on architecture support. rdt_get_mon_l3_config()
> modifies the struct mon_evt and calls mbm_config_rftype_init() to create
> the files that allow the configuration.
>
> Splitting this into separate architecture and filesystem parts would
> require the struct mon_evt and mbm_config_rftype_init() to be exposed.
>
> Instead, add resctrl_arch_is_evt_configurable(), and use this from
> resctrl_mon_resource_init() to initialise struct mon_evt and call
> mbm_config_rftype_init().
> resctrl_arch_is_evt_configurable() calls rdt_cpu_has() so it doesn't
> obviously benefit from being inlined. Putting it in core.c will allow
> rdt_cpu_has() to eventually become static.
Why bother with rdt_cpu_has() when there are all those helpers available
from previous patch?
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 21/40] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
2024-10-04 18:03 ` [PATCH v5 21/40] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
@ 2024-10-23 22:19 ` Reinette Chatre
2025-02-07 15:45 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 22:19 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> @@ -315,6 +322,24 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
>
> bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
>
> +/**
> + * resctrl_arch_mon_event_config_write() - Write the config for a counter.
Please avoid the term "counter" for events ... the upcoming AMD work adds support
for counters*.
> + * @info: struct resctrl_mon_config_info describing the resource, domain
> + * and event.
Expected "config" to appear as part of information about function
that claims to writes config?
> + *
> + * Must be called on a CPU that is a member of the specified domain.
I am not sure about this. Is this documentation intended to support authors of
arch code? In that case it may be instead useful to know that this function will be
called on a CPU that is a member of the specified domain to avoid confusion from arch
side whether it needs to take some action to ensure function is called on right CPU.
Called on a CPU that is a member of the specified domain.
> + */
> +void resctrl_arch_mon_event_config_write(void *info);
> +
> +/**
> + * resctrl_arch_mon_event_config_read() - Read the config for a counter.
counter -> event
> + * @info: struct resctrl_mon_config_info describing the resource, domain
> + * and event.
Copy&paste? No information on how config is returned.
> + *
> + * Must be called on a CPU that is a member of the specified domain.
Same comment as above.
> + */
> +void resctrl_arch_mon_event_config_read(void *info);
> +
> /*
> * Update the ctrl_val and apply this config right now.
> * Must be called on one of the domain's CPUs.
Reinette
* Awaiting Arm's feedback on that on whether it works for MPAM.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 22/40] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource
2024-10-04 18:03 ` [PATCH v5 22/40] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
@ 2024-10-23 22:42 ` Reinette Chatre
0 siblings, 0 replies; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 22:42 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> The mbm_cfg_mask field lists the bits that user-space can set when
> configuring an event. This value is output via the last_cmd_status
> file.
>
> Once the filesystem parts of resctrl are moved to live in /fs/, the
> struct rdt_hw_resource is inaccessible to the filesystem code. Because
> this value is output to user-space, it has to be accessible to the
> filesystem code.
>
> Move it to struct rdt_resource.
>
fyi:
https://lore.kernel.org/all/fc424d4bee8ac9887703fecbaad26dba3c633f72.1728495588.git.babu.moger@amd.com/
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 6b076216911c..92a94939cf93 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -415,8 +415,6 @@ struct msr_param {
> * @msr_update: Function pointer to update QOS MSRs
> * @mon_scale: cqm counter * mon_scale = occupancy in bytes
> * @mbm_width: Monitor width, to detect and correct for overflow.
> - * @mbm_cfg_mask: Bandwidth sources that can be tracked when Bandwidth
> - * Monitoring Event Configuration (BMEC) is supported.
> * @cdp_enabled: CDP state of this resource
> *
> * Members of this structure are either private to the architecture
> @@ -430,7 +428,6 @@ struct rdt_hw_resource {
> void (*msr_update)(struct msr_param *m);
> unsigned int mon_scale;
> unsigned int mbm_width;
> - unsigned int mbm_cfg_mask;
> bool cdp_enabled;
> };
>
...
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 0072c2e5947f..84588ab1994d 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -218,6 +218,8 @@ enum resctrl_schema_fmt {
> * @name: Name to use in "schemata" file.
> * @schema_fmt: Which format string and parser is used for this schema.
> * @evt_list: List of monitoring events
> + * @mbm_cfg_mask: Bandwidth sources that can be tracked when Bandwidth
> + * Monitoring Event Configuration (BMEC) is supported.
Making this arch specific feature something all arch can support is probably the
right time to remove the AMD marketing name. Maybe just:
Bandwidth sources that can be tracked when memory bandwidth
monitoring events can be configured.
Please feel free to improve.
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 23/40] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions
2024-10-04 18:03 ` [PATCH v5 23/40] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions James Morse
@ 2024-10-23 22:44 ` Reinette Chatre
2025-02-07 15:45 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 22:44 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> resctrl's pseudo lock has some copy-to-cache and measurement
> functions that are micro-architecture specific. pseudo_lock_fn()
> is not at all portable. Label these 'resctrl_arch_' so they stay
> under /arch/x86.
No mention in changelog why static is also dropped during rename and
functions moved to a header file while no call sites are changed.
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 25/40] x86/resctrl: Make prefetch_disable_bits belong to the arch code
2024-10-04 18:03 ` [PATCH v5 25/40] x86/resctrl: Make prefetch_disable_bits belong to the arch code James Morse
@ 2024-10-23 22:53 ` Reinette Chatre
2025-02-07 15:46 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 22:53 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> prefetch_disable_bits is set by rdtgroup_locksetup_enter() from a
> value provided by the architecture, but is largely read by other
> architecture helpers.
>
> Instead of exporting this value, make
> resctrl_arch_get_prefetch_disable_bits() set it so that the other
> arch-code helpers can use the cached-value.
The "exporting" term has already caused some confusion. How about:
Make resctrl_arch_get_prefetch_disable_bits() set prefetch_disable_bits
so that it can be isolated to arch-code from where the other arch-code
helpers can use its cached-value.
(Please feel free to improve.)
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 27/40] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl
2024-10-04 18:03 ` [PATCH v5 27/40] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl James Morse
@ 2024-10-23 22:59 ` Reinette Chatre
2025-02-10 13:22 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 22:59 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> thread_throttle_mode_init() is called from the architecture specific code
> to make the 'thread_throttle_mode' file visible. The architecture specific
> code has already set the membw.throttle_mode in the rdt_resource.
>
> This doesn't need to be specific to the architecture, the throttle_mode
> can be used by resctrl to determine if the 'thread_throttle_mode' file
> should be visible.
>
> Call thread_throttle_mode_init() from resctrl_setup(), check the
> membw.throttle_mode on the MBA resource. This avoids publishing an
> extra function between the architecture and filesystem code.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 1 -
> arch/x86/kernel/cpu/resctrl/internal.h | 1 -
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 9 ++++++++-
> 3 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index b5ad1ed2a4de..0da7314195af 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -228,7 +228,6 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
> r->membw.throttle_mode = THREAD_THROTTLE_PER_THREAD;
> else
> r->membw.throttle_mode = THREAD_THROTTLE_MAX;
> - thread_throttle_mode_init();
>
> r->alloc_capable = true;
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 9c08efb0e198..30de95e59129 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -495,7 +495,6 @@ void cqm_handle_limbo(struct work_struct *work);
> bool has_busy_rmid(struct rdt_mon_domain *d);
> void __check_limbo(struct rdt_mon_domain *d, bool force_free);
> void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
> -void __init thread_throttle_mode_init(void);
> void __init mbm_config_rftype_init(const char *config);
> void rdt_staged_configs_clear(void);
> bool closid_allocated(unsigned int closid);
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 3f10e6897daa..596f5f087834 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2048,10 +2048,15 @@ static struct rftype *rdtgroup_get_rftype_by_name(const char *name)
> return NULL;
> }
>
> -void __init thread_throttle_mode_init(void)
> +static void __init thread_throttle_mode_init(void)
> {
> + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
> struct rftype *rft;
>
> + if (!r->alloc_capable ||
> + r->membw.throttle_mode == THREAD_THROTTLE_UNDEFINED)
> + return;
> +
The goal from the changelog is to make "thread_throttle_mode_init()" not be specific
to an architecture. It does so by checking the value of rdt_resource->resctrl_membw->membw_throttle_mode.
I thus expect that as part of being non-architectural it should check this for all
resources that initialize resctrl_membw, this includes RDT_RESOURCE_SMBA.
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 29/40] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl
2024-10-04 18:03 ` [PATCH v5 29/40] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl James Morse
@ 2024-10-23 23:02 ` Reinette Chatre
0 siblings, 0 replies; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 23:02 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 653d7cf41e64..bbce79190b13 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -3,6 +3,7 @@
> #define _RESCTRL_H
>
> #include <linux/cacheinfo.h>
> +#include <linux/cpu.h>
> #include <linux/kernel.h>
> #include <linux/list.h>
> #include <linux/pid.h>
> @@ -397,6 +398,42 @@ static inline u32 resctrl_get_config_index(u32 closid,
> }
> }
>
> +/*
> + * Caller must hold the cpuhp read lock to prevent the struct rdt_domain being
> + * freed.
"prevent the struct rdt_domain from being freed."?
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 32/40] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
2024-10-04 18:03 ` [PATCH v5 32/40] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
@ 2024-10-23 23:50 ` Reinette Chatre
2025-02-07 15:54 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 23:50 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
> resctrl can't be built as a module, and the kernfs helpers are not exported
> so this is unlikely to change. MPAM has an error interrupt which indicates
> the MPAM driver has gone haywire. Should this occur tasks could run with
> the wrong control values, leading to bad performance for important tasks.
> The MPAM driver needs a way to tell resctrl that no further configuration
> should be attempted.
>
> Using resctrl_exit() for this leaves the system in a funny state as
> resctrl is still mounted, but cannot be un-mounted because the sysfs
> directory that is typically used has been removed. Dave Martin suggests
> this may cause systemd trouble in the future as not all filesystems
> can be unmounted.
>
> Add calls to remove all the files and directories in resctrl, and
> remove the sysfs_remove_mount_point() call that leaves the system
> in a funny state. When triggered, this causes all the resctrl files
> to disappear. resctrl can be unmounted, but not mounted again.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index f77fab859c35..bb5aadaf99b6 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -4319,9 +4319,9 @@ int __init resctrl_init(void)
>
> void __exit resctrl_exit(void)
> {
> + rdtgroup_destroy_root();
If I understand correctly, rdtgroup_destroy_root() can now be called
twice, first during the error interrupt and then on unmount. Would the
second call be safe? I am not familiar with this code but I
see kernfs_destroy_root() and __kernfs_remove() dereferencing pointers
without checks. I wonder if this needs to be made safer with a:
rdtgroup_destroy_root()
{
if (rdtgroup_default.kn) {
kernfs_destroy_root();
rdtgroup_default.kn = NULL;
}
}
> debugfs_remove_recursive(debugfs_resctrl);
> unregister_filesystem(&rdt_fs_type);
> - sysfs_remove_mount_point(fs_kobj, "resctrl");
>
This breaks symmetry with resctrl_init(). The changelog describes the
motivation clearly but once this line is removed it will be difficult to
get back to this motivation. Could this function get a comment to explain
why the mount point is not removed? This will be helpful to anybody following
this work that may attempt to "fix" the asymmetry by cleaning up the
mount point created during init.
> resctrl_mon_resource_exit();
> }
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 33/40] x86/resctrl: Drop __init/__exit on assorted symbols
2024-10-04 18:03 ` [PATCH v5 33/40] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
@ 2024-10-23 23:56 ` Reinette Chatre
2025-02-07 15:54 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-23 23:56 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
> Because ARM's MPAM controls are probed using MMIO, resctrl can't be
> initialised until enough CPUs are online to have determined the
> system-wide supported num_closid. Arm64 also supports 'late onlined
> secondaries', where only a subset of CPUs are online during boot.
>
> These two combine to mean the MPAM driver may not be able to initialise
> resctrl until user-space has brought 'enough' CPUs online.
>
> To allow MPAM to initialise resctrl after __init text has been free'd,
> remove all the __init markings from resctrl.
>
> The existing __exit markings cause these functions to be removed by the
> linker as it has never been possible to build resctrl as a module. MPAM
> has an error interrupt which causes the driver to reset and disable
> itself. Remove the __exit markings to allow the MPAM driver to tear down
> resctrl when an error occurs.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> Changes since v4:
> * Earlier __init marker removal migrated here.
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 6 +++---
> arch/x86/kernel/cpu/resctrl/internal.h | 6 +++---
> arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 +++++-----
> include/linux/resctrl.h | 6 +++---
> 5 files changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index f484726a2588..f713ac628444 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -775,7 +775,7 @@ struct rdt_options {
> bool force_off, force_on;
> };
>
> -static struct rdt_options rdt_options[] __initdata = {
> +static struct rdt_options rdt_options[] __ro_after_init = {
> RDT_OPT(RDT_FLAG_CMT, "cmt", X86_FEATURE_CQM_OCCUP_LLC),
> RDT_OPT(RDT_FLAG_MBM_TOTAL, "mbmtotal", X86_FEATURE_CQM_MBM_TOTAL),
> RDT_OPT(RDT_FLAG_MBM_LOCAL, "mbmlocal", X86_FEATURE_CQM_MBM_LOCAL),
> @@ -815,7 +815,7 @@ static int __init set_rdt_options(char *str)
> }
> __setup("rdt", set_rdt_options);
>
> -bool __init rdt_cpu_has(int flag)
> +bool rdt_cpu_has(int flag)
I assume this can be dropped when resctrl_arch_is_evt_configurable() uses
a helper instead?
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 38/40] fs/resctrl: Add boiler plate for external resctrl code
2024-10-04 18:03 ` [PATCH v5 38/40] fs/resctrl: Add boiler plate for external resctrl code James Morse
2024-10-08 23:03 ` Tony Luck
@ 2024-10-24 0:08 ` Reinette Chatre
2025-02-07 15:54 ` James Morse
1 sibling, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-24 0:08 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 10/4/24 11:03 AM, James Morse wrote:
...
> +++ b/fs/resctrl/Kconfig
> @@ -0,0 +1,37 @@
> +config RESCTRL_FS
> + bool "CPU Resource Control Filesystem (resctrl)"
> + depends on ARCH_HAS_CPU_RESCTRL
> + select KERNFS
> + select PROC_CPU_RESCTRL if PROC_FS
> + help
> + Some architectures provide hardware facilities to group tasks and
> + monitor and control their usage of memory system resources such as
> + caches and memory bandwidth. Examples of such facilities include
> + Intel's Resource Director Technology (Intel(R) RDT) and AMD's
> + Platform Quality of Service (AMD QoS).
> +
> + If your system has the necessary support and you want to be able to
> + assign tasks to groups and manipulate the associated resource
> + monitors and controls from userspace, say Y here to get a mountable
> + 'resctrl' filesystem that lets you do just that.
> +
> + If nothing mounts or prods the 'resctrl' filesystem, resource
> + controls and monitors are left in a quiescent, permissive state.
> +
> + On architectures where this can be disabled independently, it is
> + safe to say N.
> +
> + See <file:Documentation/arch/x86/resctrl.rst> for more information.
> +
> +config RESCTRL_FS_PSEUDO_LOCK
> + bool
> + help
> + Software mechanism to pin data in a cache portion using
> + micro-architecture specific knowledge.
> +
There now seems to be two copies of this ... patch #23 added this exact same
"config RESCTRL_FS_PSEUDO_LOCK" snippet to arch/x86/Kconfig
> +config RESCTRL_RMID_DEPENDS_ON_CLOSID
> + bool
> + help
> + Enable by the architecture when the RMID values depend on the CLOSID.
"Enable by" -> "Enabled by"?
> + This causes the closid allocator to search for CLOSID with clean
> + RMID.
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 40/40] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2024-10-08 23:08 ` Tony Luck
@ 2024-10-24 0:17 ` Reinette Chatre
2025-02-07 15:55 ` James Morse
0 siblings, 1 reply; 102+ messages in thread
From: Reinette Chatre @ 2024-10-24 0:17 UTC (permalink / raw)
To: Tony Luck, James Morse
Cc: x86, linux-kernel, Fenghua Yu, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Hi James,
On 10/8/24 4:08 PM, Tony Luck wrote:
> On Fri, Oct 04, 2024 at 06:03:47PM +0000, James Morse wrote:
>> +functions_to_move = [
>> + # common
>> + "pr_fmt",
>> +
>> + # ctrlmon.c
>> + "rdt_parse_data",
>> + "(ctrlval_parser_t)",
>> + "bw_validate",
>> + "parse_bw",
>> + "cbm_validate",
>> + "parse_cbm",
>> + "get_parser",
>> + "parse_line",
>> + "rdtgroup_parse_resource",
>> + "rdtgroup_schemata_write",
>> + "show_doms",
>> + "rdtgroup_schemata_show",
>> + "smp_mon_event_count",
>> + "mon_event_read",
>> + "rdtgroup_mondata_show",
>> +
>> + # internal.h
>> + "MBM_OVERFLOW_INTERVAL",
>> + "CQM_LIMBOCHECK_INTERVAL",
>> + "cpumask_any_housekeeping",
>> + "rdt_fs_context",
>> + "rdt_fc2context",
>> + "mon_evt",
>> + "mon_data_bits",
>> + "rmid_read",
>> + "resctrl_schema_all",
>> + "resctrl_mounted",
>> + "rdt_group_type",
>> + "rdtgrp_mode",
>> + "mongroup",
>> + "rdtgroup",
>> + "RFTYPE_FLAGS_CPUS_LIST",
>
> Something goes wrong with moving the RFTYPE_* defines. A new copy
> shows up in fs/resctrl/internal.h but the old copy isn't removed from
> arch/x86/kernel/cpu/resctrl/internal.h
>
There seems to be a few more duplicates: RMID_VAL_ERROR, RMID_VAL_UNAVAIL,
MBM_CNTR_WIDTH_OFFSET_MAX, RDT_DELETED.
Also please check the trace files after the move and related CFLAGS.
arch/x86/kernel/cpu/resctrl/monitor_trace.h is just empty and the
arch/x86/kernel/cpu/resctrl/Makefile still has the "CFLAGS_monitor.o = -I$(src)"
Also please check fs/resctrl/pseudo_lock_trace.h
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
* RE: [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
` (42 preceding siblings ...)
2024-10-17 17:43 ` Tony Luck
@ 2024-12-06 7:17 ` Shaopeng Tan (Fujitsu)
43 siblings, 0 replies; 102+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2024-12-06 7:17 UTC (permalink / raw)
To: 'James Morse', x86@kernel.org,
linux-kernel@vger.kernel.org
Cc: Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
carl@os.amperecomputing.com, lcherian@marvell.com,
bobo.shaobowang@huawei.com, baolin.wang@linux.alibaba.com,
Jamie Iles, Xin Hao, peternewman@google.com,
dfustini@baylibre.com, amitsinght@marvell.com, David Hildenbrand,
Rex Nie, Dave Martin
Hello James,
I have no other opinions with this series.
And I ran resctrl selftest on Intel(R) Xeon(R) Gold 6254 CPU, there is no problem.
Best regards,
Shaopeng TAN
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 07/40] x86/resctrl: Add max_bw to struct resctrl_membw
2024-10-23 21:14 ` Reinette Chatre
@ 2024-12-20 18:10 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-12-20 18:10 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 22:14, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> __rdt_get_mem_config_amd() and __get_mem_config_intel() both use
>> the default_ctrl property as a maximum value. This is because the
>> MBA schema works differently between these platforms. Doing this
> The schema works differently but they can still use the same property
> as maximum, is that a problem?
I think its a problem for user-space - but its an existing problem.
Today resctrl uses the default as the maximum. This makes that property explicit.
>> complicates determining whether the default_ctrl property belongs
>> to the arch code, or can be derived from the schema format.
>
> So instead of Intel and AMD both using default_ctrl as a maximum this patch
> introduces a new max_bw with both using that as maximum instead.
> Unclear how this change fixes the unclear complication.
Is the default value something that can be determine from the schema format?
Previously, no - because the default value is where the per-platform maximum bandwidth is
stashed. By making that explicit, we can drop the spoon feeding the architecture code has
to do to tell resctrl that the maximum bitmap is all-ones, and the maximum percentage is
100. Bandwidth is always going to be weird like this, so it makes sense to special case it.
I'll add some form of this text to the commit message.
>> Add a max_bw property for x86 platforms to specify their maximum
>> MBA bandwidth. This isn't needed for other schema formats.
>
> It is not clear to me how replacing one value with a new value that is
> used in exactly the same way addresses the initial complaint of complication.
>
>>
>> This will allow the default_ctrl to be generated from the schema
>> properties when it is needed.
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 4c16e58c4a1b..e79807a8f060 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -248,6 +249,8 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
>> cpuid_count(0x80000020, subleaf, &eax, &ebx, &ecx, &edx);
>> hw_res->num_closid = edx + 1;
>> r->default_ctrl = 1 << eax;
>> + r->schema_fmt = RESCTRL_SCHEMA_RANGE;
>
> Stray change?
Yes, when these were different values, AMD had to override the value in the table.
Since they got merged back together, its the same.
Thanks!
>> + r->membw.max_bw = 1 << eax;
>>
>> /* AMD does not use delay */
>> r->membw.delay_linear = false;
>> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> index 8d1bdfe89692..56c41bfd07e4 100644
>> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> @@ -108,8 +108,9 @@ static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
>> */
>> static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
>> {
>> - unsigned long first_bit, zero_bit, val;
>> + u32 supported_bits = BIT_MASK(r->cache.cbm_len + 1) - 1;
>
> Same issue as V4:
> https://lore.kernel.org/all/ca528ebd-fb76-40cd-a495-88c2de443cd8@intel.com/
Huh. I was sure I'd fixed that when you first pointed it out.
Sorry about that!
>> @@ -118,7 +119,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
>> return false;
>> }
>>
>> - if ((r->cache.min_cbm_bits > 0 && val == 0) || val > r->default_ctrl) {
>> + if ((r->cache.min_cbm_bits > 0 && val == 0) || val > supported_bits) {
>> rdt_last_cmd_puts("Mask out of range\n");
>> return false;
>> }
>
> The above two changes have nothing to do with memory bandwidth. They are unrelated
> to the changelog.
Yes, these should be in the next patch.
Thanks!
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 04/40] x86/resctrl: Use schema type to determine how to parse schema values
2024-10-23 21:14 ` Reinette Chatre
@ 2024-12-20 18:10 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-12-20 18:10 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 22:14, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 496ddcaa4ecf..54ec87339038 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -192,6 +191,17 @@ enum resctrl_scope {
>> RESCTRL_L3_NODE,
>> };
>>
>> +/**
>> + * enum resctrl_schema_fmt - The format user-space provides for a schema.
>> + * @RESCTRL_SCHEMA_BITMAP: The schema is a bitmap in hex.
>> + * @RESCTRL_SCHEMA_RANGE: The schema is a number, either a percentage
>> + * or a MBps value.
>
> The description of RESCTRL_SCHEMA_RANGE appears to aim to be specific. Considering this
> it should also include the "multiples of one eighth GB/s" input option used on
> AMD systems.
I really don't want to define something like that as being general purpose.
This is an intermediate step to splitting 'range' into: percentage, mibps or 'platform'.
Eventually the AMD fraction-of-GB/s would be 'platform', with resctrl unable to tell
user-space what the unit is (it doesn't today either).
I have a series to do this for MPAM's cache-capacity scheme which takes a percentage for
caches like L2 or L3. I'd like percentage to be something that can be specified as the
schema format because that gives us the opportunity to expose common properties of
percentage controls to user-space from the filesystem code. e.g. the schema format,
percentage min and granularity - the last two can only be done today if its a bandwidth
you control, and user-space just has to know what the format of the control is.
Most of MPAMs controls are either bitmaps or something we can pretend is a percentage.
The odd two are PRI (ority), which is some kind of cost or weight, and the bandwidth
stride scheme, which is similarly a cost or weight. I'd describe these as 'platform' if
they are ever supported upstream. If another architecture has a similar control format it
can be added and those MPAM controls can be switched over.
If you think the comment is too specific, I'll change it to say its a decimal number.
Splitting it up into what that number means will come back in a later series.
> The software controller is the only user of actual bandwidth and for its
> use it should be "MiBps".
This would no longer match the command line argument mba_MBps, or the other code comments.
I don't think this is worth the churn as it could never be consistent. I'll add this as
a future cleanup patch so we can see how noisy it is going to be.
(
fs/resctrl/ctrlmondata.c | 4 ++--
fs/resctrl/internal.h | 2 +-
fs/resctrl/monitor.c | 6 +++---
fs/resctrl/rdtgroup.c | 18 +++++++++---------
include/linux/resctrl.h | 10 +++++-----
5 files changed, 20 insertions(+), 20 deletions(-)
)
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 03/40] x86/resctrl: Remove fflags from struct rdt_resource
2024-10-23 21:03 ` Reinette Chatre
@ 2024-12-20 18:10 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2024-12-20 18:10 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 22:03, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> The resctrl arch code specifies whether a resource controls a cache or
>> memory using the fflags field. This field is then used by resctrl to
>> determine which files should be exposed in the filesystem.
>>
>> Allowing the architecture to pick this value means the RFTYPE_
>> flags have to be in a shared header, and allows an architecture
>> to create a combination that resctrl does not support.
>>
>> Remove the fflags field, and pick the value based on the resource
>> id.
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 6225d0b7e9ee..2abe17574407 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2160,6 +2160,20 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>> return ret;
>> }
>>
>> +static u32 fflags_from_resource(struct rdt_resource *r)
>
> What is motivation for the return type of u32? I am trying to understand why this is needed
> considering the value returned, variable it is assigned to, and the functions that use it
> (rdtgroup_mkdir_info_resdir() and rdtgroup_add_files()) all use unsigned long.
There are only 10 bits defined, it looks like I just auto-typed int, then corrected it to
u32, with the intention of the compiler generating a warning if it ever attempts to mask
in a values larger
than the return type.
I've changed it to unsigned-long.
>> +{
>> + switch (r->rid) {
>> + case RDT_RESOURCE_L3:
>> + case RDT_RESOURCE_L2:
>> + return RFTYPE_RES_CACHE;
>> + case RDT_RESOURCE_MBA:
>> + case RDT_RESOURCE_SMBA:
>> + return RFTYPE_RES_MB;
>> + }
>> +
>> + return WARN_ON_ONCE(1);
>> +}
>> +
>> static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
>> {
>> struct resctrl_schema *s;
>> @@ -2180,14 +2194,14 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
>> /* loop over enabled controls, these are all alloc_capable */
>> list_for_each_entry(s, &resctrl_schema_all, list) {
>> r = s->res;
>> - fflags = r->fflags | RFTYPE_CTRL_INFO;
>> + fflags = fflags_from_resource(r) | RFTYPE_CTRL_INFO;
>> ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
>> if (ret)
>> goto out_destroy;
>> }
>>
>> for_each_mon_capable_rdt_resource(r) {
>> - fflags = r->fflags | RFTYPE_MON_INFO;
>> + fflags = fflags_from_resource(r) | RFTYPE_MON_INFO;
>
> Fixup did not make it here.
Fixed,
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 08/40] x86/resctrl: Generate default_ctrl instead of sharing it
2024-10-23 21:15 ` Reinette Chatre
@ 2025-02-07 15:42 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:42 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 22:15, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> The struct rdt_resource default_ctrl is used by both the architecture
>> code for resetting the hardware controls, and by the filesystem parts
>> of resctrl to report to user-space.
>
> This is not accurate. The hardware controls may be different from
> what is reported to user-space, for example when MBA software controller
> is active.
That is what I'm trying to say with both paragraphs. This is an odd one when trying
to reduce the duplication in defining properties - its used as the default, except when it
isn't.
>> This means the value has to be shared, but might not match the
>> properties of the control. e.g. a percentage greater than 100.
>
> Are you referring to software controller here? When would this not
> match properties of control? That this value may not match properties
> of control contradicts first paragraph that this is used to reset the
> control ...
I am - the hardware takes a percentage. Sometimes resctrl blindly reports that to
user-space as the default/maximum, but if mba_sc is in use, user-space is told something
different - but the hardware still takes a percentage. This value is used as the default,
except when it isn't. This is messy to generalise.
I'll replace the commit message with:
-----------%<-----------
The struct rdt_resource default_ctrl is used by both the architecture code for
resetting the hardware controls, and sometimes by the filesystem code as the
default value for the schema, unless the bandwidth software controller is in use.
Having the default exposed by the architecture code causes unnecessary duplication
for each architecture as the default value must be specified, but can be derived
from other schema properties. Now that the maximum bandwidth is explicitly
described, resctrl can derive the default value from the schema format and the
other resource properties.
-----------%<-----------
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 12/40] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain()
2024-10-23 21:16 ` Reinette Chatre
@ 2025-02-07 15:42 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:42 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 22:16, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> rdt_find_domain() finds a domain given a resource and a cache-id.
>> It's not quite right for the resctrl arch API as it also returns the
>> position to insert a new domain, which is needed when bringing a
>> domain online in the arch code.
>>
>> Wrap rdt_find_domain() in another function resctrl_arch_find_domain()
>> in order to avoid the unnecessary argument outside the arch code.
>
> I do not understand the motivation for this split. There does not seem to be
> anything arch specific about rdt_find_domain().
Not arch specific, but behaviour only the arch code could use - returning the position to
insert a new domain, which only the arch code will ever do.
> Why does this need to
> be arch API? Why can resctrl and all archs not use rdt_find_domain()?
Just to remove the extra boiler plate caused by the ability to insert an entry - which the
filesystem never does.
If no-one cares, I'll just move it to a header file as-is.
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 14/40] x86/resctrl: Add a resctrl helper to reset all the resources
2024-10-23 21:32 ` Reinette Chatre
@ 2025-02-07 15:43 ` James Morse
2025-02-13 23:52 ` Reinette Chatre
0 siblings, 1 reply; 102+ messages in thread
From: James Morse @ 2025-02-07 15:43 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 22:32, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> On umount(), resctrl resets each resource back to its default
>> configuration. It only ever does this for all resources in one go.
>>
>> reset_all_ctrls() is architecture specific as it works with struct
>> rdt_hw_resource.
>>
>> Add an architecture helper to reset all resources.
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 61c8add103fe..a15198f90b29 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2883,6 +2883,14 @@ static int reset_all_ctrls(struct rdt_resource *r)
>> return 0;
>> }
>>
>> +void resctrl_arch_reset_resources(void)
>> +{
>> + struct rdt_resource *r;
>> +
>> + for_each_alloc_capable_rdt_resource(r)
>> + reset_all_ctrls(r);
>> +}
> Wouldn't this require all archs to have a duplicate helper as above with
> only the resctrl_all_ctrls() actually being arch specific?
I was hoping to be able to save a few IPI by doing the per-core work once, instead of
per-resource-per-core ... but its only done on umount, so I doubt anyone will complain!
> What if it is instead:
> resctrl_reset_alloc_resources() or reset_alloc_resources() or ...
> {
> struct rdt_resource *r;
>
> for_each_alloc_capable_rdt_resource(r)
> resctrl_arch_reset_all_ctrls(r);
> }
>
> With above archs only need to implement the actual reset code.
I opted for one helper that does everything as that is the only example today.
This has the advantage that the filesystem code can now reset a specific resource.
Sure, lets do that.
Thanks,
James
[0] https://lore.kernel.org/all/20230419111111.477118-8-dfustini@baylibre.com/
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers
2024-10-23 21:51 ` Reinette Chatre
@ 2025-02-07 15:44 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:44 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 22:51, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> The for_each_*_rdt_resource() helpers walk the architecture's array
>> of structures, using the resctrl visible part as an iterator. These
>> became over-complex when the structures were split into a
>> filesystem and architecture-specific struct. This approach avoided
>> the need to touch every call site, and was done before there was a
>> helper to retrieve a resource by rid.
>>
>> Once the filesystem parts of resctrl are moved to /fs/, both the
>> architecture's resource array, and the definition of those structures
>> is no longer accessible. To support resctrl, each architecture would
>> have to provide equally complex macros.
>>
>> Rewrite the macro to make use of resctrl_arch_get_resource(), and
>> move these to the core header so existing x86 arch code continues
>> to use them.
>
> The last part is not clear, why does it need to be moved to core
> header for x86 to use it?
If it moves to fs/resctrl/internal.h, this works for the filesystem code, but not x86.
If stays in arch/x86/kernel/cpu/resctrl/internal.h, then the existing filesystem users break.
Moving it to include/linux/resctrl.h means both filesystem and arch code can include it.
I'll add the header file path to the commit message.
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 8894aed3c593..f75f0409ae09 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -26,6 +26,24 @@ int proc_resctrl_show(struct seq_file *m,
>> /* max value for struct rdt_domain's mbps_val */
>> #define MBA_MAX_MBPS U32_MAX
>>
>> +/* Walk all possible resources, with variants for only controls or monitors. */
>> +#define for_each_rdt_resource(_r) \
>> + for ((_r) = resctrl_arch_get_resource(0); \
>> + (_r)->rid < RDT_NUM_RESOURCES - 1; \
> I do not think this reaches all resources ... should this perhaps be:
> (_r) && (_r)->rid < RDT_NUM_RESOURCES
Good catch - I wrongly assumed I was off-by-one when this blew up because the increment is
executed before this expression, and the existing 'RDT_NUM_RESOURCES - 1' reinforced that.
Adding the "(_r) &&" was the correct fix.
Thanks!
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 18/40] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h
2024-10-23 22:00 ` Reinette Chatre
@ 2025-02-07 15:44 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:44 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
On 23/10/2024 23:00, Reinette Chatre wrote:
> Hi James,
>
> On 10/4/24 11:03 AM, James Morse wrote:
>> The architecture specific parts of resctrl have helpers to hide accesses
>> to the rdt_mon_features bitmap.
>
> hmmmm ... no ... this patch creates those helpers.
is_mbm_total_enabled() and is_mbm_local_enabled() from the subject were added
by commit 9f52425ba303 ("x86/intel_rdt/mbm: Basic counting of MBM events (total and
local"), way back in 2017.
I'll add some of the helper names to this paragraph, but I think a list impedes readability.
>> Once the filesystem parts of resctrl are moved, these can no longer live
>> in internal.h. Once these are exposed to the wider kernel, they should
>> have a 'resctrl_arch_' prefix, to fit the rest of the arch<->fs interface.
>>
>> Move and rename the helpers that touch rdt_mon_features directly.
>> is_mbm_event() and is_mbm_enabled() are only called from rdtgroup.c,
>> so can be moved into that file.
>
> There seems to be a contradiction here ... earlier patch moved the
> event IDs to common header so this makes these events shared between
> resctrl and all archs.
Unique identifiers were needed for the events that are shared by all architectures - using
the x86 hardware values is simple enough, and benefits the x86 architecture code. It was
an easy choice because today they are 1,2,3 ...
> rdt_mon_features bitmap positions are
> the common event IDs. Why should rdt_mon_features thus be considered arch
> specific if bits that can be set are not?
The values are passed into the helper, its up to the architecture code what it does with
them. For example, MPAM currently uses these to check pointers in an array, but once it
exposes events that resctrl doesn't offer to user-space, it will need to do more pointer
chasing.
I don't think its a good idea to require data values to be exposed between the
architecture and filesystem code. It's simple today, but having to maintain a shared
bitmap of event types across architectures sounds like a headache.
Helpers like this have a much clearer and closely defined behaviour, and are much harder
to abuse. When one architecture needs something different, its free to do so. If one
architecture wants to expose something like rdt_mon_features and test the bits - all that
can be inlined in to the caller.
(Currently the realloc-threshold is the only data value exposed because it would have been
more churn to abstract it)
> The patch may be ok if MPAM wants to do something different here but
> motivating it as "this is arch specific and needs to be hidden by helpers"
> is a stretch since there is nothing arch specific about it.
My view will be coloured because at one point I did have helpers to remap 'resctrl event
enum' numbers back to x86's hardware counters. The cunning plan was for the compiler to
optimise it out - unless it proved impossible - which the compiler could work out.
But I figured it would be simpler to get rid of it and use the enum values directly. (the
actual values don't matter to MPAM - as long as the enum isn't too big).
I'll reword this to cover why exposing helpers instead of an unsigned-long is preferable.
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 19/40] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC
2024-10-23 22:04 ` Reinette Chatre
@ 2025-02-07 15:44 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:44 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 23:04, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> When BMEC is supported the resctrl event can be configured in a number
>> of ways. This depends on architecture support. rdt_get_mon_l3_config()
>> modifies the struct mon_evt and calls mbm_config_rftype_init() to create
>> the files that allow the configuration.
>>
>> Splitting this into separate architecture and filesystem parts would
>> require the struct mon_evt and mbm_config_rftype_init() to be exposed.
>>
>> Instead, add resctrl_arch_is_evt_configurable(), and use this from
>> resctrl_mon_resource_init() to initialise struct mon_evt and call
>> mbm_config_rftype_init().
>> resctrl_arch_is_evt_configurable() calls rdt_cpu_has() so it doesn't
>> obviously benefit from being inlined. Putting it in core.c will allow
>> rdt_cpu_has() to eventually become static.
>
> Why bother with rdt_cpu_has() when there are all those helpers available
> from previous patch?
It's what the existing code does... The helpers in the previous patch are about support
for an event/counter-type, e.g. whether mbm-total exists or not?
resctrl_arch_is_evt_configurable() is to check properties of a particular
event/counter-type, e.g. now you know mbm-total exists - can it be configured?
rdt_cpu_has() is how x86 determines this today.
The rdt_cpu_has() angle in the commit messages is that its a good thing if its all
contained in one file, and as its searching an array of of architecture specific
command-line options, its not going to be easy to inline resctrl_arch_is_evt_configurable().
Yes we could abuse rdt_mon_features to move these existing calls to something exposed as
bits in an unsigned long (and I'll be very glad its hidden behind helpers!) - but this is
done once when the filesystem is mounted, so it doesn't seem worth the churn.
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 21/40] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers
2024-10-23 22:19 ` Reinette Chatre
@ 2025-02-07 15:45 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:45 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 23:19, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> @@ -315,6 +322,24 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
>>
>> bool __init resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
>>
>> +/**
>> + * resctrl_arch_mon_event_config_write() - Write the config for a counter.
>
> Please avoid the term "counter" for events ... the upcoming AMD work adds support
> for counters*.
Fixed.
>> + * @info: struct resctrl_mon_config_info describing the resource, domain
>> + * and event.
>
> Expected "config" to appear as part of information about function
> that claims to writes config?
I'll rename the variable.
>> + *
>> + * Must be called on a CPU that is a member of the specified domain.
> I am not sure about this. Is this documentation intended to support authors of
> arch code? In that case it may be instead useful to know that this function will be
> called on a CPU that is a member of the specified domain to avoid confusion from arch
> side whether it needs to take some action to ensure function is called on right CPU.
>
> Called on a CPU that is a member of the specified domain.
I don't think we have clear idea who this documentation is for - I'm assuming its for the
caller...
I'll change this to "Called via IPI to reach a CPU that is a member of the specified
domain", as that covers both views, and adds that its called in irq context.
>> + */
>> +void resctrl_arch_mon_event_config_write(void *info);
>> +
>> +/**
>> + * resctrl_arch_mon_event_config_read() - Read the config for a counter.
>
> counter -> event
>
>> + * @info: struct resctrl_mon_config_info describing the resource, domain
>> + * and event.
>
> Copy&paste? No information on how config is returned.
Laziness - there is only one possible place it could be! I've added:
| * Reads resource, domain and eventid from @config_info and reads the
| * hardware config value into config_info->mon_config.
and similar on the other function.
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 23/40] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions
2024-10-23 22:44 ` Reinette Chatre
@ 2025-02-07 15:45 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:45 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 23:44, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> resctrl's pseudo lock has some copy-to-cache and measurement
>> functions that are micro-architecture specific. pseudo_lock_fn()
>> is not at all portable. Label these 'resctrl_arch_' so they stay
>> under /arch/x86.
>
> No mention in changelog why static is also dropped during rename and
> functions moved to a header file while no call sites are changed.
I'll add the following to the commit message:
| To expose these functions to the filesystem code they need an entry
| in a header file, and can't be marked static.
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 25/40] x86/resctrl: Make prefetch_disable_bits belong to the arch code
2024-10-23 22:53 ` Reinette Chatre
@ 2025-02-07 15:46 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:46 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 23:53, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> prefetch_disable_bits is set by rdtgroup_locksetup_enter() from a
>> value provided by the architecture, but is largely read by other
>> architecture helpers.
>>
>> Instead of exporting this value, make
>> resctrl_arch_get_prefetch_disable_bits() set it so that the other
>> arch-code helpers can use the cached-value.
>
> The "exporting" term has already caused some confusion. How about:
>
> Make resctrl_arch_get_prefetch_disable_bits() set prefetch_disable_bits
> so that it can be isolated to arch-code from where the other arch-code
> helpers can use its cached-value.
>
> (Please feel free to improve.)
I'd cleaned those up - but I'll go with your version as it'll be in the
correct voice.
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 30/40] x86/resctrl: Describe resctrl's bitmap size assumptions
2024-10-08 18:50 ` Tony Luck
@ 2025-02-07 15:46 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:46 UTC (permalink / raw)
To: Tony Luck
Cc: x86, linux-kernel, Fenghua Yu, Reinette Chatre, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Hi Tony,
On 08/10/2024 19:50, Tony Luck wrote:
> On Fri, Oct 04, 2024 at 06:03:37PM +0000, James Morse wrote:
>> resctrl operates on configuration bitmaps and a bitmap of allocated
>> CLOSID, both are stored in a u32.
>>
>> MPAM supports configuration/portion bitmaps and PARTIDs larger
>> than will fit in a u32.
>>
>> Add some preprocessor values that make it clear why MPAM clamps
>> some of these values. This will make it easier to find code related
>> to these values if this resctrl behaviour ever changes.
>
> ...
>
>> +#define RESCTRL_MAX_CLOSID 32
>
> Do you really need to do this? Intel x86 architecture allows for more than
> 32 CLOSIDs, it's just expensive in h/w to get past 16 ... so I picked
> that trivial bitmap allocator in ages past. But if ARM can have more,
> then why would you need to clamp the value?
Just to avoid touching this code now! For feature-parity - no-one can argue they have a
33-CLOSID use-case that works on x86, but not on arm64. My cunning plan was to flush out
the people who cared ...
> File system code could ask
> architecture code to allocate a CLOSID. On x86 that will fail when there
> are no more CLOSIDs, so filesystem will fail the mkdir(2).
I didn't have a strong reason to pull the allocator out to be arch-specific, so just left
it where it was. That avoided having to think about abstracting the limbo thread.
> Or, since you put closid_alloc() into the filesystem code you could
> change the closid_free_map to u64.
Marvell sent a patch doing exactly that - so they must have a platform with more than 32
PARTID. The version I'm carrying around removes the limit:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.12-rc1&id=61e646352b8286b869f8ee0627cec81f01009e3e
(I did think it would be more invasive)
As the existing limit is not important I'll swap this patch with the one linked above that
removes the CLOSID limit into this series.
(The MAX_CBM bitmap size can be inferred from things like struct resctrl_staged_config's
new_ctrl u32)
> If you really do want to have this #define ... maybe you should use it
> in place of the hard coded 32 here:
>
> static void closid_init(void)
> {
> struct resctrl_schema *s;
> u32 rdt_min_closid = 32;
> }
Oops - I'd missed that.
>> +#define RESCTRL_MAX_CBM 32
>
> Intel x86 could plausibly expand the cache bitmap size (the MSRs
> that store them currenly have bits 63:32 reserved, but that could be
> changed).
> The only 32-bit limits are the CPUID field that enumerates
> CBM_LEN and the CPUID field that enumerates the shared bitmap. The
> length has space for expansion, the share bitfiled does not. So if
> Intel did go to more than 32-bits we'd be stuck making sure any shared
> bits were in the lower 32-bits.
Thanks for looking those up - I thought these limits were hard-and-fast!
MPAMs limit for the control values for the cache bitmaps is 4K - I suggest we ignore this!
MPAM can't describe a shared bitmap, if someone ever needed that, it would have to be
done by a firmware table. The CPU's max CLOSID limit is 2^16.
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 32/40] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
2024-10-23 23:50 ` Reinette Chatre
@ 2025-02-07 15:54 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:54 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 24/10/2024 00:50, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>> resctrl can't be built as a module, and the kernfs helpers are not exported
>> so this is unlikely to change. MPAM has an error interrupt which indicates
>> the MPAM driver has gone haywire. Should this occur tasks could run with
>> the wrong control values, leading to bad performance for important tasks.
>> The MPAM driver needs a way to tell resctrl that no further configuration
>> should be attempted.
>>
>> Using resctrl_exit() for this leaves the system in a funny state as
>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>> directory that is typically used has been removed. Dave Martin suggests
>> this may cause systemd trouble in the future as not all filesystems
>> can be unmounted.
>>
>> Add calls to remove all the files and directories in resctrl, and
>> remove the sysfs_remove_mount_point() call that leaves the system
>> in a funny state. When triggered, this causes all the resctrl files
>> to disappear. resctrl can be unmounted, but not mounted again.
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index f77fab859c35..bb5aadaf99b6 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -4319,9 +4319,9 @@ int __init resctrl_init(void)
>>
>> void __exit resctrl_exit(void)
>> {
>> + rdtgroup_destroy_root();
>
> If I understand correctly, rdtgroup_destroy_root() can now be called
> twice, first during the error interrupt and then on unmount. Would the
> second call be safe?
Hmmm, I thought the mount point would be holding a reference, but this is undoing the work
done at mount time, not init time. Yes, its not safe.
As there is no caller of resctrl_exit() until the MPAM driver, I had another piece left
until later - which covers what happens if the error triggers when resctrl is not mounted:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.12-rc1&id=44bb27404b4ce6744fdd4058d1fc07ed2f8d1a9f
(which also covers serialising this against umount if the caller is really unlucky)
> I am not familiar with this code but I
> see kernfs_destroy_root() and __kernfs_remove() dereferencing pointers
> without checks. I wonder if this needs to be made safer with a:
> rdtgroup_destroy_root()
> {
> if (rdtgroup_default.kn) {
> kernfs_destroy_root();
> rdtgroup_default.kn = NULL;
> }
> }
My version checked rdt_root - but nothing actually nobbles that. Your version is a lot
better. Thanks!
If there was a helper to reverse kernfs_root_to_node(), it'd be possible to remove
rdt_root completely - but its contents are private to kernfs.
>> debugfs_remove_recursive(debugfs_resctrl);
>> unregister_filesystem(&rdt_fs_type);
>> - sysfs_remove_mount_point(fs_kobj, "resctrl");
> This breaks symmetry with resctrl_init(). The changelog describes the
> motivation clearly but once this line is removed it will be difficult to
> get back to this motivation. Could this function get a comment to explain
> why the mount point is not removed? This will be helpful to anybody following
> this work that may attempt to "fix" the asymmetry by cleaning up the
> mount point created during init.
Sure. I've added some kdoc to explain where/when this is called, and what it does at a
high level:
| /**
| * resctrl_exit() - Remove the resctrl filesystem and free resources.
| *
| * Called by the architecture code in response to a fatal error.
| * Resctrl files and structures are removed from kernfs to prevent further
| * configuration.
| */
Then specifically:
| /*
| * The sysfs mount point added by resctrl_init() is not removed so that
| * it can be used to umount resctrl.
| */
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 33/40] x86/resctrl: Drop __init/__exit on assorted symbols
2024-10-23 23:56 ` Reinette Chatre
@ 2025-02-07 15:54 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:54 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 24/10/2024 00:56, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> Because ARM's MPAM controls are probed using MMIO, resctrl can't be
>> initialised until enough CPUs are online to have determined the
>> system-wide supported num_closid. Arm64 also supports 'late onlined
>> secondaries', where only a subset of CPUs are online during boot.
>>
>> These two combine to mean the MPAM driver may not be able to initialise
>> resctrl until user-space has brought 'enough' CPUs online.
>>
>> To allow MPAM to initialise resctrl after __init text has been free'd,
>> remove all the __init markings from resctrl.
>>
>> The existing __exit markings cause these functions to be removed by the
>> linker as it has never been possible to build resctrl as a module. MPAM
>> has an error interrupt which causes the driver to reset and disable
>> itself. Remove the __exit markings to allow the MPAM driver to tear down
>> resctrl when an error occurs.
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index f484726a2588..f713ac628444 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -775,7 +775,7 @@ struct rdt_options {
>> bool force_off, force_on;
>> };
>>
>> -static struct rdt_options rdt_options[] __initdata = {
>> +static struct rdt_options rdt_options[] __ro_after_init = {
>> RDT_OPT(RDT_FLAG_CMT, "cmt", X86_FEATURE_CQM_OCCUP_LLC),
>> RDT_OPT(RDT_FLAG_MBM_TOTAL, "mbmtotal", X86_FEATURE_CQM_MBM_TOTAL),
>> RDT_OPT(RDT_FLAG_MBM_LOCAL, "mbmlocal", X86_FEATURE_CQM_MBM_LOCAL),
>> @@ -815,7 +815,7 @@ static int __init set_rdt_options(char *str)
>> }
>> __setup("rdt", set_rdt_options);
>>
>> -bool __init rdt_cpu_has(int flag)
>> +bool rdt_cpu_has(int flag)
> I assume this can be dropped when resctrl_arch_is_evt_configurable() uses
> a helper instead?
resctrl_arch_is_evt_configurable() is that helper! If we wanted to decouple this in the
x86 arch code, it could do the rdt_cpu_has() stuff at boot, then test flags in
rdt_mon_features - but that isn't how it works today, and I don't think its worth the churn.
This __init marker causes a mismatched-sections warning because its called via
resctrl_arch_is_evt_configurable() from non __init marked code.
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 38/40] fs/resctrl: Add boiler plate for external resctrl code
2024-10-24 0:08 ` Reinette Chatre
@ 2025-02-07 15:54 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:54 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 24/10/2024 01:08, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
> ...
>> +++ b/fs/resctrl/Kconfig
>> @@ -0,0 +1,37 @@
>> +config RESCTRL_FS
>> + bool "CPU Resource Control Filesystem (resctrl)"
>> + depends on ARCH_HAS_CPU_RESCTRL
>> + select KERNFS
>> + select PROC_CPU_RESCTRL if PROC_FS
>> + help
>> + Some architectures provide hardware facilities to group tasks and
>> + monitor and control their usage of memory system resources such as
>> + caches and memory bandwidth. Examples of such facilities include
>> + Intel's Resource Director Technology (Intel(R) RDT) and AMD's
>> + Platform Quality of Service (AMD QoS).
>> +
>> + If your system has the necessary support and you want to be able to
>> + assign tasks to groups and manipulate the associated resource
>> + monitors and controls from userspace, say Y here to get a mountable
>> + 'resctrl' filesystem that lets you do just that.
>> +
>> + If nothing mounts or prods the 'resctrl' filesystem, resource
>> + controls and monitors are left in a quiescent, permissive state.
>> +
>> + On architectures where this can be disabled independently, it is
>> + safe to say N.
>> +
>> + See <file:Documentation/arch/x86/resctrl.rst> for more information.
>> +
>> +config RESCTRL_FS_PSEUDO_LOCK
>> + bool
>> + help
>> + Software mechanism to pin data in a cache portion using
>> + micro-architecture specific knowledge.
>> +
> There now seems to be two copies of this ... patch #23 added this exact same
> "config RESCTRL_FS_PSEUDO_LOCK" snippet to arch/x86/Kconfig
Oops - I'm surprised Kbuild didn't bark at me about that!
>> +config RESCTRL_RMID_DEPENDS_ON_CLOSID
>> + bool
>> + help
>> + Enable by the architecture when the RMID values depend on the CLOSID.
>
> "Enable by" -> "Enabled by"?
Yup,
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 40/40] x86/resctrl: Add python script to move resctrl code to /fs/resctrl
2024-10-24 0:17 ` Reinette Chatre
@ 2025-02-07 15:55 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-07 15:55 UTC (permalink / raw)
To: Reinette Chatre, Tony Luck
Cc: x86, linux-kernel, Fenghua Yu, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H Peter Anvin, Babu Moger,
shameerali.kolothum.thodi, D Scott Phillips OS, carl, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Shaopeng Tan
Hi Reinette, Tony,
On 24/10/2024 01:17, Reinette Chatre wrote:
> On 10/8/24 4:08 PM, Tony Luck wrote:
>> On Fri, Oct 04, 2024 at 06:03:47PM +0000, James Morse wrote:
>>> +functions_to_move = [
>>> + # common
>>> + "pr_fmt",
>>> +
>>> + # ctrlmon.c
>>> + "rdt_parse_data",
>>> + "(ctrlval_parser_t)",
>>> + "bw_validate",
>>> + "parse_bw",
>>> + "cbm_validate",
>>> + "parse_cbm",
>>> + "get_parser",
>>> + "parse_line",
>>> + "rdtgroup_parse_resource",
>>> + "rdtgroup_schemata_write",
>>> + "show_doms",
>>> + "rdtgroup_schemata_show",
>>> + "smp_mon_event_count",
>>> + "mon_event_read",
>>> + "rdtgroup_mondata_show",
>>> +
>>> + # internal.h
>>> + "MBM_OVERFLOW_INTERVAL",
>>> + "CQM_LIMBOCHECK_INTERVAL",
>>> + "cpumask_any_housekeeping",
>>> + "rdt_fs_context",
>>> + "rdt_fc2context",
>>> + "mon_evt",
>>> + "mon_data_bits",
>>> + "rmid_read",
>>> + "resctrl_schema_all",
>>> + "resctrl_mounted",
>>> + "rdt_group_type",
>>> + "rdtgrp_mode",
>>> + "mongroup",
>>> + "rdtgroup",
>>> + "RFTYPE_FLAGS_CPUS_LIST",
>>
>> Something goes wrong with moving the RFTYPE_* defines. A new copy
>> shows up in fs/resctrl/internal.h but the old copy isn't removed from
>> arch/x86/kernel/cpu/resctrl/internal.h
>>
> There seems to be a few more duplicates: RMID_VAL_ERROR, RMID_VAL_UNAVAIL,
> MBM_CNTR_WIDTH_OFFSET_MAX, RDT_DELETED.
I had a follow-up patch to remove those[0] - but I've fixed it in the script now.
These were mostly due to a missing else when spotting macro's that weren't multi-line.
RDT_DELETED was a good laugh - that has a tab after the #define, not a space - I stared at
that for a good while!
> Also please check the trace files after the move and related CFLAGS.
> arch/x86/kernel/cpu/resctrl/monitor_trace.h is just empty and the
> arch/x86/kernel/cpu/resctrl/Makefile still has the "CFLAGS_monitor.o = -I$(src)"
> Also please check fs/resctrl/pseudo_lock_trace.h
I'm afraid this last one is expected. There is a follow-up patch to remove the harmless
boiler plate that gets generated here:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.12-rc1&id=02cd8d6facc3d3a8984b260203d71cba9d2e462c
I could hack in something to delete the files - but it'd need to know something about C in
order to remove the #includes too - otherwise the result can't be built.
If this is a problem, we can ask whoever generates the 'final' version of the patch to
merge that follow-up in.
I'll add a list of these corner cases to the 'commit message'. (we shouldn't commit this!)
The additional corner case is the includes for 'asm' or relative paths that would break
the build on other architectures - but don't matter yet.
I couldn't include those patches in this series as they don't apply without posting the
generated patch ...
Thanks,
James
[0]
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.12-rc1&id=2fada2eb99e2814984bbba9106a9695ef6d4b8b1
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 27/40] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl
2024-10-23 22:59 ` Reinette Chatre
@ 2025-02-10 13:22 ` James Morse
0 siblings, 0 replies; 102+ messages in thread
From: James Morse @ 2025-02-10 13:22 UTC (permalink / raw)
To: Reinette Chatre, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi Reinette,
On 23/10/2024 23:59, Reinette Chatre wrote:
> On 10/4/24 11:03 AM, James Morse wrote:
>> thread_throttle_mode_init() is called from the architecture specific code
>> to make the 'thread_throttle_mode' file visible. The architecture specific
>> code has already set the membw.throttle_mode in the rdt_resource.
>>
>> This doesn't need to be specific to the architecture, the throttle_mode
>> can be used by resctrl to determine if the 'thread_throttle_mode' file
>> should be visible.
>>
>> Call thread_throttle_mode_init() from resctrl_setup(), check the
>> membw.throttle_mode on the MBA resource. This avoids publishing an
>> extra function between the architecture and filesystem code.
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 3f10e6897daa..596f5f087834 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2048,10 +2048,15 @@ static struct rftype *rdtgroup_get_rftype_by_name(const char *name)
>> return NULL;
>> }
>>
>> -void __init thread_throttle_mode_init(void)
>> +static void __init thread_throttle_mode_init(void)
>> {
>> + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_MBA);
>> struct rftype *rft;
>>
>> + if (!r->alloc_capable ||
>> + r->membw.throttle_mode == THREAD_THROTTLE_UNDEFINED)
>> + return;
>> +
>
> The goal from the changelog is to make "thread_throttle_mode_init()" not be specific
> to an architecture. It does so by checking the value of rdt_resource->resctrl_membw->membw_throttle_mode.
> I thus expect that as part of being non-architectural it should check this for all
> resources that initialize resctrl_membw, this includes RDT_RESOURCE_SMBA.
Sure.
Adding this creates the new corner-case where MBA has a throttle_mode but SMBA does not.
I'll add the corresponding logic to rdt_thread_throttle_mode_show() to print 'undefined'
to user-space if that ever happens.
Thanks,
James
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v5 14/40] x86/resctrl: Add a resctrl helper to reset all the resources
2025-02-07 15:43 ` James Morse
@ 2025-02-13 23:52 ` Reinette Chatre
0 siblings, 0 replies; 102+ messages in thread
From: Reinette Chatre @ 2025-02-13 23:52 UTC (permalink / raw)
To: James Morse, x86, linux-kernel
Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, Babu Moger, shameerali.kolothum.thodi,
D Scott Phillips OS, carl, lcherian, bobo.shaobowang,
tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao, peternewman,
dfustini, amitsinght, David Hildenbrand, Rex Nie, Dave Martin,
Shaopeng Tan
Hi James,
On 2/7/25 7:43 AM, James Morse wrote:
> Hi Reinette,
>
> On 23/10/2024 22:32, Reinette Chatre wrote:
>> On 10/4/24 11:03 AM, James Morse wrote:
>>> On umount(), resctrl resets each resource back to its default
>>> configuration. It only ever does this for all resources in one go.
>>>
>>> reset_all_ctrls() is architecture specific as it works with struct
>>> rdt_hw_resource.
>>>
>>> Add an architecture helper to reset all resources.
>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index 61c8add103fe..a15198f90b29 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -2883,6 +2883,14 @@ static int reset_all_ctrls(struct rdt_resource *r)
>>> return 0;
>>> }
>>>
>>> +void resctrl_arch_reset_resources(void)
>>> +{
>>> + struct rdt_resource *r;
>>> +
>>> + for_each_alloc_capable_rdt_resource(r)
>>> + reset_all_ctrls(r);
>>> +}
>
>> Wouldn't this require all archs to have a duplicate helper as above with
>> only the resctrl_all_ctrls() actually being arch specific?
>
> I was hoping to be able to save a few IPI by doing the per-core work once, instead of
> per-resource-per-core ... but its only done on umount, so I doubt anyone will complain!
This is very reasonable but that is not what the code does today and the
helper is added to today's code without providing insight into future optimizations. It
sounds as though MPAM was planning something differently (better) for which the helper in this
version would be appropriate and I expect that x86 could also benefit from that. I
understand that this is not a "fast" path but it raises the question of how optimizations
across archs should be handled. Ideally we should look across archs for ideal helpers
and not force one arch to adapt to what works for another. I am not advocating for a
change to what you submitted in v6 but instead would like to share that I think by not
having a discussion before switching to new helper there was a missed opportunity for
both archs.
Reinette
^ permalink raw reply [flat|nested] 102+ messages in thread
end of thread, other threads:[~2025-02-13 23:52 UTC | newest]
Thread overview: 102+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-04 18:03 [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
2024-10-04 18:03 ` [PATCH v5 01/40] x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors James Morse
2024-10-04 18:03 ` [PATCH v5 02/40] x86/resctrl: Add a helper to avoid reaching into the arch code resource list James Morse
2024-10-15 22:57 ` Tony Luck
2024-10-18 17:07 ` James Morse
2024-10-18 17:14 ` Luck, Tony
2024-10-23 21:03 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 03/40] x86/resctrl: Remove fflags from struct rdt_resource James Morse
2024-10-23 21:03 ` Reinette Chatre
2024-12-20 18:10 ` James Morse
2024-10-04 18:03 ` [PATCH v5 04/40] x86/resctrl: Use schema type to determine how to parse schema values James Morse
2024-10-15 23:15 ` Tony Luck
2024-10-18 17:07 ` James Morse
2024-10-23 21:14 ` Reinette Chatre
2024-12-20 18:10 ` James Morse
2024-10-04 18:03 ` [PATCH v5 05/40] x86/resctrl: Use schema type to determine the schema format string James Morse
2024-10-21 17:39 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 06/40] x86/resctrl: Remove data_width and the tabular format James Morse
2024-10-15 23:29 ` Tony Luck
2024-10-18 17:07 ` James Morse
2024-10-04 18:03 ` [PATCH v5 07/40] x86/resctrl: Add max_bw to struct resctrl_membw James Morse
2024-10-09 18:02 ` Tony Luck
2024-10-23 21:14 ` Reinette Chatre
2024-12-20 18:10 ` James Morse
2024-10-04 18:03 ` [PATCH v5 08/40] x86/resctrl: Generate default_ctrl instead of sharing it James Morse
2024-10-23 21:15 ` Reinette Chatre
2025-02-07 15:42 ` James Morse
2024-10-04 18:03 ` [PATCH v5 09/40] x86/resctrl: Add helper for setting CPU default properties James Morse
2024-10-04 18:03 ` [PATCH v5 10/40] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid() James Morse
2024-10-04 18:03 ` [PATCH v5 11/40] x86/resctrl: Export resctrl fs's init function James Morse
2024-10-16 16:20 ` Tony Luck
2024-10-04 18:03 ` [PATCH v5 12/40] x86/resctrl: Wrap resctrl_arch_find_domain() around rdt_find_domain() James Morse
2024-10-23 21:16 ` Reinette Chatre
2025-02-07 15:42 ` James Morse
2024-10-04 18:03 ` [PATCH v5 13/40] x86/resctrl: Move resctrl types to a separate header James Morse
2024-10-04 18:03 ` [PATCH v5 14/40] x86/resctrl: Add a resctrl helper to reset all the resources James Morse
2024-10-23 21:32 ` Reinette Chatre
2025-02-07 15:43 ` James Morse
2025-02-13 23:52 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 15/40] x86/resctrl: Move monitor exit work to a resctrl exit call James Morse
2024-10-04 18:03 ` [PATCH v5 16/40] x86/resctrl: Move monitor init work to a resctrl init call James Morse
2024-10-04 18:03 ` [PATCH v5 17/40] x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkers James Morse
2024-10-08 0:00 ` Tony Luck
2024-10-08 16:40 ` Reinette Chatre
2024-10-18 17:07 ` James Morse
2024-10-23 21:51 ` Reinette Chatre
2025-02-07 15:44 ` James Morse
2024-10-04 18:03 ` [PATCH v5 18/40] x86/resctrl: Export the is_mbm_*_enabled() helpers to asm/resctrl.h James Morse
2024-10-23 22:00 ` Reinette Chatre
2025-02-07 15:44 ` James Morse
2024-10-04 18:03 ` [PATCH v5 19/40] x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMEC James Morse
2024-10-23 22:04 ` Reinette Chatre
2025-02-07 15:44 ` James Morse
2024-10-04 18:03 ` [PATCH v5 20/40] x86/resctrl: Slightly clean-up mbm_config_show() James Morse
2024-10-16 16:50 ` Tony Luck
2024-10-04 18:03 ` [PATCH v5 21/40] x86/resctrl: Change mon_event_config_{read,write}() to be arch helpers James Morse
2024-10-23 22:19 ` Reinette Chatre
2025-02-07 15:45 ` James Morse
2024-10-04 18:03 ` [PATCH v5 22/40] x86/resctrl: Move mbm_cfg_mask to struct rdt_resource James Morse
2024-10-23 22:42 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 23/40] x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functions James Morse
2024-10-23 22:44 ` Reinette Chatre
2025-02-07 15:45 ` James Morse
2024-10-04 18:03 ` [PATCH v5 24/40] x86/resctrl: Allow an architecture to disable pseudo lock James Morse
2024-10-04 18:03 ` [PATCH v5 25/40] x86/resctrl: Make prefetch_disable_bits belong to the arch code James Morse
2024-10-23 22:53 ` Reinette Chatre
2025-02-07 15:46 ` James Morse
2024-10-04 18:03 ` [PATCH v5 26/40] x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plr James Morse
2024-10-04 18:03 ` [PATCH v5 27/40] x86/resctrl: Move thread_throttle_mode_init() to be managed by resctrl James Morse
2024-10-23 22:59 ` Reinette Chatre
2025-02-10 13:22 ` James Morse
2024-10-04 18:03 ` [PATCH v5 28/40] x86/resctrl: Move get_config_index() to a header James Morse
2024-10-04 18:03 ` [PATCH v5 29/40] x86/resctrl: Claim get_{mon,ctrl}_domain_from_cpu() helpers for resctrl James Morse
2024-10-23 23:02 ` Reinette Chatre
2024-10-04 18:03 ` [PATCH v5 30/40] x86/resctrl: Describe resctrl's bitmap size assumptions James Morse
2024-10-08 18:50 ` Tony Luck
2025-02-07 15:46 ` James Morse
2024-10-04 18:03 ` [PATCH v5 31/40] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" James Morse
2024-10-04 18:03 ` [PATCH v5 32/40] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
2024-10-23 23:50 ` Reinette Chatre
2025-02-07 15:54 ` James Morse
2024-10-04 18:03 ` [PATCH v5 33/40] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
2024-10-23 23:56 ` Reinette Chatre
2025-02-07 15:54 ` James Morse
2024-10-04 18:03 ` [PATCH v5 34/40] x86/resctrl: Move is_mba_sc() out of core.c James Morse
2024-10-04 18:03 ` [PATCH v5 35/40] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
2024-10-04 18:03 ` [PATCH v5 36/40] x86/resctrl: Remove a newline to avoid confusing the code move script James Morse
2024-10-04 18:03 ` [PATCH v5 37/40] x86/resctrl: Split trace.h James Morse
2024-10-04 18:03 ` [PATCH v5 38/40] fs/resctrl: Add boiler plate for external resctrl code James Morse
2024-10-08 23:03 ` Tony Luck
2024-10-24 0:08 ` Reinette Chatre
2025-02-07 15:54 ` James Morse
2024-10-04 18:03 ` [PATCH v5 39/40] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
2024-10-04 18:03 ` [PATCH v5 40/40] x86/resctrl: Add python script to move resctrl code to /fs/resctrl James Morse
2024-10-08 23:08 ` Tony Luck
2024-10-24 0:17 ` Reinette Chatre
2025-02-07 15:55 ` James Morse
2024-10-04 21:18 ` [PATCH v5 00/40] x86/resctrl: Move the resctrl filesystem " Reinette Chatre
2024-10-07 17:29 ` James Morse
2024-10-08 23:24 ` Tony Luck
2024-10-17 17:43 ` Tony Luck
2024-12-06 7:17 ` Shaopeng Tan (Fujitsu)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).